Files
Verso/services
claude 4aca4aaac6
Build and Deploy Verso / deploy (push) Successful in 13m36s
fix: make CodeIdent external and replace strongItem*/emphItem* with flat body tokens
Two LALR state-merging bugs prevented Strong/Emphasis nodes from ever being
produced (confirmed: tok-strong/tok-emphasis count = 0 in browser diagnostic).

Bug 1 — _italic_ consumed as CodeIdent:
  CodeIdent was a @tokens rule with identHead = [A-Za-z_], so '_italic_' (the
  entire string including both underscores) matched as one CodeIdent token.
  LALR merging caused CodeIdent to be in item*'s valid set, and CodeIdent >
  "_" in @precedence, so the parser never opened Emphasis.

  Fix: move CodeIdent to an external tokenizer (codeIdentTokenizer) with a
  character-level guard — only fires when the preceding non-whitespace char
  is one of '#', '.', '(', ',' (genuine code-context positions).  In body
  text where peek-back finds a newline, space, or markup delimiter, the
  tokenizer returns without emitting, letting '"_"' open Emphasis correctly.

Bug 2 — StrongText never produced inside Strong:
  The strongItem* / emphItem* loops merged with item* states via Lezer's
  aggressive LALR merging.  In the merged state MarkupContent was in the
  valid set (from the item* side) and MarkupContent > StrongText in
  @precedence, so MarkupContent was always produced — not a valid strongItem,
  leading to error recovery with no StrongText in the tree.

  Fix: replace the recursive strongItem* / emphItem* loops with flat external
  tokens StrongBody / EmphBody (contextual: true).  These fire only inside
  Strong → "*" . StrongBody? "*" and Emphasis → "_" . EmphBody? "_", states
  specific enough that canShift is reliable.  They read everything up to the
  closing delimiter or newline in one token, bypassing the LALR merging
  entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 22:15:22 +00:00
..