4aca4aaac6
Build and Deploy Verso / deploy (push) Successful in 13m36s
Two LALR state-merging bugs prevented Strong/Emphasis nodes from ever being
produced (confirmed: tok-strong/tok-emphasis count = 0 in browser diagnostic).
Bug 1 — _italic_ consumed as CodeIdent:
CodeIdent was a @tokens rule with identHead = [A-Za-z_], so '_italic_' (the
entire string including both underscores) matched as one CodeIdent token.
LALR merging caused CodeIdent to be in item*'s valid set, and CodeIdent >
"_" in @precedence, so the parser never opened Emphasis.
Fix: move CodeIdent to an external tokenizer (codeIdentTokenizer) with a
character-level guard — only fires when the preceding non-whitespace char
is one of '#', '.', '(', ',' (genuine code-context positions). In body
text where peek-back finds a newline, space, or markup delimiter, the
tokenizer returns without emitting, letting '"_"' open Emphasis correctly.
Bug 2 — StrongText never produced inside Strong:
The strongItem* / emphItem* loops merged with item* states via Lezer's
aggressive LALR merging. In the merged state MarkupContent was in the
valid set (from the item* side) and MarkupContent > StrongText in
@precedence, so MarkupContent was always produced — not a valid strongItem,
leading to error recovery with no StrongText in the tree.
Fix: replace the recursive strongItem* / emphItem* loops with flat external
tokens StrongBody / EmphBody (contextual: true). These fire only inside
Strong → "*" . StrongBody? "*" and Emphasis → "_" . EmphBody? "_", states
specific enough that canShift is reliable. They read everything up to the
closing delimiter or newline in one token, bypassing the LALR merging
entirely.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>