Files

T

claude@clouddev1 0e6f767848 docs: ADR-0042 — continue H1a parse-error pedagogy on the grammar tree

ADR-0020/0021 specified a chumsky-based H1a; ADR-0024 replaced chumsky
with the scannerless walker, leaving both obsolete. Mark them superseded
(kept as institutional memory) and add ADR-0042, which restates H1a
against the architecture as built.

ADR-0042 records that H1a is substantially shipped already — per-command
usage block, available-commands fallback, source-derived ident slot
labels, curated parse.custom.* near-miss messages, and schema-aware
[ERR] diagnostics — and defines the remaining work: a verified
per-command near-miss matrix (the definition of done), friendlier
literal expectation labels that add role context while keeping the
exact literal visible, and advanced-mode SQL parse parity (RETURNING
scope, CROSS JOIN ON, INSERT…SELECT count), kept distinct from
ADR-0019 §OOS-2 engine-error sanitisation.

- docs/adr/0020,0021: superseded notes + README entries
- docs/adr/0042: new ADR
- docs/adr/README.md: index upkeep (ADR-0000 rule)

2026-06-03 14:05:09 +00:00

13 KiB

Raw Blame History

ADR-0042: H1a parse-error pedagogy in the grammar-tree era

Status

Accepted — 2026-06-03.

Continues H1a (requirements.md) from ADR-0021, whose chumsky-based mechanism was superseded by ADR-0024 (unified grammar tree). ADR-0021's intent — surface the grammar of the command at the point of error, not just the next token — is re-stated here against the architecture as actually built, with an inventory of what already ships and a definition of done for the remaining work.

Cross-references ADR-0019 (friendly-error layer + i18n catalog conventions; H1a output shares the catalog), ADR-0022 (ambient typing assistance, which shares the walker's expected-set machinery), ADR-0024 (the grammar tree), and ADR-0009 (DSL surface conventions; usage templates render in the documented surface form).

Context

Why a new ADR rather than amending ADR-0021

ADR-0021 specifies a UsageEntry registry in src/dsl/usage.rs, parse.token.* catalog keys, and a renderer over chumsky's RichPattern<Token> expected sets. None of that exists. ADR-0024 removed chumsky from the project, deleted usage.rs, and folded usage information onto the grammar nodes themselves. Amending ADR-0021 in place would force every reader to mentally translate a dead mechanism; a fresh ADR records the live state directly. ADR-0020 and ADR-0021 keep their superseding notes and remain as institutional memory.

What H1a is

When a learner types something near-correct, the error should name the missing keyword or clause and show the shape of the command, rather than point a caret at the unexpected character. The user-reported gap: typing create once produced parse error: after \create`, expected `table`` — structurally true, pedagogically silent.

What already ships (the baseline — do not re-build)

Verified against code on 2026-06-03. The grammar-tree migration delivered most of ADR-0021's intent through different machinery:

Per-command usage block. Every CommandNode carries usage_ids: &'static [&'static str] (src/dsl/grammar/mod.rs). On any parse error the renderer emits a usage: block listing every form of the matched command family — 38 templates under parse.usage.* (src/friendly/strings/en-US.yaml:499-571), resolved by grammar::usage_keys_for_input and rendered by render_usage_block (src/app.rs:2560).
Available-commands fallback. When no command keyword was consumed, the block becomes available commands: … (parse.available_commands, en-US.yaml:493; app.rs:2593).
Structural error names the consumed prefix and expected set. format_walker_error (src/dsl/parser.rs:289) renders after \`, expected , found <token|end of input>, distinguishing incomplete-at-EOF (at_eof = true`, more input would help) from a definite mid-input mismatch.
Friendly slot labels for identifiers. format_expectation (src/dsl/parser.rs:262) renders Ident slots by source — "table name", "column name", "relationship name", "index name", "type" — instead of a bare "identifier" (ADR-0022 stage 8c).
Curated custom messages for high-value near-misses under parse.custom.* (en-US.yaml:443-478): create_table_needs_pk, insert_form_a_missing_values ("looks like Form A — add values (...)"), change_column_flags_exclusive, bind_type_mismatch, the redundant-constraint and alter-add-primary-key cases, etc.
Schema-aware pre-flight diagnostics that light the [ERR] validity indicator at typing time (ADR-0027 / ADR-0033 / ADR-0036): INSERT arity for Forms A/B/C, unknown table/column, type mismatch, = NULL, NOT-NULL-missing, and — on the advanced-SQL surface — cte_arity_mismatch, compound_arity_mismatch, and projection_alias_misplaced (diagnostic.*, en-US.yaml:577-620; walker logic in src/dsl/walker/mod.rs).
Ambient "Next:" hints and the simple→advanced cross-mode pointer (ADR-0022 / advanced_alternative_note, src/input_render.rs).

So H1a is substantially delivered at the intent level. The handoff's two canonical examples already behave: insert into T ('Oli') → custom Form-A message; update T set x=1 → structural "expected where or --all-rows" + usage block.

What remains — the genuine gap

The remaining work is systematic verification plus targeted polish, not a missing feature:

No enumerated coverage guarantee. Coverage is curated case-by-case; nothing asserts that every required slot in every command produces a pedagogically-sound near-miss message.
Literal expectations render terse. Word/Literal/Punct/ Flag slots come out as backticked literals (`where`, `=`, `--all-rows`). Correct, but a learner is helped more by a short prose gloss in select high-value positions.
Advanced-mode SQL parse pedagogy is thinner than the DSL surface (RETURNING scope, CTE-arity diagnostic positioning, CROSS JOIN … ON, INSERT…SELECT column-count). No other ADR or open issue covers this (ADR-0019 §OOS-2 covers advanced-SQL engine-error sanitisation — a different layer).

Decision

1. Definition of done — a verified near-miss matrix

H1a is "done" when there is a test matrix that, for every command in the REGISTRY, exercises its salient near-miss inputs and asserts the rendered output reads pedagogically. "Salient near-misses" per command means at minimum:

the bare entry keyword alone (create, add, update);
each required clause omitted (e.g. update T set x=1 with no filter rail; insert into T (cols) with no values);
a wrong token where a specific slot is expected (e.g. a number where a table name belongs);
the zero-prefix / unknown-command case (available-commands fallback).

The matrix lives in the existing surfaces — tests/typing_surface/ (snapshot-based, the standalone typing_surface_matrix binary) for the typing-time hint/validity view, and tests/it/parse_error_pedagogy.rs (the consolidated it binary) for the submit-time rendered three-block output. New integration tests go in tests/it/ per the handoff-57 §3 layout rule — not as new top-level tests/*.rs.

Work is test-first: add the matrix entry, observe the current rendering, and only then adjust wording/labels where it reads poorly. A near-miss whose current rendering is already good is locked by a snapshot, not rewritten.

2. Friendlier literal expectation labels

format_expectation gains, for high-value keyword/punct positions, an optional prose gloss while always keeping the exact literal visible — a learner must still see the precise token to type. The principle: a label may add role context, never replace the literal.

Illustrative target (final wording settled per-case against the matrix, as is normal for pedagogical text):

expected \where` or `--all-rows`→expected a filter clause: `where …` or `--all-rows``
expected \values`(after a Form-A column list) → already covered byparse.custom.insert_form_a_missing_values`; the matrix confirms it fires.

Mechanism (illustrative, finalised at implementation time): a grammar Word/Punct node may carry an optional expectation-label key, mirroring how Ident slots derive a label from IdentSource. Absent an override, rendering is unchanged (the backticked literal). This keeps the change additive and per-slot — no blanket reword that would churn the anchor-phrase tests needlessly.

New glosses are catalog-sourced (parse.expect.* or reuse of parse.usage.* fragments — chosen at implementation time) so wording stays in en-US.yaml, not in code, consistent with ADR-0019.

3. Advanced-mode SQL parse pedagogy — in scope

The same matrix discipline (§1) extends to the advanced-mode SQL surface. Two of the relevant arity diagnostics already exist and must not be re-built — cte_arity_mismatch and compound_arity_mismatch (en-US.yaml:590-591); for these the work is matrix coverage and, for CTE, auditing whether the diagnostic is positioned at the CTE name (easiest to fix) rather than the body. The genuine absences the pass adds are: RETURNING column scope, CROSS JOIN rejecting an ON clause, and INSERT…SELECT projection/target column-count (verified absent from the catalog 2026-06-03). Each gets a matrix entry; fixes land as walker diagnostics or parse.custom.* messages following the existing patterns. The : one-shot escape (a simple-mode line run once in advanced mode) is part of the advanced surface and gets at least one near-miss matrix entry.

This stays clear of ADR-0019 §OOS-2 (advanced-SQL engine-error sanitisation): §OOS-2 reworks errors raised by executing SQL; H1a here concerns errors raised while parsing it. If a near-miss turns out to be an engine error rather than a parse error, it is out of H1a scope and noted against §OOS-2 instead.

4. Catalog and anchor-phrase discipline

All new or reworded user-facing strings go through the i18n catalog (en-US.yaml) and the KEYS_AND_PLACEHOLDERS validator, per ADR-0019. No engine vocabulary in any string (CLAUDE.md).

Two anchor styles constrain §2's glosses and both are preserved by its "literal always visible" rule:

The substring assertions in src/dsl/parser.rs tests ("after …", "expected table name", "found end of input", "unknown type", "expected one of").
The substring assertions in tests/it/parse_error_pedagogy.rs, which check for backticked literals and usage fragments (e.g. `column`, `1`, "create table", "with pk"). This test is .contains()-based, not snapshot-based, so a §2 gloss that dropped the bare literal would fail it — which is precisely the regression §2's rule prevents.

The snapshot-based tests/typing_surface/ matrix will re-baseline on any §2 wording change (expected; reviewed via cargo insta), but the two substring suites above must stay green without edits to their assertions.

Out of scope

Advanced-SQL engine-error sanitisation — ADR-0019 §OOS-2.
Tab completion (I3) and syntax highlighting (I4) as features — they share the walker but are separate ADRs.
Schema-aware "did you mean Customers?" spell-correction — ADR-0021's out-of-scope §2; belongs with I3.
Multi-error reporting. The walker reports the first error and stops; unchanged.
messages-style verbosity gating of the usage block. Per ADR-0021 §8 the usage block is always shown; parse errors are exactly when pedagogical surface should be maximal. Unchanged.
Auto-generating usage/help text from the grammar. ADR-0024 left help prose hand-curated; templates stay hand-written.

Consequences

Positive

H1a gains an explicit, enumerated definition of done instead of an open-ended "systematic pass still pending".
The matrix becomes a regression lock: future grammar changes that degrade a near-miss message fail a snapshot.
Literal-label glosses close the last terse-wording gap without a blanket reword.
The advanced-SQL surface reaches parity with the DSL surface for the audience that has switched to raw SQL.

Costs

Wording iteration across many near-miss cases — but cheap, catalog-driven, and snapshot-guarded.
The §2 per-node label field is one more annotation a new command may set (optional; default unchanged).
Snapshot volume grows; acceptable given the existing ~160-entry typing-surface matrix.

Neutral

No public API change. parse_command* signatures, the ParseError shape, and the three-block render path are all unchanged; this ADR adds wording, labels, and tests within them.

Implementation notes

Order of operations (test-first throughout):

Enumerate the per-command near-miss matrix (§1) as failing/asserting tests in tests/typing_surface/ + tests/it/parse_error_pedagogy.rs. Capture current rendering as the starting baseline.
Triage: which entries read poorly? Only those get wording work.
Add the optional expectation-label mechanism (§2) and apply it to the high-value keyword/punct positions surfaced in triage.
Advanced-SQL near-miss audit + fixes (§3), distinguishing parse from engine errors as they arise.
Catalog validator + anchor-phrase checks stay green (§4).
Update requirements.md H1a with the matrix as the done-marker; flip to [x] only when the matrix is complete and green.

13 KiB Raw Blame History