add ADR-0024: unified grammar tree execution plan (accepted)

Concrete specification for the direction in ADR-0023, landed during the round-6 design pass. Resolves all four rounds of open design questions: walker as single source of truth, scannerless terminal vocabulary (~8 building blocks), typed value slots with content validators, WalkContext for schema- aware narrowing from day one, WalkOutcome multi-purpose return, HintMode per-node, ranker as separate layer, static + dynamic sub-grammars, aliases as Word annotations, IdentSource taxonomy, six-phase per-command migration with chumsky and walker side-by-side during the transition. Key shifts from ADR-0023's sketch: - Lexer dissolves entirely. Walker operates on bytes directly. dsl/lexer.rs, dsl/keyword.rs go away in Phase F. - Schema-aware parse from day one (not phased). Typed value slots reject mis-shaped input at parse time with localised wording. Completion narrows per column type. - Sub-grammars: static (fn() -> Node) for composition; dynamic (fn(&WalkContext) -> Node) for schema-dependent expansion. No global named registry. - Path-bearing commands: BarePath becomes a routine non-whitespace terminal. Paths with spaces require quoting via StringLit (UX simplification, aligns with standard CLI convention). - 13-node taxonomy: Word, Punct, Ident, NumberLit, StringLit, BlobLit, Flag, BarePath, Choice, Seq, Optional, Repeated, DynamicSubgrammar. Migration plan: Phase A (walker scaffolding + app-lifecycle commands), Phase B (DDL without value literals), Phase C (create table), Phase D (data commands with full schema awareness -- the design's central claim landing), Phase E (replay), Phase F (delete chumsky + lexer + legacy parser modules, simplify catalog). Estimated ~4 sessions total. Also: rename ADR-0023 from 0023-proposed-unified-grammar-tree.md to 0023-unified-grammar-tree.md (git mv preserves history) and update its status to reflect the direction-accepted-but- superseded-for-execution-detail relationship with ADR-0024. Index updated.
2026-05-14 21:52:10 +00:00
parent 3b36bbb4d6
commit 74c3ec1edf
3 changed files with 719 additions and 14 deletions
@@ -1,21 +1,24 @@
-# ADR-0023: Unified declarative grammar tree (proposed direction)
+# ADR-0023: Unified declarative grammar tree (direction)
 ## Status
-**Proposed.**
+**Accepted in direction, superseded for execution detail by
 ADR-0024.** 2026-05-14.
-Not yet accepted. Captures a researched direction for a future
+This ADR captures the architectural critique (the "10-place
-refactor that supersedes the parts of ADR-0001 (chumsky as the
+edit" scatter problem with the current parser shape) and the
-DSL parser), ADR-0019 (separated catalog declaration), ADR-0020
+direction (a unified declarative grammar tree). The round-6
-(lexer + keyword macro), ADR-0021 (per-command usage registry),
+design pass turned that direction into a concrete specification,
-and ADR-0022 (completion via expected-set introspection,
+which ships as ADR-0024. ADR-0024 makes some refinements
-highlighting via lexer) that this ADR identifies as accreted
+beyond what's sketched here — notably the decision to drop the
-rather than designed.
+lexer module entirely (scannerless walker) and to put schema-
 aware narrowing into round 1 rather than phasing it. Read
 ADR-0024 for the executable plan; this ADR remains for the
 institutional memory of why the change is happening.
-Filename carries the `-proposed-` segment so the status is
+The filename was renamed from `0023-proposed-unified-grammar-tree.md`
-visible at directory listing time; on acceptance, rename to
+to `0023-unified-grammar-tree.md` when the direction was
-`0023-unified-grammar-tree.md` via `git mv` (history
+accepted. History is preserved through the `git mv`.
 preserved).
 ## Context
@@ -0,0 +1,701 @@
 # ADR-0024: Unified grammar tree — execution plan
 ## Status
 **Accepted.** 2026-05-14.
 Concrete specification for the direction proposed in ADR-0023.
 Where ADR-0023 captured the critique of the current parser
 shape and the high-level vision, this ADR specifies the data
 model, walker semantics, migration sequence, and cleanup steps
 in enough detail that implementation can proceed without
 further design decisions.
 Supersedes ADR-0023's "Proposed" status. ADR-0023 stays in
 the directory as institutional memory of why this change is
 happening; ADR-0024 is what gets built.
 ## Context
 The design pass landed in the round-6 session (2026-05-14)
 worked through ADR-0023's open questions and a number of
 implicit decisions that hadn't been written down. Four rounds
 of questions, each followed by user confirmation:
 1. **Round 1 — foundational.** Registry shape, node taxonomy,
   AST output model, failure / "expected" semantics, walker
   API and its mapping to parse / complete / highlight / hint
   concerns.
 2. **Round 2 — concrete representation.** Multi-keyword
   sequences, sub-grammar reusability (static and dynamic),
   path-bearing commands, bare-or-with-suffix commands.
 3. **Round 3 — organisation and migration.** Module layout,
   per-command migration strategy, test discipline during
   migration.
 4. **Round 4 — smaller details.** Aliases on keyword nodes,
   `IdentSlot` fate, highlight palette, external-tooling
   exposure.
 Two larger decisions emerged from the rounds and shifted the
 shape from ADR-0023's sketch:
 - **The lexer dissolves.** The walker operates directly on
  source bytes ("scannerless"). The current `dsl/lexer.rs`
  module's responsibilities (whitespace skipping, token shape
  recognition, byte-span tracking) migrate into terminal-node
  consume functions and the walker driver. The `define_keywords!`
  macro is no longer needed in its current form; keyword
  literals live on `Word` nodes in the grammar.
 - **Schema-aware parse from day one.** ADR-0023 had been
  cautious about coupling parse to schema state. The round-1
  / round-2 discussion concluded that this caution comes from
  general-purpose parser tooling and doesn't apply to an
  interactive DSL editor where the schema *is* the context.
  Typed value slots consult the schema during parse; bind-time
  type checks remain but become belt-and-braces rather than
  the primary defense.
 A separate critique surfaced in the design pass: my (Claude's)
 default pull toward "what's the safe incremental version of
 what general-purpose parser tooling does" repeatedly fought
 against the project owner's cleaner direct design. The pull
 is now explicitly resisted — this ADR ships the direct design,
 not a phased compromise.
 ## Decision summary
 A single trie data structure declared in Rust serves as the
 authority for parsing, completion, syntax highlighting, parse-
 error usage rendering, hint-panel content, and (eventually)
 external-tooling exposure. The walker that consumes this trie
 operates directly on source bytes — no separate lexer pass.
 Schema-aware narrowing flows naturally from the trie's
 structure: typed value slots and dynamic sub-grammars consult
 a per-walk context that carries the current table, the
 resolved column types, and a reference to the schema cache.
 Migration is per-command across six phases. The legacy
 chumsky parser and the new walker run side-by-side during the
 transition; existing behavioural tests guard regressions.
 Phase F removes chumsky, the lexer module, the separate
 `UsageEntry` registry, and the expected-set introspection
 in `completion.rs`.
 Estimated total cost: ~4 sessions — one to land the framework
 and migrate Phase A, two for Phases B-D, one for Phases E + F.
 ## Architecture
 ### Walker as single source of truth
 ```rust
 pub fn walk(
    source: &str,
    bound: WalkBound,
    ctx: &mut WalkContext,
 ) -> WalkResult<'_>;
 pub enum WalkBound {
    EndOfInput,           // parse: walk all input
    Position(usize),      // complete / hint: walk up to cursor byte
 }
 pub struct WalkResult<'a> {
    pub outcome: WalkOutcome,
    pub matched_path: MatchedPath,
    pub per_byte_class: Vec<(ByteRange, HighlightClass)>,
 }
 pub enum WalkOutcome {
    Match { command_idx: usize },
    Incomplete { position: usize, expected: Vec<&'static Node> },
    Mismatch { position: usize, expected: Vec<&'static Node>, found_byte: u8 },
    ValidationFailed { position: usize, message_key: &'static str, args: Vec<(&'static str, String)> },
 }
 ```
 Consumers:
 - **Parse for dispatch.** `walk(source, EndOfInput, ctx)`. On
  `Match`, invoke `commands[command_idx].ast_builder(matched_path)`
  and dispatch the returned `Command`.
 - **Highlighting.** `walk(source, EndOfInput, ctx).per_byte_class`.
  Each terminal records `(byte_range, node.highlight_class())`
  as it matches. Unmatched ranges (past a failure) get the
  `tok_error` overlay.
 - **Completion at cursor.** `walk(source, Position(cursor), ctx)`,
  inspect `outcome.expected`. Each expected `Node` contributes
  candidates: `Word` → its primary literal, `Ident { source }`
  → schema-cache lookup, `Flag` → `--name`, value-literal slot
  → type-appropriate hint per `HintMode`, etc.
 - **Hint panel ambient.** Same walk as completion. The hint
  resolver consults `WalkOutcome` variants plus the expected
  nodes' `HintMode` to choose between candidates rendering,
  prose, suppression, etc.
 ### Scannerless: no lexer module
 Terminal nodes consume bytes directly. No pre-pass produces a
 `Vec<Token>`. The walker's driver handles whitespace skipping
 between siblings of a `Seq` and dispatches to each terminal's
 `consume(source, position)` function.
 Character-level helpers (identifier shape, digit-sequence shape,
 quoted-string escape handling) live in
 `src/dsl/walker/lex_helpers.rs` — a small shared module used
 by the various terminal consume functions. This is internally
 similar to the current lexer's logic, but it's invoked per-position
 by the walker rather than as a pre-pass.
 `src/dsl/lexer.rs` and `src/dsl/keyword.rs` are deleted in
 Phase F. The keyword vocabulary is no longer a Rust enum; each
 keyword exists as a `Word` node in the grammar declarations.
 ### Node taxonomy
 Thirteen node kinds. Three categories:
 **Terminals** (consume bytes):
 ```rust
 pub enum Node {
    Word {
        primary: &'static str,
        aliases: &'static [&'static str],
        // Default tok_keyword unless overridden.
        highlight_override: Option<HighlightClass>,
    },
    Punct(char),
    Ident {
        source: IdentSource,
        role: &'static str,
        highlight_override: Option<HighlightClass>,
    },
    NumberLit,
    StringLit,
    BlobLit,
    Flag(&'static str),
    BarePath,
    // Combinators ↓
 }
 ```
 **Combinators** (compose other nodes):
 ```rust
    Choice(&'static [Node]),
    Seq(&'static [Node]),
    Optional(&'static Node),
    Repeated {
        inner: &'static Node,
        separator: Option<&'static Node>,
        min: usize,
    },
 ```
 **Dynamic** (resolves at walk time using `WalkContext`):
 ```rust
    DynamicSubgrammar(fn(&WalkContext) -> Node),
 }
 ```
 `CommandNode` is the top-level entry record:
 ```rust
 pub struct CommandNode {
    pub entry: Word,
    pub shape: Node,                                  // usually a Seq
    pub ast_builder: fn(&MatchedPath) -> Command,
    pub dispatch: fn(&mut App, Command) -> Vec<Action>,
    pub help_id: Option<&'static str>,
    pub usage_id: Option<&'static str>,
    // Hint mode override at command level; nodes can carry their own too.
    pub hint_mode: Option<HintMode>,
 }
 pub const REGISTRY: &[CommandNode] = &[ /* ... */ ];
 ```
 ### Typed value slots
 Value-literal positions use typed slots built from terminals
 plus content validators. One slot factory per data type:
 ```rust
 fn int_slot()      -> Node { Choice(&[NumberLit_with(integer_only_validator), null_word()]) }
 fn real_slot()     -> Node { Choice(&[NumberLit, null_word()]) }
 fn decimal_slot()  -> Node { Choice(&[NumberLit_with(decimal_validator), null_word()]) }
 fn bool_slot()     -> Node { Choice(&[Word("true", &[]), Word("false", &[]), null_word()]) }
 fn text_slot()     -> Node { Choice(&[StringLit, null_word()]) }
 fn date_slot()     -> Node { Choice(&[StringLit_with(date_format_validator), null_word()]) }
 fn datetime_slot() -> Node { Choice(&[StringLit_with(datetime_format_validator), null_word()]) }
 fn blob_slot()     -> Node { Choice(&[BlobLit, null_word()]) }
 ```
 `StringLit_with(validator)` is a `StringLit` terminal carrying
 a content validator that runs after a successful match. Same
 for `NumberLit_with`. A failed validator surfaces as
 `WalkOutcome::ValidationFailed` with the validator's catalog
 key.
 `slot_for_type(ty: Type) -> Node` is the dispatcher: given a
 column type, returns the appropriate slot. Used by dynamic
 sub-grammars (see below).
 ### `WalkContext`
 ```rust
 pub struct WalkContext<'a> {
    pub schema: &'a SchemaCache,
    // Current table inferred from the partial parse — e.g.,
    // `insert into Customers ...` sets `current_table = "Customers"`.
    pub current_table: Option<String>,
    // The columns of `current_table`, in declaration order, with types.
    // Populated by Ident { source: Tables } when it matches a
    // known table.
    pub current_table_columns: Option<Vec<ColumnInfo>>,
    // For comma-separated value lists, which position we're at.
    pub value_position: usize,
    // For `set` clauses and `where` clauses, the column whose value
    // we're about to consume.
    pub current_column: Option<ColumnInfo>,
 }
 ```
 Nodes can write to `WalkContext`:
 - `Ident { source: Tables, role: "table", writes_table: true }`
  on match sets `ctx.current_table` to the matched identifier
  and resolves `ctx.current_table_columns` from the schema.
 - `Ident { source: Columns, role: "column", writes_current_column: true }`
  on match sets `ctx.current_column` from the resolved column list.
 Nodes can read from `WalkContext`:
 - `DynamicSubgrammar(column_value_list)` reads
  `ctx.current_table_columns` and unfolds to a `Seq` of
  comma-separated typed slots — one per column.
 - The value slot after `set col=` reads `ctx.current_column.user_type`
  to pick the right typed slot.
 ### `WalkOutcome` and "expected"
 The walker keeps track of the longest prefix that matched and
 the position at which it failed (or completed). At a failure
 or incomplete position, `expected` is the set of nodes that
 could legally continue the walk — derived structurally from
 the trie, not from a separate "expected" table.
 For a `Seq` mid-walk, `expected` is the next child node.
 For a `Choice` that hasn't committed to a branch, `expected`
 is all children. For an `Optional` at a position where its
 inner could start, `expected` includes the inner plus the
 next sibling.
 This is the same information chumsky's
 `ParseError::Invalid::expected` carries today, sourced from
 the trie directly instead of via combinator introspection.
 ### `HintMode` per node
 Each node may carry a `HintMode`:
 ```rust
 pub enum HintMode {
    /// Candidates if any surface; else prose fallback.
    Default,
    /// Force the prose at this catalog key regardless of candidates.
    /// Used by NewName slots ("Type a name, then `(`").
    ForceProse(&'static str),
    /// Show only the prose; suppress Tab candidates.
    /// Used by typed value slots at empty prefix.
    ProseOnly(&'static str),
    /// Suppress prose; only candidates.
    SuppressProse,
 }
 ```
 The walker propagates each expected node's `HintMode` to the
 hint resolver, which dispatches accordingly.
 The current ad-hoc cases in `input_render.rs::ambient_hint`
 (value-literal slot suppression, NewName slot typing-name
 prose, invalid-ident overlay) migrate to node-attached
 `HintMode` annotations during Phase D.
 ### Ranker layer
 A ranker function runs between the walker's raw candidate
 output and the hint-panel renderer:
 ```rust
 pub type Ranker = fn(&WalkContext, Vec<Candidate>) -> Vec<Candidate>;
 pub fn identity_ranker(_: &WalkContext, c: Vec<Candidate>) -> Vec<Candidate> { c }
 ```
 Default is `identity_ranker` — declaration order from the
 trie is preserved. The signature allows future enhancements
 (frequency-based ranking, content-aware priors for type
 suggestions per column name) to plug in without changing
 grammar declarations.
 The ranker lives outside the trie. Grammar declarations are
 about *what's valid*; ranking is about *what's likely useful
 first*.
 ### Sub-grammars
 Two flavours, no global registry:
 **Static** — pure composition, function returning a const node:
 ```rust
 const fn qualified_column(role_table: &'static str, role_col: &'static str) -> Node {
    Seq(&[
        Ident { source: Tables, role: role_table, /* ... */ },
        Punct('.'),
        Ident { source: Columns, role: role_col, /* ... */ },
    ])
 }
 const fn where_clause() -> Node {
    Seq(&[
        Word { primary: "where", /* ... */ },
        Ident { source: Columns, role: "filter_column", /* ... */ },
        Punct('='),
        AnyValueSlot,
    ])
 }
 ```
 **Dynamic** — context-aware, expands at walk time:
 ```rust
 fn column_value_list(ctx: &WalkContext) -> Node {
    let cols = ctx.current_table_columns.as_ref().unwrap_or(&Vec::new());
    let mut children: Vec<Node> = Vec::new();
    for (i, col) in cols.iter().enumerate() {
        if i > 0 { children.push(Punct(',')); }
        children.push(slot_for_type(col.user_type));
    }
    Seq(Box::leak(children.into_boxed_slice()))
 }
 ```
 Dynamic sub-grammars return owned `Node` values that the
 walker treats as inline expansions. The leak above is one
 implementation tactic — alternatively, the walker stores the
 expanded node in a small per-walk arena. Both work; pick at
 implementation time.
 ### Aliases
 A `Word` node carries `primary` and an `aliases` slice. The
 walker matches input against either; completion surfaces only
 the primary; help text mentions aliases prose-style if
 appropriate. Highlight class is the same for both.
 Round 5's `q` removal is *not* reverted by this design. `q`
 stays gone — adding it back would now be the single line
 `aliases: &["q"]` on the `quit` `Word` node, and would not
 surface as a separate candidate in completion (matching the
 round-5 user request).
 ### `IdentSource`
 Replaces the current `dsl::ident_slot::IdentSlot`:
 ```rust
 pub enum IdentSource {
    NewName,         // user invents; no schema lookup; ProseOnly hint
    Tables,          // existing table names
    Columns,         // existing column names (filtered by current table)
    Relationships,   // existing relationship names
    Types,           // closed set from Type::all()
 }
 ```
 `Types` is new — it replaces the magic-string `TYPE_SLOT_LABEL`
 used today. `src/dsl/ident_slot.rs` dissolves into
 `src/dsl/grammar/mod.rs`.
 ### Highlight class assignment
 Per-byte highlight class is computed as a side effect of the
 walk. Each terminal records `(byte_range, class)` in
 `WalkResult::per_byte_class` as it matches. Unmatched ranges
 (past a definite failure) get the `tok_error` overlay,
 identical to today's behaviour.
 Default classes per terminal kind:
 | Terminal | Default class |
 |---|---|
 | `Word` | `tok_keyword` |
 | `Punct` | `tok_punct` |
 | `Ident` | `tok_identifier` |
 | `NumberLit` | `tok_number` |
 | `StringLit` | `tok_string` |
 | `BlobLit` | `tok_string` |
 | `Flag` | `tok_flag` |
 | `BarePath` | `tok_string` |
 The `highlight_override: Option<HighlightClass>` field on
 `Word` and `Ident` is reserved for future per-slot variants
 (e.g., a Tables slot in a distinct shade vs a NewName slot
 muted) — left `None` everywhere in round 1.
 No new palette colours for the initial migration.
 ## Migration plan
 ### Code organisation
 ```
 src/dsl/
  grammar/
    mod.rs           — Node enum, IdentSource, HintMode, HighlightClass,
                       MatchedPath, CommandNode, REGISTRY top-level
    data.rs          — insert, update, delete, show
    ddl.rs           — create, drop, add, rename, change
    app.rs           — quit, help, save/save-as, new, load, rebuild,
                       export, import, mode, messages
    shared.rs        — typed value slots (int_slot, date_slot, …),
                       qualified_column, where_clause, action_keyword,
                       column_value_list (dynamic)
    validators.rs    — content validators (integer_only_validator,
                       date_format_validator, datetime_format_validator,
                       type_name_validator, …)
  walker/
    mod.rs           — public walk() entry; orchestration
    driver.rs        — the per-node-kind dispatch
    context.rs       — WalkContext
    outcome.rs       — WalkOutcome, MatchedPath, WalkResult
    lex_helpers.rs   — identifier-shape, digit-shape, string-escape
                       helpers; shared across terminal consume fns
  parser.rs          — Phase A: becomes a router. Phase F: deleted.
  lexer.rs           — Phase F: deleted.
  keyword.rs         — Phase F: deleted.
  ident_slot.rs      — Phase F: dissolved into grammar/mod.rs.
  usage.rs           — Phase F: REGISTRY deleted; the file may go.
 ```
 ### Six-phase migration
 **Phase A — Walker skeleton + app-lifecycle commands.**
 - Build the walker driver, `WalkContext`, `WalkOutcome`,
  `MatchedPath`, the terminal consume functions.
 - Migrate the app-lifecycle commands (no schema dependency,
  no value literals): quit, help, rebuild, save, save as, new,
  load, export, import, mode, messages.
 - Router in `parse_command` consults the walker for migrated
  commands; falls back to chumsky for the rest.
 - Differential test scaffolding: a test helper that, for every
  input in the existing test corpus, runs both parsers and
  asserts identical `Command` output where the input falls
  under a migrated command.
 Exit criteria: walker handles the app-lifecycle commands
 end-to-end; existing tests for those commands pass via the
 walker path; tests for other commands still pass via chumsky.
 **Phase B — DDL commands without value literals.**
 - drop table, drop column, drop relationship.
 - rename column.
 - add column (without the value-literal aspect — type slot
  uses `Ident { source: Types }` with a content validator).
 - add 1:n relationship (referential clauses as a static
  sub-grammar).
 - change column (type slot + flags).
 These exercise schema lookups via `Ident { source: Tables }`
 and `Ident { source: Columns }`, and the `Types` source. No
 typed value slots yet, no `DynamicSubgrammar`.
 Exit criteria: all DDL commands except `create table` pass
 via the walker; the rest still pass via chumsky.
 **Phase C — `create table` with column-list value literals.**
 - The `with pk` clause uses `Repeated` for the column-spec
  list, each spec being a `Seq(Ident{NewName}, Punct(':'),
  Ident{Types}-with-validator)`.
 - First test of `Repeated` with separator.
 Exit criteria: create table works end-to-end via the walker.
 **Phase D — data commands with full schema awareness.**
 - show data, show table, replay.
 - insert: uses `DynamicSubgrammar(column_value_list)` for the
  comma-separated typed value list. Exercises full
  `WalkContext` propagation: `Ident { source: Tables, role:
  "table", writes_table: true }` resolves the column list;
  the dynamic sub-grammar unfolds typed slots per column.
 - update: `set` clauses use `DynamicSubgrammar` to resolve the
  value slot's type from the column. `where` clause uses the
  shared sub-grammar with `AnyValueSlot` (or, optionally, also
  column-typed if the column resolves cleanly).
 - delete: same `where` clause; otherwise simple.
 This is the phase that proves the design's central claim:
 typed slots, dynamic sub-grammars, and schema-aware narrowing
 all collaborate to produce a single coherent grammar
 declaration per command.
 Exit criteria: all data commands pass via the walker; the
 round-5 limitations close automatically (save Tab can offer
 `as`, value slots narrow by column type).
 **Phase E — replay end-to-end.**
 - replay uses `BarePath` + `StringLit` (quoted form).
 - Internally replays each line through the same dispatch
  pipeline.
 Exit criteria: replay works end-to-end via the walker; nested
 replay rejection still fires from the runtime, unchanged.
 **Phase F — cleanup.**
 - Delete `dsl/parser.rs`.
 - Delete `dsl/lexer.rs`.
 - Delete `dsl/keyword.rs`.
 - Delete `dsl/ident_slot.rs` (already merged into
  `grammar/mod.rs` in Phase A).
 - Delete `dsl/usage.rs::REGISTRY`.
 - Delete `chumsky` dependency from `Cargo.toml`.
 - Delete `parse.token.keyword.*` entries from the catalog and
  `keys.rs` that the walker doesn't need (the keyword
  vocabulary is implicit in the grammar nodes).
 - Remove the differential test scaffolding from Phase A.
 Exit criteria: working tree clean of legacy parser code;
 test suite still all-green; `cargo clippy --all-targets --
 -D warnings` passes; `cargo build --release` binary not
 noticeably larger.
 ### Test discipline
 Three guarantees throughout migration:
 1. **Full test suite green at every commit.** Migration is
   per-command; tests are per-behaviour. They don't care
   which parser produces a `Command` — they assert input →
   expected output. If a test fails mid-migration, the
   walker hasn't reproduced behaviour; fix the walker
   before continuing.
 2. **Walker-specific tests for trie-only features.** Schema-
   aware narrowing, `WalkContext` propagation, dynamic sub-
   grammar expansion, `HintMode` per-node behaviour, the
   round-5 "save Tab offers as" gap-closing — each gets new
   tests as the feature lands.
 3. **Differential check during the migration window.** A
   test helper iterates the existing input corpus, runs both
   parsers on inputs that fall under a migrated command, and
   asserts identical `Command` output. Cheap insurance
   against subtle divergence. Removed at Phase F cleanup.
 ### Cleanup pass at Phase F
 Beyond deleting the legacy modules, Phase F includes catalog
 cleanup. The `parse.token.keyword.*` entries (40+ of them) are
 near-mechanical wrappers (`create: "\`create\`"`); with no
 external code looking up these keys (the walker renders
 keyword names from `Word` node literals directly), the
 entries can collapse. A small `format_keyword_for_error(literal)
 -> String` helper replaces them. The `keys.rs` declarations
 go with them.
 Help text in `help.cli_banner` and `help.in_app_body` stays
 as hand-written prose — the alternative (auto-generating from
 the grammar) was deferred during the round-6 discussion as a
 separate concern; the grammar tree carries enough metadata
 (per-command `help_id`) for future automation but the prose
 documentation is still hand-curated for round 1.
 ## Consequences
 ### What we gain
 - **One declaration per command.** Entry keyword, shape, AST
  builder, dispatch handler, usage reference, help reference
  all colocated. Adding a command is one block in one file.
 - **No cross-file scatter.** The round-5 "10 places to remove
  `q`" critique is structurally addressed: there's nowhere
  else for keyword/usage/registry info to live but the
  grammar tree.
 - **Schema-aware narrowing from day one.** Typed value slots
  reject mis-shaped input at parse time with localised error
  wording; completion narrows per column type; the round-5
  value-literal slot hint becomes type-specific
  ("Type a date as 'YYYY-MM-DD'") not generic.
 - **Aliases as a single annotation.** `q` could come back as
  one line on the `quit` `Word` node, no scatter.
 - **Tests focus on behaviour, not enumeration.** Tests that
  hardcoded keyword lists during round 5 (we noted these in
  `usage.rs` and `completion.rs`) can iterate the trie
  registry instead, becoming structural rather than
  literal.
 - **Drift is structurally impossible.** Completion, highlight,
  parse, usage, and help all derive from the same trie. No
  separate sources to keep in sync.
 ### What we accept
 - **Parse depends on schema state.** A DSL command that
  references a non-existent table fails at parse time, not
  at execute time as today. This matches the user mental
  model when typing (the schema cache is current per
  ADR-0022) and yields better completion / hint
  experience. It does mean tests that exercised parser
  behaviour in isolation may now need to set up a schema
  cache.
 - **chumsky's general-purpose features go unused.** Recovery
  on ambiguous input, multi-error reporting in a single
  pass, ambiguous-grammar handling — features chumsky offers
  but our DSL doesn't use. The trade is fine because our
  grammar is deterministic.
 - **Some implementation complexity moves into the walker.**
  Whitespace skipping between siblings, terminal consume
  functions, character-level shape recognition — the lexer
  did some of this implicitly; the walker does it
  explicitly. Net code is comparable or smaller because the
  scatter cost goes away.
 ### What's out of scope for this ADR
 - **External tooling integration (LSP, editor extensions).**
  The registry is `pub` and accessible via accessor
  functions, so future tooling work doesn't fight this design.
  No tooling is built in round 1.
 - **Help text auto-generation.** Grammar tree carries
  `help_id` per node, but the help catalog body stays
  hand-curated.
 - **Performance optimisation.** Walker re-runs per keystroke
  for completion + highlighting. Naïve implementation is
  acceptable; if hot-path concerns emerge later, caching /
  incremental walks become a separate ADR.
 - **Ranker implementations.** The ranker hook exists; default
  is identity. Frequency-based ranking, content-aware priors
  for type completion ("Email → text first, Score → real"),
  recency — all future work that plugs into the ranker
  signature without touching grammar declarations.
 - **Per-slot highlight overrides.** The `highlight_override`
  field exists but stays `None` in round 1. Differentiating
  table-ident from new-name-ident visually is a future
  enhancement.
 ## References
 - ADR-0023 — Unified declarative grammar tree (Proposed direction). Superseded by this ADR for execution detail.
 - ADR-0001 — Language and TUI framework (chumsky choice). Phase F removes the chumsky dependency.
 - ADR-0019 — Friendly error layer and i18n catalog. Catalog conventions stay; `parse.token.keyword.*` entries collapse in Phase F.
 - ADR-0020 — Tokenization layer for the DSL parser. Superseded by the scannerless walker.
 - ADR-0021 — Parser-as-source-of-truth for H1a. Usage info migrates from a separate registry to grammar nodes.
 - ADR-0022 — Ambient typing assistance. The walker subsumes the expected-set introspection that powered completion in that ADR.
 - Round-6 session transcript — design pass that produced this spec.
@@ -28,4 +28,5 @@ This directory contains the project's ADRs, recorded per
 - [ADR-0020 — Tokenization layer for the DSL parser](0020-tokenization-layer-for-the-dsl-parser.md)
 - [ADR-0021 — Parser-as-source-of-truth for H1a (per-command usage in parse errors)](0021-parser-as-source-of-truth-for-h1a.md)
 - [ADR-0022 — Ambient typing assistance: colour, hint panel, completion (I3 + I4)](0022-ambient-typing-assistance.md)
- [ADR-0023 — Unified declarative grammar tree](0023-proposed-unified-grammar-tree.md) — **Proposed** (researched direction, not yet accepted)
+- [ADR-0023 — Unified declarative grammar tree](0023-unified-grammar-tree.md) — direction (superseded for execution detail by ADR-0024)
 - [ADR-0024 — Unified grammar tree: execution plan](0024-unified-grammar-tree-execution-plan.md) — **Accepted**, the executable spec