diff --git a/docs/adr/0023-proposed-unified-grammar-tree.md b/docs/adr/0023-unified-grammar-tree.md similarity index 94% rename from docs/adr/0023-proposed-unified-grammar-tree.md rename to docs/adr/0023-unified-grammar-tree.md index 5d2e825..e648d1f 100644 --- a/docs/adr/0023-proposed-unified-grammar-tree.md +++ b/docs/adr/0023-unified-grammar-tree.md @@ -1,21 +1,24 @@ -# ADR-0023: Unified declarative grammar tree (proposed direction) +# ADR-0023: Unified declarative grammar tree (direction) ## Status -**Proposed.** +**Accepted in direction, superseded for execution detail by +ADR-0024.** 2026-05-14. -Not yet accepted. Captures a researched direction for a future -refactor that supersedes the parts of ADR-0001 (chumsky as the -DSL parser), ADR-0019 (separated catalog declaration), ADR-0020 -(lexer + keyword macro), ADR-0021 (per-command usage registry), -and ADR-0022 (completion via expected-set introspection, -highlighting via lexer) that this ADR identifies as accreted -rather than designed. +This ADR captures the architectural critique (the "10-place +edit" scatter problem with the current parser shape) and the +direction (a unified declarative grammar tree). The round-6 +design pass turned that direction into a concrete specification, +which ships as ADR-0024. ADR-0024 makes some refinements +beyond what's sketched here — notably the decision to drop the +lexer module entirely (scannerless walker) and to put schema- +aware narrowing into round 1 rather than phasing it. Read +ADR-0024 for the executable plan; this ADR remains for the +institutional memory of why the change is happening. -Filename carries the `-proposed-` segment so the status is -visible at directory listing time; on acceptance, rename to -`0023-unified-grammar-tree.md` via `git mv` (history -preserved). +The filename was renamed from `0023-proposed-unified-grammar-tree.md` +to `0023-unified-grammar-tree.md` when the direction was +accepted. History is preserved through the `git mv`. ## Context diff --git a/docs/adr/0024-unified-grammar-tree-execution-plan.md b/docs/adr/0024-unified-grammar-tree-execution-plan.md new file mode 100644 index 0000000..af71c70 --- /dev/null +++ b/docs/adr/0024-unified-grammar-tree-execution-plan.md @@ -0,0 +1,701 @@ +# ADR-0024: Unified grammar tree — execution plan + +## Status + +**Accepted.** 2026-05-14. + +Concrete specification for the direction proposed in ADR-0023. +Where ADR-0023 captured the critique of the current parser +shape and the high-level vision, this ADR specifies the data +model, walker semantics, migration sequence, and cleanup steps +in enough detail that implementation can proceed without +further design decisions. + +Supersedes ADR-0023's "Proposed" status. ADR-0023 stays in +the directory as institutional memory of why this change is +happening; ADR-0024 is what gets built. + +## Context + +The design pass landed in the round-6 session (2026-05-14) +worked through ADR-0023's open questions and a number of +implicit decisions that hadn't been written down. Four rounds +of questions, each followed by user confirmation: + +1. **Round 1 — foundational.** Registry shape, node taxonomy, + AST output model, failure / "expected" semantics, walker + API and its mapping to parse / complete / highlight / hint + concerns. +2. **Round 2 — concrete representation.** Multi-keyword + sequences, sub-grammar reusability (static and dynamic), + path-bearing commands, bare-or-with-suffix commands. +3. **Round 3 — organisation and migration.** Module layout, + per-command migration strategy, test discipline during + migration. +4. **Round 4 — smaller details.** Aliases on keyword nodes, + `IdentSlot` fate, highlight palette, external-tooling + exposure. + +Two larger decisions emerged from the rounds and shifted the +shape from ADR-0023's sketch: + +- **The lexer dissolves.** The walker operates directly on + source bytes ("scannerless"). The current `dsl/lexer.rs` + module's responsibilities (whitespace skipping, token shape + recognition, byte-span tracking) migrate into terminal-node + consume functions and the walker driver. The `define_keywords!` + macro is no longer needed in its current form; keyword + literals live on `Word` nodes in the grammar. +- **Schema-aware parse from day one.** ADR-0023 had been + cautious about coupling parse to schema state. The round-1 + / round-2 discussion concluded that this caution comes from + general-purpose parser tooling and doesn't apply to an + interactive DSL editor where the schema *is* the context. + Typed value slots consult the schema during parse; bind-time + type checks remain but become belt-and-braces rather than + the primary defense. + +A separate critique surfaced in the design pass: my (Claude's) +default pull toward "what's the safe incremental version of +what general-purpose parser tooling does" repeatedly fought +against the project owner's cleaner direct design. The pull +is now explicitly resisted — this ADR ships the direct design, +not a phased compromise. + +## Decision summary + +A single trie data structure declared in Rust serves as the +authority for parsing, completion, syntax highlighting, parse- +error usage rendering, hint-panel content, and (eventually) +external-tooling exposure. The walker that consumes this trie +operates directly on source bytes — no separate lexer pass. +Schema-aware narrowing flows naturally from the trie's +structure: typed value slots and dynamic sub-grammars consult +a per-walk context that carries the current table, the +resolved column types, and a reference to the schema cache. + +Migration is per-command across six phases. The legacy +chumsky parser and the new walker run side-by-side during the +transition; existing behavioural tests guard regressions. +Phase F removes chumsky, the lexer module, the separate +`UsageEntry` registry, and the expected-set introspection +in `completion.rs`. + +Estimated total cost: ~4 sessions — one to land the framework +and migrate Phase A, two for Phases B-D, one for Phases E + F. + +## Architecture + +### Walker as single source of truth + +```rust +pub fn walk( + source: &str, + bound: WalkBound, + ctx: &mut WalkContext, +) -> WalkResult<'_>; + +pub enum WalkBound { + EndOfInput, // parse: walk all input + Position(usize), // complete / hint: walk up to cursor byte +} + +pub struct WalkResult<'a> { + pub outcome: WalkOutcome, + pub matched_path: MatchedPath, + pub per_byte_class: Vec<(ByteRange, HighlightClass)>, +} + +pub enum WalkOutcome { + Match { command_idx: usize }, + Incomplete { position: usize, expected: Vec<&'static Node> }, + Mismatch { position: usize, expected: Vec<&'static Node>, found_byte: u8 }, + ValidationFailed { position: usize, message_key: &'static str, args: Vec<(&'static str, String)> }, +} +``` + +Consumers: + +- **Parse for dispatch.** `walk(source, EndOfInput, ctx)`. On + `Match`, invoke `commands[command_idx].ast_builder(matched_path)` + and dispatch the returned `Command`. +- **Highlighting.** `walk(source, EndOfInput, ctx).per_byte_class`. + Each terminal records `(byte_range, node.highlight_class())` + as it matches. Unmatched ranges (past a failure) get the + `tok_error` overlay. +- **Completion at cursor.** `walk(source, Position(cursor), ctx)`, + inspect `outcome.expected`. Each expected `Node` contributes + candidates: `Word` → its primary literal, `Ident { source }` + → schema-cache lookup, `Flag` → `--name`, value-literal slot + → type-appropriate hint per `HintMode`, etc. +- **Hint panel ambient.** Same walk as completion. The hint + resolver consults `WalkOutcome` variants plus the expected + nodes' `HintMode` to choose between candidates rendering, + prose, suppression, etc. + +### Scannerless: no lexer module + +Terminal nodes consume bytes directly. No pre-pass produces a +`Vec`. The walker's driver handles whitespace skipping +between siblings of a `Seq` and dispatches to each terminal's +`consume(source, position)` function. + +Character-level helpers (identifier shape, digit-sequence shape, +quoted-string escape handling) live in +`src/dsl/walker/lex_helpers.rs` — a small shared module used +by the various terminal consume functions. This is internally +similar to the current lexer's logic, but it's invoked per-position +by the walker rather than as a pre-pass. + +`src/dsl/lexer.rs` and `src/dsl/keyword.rs` are deleted in +Phase F. The keyword vocabulary is no longer a Rust enum; each +keyword exists as a `Word` node in the grammar declarations. + +### Node taxonomy + +Thirteen node kinds. Three categories: + +**Terminals** (consume bytes): + +```rust +pub enum Node { + Word { + primary: &'static str, + aliases: &'static [&'static str], + // Default tok_keyword unless overridden. + highlight_override: Option, + }, + Punct(char), + Ident { + source: IdentSource, + role: &'static str, + highlight_override: Option, + }, + NumberLit, + StringLit, + BlobLit, + Flag(&'static str), + BarePath, + // Combinators ↓ +} +``` + +**Combinators** (compose other nodes): + +```rust + Choice(&'static [Node]), + Seq(&'static [Node]), + Optional(&'static Node), + Repeated { + inner: &'static Node, + separator: Option<&'static Node>, + min: usize, + }, +``` + +**Dynamic** (resolves at walk time using `WalkContext`): + +```rust + DynamicSubgrammar(fn(&WalkContext) -> Node), +} +``` + +`CommandNode` is the top-level entry record: + +```rust +pub struct CommandNode { + pub entry: Word, + pub shape: Node, // usually a Seq + pub ast_builder: fn(&MatchedPath) -> Command, + pub dispatch: fn(&mut App, Command) -> Vec, + pub help_id: Option<&'static str>, + pub usage_id: Option<&'static str>, + // Hint mode override at command level; nodes can carry their own too. + pub hint_mode: Option, +} + +pub const REGISTRY: &[CommandNode] = &[ /* ... */ ]; +``` + +### Typed value slots + +Value-literal positions use typed slots built from terminals +plus content validators. One slot factory per data type: + +```rust +fn int_slot() -> Node { Choice(&[NumberLit_with(integer_only_validator), null_word()]) } +fn real_slot() -> Node { Choice(&[NumberLit, null_word()]) } +fn decimal_slot() -> Node { Choice(&[NumberLit_with(decimal_validator), null_word()]) } +fn bool_slot() -> Node { Choice(&[Word("true", &[]), Word("false", &[]), null_word()]) } +fn text_slot() -> Node { Choice(&[StringLit, null_word()]) } +fn date_slot() -> Node { Choice(&[StringLit_with(date_format_validator), null_word()]) } +fn datetime_slot() -> Node { Choice(&[StringLit_with(datetime_format_validator), null_word()]) } +fn blob_slot() -> Node { Choice(&[BlobLit, null_word()]) } +``` + +`StringLit_with(validator)` is a `StringLit` terminal carrying +a content validator that runs after a successful match. Same +for `NumberLit_with`. A failed validator surfaces as +`WalkOutcome::ValidationFailed` with the validator's catalog +key. + +`slot_for_type(ty: Type) -> Node` is the dispatcher: given a +column type, returns the appropriate slot. Used by dynamic +sub-grammars (see below). + +### `WalkContext` + +```rust +pub struct WalkContext<'a> { + pub schema: &'a SchemaCache, + // Current table inferred from the partial parse — e.g., + // `insert into Customers ...` sets `current_table = "Customers"`. + pub current_table: Option, + // The columns of `current_table`, in declaration order, with types. + // Populated by Ident { source: Tables } when it matches a + // known table. + pub current_table_columns: Option>, + // For comma-separated value lists, which position we're at. + pub value_position: usize, + // For `set` clauses and `where` clauses, the column whose value + // we're about to consume. + pub current_column: Option, +} +``` + +Nodes can write to `WalkContext`: + +- `Ident { source: Tables, role: "table", writes_table: true }` + on match sets `ctx.current_table` to the matched identifier + and resolves `ctx.current_table_columns` from the schema. +- `Ident { source: Columns, role: "column", writes_current_column: true }` + on match sets `ctx.current_column` from the resolved column list. + +Nodes can read from `WalkContext`: + +- `DynamicSubgrammar(column_value_list)` reads + `ctx.current_table_columns` and unfolds to a `Seq` of + comma-separated typed slots — one per column. +- The value slot after `set col=` reads `ctx.current_column.user_type` + to pick the right typed slot. + +### `WalkOutcome` and "expected" + +The walker keeps track of the longest prefix that matched and +the position at which it failed (or completed). At a failure +or incomplete position, `expected` is the set of nodes that +could legally continue the walk — derived structurally from +the trie, not from a separate "expected" table. + +For a `Seq` mid-walk, `expected` is the next child node. +For a `Choice` that hasn't committed to a branch, `expected` +is all children. For an `Optional` at a position where its +inner could start, `expected` includes the inner plus the +next sibling. + +This is the same information chumsky's +`ParseError::Invalid::expected` carries today, sourced from +the trie directly instead of via combinator introspection. + +### `HintMode` per node + +Each node may carry a `HintMode`: + +```rust +pub enum HintMode { + /// Candidates if any surface; else prose fallback. + Default, + /// Force the prose at this catalog key regardless of candidates. + /// Used by NewName slots ("Type a name, then `(`"). + ForceProse(&'static str), + /// Show only the prose; suppress Tab candidates. + /// Used by typed value slots at empty prefix. + ProseOnly(&'static str), + /// Suppress prose; only candidates. + SuppressProse, +} +``` + +The walker propagates each expected node's `HintMode` to the +hint resolver, which dispatches accordingly. + +The current ad-hoc cases in `input_render.rs::ambient_hint` +(value-literal slot suppression, NewName slot typing-name +prose, invalid-ident overlay) migrate to node-attached +`HintMode` annotations during Phase D. + +### Ranker layer + +A ranker function runs between the walker's raw candidate +output and the hint-panel renderer: + +```rust +pub type Ranker = fn(&WalkContext, Vec) -> Vec; + +pub fn identity_ranker(_: &WalkContext, c: Vec) -> Vec { c } +``` + +Default is `identity_ranker` — declaration order from the +trie is preserved. The signature allows future enhancements +(frequency-based ranking, content-aware priors for type +suggestions per column name) to plug in without changing +grammar declarations. + +The ranker lives outside the trie. Grammar declarations are +about *what's valid*; ranking is about *what's likely useful +first*. + +### Sub-grammars + +Two flavours, no global registry: + +**Static** — pure composition, function returning a const node: + +```rust +const fn qualified_column(role_table: &'static str, role_col: &'static str) -> Node { + Seq(&[ + Ident { source: Tables, role: role_table, /* ... */ }, + Punct('.'), + Ident { source: Columns, role: role_col, /* ... */ }, + ]) +} + +const fn where_clause() -> Node { + Seq(&[ + Word { primary: "where", /* ... */ }, + Ident { source: Columns, role: "filter_column", /* ... */ }, + Punct('='), + AnyValueSlot, + ]) +} +``` + +**Dynamic** — context-aware, expands at walk time: + +```rust +fn column_value_list(ctx: &WalkContext) -> Node { + let cols = ctx.current_table_columns.as_ref().unwrap_or(&Vec::new()); + let mut children: Vec = Vec::new(); + for (i, col) in cols.iter().enumerate() { + if i > 0 { children.push(Punct(',')); } + children.push(slot_for_type(col.user_type)); + } + Seq(Box::leak(children.into_boxed_slice())) +} +``` + +Dynamic sub-grammars return owned `Node` values that the +walker treats as inline expansions. The leak above is one +implementation tactic — alternatively, the walker stores the +expanded node in a small per-walk arena. Both work; pick at +implementation time. + +### Aliases + +A `Word` node carries `primary` and an `aliases` slice. The +walker matches input against either; completion surfaces only +the primary; help text mentions aliases prose-style if +appropriate. Highlight class is the same for both. + +Round 5's `q` removal is *not* reverted by this design. `q` +stays gone — adding it back would now be the single line +`aliases: &["q"]` on the `quit` `Word` node, and would not +surface as a separate candidate in completion (matching the +round-5 user request). + +### `IdentSource` + +Replaces the current `dsl::ident_slot::IdentSlot`: + +```rust +pub enum IdentSource { + NewName, // user invents; no schema lookup; ProseOnly hint + Tables, // existing table names + Columns, // existing column names (filtered by current table) + Relationships, // existing relationship names + Types, // closed set from Type::all() +} +``` + +`Types` is new — it replaces the magic-string `TYPE_SLOT_LABEL` +used today. `src/dsl/ident_slot.rs` dissolves into +`src/dsl/grammar/mod.rs`. + +### Highlight class assignment + +Per-byte highlight class is computed as a side effect of the +walk. Each terminal records `(byte_range, class)` in +`WalkResult::per_byte_class` as it matches. Unmatched ranges +(past a definite failure) get the `tok_error` overlay, +identical to today's behaviour. + +Default classes per terminal kind: + +| Terminal | Default class | +|---|---| +| `Word` | `tok_keyword` | +| `Punct` | `tok_punct` | +| `Ident` | `tok_identifier` | +| `NumberLit` | `tok_number` | +| `StringLit` | `tok_string` | +| `BlobLit` | `tok_string` | +| `Flag` | `tok_flag` | +| `BarePath` | `tok_string` | + +The `highlight_override: Option` field on +`Word` and `Ident` is reserved for future per-slot variants +(e.g., a Tables slot in a distinct shade vs a NewName slot +muted) — left `None` everywhere in round 1. + +No new palette colours for the initial migration. + +## Migration plan + +### Code organisation + +``` +src/dsl/ + grammar/ + mod.rs — Node enum, IdentSource, HintMode, HighlightClass, + MatchedPath, CommandNode, REGISTRY top-level + data.rs — insert, update, delete, show + ddl.rs — create, drop, add, rename, change + app.rs — quit, help, save/save-as, new, load, rebuild, + export, import, mode, messages + shared.rs — typed value slots (int_slot, date_slot, …), + qualified_column, where_clause, action_keyword, + column_value_list (dynamic) + validators.rs — content validators (integer_only_validator, + date_format_validator, datetime_format_validator, + type_name_validator, …) + walker/ + mod.rs — public walk() entry; orchestration + driver.rs — the per-node-kind dispatch + context.rs — WalkContext + outcome.rs — WalkOutcome, MatchedPath, WalkResult + lex_helpers.rs — identifier-shape, digit-shape, string-escape + helpers; shared across terminal consume fns + parser.rs — Phase A: becomes a router. Phase F: deleted. + lexer.rs — Phase F: deleted. + keyword.rs — Phase F: deleted. + ident_slot.rs — Phase F: dissolved into grammar/mod.rs. + usage.rs — Phase F: REGISTRY deleted; the file may go. +``` + +### Six-phase migration + +**Phase A — Walker skeleton + app-lifecycle commands.** + +- Build the walker driver, `WalkContext`, `WalkOutcome`, + `MatchedPath`, the terminal consume functions. +- Migrate the app-lifecycle commands (no schema dependency, + no value literals): quit, help, rebuild, save, save as, new, + load, export, import, mode, messages. +- Router in `parse_command` consults the walker for migrated + commands; falls back to chumsky for the rest. +- Differential test scaffolding: a test helper that, for every + input in the existing test corpus, runs both parsers and + asserts identical `Command` output where the input falls + under a migrated command. + +Exit criteria: walker handles the app-lifecycle commands +end-to-end; existing tests for those commands pass via the +walker path; tests for other commands still pass via chumsky. + +**Phase B — DDL commands without value literals.** + +- drop table, drop column, drop relationship. +- rename column. +- add column (without the value-literal aspect — type slot + uses `Ident { source: Types }` with a content validator). +- add 1:n relationship (referential clauses as a static + sub-grammar). +- change column (type slot + flags). + +These exercise schema lookups via `Ident { source: Tables }` +and `Ident { source: Columns }`, and the `Types` source. No +typed value slots yet, no `DynamicSubgrammar`. + +Exit criteria: all DDL commands except `create table` pass +via the walker; the rest still pass via chumsky. + +**Phase C — `create table` with column-list value literals.** + +- The `with pk` clause uses `Repeated` for the column-spec + list, each spec being a `Seq(Ident{NewName}, Punct(':'), + Ident{Types}-with-validator)`. +- First test of `Repeated` with separator. + +Exit criteria: create table works end-to-end via the walker. + +**Phase D — data commands with full schema awareness.** + +- show data, show table, replay. +- insert: uses `DynamicSubgrammar(column_value_list)` for the + comma-separated typed value list. Exercises full + `WalkContext` propagation: `Ident { source: Tables, role: + "table", writes_table: true }` resolves the column list; + the dynamic sub-grammar unfolds typed slots per column. +- update: `set` clauses use `DynamicSubgrammar` to resolve the + value slot's type from the column. `where` clause uses the + shared sub-grammar with `AnyValueSlot` (or, optionally, also + column-typed if the column resolves cleanly). +- delete: same `where` clause; otherwise simple. + +This is the phase that proves the design's central claim: +typed slots, dynamic sub-grammars, and schema-aware narrowing +all collaborate to produce a single coherent grammar +declaration per command. + +Exit criteria: all data commands pass via the walker; the +round-5 limitations close automatically (save Tab can offer +`as`, value slots narrow by column type). + +**Phase E — replay end-to-end.** + +- replay uses `BarePath` + `StringLit` (quoted form). +- Internally replays each line through the same dispatch + pipeline. + +Exit criteria: replay works end-to-end via the walker; nested +replay rejection still fires from the runtime, unchanged. + +**Phase F — cleanup.** + +- Delete `dsl/parser.rs`. +- Delete `dsl/lexer.rs`. +- Delete `dsl/keyword.rs`. +- Delete `dsl/ident_slot.rs` (already merged into + `grammar/mod.rs` in Phase A). +- Delete `dsl/usage.rs::REGISTRY`. +- Delete `chumsky` dependency from `Cargo.toml`. +- Delete `parse.token.keyword.*` entries from the catalog and + `keys.rs` that the walker doesn't need (the keyword + vocabulary is implicit in the grammar nodes). +- Remove the differential test scaffolding from Phase A. + +Exit criteria: working tree clean of legacy parser code; +test suite still all-green; `cargo clippy --all-targets -- +-D warnings` passes; `cargo build --release` binary not +noticeably larger. + +### Test discipline + +Three guarantees throughout migration: + +1. **Full test suite green at every commit.** Migration is + per-command; tests are per-behaviour. They don't care + which parser produces a `Command` — they assert input → + expected output. If a test fails mid-migration, the + walker hasn't reproduced behaviour; fix the walker + before continuing. +2. **Walker-specific tests for trie-only features.** Schema- + aware narrowing, `WalkContext` propagation, dynamic sub- + grammar expansion, `HintMode` per-node behaviour, the + round-5 "save Tab offers as" gap-closing — each gets new + tests as the feature lands. +3. **Differential check during the migration window.** A + test helper iterates the existing input corpus, runs both + parsers on inputs that fall under a migrated command, and + asserts identical `Command` output. Cheap insurance + against subtle divergence. Removed at Phase F cleanup. + +### Cleanup pass at Phase F + +Beyond deleting the legacy modules, Phase F includes catalog +cleanup. The `parse.token.keyword.*` entries (40+ of them) are +near-mechanical wrappers (`create: "\`create\`"`); with no +external code looking up these keys (the walker renders +keyword names from `Word` node literals directly), the +entries can collapse. A small `format_keyword_for_error(literal) +-> String` helper replaces them. The `keys.rs` declarations +go with them. + +Help text in `help.cli_banner` and `help.in_app_body` stays +as hand-written prose — the alternative (auto-generating from +the grammar) was deferred during the round-6 discussion as a +separate concern; the grammar tree carries enough metadata +(per-command `help_id`) for future automation but the prose +documentation is still hand-curated for round 1. + +## Consequences + +### What we gain + +- **One declaration per command.** Entry keyword, shape, AST + builder, dispatch handler, usage reference, help reference + all colocated. Adding a command is one block in one file. +- **No cross-file scatter.** The round-5 "10 places to remove + `q`" critique is structurally addressed: there's nowhere + else for keyword/usage/registry info to live but the + grammar tree. +- **Schema-aware narrowing from day one.** Typed value slots + reject mis-shaped input at parse time with localised error + wording; completion narrows per column type; the round-5 + value-literal slot hint becomes type-specific + ("Type a date as 'YYYY-MM-DD'") not generic. +- **Aliases as a single annotation.** `q` could come back as + one line on the `quit` `Word` node, no scatter. +- **Tests focus on behaviour, not enumeration.** Tests that + hardcoded keyword lists during round 5 (we noted these in + `usage.rs` and `completion.rs`) can iterate the trie + registry instead, becoming structural rather than + literal. +- **Drift is structurally impossible.** Completion, highlight, + parse, usage, and help all derive from the same trie. No + separate sources to keep in sync. + +### What we accept + +- **Parse depends on schema state.** A DSL command that + references a non-existent table fails at parse time, not + at execute time as today. This matches the user mental + model when typing (the schema cache is current per + ADR-0022) and yields better completion / hint + experience. It does mean tests that exercised parser + behaviour in isolation may now need to set up a schema + cache. +- **chumsky's general-purpose features go unused.** Recovery + on ambiguous input, multi-error reporting in a single + pass, ambiguous-grammar handling — features chumsky offers + but our DSL doesn't use. The trade is fine because our + grammar is deterministic. +- **Some implementation complexity moves into the walker.** + Whitespace skipping between siblings, terminal consume + functions, character-level shape recognition — the lexer + did some of this implicitly; the walker does it + explicitly. Net code is comparable or smaller because the + scatter cost goes away. + +### What's out of scope for this ADR + +- **External tooling integration (LSP, editor extensions).** + The registry is `pub` and accessible via accessor + functions, so future tooling work doesn't fight this design. + No tooling is built in round 1. +- **Help text auto-generation.** Grammar tree carries + `help_id` per node, but the help catalog body stays + hand-curated. +- **Performance optimisation.** Walker re-runs per keystroke + for completion + highlighting. Naïve implementation is + acceptable; if hot-path concerns emerge later, caching / + incremental walks become a separate ADR. +- **Ranker implementations.** The ranker hook exists; default + is identity. Frequency-based ranking, content-aware priors + for type completion ("Email → text first, Score → real"), + recency — all future work that plugs into the ranker + signature without touching grammar declarations. +- **Per-slot highlight overrides.** The `highlight_override` + field exists but stays `None` in round 1. Differentiating + table-ident from new-name-ident visually is a future + enhancement. + +## References + +- ADR-0023 — Unified declarative grammar tree (Proposed direction). Superseded by this ADR for execution detail. +- ADR-0001 — Language and TUI framework (chumsky choice). Phase F removes the chumsky dependency. +- ADR-0019 — Friendly error layer and i18n catalog. Catalog conventions stay; `parse.token.keyword.*` entries collapse in Phase F. +- ADR-0020 — Tokenization layer for the DSL parser. Superseded by the scannerless walker. +- ADR-0021 — Parser-as-source-of-truth for H1a. Usage info migrates from a separate registry to grammar nodes. +- ADR-0022 — Ambient typing assistance. The walker subsumes the expected-set introspection that powered completion in that ADR. +- Round-6 session transcript — design pass that produced this spec. diff --git a/docs/adr/README.md b/docs/adr/README.md index 4a410f8..b3dfada 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -28,4 +28,5 @@ This directory contains the project's ADRs, recorded per - [ADR-0020 — Tokenization layer for the DSL parser](0020-tokenization-layer-for-the-dsl-parser.md) - [ADR-0021 — Parser-as-source-of-truth for H1a (per-command usage in parse errors)](0021-parser-as-source-of-truth-for-h1a.md) - [ADR-0022 — Ambient typing assistance: colour, hint panel, completion (I3 + I4)](0022-ambient-typing-assistance.md) -- [ADR-0023 — Unified declarative grammar tree](0023-proposed-unified-grammar-tree.md) — **Proposed** (researched direction, not yet accepted) +- [ADR-0023 — Unified declarative grammar tree](0023-unified-grammar-tree.md) — direction (superseded for execution detail by ADR-0024) +- [ADR-0024 — Unified grammar tree: execution plan](0024-unified-grammar-tree-execution-plan.md) — **Accepted**, the executable spec