add ADR-0024: unified grammar tree execution plan (accepted)

Concrete specification for the direction in ADR-0023, landed
during the round-6 design pass. Resolves all four rounds of
open design questions: walker as single source of truth,
scannerless terminal vocabulary (~8 building blocks), typed
value slots with content validators, WalkContext for schema-
aware narrowing from day one, WalkOutcome multi-purpose
return, HintMode per-node, ranker as separate layer, static
+ dynamic sub-grammars, aliases as Word annotations,
IdentSource taxonomy, six-phase per-command migration with
chumsky and walker side-by-side during the transition.

Key shifts from ADR-0023's sketch:

- Lexer dissolves entirely. Walker operates on bytes directly.
  dsl/lexer.rs, dsl/keyword.rs go away in Phase F.
- Schema-aware parse from day one (not phased). Typed value
  slots reject mis-shaped input at parse time with localised
  wording. Completion narrows per column type.
- Sub-grammars: static (fn() -> Node) for composition;
  dynamic (fn(&WalkContext) -> Node) for schema-dependent
  expansion. No global named registry.
- Path-bearing commands: BarePath becomes a routine
  non-whitespace terminal. Paths with spaces require quoting
  via StringLit (UX simplification, aligns with standard CLI
  convention).
- 13-node taxonomy: Word, Punct, Ident, NumberLit, StringLit,
  BlobLit, Flag, BarePath, Choice, Seq, Optional, Repeated,
  DynamicSubgrammar.

Migration plan: Phase A (walker scaffolding + app-lifecycle
commands), Phase B (DDL without value literals), Phase C
(create table), Phase D (data commands with full schema
awareness -- the design's central claim landing), Phase E
(replay), Phase F (delete chumsky + lexer + legacy parser
modules, simplify catalog). Estimated ~4 sessions total.

Also: rename ADR-0023 from 0023-proposed-unified-grammar-tree.md
to 0023-unified-grammar-tree.md (git mv preserves history)
and update its status to reflect the direction-accepted-but-
superseded-for-execution-detail relationship with ADR-0024.
Index updated.
This commit is contained in:
claude@clouddev1
2026-05-14 21:52:10 +00:00
parent 3b36bbb4d6
commit 74c3ec1edf
3 changed files with 719 additions and 14 deletions
@@ -1,21 +1,24 @@
# ADR-0023: Unified declarative grammar tree (proposed direction) # ADR-0023: Unified declarative grammar tree (direction)
## Status ## Status
**Proposed.** **Accepted in direction, superseded for execution detail by
ADR-0024.** 2026-05-14.
Not yet accepted. Captures a researched direction for a future This ADR captures the architectural critique (the "10-place
refactor that supersedes the parts of ADR-0001 (chumsky as the edit" scatter problem with the current parser shape) and the
DSL parser), ADR-0019 (separated catalog declaration), ADR-0020 direction (a unified declarative grammar tree). The round-6
(lexer + keyword macro), ADR-0021 (per-command usage registry), design pass turned that direction into a concrete specification,
and ADR-0022 (completion via expected-set introspection, which ships as ADR-0024. ADR-0024 makes some refinements
highlighting via lexer) that this ADR identifies as accreted beyond what's sketched here — notably the decision to drop the
rather than designed. lexer module entirely (scannerless walker) and to put schema-
aware narrowing into round 1 rather than phasing it. Read
ADR-0024 for the executable plan; this ADR remains for the
institutional memory of why the change is happening.
Filename carries the `-proposed-` segment so the status is The filename was renamed from `0023-proposed-unified-grammar-tree.md`
visible at directory listing time; on acceptance, rename to to `0023-unified-grammar-tree.md` when the direction was
`0023-unified-grammar-tree.md` via `git mv` (history accepted. History is preserved through the `git mv`.
preserved).
## Context ## Context
@@ -0,0 +1,701 @@
# ADR-0024: Unified grammar tree — execution plan
## Status
**Accepted.** 2026-05-14.
Concrete specification for the direction proposed in ADR-0023.
Where ADR-0023 captured the critique of the current parser
shape and the high-level vision, this ADR specifies the data
model, walker semantics, migration sequence, and cleanup steps
in enough detail that implementation can proceed without
further design decisions.
Supersedes ADR-0023's "Proposed" status. ADR-0023 stays in
the directory as institutional memory of why this change is
happening; ADR-0024 is what gets built.
## Context
The design pass landed in the round-6 session (2026-05-14)
worked through ADR-0023's open questions and a number of
implicit decisions that hadn't been written down. Four rounds
of questions, each followed by user confirmation:
1. **Round 1 — foundational.** Registry shape, node taxonomy,
AST output model, failure / "expected" semantics, walker
API and its mapping to parse / complete / highlight / hint
concerns.
2. **Round 2 — concrete representation.** Multi-keyword
sequences, sub-grammar reusability (static and dynamic),
path-bearing commands, bare-or-with-suffix commands.
3. **Round 3 — organisation and migration.** Module layout,
per-command migration strategy, test discipline during
migration.
4. **Round 4 — smaller details.** Aliases on keyword nodes,
`IdentSlot` fate, highlight palette, external-tooling
exposure.
Two larger decisions emerged from the rounds and shifted the
shape from ADR-0023's sketch:
- **The lexer dissolves.** The walker operates directly on
source bytes ("scannerless"). The current `dsl/lexer.rs`
module's responsibilities (whitespace skipping, token shape
recognition, byte-span tracking) migrate into terminal-node
consume functions and the walker driver. The `define_keywords!`
macro is no longer needed in its current form; keyword
literals live on `Word` nodes in the grammar.
- **Schema-aware parse from day one.** ADR-0023 had been
cautious about coupling parse to schema state. The round-1
/ round-2 discussion concluded that this caution comes from
general-purpose parser tooling and doesn't apply to an
interactive DSL editor where the schema *is* the context.
Typed value slots consult the schema during parse; bind-time
type checks remain but become belt-and-braces rather than
the primary defense.
A separate critique surfaced in the design pass: my (Claude's)
default pull toward "what's the safe incremental version of
what general-purpose parser tooling does" repeatedly fought
against the project owner's cleaner direct design. The pull
is now explicitly resisted — this ADR ships the direct design,
not a phased compromise.
## Decision summary
A single trie data structure declared in Rust serves as the
authority for parsing, completion, syntax highlighting, parse-
error usage rendering, hint-panel content, and (eventually)
external-tooling exposure. The walker that consumes this trie
operates directly on source bytes — no separate lexer pass.
Schema-aware narrowing flows naturally from the trie's
structure: typed value slots and dynamic sub-grammars consult
a per-walk context that carries the current table, the
resolved column types, and a reference to the schema cache.
Migration is per-command across six phases. The legacy
chumsky parser and the new walker run side-by-side during the
transition; existing behavioural tests guard regressions.
Phase F removes chumsky, the lexer module, the separate
`UsageEntry` registry, and the expected-set introspection
in `completion.rs`.
Estimated total cost: ~4 sessions — one to land the framework
and migrate Phase A, two for Phases B-D, one for Phases E + F.
## Architecture
### Walker as single source of truth
```rust
pub fn walk(
source: &str,
bound: WalkBound,
ctx: &mut WalkContext,
) -> WalkResult<'_>;
pub enum WalkBound {
EndOfInput, // parse: walk all input
Position(usize), // complete / hint: walk up to cursor byte
}
pub struct WalkResult<'a> {
pub outcome: WalkOutcome,
pub matched_path: MatchedPath,
pub per_byte_class: Vec<(ByteRange, HighlightClass)>,
}
pub enum WalkOutcome {
Match { command_idx: usize },
Incomplete { position: usize, expected: Vec<&'static Node> },
Mismatch { position: usize, expected: Vec<&'static Node>, found_byte: u8 },
ValidationFailed { position: usize, message_key: &'static str, args: Vec<(&'static str, String)> },
}
```
Consumers:
- **Parse for dispatch.** `walk(source, EndOfInput, ctx)`. On
`Match`, invoke `commands[command_idx].ast_builder(matched_path)`
and dispatch the returned `Command`.
- **Highlighting.** `walk(source, EndOfInput, ctx).per_byte_class`.
Each terminal records `(byte_range, node.highlight_class())`
as it matches. Unmatched ranges (past a failure) get the
`tok_error` overlay.
- **Completion at cursor.** `walk(source, Position(cursor), ctx)`,
inspect `outcome.expected`. Each expected `Node` contributes
candidates: `Word` → its primary literal, `Ident { source }`
→ schema-cache lookup, `Flag``--name`, value-literal slot
→ type-appropriate hint per `HintMode`, etc.
- **Hint panel ambient.** Same walk as completion. The hint
resolver consults `WalkOutcome` variants plus the expected
nodes' `HintMode` to choose between candidates rendering,
prose, suppression, etc.
### Scannerless: no lexer module
Terminal nodes consume bytes directly. No pre-pass produces a
`Vec<Token>`. The walker's driver handles whitespace skipping
between siblings of a `Seq` and dispatches to each terminal's
`consume(source, position)` function.
Character-level helpers (identifier shape, digit-sequence shape,
quoted-string escape handling) live in
`src/dsl/walker/lex_helpers.rs` — a small shared module used
by the various terminal consume functions. This is internally
similar to the current lexer's logic, but it's invoked per-position
by the walker rather than as a pre-pass.
`src/dsl/lexer.rs` and `src/dsl/keyword.rs` are deleted in
Phase F. The keyword vocabulary is no longer a Rust enum; each
keyword exists as a `Word` node in the grammar declarations.
### Node taxonomy
Thirteen node kinds. Three categories:
**Terminals** (consume bytes):
```rust
pub enum Node {
Word {
primary: &'static str,
aliases: &'static [&'static str],
// Default tok_keyword unless overridden.
highlight_override: Option<HighlightClass>,
},
Punct(char),
Ident {
source: IdentSource,
role: &'static str,
highlight_override: Option<HighlightClass>,
},
NumberLit,
StringLit,
BlobLit,
Flag(&'static str),
BarePath,
// Combinators ↓
}
```
**Combinators** (compose other nodes):
```rust
Choice(&'static [Node]),
Seq(&'static [Node]),
Optional(&'static Node),
Repeated {
inner: &'static Node,
separator: Option<&'static Node>,
min: usize,
},
```
**Dynamic** (resolves at walk time using `WalkContext`):
```rust
DynamicSubgrammar(fn(&WalkContext) -> Node),
}
```
`CommandNode` is the top-level entry record:
```rust
pub struct CommandNode {
pub entry: Word,
pub shape: Node, // usually a Seq
pub ast_builder: fn(&MatchedPath) -> Command,
pub dispatch: fn(&mut App, Command) -> Vec<Action>,
pub help_id: Option<&'static str>,
pub usage_id: Option<&'static str>,
// Hint mode override at command level; nodes can carry their own too.
pub hint_mode: Option<HintMode>,
}
pub const REGISTRY: &[CommandNode] = &[ /* ... */ ];
```
### Typed value slots
Value-literal positions use typed slots built from terminals
plus content validators. One slot factory per data type:
```rust
fn int_slot() -> Node { Choice(&[NumberLit_with(integer_only_validator), null_word()]) }
fn real_slot() -> Node { Choice(&[NumberLit, null_word()]) }
fn decimal_slot() -> Node { Choice(&[NumberLit_with(decimal_validator), null_word()]) }
fn bool_slot() -> Node { Choice(&[Word("true", &[]), Word("false", &[]), null_word()]) }
fn text_slot() -> Node { Choice(&[StringLit, null_word()]) }
fn date_slot() -> Node { Choice(&[StringLit_with(date_format_validator), null_word()]) }
fn datetime_slot() -> Node { Choice(&[StringLit_with(datetime_format_validator), null_word()]) }
fn blob_slot() -> Node { Choice(&[BlobLit, null_word()]) }
```
`StringLit_with(validator)` is a `StringLit` terminal carrying
a content validator that runs after a successful match. Same
for `NumberLit_with`. A failed validator surfaces as
`WalkOutcome::ValidationFailed` with the validator's catalog
key.
`slot_for_type(ty: Type) -> Node` is the dispatcher: given a
column type, returns the appropriate slot. Used by dynamic
sub-grammars (see below).
### `WalkContext`
```rust
pub struct WalkContext<'a> {
pub schema: &'a SchemaCache,
// Current table inferred from the partial parse — e.g.,
// `insert into Customers ...` sets `current_table = "Customers"`.
pub current_table: Option<String>,
// The columns of `current_table`, in declaration order, with types.
// Populated by Ident { source: Tables } when it matches a
// known table.
pub current_table_columns: Option<Vec<ColumnInfo>>,
// For comma-separated value lists, which position we're at.
pub value_position: usize,
// For `set` clauses and `where` clauses, the column whose value
// we're about to consume.
pub current_column: Option<ColumnInfo>,
}
```
Nodes can write to `WalkContext`:
- `Ident { source: Tables, role: "table", writes_table: true }`
on match sets `ctx.current_table` to the matched identifier
and resolves `ctx.current_table_columns` from the schema.
- `Ident { source: Columns, role: "column", writes_current_column: true }`
on match sets `ctx.current_column` from the resolved column list.
Nodes can read from `WalkContext`:
- `DynamicSubgrammar(column_value_list)` reads
`ctx.current_table_columns` and unfolds to a `Seq` of
comma-separated typed slots — one per column.
- The value slot after `set col=` reads `ctx.current_column.user_type`
to pick the right typed slot.
### `WalkOutcome` and "expected"
The walker keeps track of the longest prefix that matched and
the position at which it failed (or completed). At a failure
or incomplete position, `expected` is the set of nodes that
could legally continue the walk — derived structurally from
the trie, not from a separate "expected" table.
For a `Seq` mid-walk, `expected` is the next child node.
For a `Choice` that hasn't committed to a branch, `expected`
is all children. For an `Optional` at a position where its
inner could start, `expected` includes the inner plus the
next sibling.
This is the same information chumsky's
`ParseError::Invalid::expected` carries today, sourced from
the trie directly instead of via combinator introspection.
### `HintMode` per node
Each node may carry a `HintMode`:
```rust
pub enum HintMode {
/// Candidates if any surface; else prose fallback.
Default,
/// Force the prose at this catalog key regardless of candidates.
/// Used by NewName slots ("Type a name, then `(`").
ForceProse(&'static str),
/// Show only the prose; suppress Tab candidates.
/// Used by typed value slots at empty prefix.
ProseOnly(&'static str),
/// Suppress prose; only candidates.
SuppressProse,
}
```
The walker propagates each expected node's `HintMode` to the
hint resolver, which dispatches accordingly.
The current ad-hoc cases in `input_render.rs::ambient_hint`
(value-literal slot suppression, NewName slot typing-name
prose, invalid-ident overlay) migrate to node-attached
`HintMode` annotations during Phase D.
### Ranker layer
A ranker function runs between the walker's raw candidate
output and the hint-panel renderer:
```rust
pub type Ranker = fn(&WalkContext, Vec<Candidate>) -> Vec<Candidate>;
pub fn identity_ranker(_: &WalkContext, c: Vec<Candidate>) -> Vec<Candidate> { c }
```
Default is `identity_ranker` — declaration order from the
trie is preserved. The signature allows future enhancements
(frequency-based ranking, content-aware priors for type
suggestions per column name) to plug in without changing
grammar declarations.
The ranker lives outside the trie. Grammar declarations are
about *what's valid*; ranking is about *what's likely useful
first*.
### Sub-grammars
Two flavours, no global registry:
**Static** — pure composition, function returning a const node:
```rust
const fn qualified_column(role_table: &'static str, role_col: &'static str) -> Node {
Seq(&[
Ident { source: Tables, role: role_table, /* ... */ },
Punct('.'),
Ident { source: Columns, role: role_col, /* ... */ },
])
}
const fn where_clause() -> Node {
Seq(&[
Word { primary: "where", /* ... */ },
Ident { source: Columns, role: "filter_column", /* ... */ },
Punct('='),
AnyValueSlot,
])
}
```
**Dynamic** — context-aware, expands at walk time:
```rust
fn column_value_list(ctx: &WalkContext) -> Node {
let cols = ctx.current_table_columns.as_ref().unwrap_or(&Vec::new());
let mut children: Vec<Node> = Vec::new();
for (i, col) in cols.iter().enumerate() {
if i > 0 { children.push(Punct(',')); }
children.push(slot_for_type(col.user_type));
}
Seq(Box::leak(children.into_boxed_slice()))
}
```
Dynamic sub-grammars return owned `Node` values that the
walker treats as inline expansions. The leak above is one
implementation tactic — alternatively, the walker stores the
expanded node in a small per-walk arena. Both work; pick at
implementation time.
### Aliases
A `Word` node carries `primary` and an `aliases` slice. The
walker matches input against either; completion surfaces only
the primary; help text mentions aliases prose-style if
appropriate. Highlight class is the same for both.
Round 5's `q` removal is *not* reverted by this design. `q`
stays gone — adding it back would now be the single line
`aliases: &["q"]` on the `quit` `Word` node, and would not
surface as a separate candidate in completion (matching the
round-5 user request).
### `IdentSource`
Replaces the current `dsl::ident_slot::IdentSlot`:
```rust
pub enum IdentSource {
NewName, // user invents; no schema lookup; ProseOnly hint
Tables, // existing table names
Columns, // existing column names (filtered by current table)
Relationships, // existing relationship names
Types, // closed set from Type::all()
}
```
`Types` is new — it replaces the magic-string `TYPE_SLOT_LABEL`
used today. `src/dsl/ident_slot.rs` dissolves into
`src/dsl/grammar/mod.rs`.
### Highlight class assignment
Per-byte highlight class is computed as a side effect of the
walk. Each terminal records `(byte_range, class)` in
`WalkResult::per_byte_class` as it matches. Unmatched ranges
(past a definite failure) get the `tok_error` overlay,
identical to today's behaviour.
Default classes per terminal kind:
| Terminal | Default class |
|---|---|
| `Word` | `tok_keyword` |
| `Punct` | `tok_punct` |
| `Ident` | `tok_identifier` |
| `NumberLit` | `tok_number` |
| `StringLit` | `tok_string` |
| `BlobLit` | `tok_string` |
| `Flag` | `tok_flag` |
| `BarePath` | `tok_string` |
The `highlight_override: Option<HighlightClass>` field on
`Word` and `Ident` is reserved for future per-slot variants
(e.g., a Tables slot in a distinct shade vs a NewName slot
muted) — left `None` everywhere in round 1.
No new palette colours for the initial migration.
## Migration plan
### Code organisation
```
src/dsl/
grammar/
mod.rs — Node enum, IdentSource, HintMode, HighlightClass,
MatchedPath, CommandNode, REGISTRY top-level
data.rs — insert, update, delete, show
ddl.rs — create, drop, add, rename, change
app.rs — quit, help, save/save-as, new, load, rebuild,
export, import, mode, messages
shared.rs — typed value slots (int_slot, date_slot, …),
qualified_column, where_clause, action_keyword,
column_value_list (dynamic)
validators.rs — content validators (integer_only_validator,
date_format_validator, datetime_format_validator,
type_name_validator, …)
walker/
mod.rs — public walk() entry; orchestration
driver.rs — the per-node-kind dispatch
context.rs — WalkContext
outcome.rs — WalkOutcome, MatchedPath, WalkResult
lex_helpers.rs — identifier-shape, digit-shape, string-escape
helpers; shared across terminal consume fns
parser.rs — Phase A: becomes a router. Phase F: deleted.
lexer.rs — Phase F: deleted.
keyword.rs — Phase F: deleted.
ident_slot.rs — Phase F: dissolved into grammar/mod.rs.
usage.rs — Phase F: REGISTRY deleted; the file may go.
```
### Six-phase migration
**Phase A — Walker skeleton + app-lifecycle commands.**
- Build the walker driver, `WalkContext`, `WalkOutcome`,
`MatchedPath`, the terminal consume functions.
- Migrate the app-lifecycle commands (no schema dependency,
no value literals): quit, help, rebuild, save, save as, new,
load, export, import, mode, messages.
- Router in `parse_command` consults the walker for migrated
commands; falls back to chumsky for the rest.
- Differential test scaffolding: a test helper that, for every
input in the existing test corpus, runs both parsers and
asserts identical `Command` output where the input falls
under a migrated command.
Exit criteria: walker handles the app-lifecycle commands
end-to-end; existing tests for those commands pass via the
walker path; tests for other commands still pass via chumsky.
**Phase B — DDL commands without value literals.**
- drop table, drop column, drop relationship.
- rename column.
- add column (without the value-literal aspect — type slot
uses `Ident { source: Types }` with a content validator).
- add 1:n relationship (referential clauses as a static
sub-grammar).
- change column (type slot + flags).
These exercise schema lookups via `Ident { source: Tables }`
and `Ident { source: Columns }`, and the `Types` source. No
typed value slots yet, no `DynamicSubgrammar`.
Exit criteria: all DDL commands except `create table` pass
via the walker; the rest still pass via chumsky.
**Phase C — `create table` with column-list value literals.**
- The `with pk` clause uses `Repeated` for the column-spec
list, each spec being a `Seq(Ident{NewName}, Punct(':'),
Ident{Types}-with-validator)`.
- First test of `Repeated` with separator.
Exit criteria: create table works end-to-end via the walker.
**Phase D — data commands with full schema awareness.**
- show data, show table, replay.
- insert: uses `DynamicSubgrammar(column_value_list)` for the
comma-separated typed value list. Exercises full
`WalkContext` propagation: `Ident { source: Tables, role:
"table", writes_table: true }` resolves the column list;
the dynamic sub-grammar unfolds typed slots per column.
- update: `set` clauses use `DynamicSubgrammar` to resolve the
value slot's type from the column. `where` clause uses the
shared sub-grammar with `AnyValueSlot` (or, optionally, also
column-typed if the column resolves cleanly).
- delete: same `where` clause; otherwise simple.
This is the phase that proves the design's central claim:
typed slots, dynamic sub-grammars, and schema-aware narrowing
all collaborate to produce a single coherent grammar
declaration per command.
Exit criteria: all data commands pass via the walker; the
round-5 limitations close automatically (save Tab can offer
`as`, value slots narrow by column type).
**Phase E — replay end-to-end.**
- replay uses `BarePath` + `StringLit` (quoted form).
- Internally replays each line through the same dispatch
pipeline.
Exit criteria: replay works end-to-end via the walker; nested
replay rejection still fires from the runtime, unchanged.
**Phase F — cleanup.**
- Delete `dsl/parser.rs`.
- Delete `dsl/lexer.rs`.
- Delete `dsl/keyword.rs`.
- Delete `dsl/ident_slot.rs` (already merged into
`grammar/mod.rs` in Phase A).
- Delete `dsl/usage.rs::REGISTRY`.
- Delete `chumsky` dependency from `Cargo.toml`.
- Delete `parse.token.keyword.*` entries from the catalog and
`keys.rs` that the walker doesn't need (the keyword
vocabulary is implicit in the grammar nodes).
- Remove the differential test scaffolding from Phase A.
Exit criteria: working tree clean of legacy parser code;
test suite still all-green; `cargo clippy --all-targets --
-D warnings` passes; `cargo build --release` binary not
noticeably larger.
### Test discipline
Three guarantees throughout migration:
1. **Full test suite green at every commit.** Migration is
per-command; tests are per-behaviour. They don't care
which parser produces a `Command` — they assert input →
expected output. If a test fails mid-migration, the
walker hasn't reproduced behaviour; fix the walker
before continuing.
2. **Walker-specific tests for trie-only features.** Schema-
aware narrowing, `WalkContext` propagation, dynamic sub-
grammar expansion, `HintMode` per-node behaviour, the
round-5 "save Tab offers as" gap-closing — each gets new
tests as the feature lands.
3. **Differential check during the migration window.** A
test helper iterates the existing input corpus, runs both
parsers on inputs that fall under a migrated command, and
asserts identical `Command` output. Cheap insurance
against subtle divergence. Removed at Phase F cleanup.
### Cleanup pass at Phase F
Beyond deleting the legacy modules, Phase F includes catalog
cleanup. The `parse.token.keyword.*` entries (40+ of them) are
near-mechanical wrappers (`create: "\`create\`"`); with no
external code looking up these keys (the walker renders
keyword names from `Word` node literals directly), the
entries can collapse. A small `format_keyword_for_error(literal)
-> String` helper replaces them. The `keys.rs` declarations
go with them.
Help text in `help.cli_banner` and `help.in_app_body` stays
as hand-written prose — the alternative (auto-generating from
the grammar) was deferred during the round-6 discussion as a
separate concern; the grammar tree carries enough metadata
(per-command `help_id`) for future automation but the prose
documentation is still hand-curated for round 1.
## Consequences
### What we gain
- **One declaration per command.** Entry keyword, shape, AST
builder, dispatch handler, usage reference, help reference
all colocated. Adding a command is one block in one file.
- **No cross-file scatter.** The round-5 "10 places to remove
`q`" critique is structurally addressed: there's nowhere
else for keyword/usage/registry info to live but the
grammar tree.
- **Schema-aware narrowing from day one.** Typed value slots
reject mis-shaped input at parse time with localised error
wording; completion narrows per column type; the round-5
value-literal slot hint becomes type-specific
("Type a date as 'YYYY-MM-DD'") not generic.
- **Aliases as a single annotation.** `q` could come back as
one line on the `quit` `Word` node, no scatter.
- **Tests focus on behaviour, not enumeration.** Tests that
hardcoded keyword lists during round 5 (we noted these in
`usage.rs` and `completion.rs`) can iterate the trie
registry instead, becoming structural rather than
literal.
- **Drift is structurally impossible.** Completion, highlight,
parse, usage, and help all derive from the same trie. No
separate sources to keep in sync.
### What we accept
- **Parse depends on schema state.** A DSL command that
references a non-existent table fails at parse time, not
at execute time as today. This matches the user mental
model when typing (the schema cache is current per
ADR-0022) and yields better completion / hint
experience. It does mean tests that exercised parser
behaviour in isolation may now need to set up a schema
cache.
- **chumsky's general-purpose features go unused.** Recovery
on ambiguous input, multi-error reporting in a single
pass, ambiguous-grammar handling — features chumsky offers
but our DSL doesn't use. The trade is fine because our
grammar is deterministic.
- **Some implementation complexity moves into the walker.**
Whitespace skipping between siblings, terminal consume
functions, character-level shape recognition — the lexer
did some of this implicitly; the walker does it
explicitly. Net code is comparable or smaller because the
scatter cost goes away.
### What's out of scope for this ADR
- **External tooling integration (LSP, editor extensions).**
The registry is `pub` and accessible via accessor
functions, so future tooling work doesn't fight this design.
No tooling is built in round 1.
- **Help text auto-generation.** Grammar tree carries
`help_id` per node, but the help catalog body stays
hand-curated.
- **Performance optimisation.** Walker re-runs per keystroke
for completion + highlighting. Naïve implementation is
acceptable; if hot-path concerns emerge later, caching /
incremental walks become a separate ADR.
- **Ranker implementations.** The ranker hook exists; default
is identity. Frequency-based ranking, content-aware priors
for type completion ("Email → text first, Score → real"),
recency — all future work that plugs into the ranker
signature without touching grammar declarations.
- **Per-slot highlight overrides.** The `highlight_override`
field exists but stays `None` in round 1. Differentiating
table-ident from new-name-ident visually is a future
enhancement.
## References
- ADR-0023 — Unified declarative grammar tree (Proposed direction). Superseded by this ADR for execution detail.
- ADR-0001 — Language and TUI framework (chumsky choice). Phase F removes the chumsky dependency.
- ADR-0019 — Friendly error layer and i18n catalog. Catalog conventions stay; `parse.token.keyword.*` entries collapse in Phase F.
- ADR-0020 — Tokenization layer for the DSL parser. Superseded by the scannerless walker.
- ADR-0021 — Parser-as-source-of-truth for H1a. Usage info migrates from a separate registry to grammar nodes.
- ADR-0022 — Ambient typing assistance. The walker subsumes the expected-set introspection that powered completion in that ADR.
- Round-6 session transcript — design pass that produced this spec.
+2 -1
View File
@@ -28,4 +28,5 @@ This directory contains the project's ADRs, recorded per
- [ADR-0020 — Tokenization layer for the DSL parser](0020-tokenization-layer-for-the-dsl-parser.md) - [ADR-0020 — Tokenization layer for the DSL parser](0020-tokenization-layer-for-the-dsl-parser.md)
- [ADR-0021 — Parser-as-source-of-truth for H1a (per-command usage in parse errors)](0021-parser-as-source-of-truth-for-h1a.md) - [ADR-0021 — Parser-as-source-of-truth for H1a (per-command usage in parse errors)](0021-parser-as-source-of-truth-for-h1a.md)
- [ADR-0022 — Ambient typing assistance: colour, hint panel, completion (I3 + I4)](0022-ambient-typing-assistance.md) - [ADR-0022 — Ambient typing assistance: colour, hint panel, completion (I3 + I4)](0022-ambient-typing-assistance.md)
- [ADR-0023 — Unified declarative grammar tree](0023-proposed-unified-grammar-tree.md) — **Proposed** (researched direction, not yet accepted) - [ADR-0023 — Unified declarative grammar tree](0023-unified-grammar-tree.md) — direction (superseded for execution detail by ADR-0024)
- [ADR-0024 — Unified grammar tree: execution plan](0024-unified-grammar-tree-execution-plan.md) — **Accepted**, the executable spec