ADR-0024 Phase A: walker framework + app-lifecycle commands
Stand up the unified-grammar tree walker alongside the existing
chumsky parser and migrate the eleven app-lifecycle commands
(quit, help, rebuild, save / save as, new, load, export, import,
mode, messages) end-to-end. The router in parse_tokens consults
the walker first; non-migrated commands still fall through to
chumsky.
Scope:
- src/dsl/grammar/{mod,app}.rs: Node enum (13 kinds), Word /
IdentSource / HintMode / HighlightClass / ValidationError /
CommandNode types, REGISTRY of the eleven app commands.
- src/dsl/walker/{mod,driver,context,outcome,lex_helpers}.rs:
scannerless byte-level walker, per-node-kind dispatch with
Choice/Seq/Optional backtracking, WalkContext (Phase B-D
schema fields stubbed), WalkOutcome with Match/Incomplete/
Mismatch/ValidationFailed.
- src/dsl/parser.rs: try_walker_route() runs first in
parse_tokens; bridge converts WalkOutcome to ParseError
preserving catalog wording (mode.unknown / messages.unknown
surface verbatim via friendly::translate). Legacy
try_parse_app_path_command deleted; chumsky's bare-keyword
app branches remain unreachable until Phase F sweep.
Walker design choices worth noting:
- mode <value> / messages <value> use Choice(Word, Word, Ident)
so known keywords appear in the expected-set; the trailing
Ident catch-all funnels unknown values into the friendly
validator that always errors with the catalog wording.
- save / save as is one CommandNode (Optional(Word("as"))) -
closes the round-5 "save Tab can't offer as" limitation
structurally.
- Path-bearing UX shipped per ADR-0024: BarePath terminates at
whitespace; paths with spaces use the (not-yet-wired) quoted
form. Existing tests pass on the new shape.
Tests:
- 28 new walker-specific tests in dsl::walker::tests covering
every app-lifecycle command, friendly-error wording for
mode/messages unknown values, trailing-garbage detection,
whitespace tolerance, and routing fall-through.
- Total: 805 passed, 0 failed, 1 ignored (was 777 / 1).
- cargo clippy --all-targets -- -D warnings clean.
This commit is contained in:
@@ -0,0 +1,247 @@
|
||||
//! Unified declarative grammar tree (ADR-0024).
|
||||
//!
|
||||
//! The grammar tree is the single source of truth for the DSL —
|
||||
//! parsing, completion, syntax highlighting, parse-error usage
|
||||
//! rendering, and hint-panel content all derive from this same
|
||||
//! data structure (ADR-0023 institutional context).
|
||||
//!
|
||||
//! Phase A scope (ADR-0024 §migration): the framework lands
|
||||
//! alongside the eleven app-lifecycle commands (quit, help,
|
||||
//! rebuild, save, save as, new, load, export, import, mode,
|
||||
//! messages). The chumsky parser still owns every other
|
||||
//! command; the router in `dsl::parser` decides which path to
|
||||
//! take per first-token. Schema-aware nodes (`IdentSource::Tables`
|
||||
//! and friends) and `DynamicSubgrammar` are declared here but
|
||||
//! not exercised until Phase B-D.
|
||||
//!
|
||||
//! The shape of `Node` mirrors ADR-0024 §node-taxonomy with one
|
||||
//! pragmatic addition for Phase A: each `Ident` carries an
|
||||
//! optional content validator, used today by the `mode <value>`
|
||||
//! / `messages <value>` slots to surface friendly catalog
|
||||
//! wording (`mode.unknown`, `messages.unknown`) on out-of-set
|
||||
//! identifiers. The same hook generalises naturally to typed
|
||||
//! value slots in Phase D.
|
||||
|
||||
pub mod app;
|
||||
|
||||
use crate::dsl::command::Command;
|
||||
use crate::dsl::walker::context::WalkContext;
|
||||
use crate::dsl::walker::outcome::MatchedPath;
|
||||
|
||||
/// Highlight class assigned to a matched terminal.
|
||||
///
|
||||
/// Phase A records these on the `WalkResult::per_byte_class`
|
||||
/// slice; the existing input-renderer (chumsky-driven) still
|
||||
/// owns the user-visible highlight today.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
#[allow(dead_code)]
|
||||
pub enum HighlightClass {
|
||||
Keyword,
|
||||
Identifier,
|
||||
Number,
|
||||
String,
|
||||
Punct,
|
||||
Flag,
|
||||
Error,
|
||||
}
|
||||
|
||||
/// Where an `Ident` slot's candidates come from at completion time.
|
||||
///
|
||||
/// Phase A only exercises `NewName` (the `import … as <target>`
|
||||
/// slot) and `Free` (the catch-all branch in `mode`/`messages`
|
||||
/// that funnels unknown values into a friendly validator). The
|
||||
/// schema-aware variants land in Phase B-D.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
pub enum IdentSource {
|
||||
/// User invents this name. No schema lookup; no completion
|
||||
/// candidates beyond the identifier shape itself.
|
||||
NewName,
|
||||
/// Existing table name. Phase B+.
|
||||
#[allow(dead_code)]
|
||||
Tables,
|
||||
/// Existing column in the current table. Phase B+.
|
||||
#[allow(dead_code)]
|
||||
Columns,
|
||||
/// Existing relationship name. Phase B+.
|
||||
#[allow(dead_code)]
|
||||
Relationships,
|
||||
/// Closed set from `Type::all()`. Phase B+.
|
||||
#[allow(dead_code)]
|
||||
Types,
|
||||
/// Any identifier shape; used by synthetic catch-all branches
|
||||
/// (e.g., the unknown-value branch of `mode <value>`).
|
||||
Free,
|
||||
}
|
||||
|
||||
/// Hint-panel mode for an expected node.
|
||||
///
|
||||
/// Phase A defaults to `Default`; the `ProseOnly` variant
|
||||
/// attaches to typed value slots in Phase D so the hint reads
|
||||
/// "Type a date as 'YYYY-MM-DD'" rather than candidate-cycling.
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
#[allow(dead_code)]
|
||||
pub enum HintMode {
|
||||
Default,
|
||||
ForceProse(&'static str),
|
||||
ProseOnly(&'static str),
|
||||
SuppressProse,
|
||||
}
|
||||
|
||||
/// A keyword node literal.
|
||||
///
|
||||
/// The `aliases` slice is empty for the app-lifecycle commands
|
||||
/// today; the round-5 `q` removal remains intentional, and any
|
||||
/// future re-introduction would be a one-line `aliases: &["q"]`
|
||||
/// addition (ADR-0024 §aliases).
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
pub struct Word {
|
||||
pub primary: &'static str,
|
||||
pub aliases: &'static [&'static str],
|
||||
pub highlight_override: Option<HighlightClass>,
|
||||
}
|
||||
|
||||
impl Word {
|
||||
pub const fn keyword(primary: &'static str) -> Self {
|
||||
Self {
|
||||
primary,
|
||||
aliases: &[],
|
||||
highlight_override: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Case-insensitive match against the primary or any alias.
|
||||
pub fn matches(&self, candidate: &str) -> bool {
|
||||
if candidate.eq_ignore_ascii_case(self.primary) {
|
||||
return true;
|
||||
}
|
||||
self.aliases
|
||||
.iter()
|
||||
.any(|a| candidate.eq_ignore_ascii_case(a))
|
||||
}
|
||||
}
|
||||
|
||||
/// Content-level validator for an `Ident` slot. Returns the
|
||||
/// catalog key + arg list to surface as `WalkOutcome::ValidationFailed`
|
||||
/// on mismatch.
|
||||
pub type IdentValidator = fn(matched: &str) -> Result<(), ValidationError>;
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub struct ValidationError {
|
||||
pub message_key: &'static str,
|
||||
pub args: Vec<(&'static str, String)>,
|
||||
}
|
||||
|
||||
/// The grammar-tree node taxonomy (ADR-0024 §node-taxonomy).
|
||||
///
|
||||
/// Some variants carry data (`Word` literal, `Punct` char,
|
||||
/// `Ident` source/role/validator); combinators reference their
|
||||
/// children through `&'static [Node]` / `&'static Node` slices,
|
||||
/// which lets the entire registry live in `const`s — no runtime
|
||||
/// allocation, every command is one declaration block in its
|
||||
/// grammar file.
|
||||
pub enum Node {
|
||||
/// A keyword token. Case-insensitive match (ADR-0009).
|
||||
Word(Word),
|
||||
/// A single punctuation character. The exact set comes from
|
||||
/// the migrated commands' usage — Phase A only needs none of
|
||||
/// these (app-lifecycle commands are pure keyword + ident +
|
||||
/// path), but the variant is declared for Phase B+ use.
|
||||
#[allow(dead_code)]
|
||||
Punct(char),
|
||||
/// An identifier slot. `source` drives completion candidates;
|
||||
/// `role` names the slot for error wording / completion-engine
|
||||
/// dispatch; `validator` runs after a successful identifier-
|
||||
/// shape match and may reject the value with a catalog-driven
|
||||
/// message.
|
||||
Ident {
|
||||
source: IdentSource,
|
||||
role: &'static str,
|
||||
validator: Option<IdentValidator>,
|
||||
#[allow(dead_code)]
|
||||
highlight_override: Option<HighlightClass>,
|
||||
},
|
||||
#[allow(dead_code)]
|
||||
NumberLit,
|
||||
#[allow(dead_code)]
|
||||
StringLit,
|
||||
#[allow(dead_code)]
|
||||
BlobLit,
|
||||
#[allow(dead_code)]
|
||||
Flag(&'static str),
|
||||
/// A non-whitespace run consumed verbatim from source. Per
|
||||
/// ADR-0024's path-bearing-commands UX change, paths with
|
||||
/// spaces use the quoted form (`StringLit`); `BarePath`
|
||||
/// terminates at the first whitespace byte.
|
||||
BarePath,
|
||||
/// Try each child in order. The first one that matches a
|
||||
/// non-empty prefix wins; if none match, the choice fails
|
||||
/// with the union of expectations.
|
||||
Choice(&'static [Self]),
|
||||
/// All children must match in order. Whitespace is implicitly
|
||||
/// allowed between siblings.
|
||||
Seq(&'static [Self]),
|
||||
/// The inner node may match or be skipped.
|
||||
Optional(&'static Self),
|
||||
/// `inner` matches at least `min` times, separated by
|
||||
/// `separator` (if any). Phase C+ uses this for `with pk`
|
||||
/// column lists.
|
||||
#[allow(dead_code)]
|
||||
Repeated {
|
||||
inner: &'static Self,
|
||||
separator: Option<&'static Self>,
|
||||
min: usize,
|
||||
},
|
||||
/// Resolves at walk time using the active `WalkContext`.
|
||||
/// Phase D+ uses this for `column_value_list`.
|
||||
#[allow(dead_code)]
|
||||
DynamicSubgrammar(fn(&WalkContext) -> Self),
|
||||
}
|
||||
|
||||
/// Top-level entry record. One per command. The `entry` keyword
|
||||
/// alone identifies which command the walker dispatches to;
|
||||
/// `shape` is what follows the entry word.
|
||||
pub struct CommandNode {
|
||||
pub entry: Word,
|
||||
pub shape: Node,
|
||||
/// Builds the typed `Command` AST from the matched terminal
|
||||
/// path. May fail with a `ValidationError` for content-level
|
||||
/// rejections that are easier to express imperatively than
|
||||
/// as a per-node validator (Phase A: none — every app
|
||||
/// command's ast_builder is infallible).
|
||||
pub ast_builder: fn(&MatchedPath) -> Result<Command, ValidationError>,
|
||||
#[allow(dead_code)]
|
||||
pub help_id: Option<&'static str>,
|
||||
#[allow(dead_code)]
|
||||
pub usage_id: Option<&'static str>,
|
||||
#[allow(dead_code)]
|
||||
pub hint_mode: Option<HintMode>,
|
||||
}
|
||||
|
||||
/// The active grammar registry. Phase A: the eleven app-lifecycle
|
||||
/// commands. Migrated commands route through this; everything
|
||||
/// else falls through to the chumsky path in `dsl::parser`.
|
||||
pub static REGISTRY: &[&CommandNode] = &[
|
||||
&app::QUIT,
|
||||
&app::HELP,
|
||||
&app::REBUILD,
|
||||
&app::SAVE,
|
||||
&app::NEW,
|
||||
&app::LOAD,
|
||||
&app::EXPORT,
|
||||
&app::IMPORT,
|
||||
&app::MODE,
|
||||
&app::MESSAGES,
|
||||
];
|
||||
|
||||
/// Look up a `CommandNode` by entry word, case-insensitively.
|
||||
///
|
||||
/// Used by the router to decide whether the walker owns this
|
||||
/// input. Returns the index into `REGISTRY` so callers can
|
||||
/// later use it as a `WalkOutcome::Match { command_idx }`.
|
||||
pub fn command_for_entry_word(word: &str) -> Option<(usize, &'static CommandNode)> {
|
||||
REGISTRY
|
||||
.iter()
|
||||
.enumerate()
|
||||
.find(|(_, c)| c.entry.matches(word))
|
||||
.map(|(i, c)| (i, *c))
|
||||
}
|
||||
Reference in New Issue
Block a user