ADR-0024 Phase F (full) step 5: walker-driven completion

Replaces the `ParseError::Invalid::expected: Vec<String>` round-trip with structured `Expectation`s direct from the walker (ADR-0024 §architecture). The completion engine no longer parses formatted strings back into types — `Expectation::Ident { source, role }`, `Expectation::Word`, `Expectation::Literal`, `Expectation::Flag`, `Expectation::NumberLit`, and `Expectation::StringLit` are consumed as enum variants. New helper: - `walker::expected_at_input(source) -> Vec<Expectation>` consolidates the empty-input case (returns every CommandNode entry word), unknown-command-word case (also entry words), and walker-engaged case (Incomplete / Mismatch expectations) in one place. ValidationFailed and Match resolve to empty. `completion.rs` refactor: - `expected_at(leading)` wraps the walker helper; replaces the legacy string-based `expected_set`. - Keyword candidates: filter `Expectation::Word(w)` / `Expectation::Literal(s)` to alphabetic-only literals (no more string-parsing / `strip_backticks`). - Type names: detect `Expectation::Ident { source: IdentSource::Types }` directly (replaces the `TYPE_SLOT_LABEL` magic string). - Flag candidates: read `Expectation::Flag(body)` and format as `--{body}` (replaces backticked-string matching). - Composite-literal candidates: match against `Expectation::Literal("1")` (replaces the backticked-string `` `1` ``). - Schema identifiers: `Expectation::Ident { source, .. }` filtered by `source.completes_from_schema()`. - `is_value_literal_signature` checks for `Expectation::Word` values "null"/"true"/"false" and `Expectation::NumberLit` + `Expectation::StringLit` variants (replaces backticked-string matching). - `invalid_ident_at_cursor` and `typing_name_at_cursor` adopt the same path. The `typing_name_at_cursor` probe (substitute placeholder and re-parse) still goes through `parse_command` because the probe specifically wants the *post-name* expected set — `parse_command` + the string `expected` field carries that today. A future follow-up could thread the structured probe through `walker`, but the value-add is marginal. `COMPOSITE_CANDIDATES` opener key changes from `` `1` `` (the backticked-string the chumsky parser produced) to bare `"1"` (the Expectation::Literal payload). Touched modules: `dsl/walker/mod.rs` (new export), `src/completion.rs` (refactor). Tests: 806 passing, 0 failing, 1 ignored — every existing completion test passes unchanged, proving the structured path is behaviour-preserving. Clippy clean.
2026-05-15 17:26:08 +00:00
parent 044173bd39
commit bbe12524ab
2 changed files with 163 additions and 114 deletions
@@ -16,27 +16,18 @@
 use crate::dsl::grammar::IdentSource;
 use crate::dsl::types::Type;
 use crate::dsl::walker::outcome::Expectation;
 use crate::dsl::{ParseError, parse_command};
 /// Label emitted by `type_keyword` (in `dsl::parser`) when it
 /// expects a column-type token. Matches the `.labelled("type")`
 /// applied on the inner `select_ref!`. Centralised here so the
 /// completion engine and the parser agree on the magic string.
 const TYPE_SLOT_LABEL: &str = "type";
 /// Composite literal candidates whose lexed shape is more than
 /// one token but which the user types as a single fluent piece.
-/// Pairs of (parser-expected-opener, full-composite-text).
+/// Pairs of (walker-expected-literal, full-composite-text).
 ///
-/// The opener is the first token's backticked label as it
+/// When the walker reports `Expectation::Literal(opener)` at the
-/// appears in `ParseError::Invalid::expected` — when present,
+/// cursor, the engine surfaces the full composite text as a Tab
-/// the engine surfaces the full composite text as a Tab
+/// candidate. Today the only entry is `1:n` (the opener for
-/// candidate.
+/// `add 1:n relationship`) — adding more is a one-line edit.
-///
+const COMPOSITE_CANDIDATES: &[(&str, &str)] = &[("1", "1:n")];
 /// Currently the only entry is `1:n` (start of
 /// `add 1:n relationship`). New entries register here; no
 /// parser change required.
 const COMPOSITE_CANDIDATES: &[(&str, &str)] = &[("`1`", "1:n")];
 /// Per-project schema lookup cache (ADR-0022 §9).
 ///
@@ -65,6 +56,14 @@ impl SchemaCache {
    }
 }
 /// What the grammar would accept at the end of `leading`,
 /// expressed as structured `Expectation`s direct from the
 /// walker (ADR-0024 §architecture, Phase F walker-driven
 /// completion). Replaces the `ParseError`-string round-trip.
 fn expected_at(leading: &str) -> Vec<Expectation> {
    crate::dsl::walker::expected_at_input(leading)
 }
 /// A single Tab-insertable item with its source (so the
 /// renderer can colour keywords differently from schema
 /// identifiers, and so the ordering can group keywords first).
@@ -139,7 +138,7 @@ pub fn candidates_at_cursor(
    let partial_prefix = input[start..cursor].to_string();
    let leading = &input[..start];
-    let expected = expected_set(leading);
+    let expected = expected_at(leading);
    if expected.is_empty() {
        return None;
    }
@@ -149,13 +148,9 @@ pub fn candidates_at_cursor(
    // keyword literals — number / string-literal slots are
    // descriptive labels, not Tab candidates). Surfacing those
    // three is actively misleading — the user usually wants a
-    // number, a quoted string, or a date, and seeing just
+    // number, a quoted string, or a date. Suppress so the
-    // "null true false" implies those are *the* options. We
+    // ambient_hint ladder falls through to a prose hint with
-    // suppress the keyword candidates here; the ambient_hint
+    // format examples instead.
    // ladder falls through to a prose hint with format
    // examples instead. Once the user starts typing a prefix
    // (`n`, `tr`, `fa`) the normal keyword-completion path
    // applies — the suppression only triggers at empty prefix.
    if partial_prefix.is_empty() && is_value_literal_signature(&expected) {
        return None;
    }
@@ -163,35 +158,36 @@ pub fn candidates_at_cursor(
    let lowered_prefix = partial_prefix.to_lowercase();
    let matches_prefix = |s: &str| s.to_lowercase().starts_with(&lowered_prefix);
-    // Source 1: keyword candidates from the parser's
+    // Source 1: keyword candidates direct from the walker's
-    // expected-set. Preserve `expected`'s order — it reflects
+    // expected set. `Word(primary)` and `Literal(s)` both
-    // chumsky's source-order traversal of `or_not` / `choice`
+    // surface here; we keep only the alphabetic ones —
-    // chains, which matches the canonical command shape (e.g.
+    // single-digit literals like `1` go through the composite
-    // `to` before `table` for `add column [to] [table] …`).
+    // pipeline below, and punct never surfaces as a candidate.
    // Declaration order is preserved (matches the canonical
    // command shape, e.g. `to` before `table` for
    // `add column [to] [table] …`).
    let mut keywords: Vec<String> = expected
        .iter()
-        .filter_map(|item| strip_backticks(item))
+        .filter_map(|e| match e {
-        // Backticked items are walker `Expectation::Word`s or
+            Expectation::Word(w) | Expectation::Literal(w) => Some(*w),
-        // `Expectation::Literal`s. Keywords are the
+            _ => None,
-        // alphabetic-only ones; punct (`,`, `=`) and digit
+        })
-        // literals (`1`) live in the same expected-set but
+        .filter(|w| !w.is_empty() && w.chars().all(|c| c.is_ascii_alphabetic()))
        // surface through other candidate sources.
        .filter(|name| !name.is_empty() && name.chars().all(|c| c.is_ascii_alphabetic()))
        .map(str::to_string)
        .filter(|name| matches_prefix(name))
        .collect();
    let mut seen_kw = std::collections::HashSet::new();
    keywords.retain(|k| seen_kw.insert(k.clone()));
-    // Source 1.5: type-name candidates when the parser expects
+    // Source 1.5: type-name candidates when the walker expects
-    // a column-type slot. Type names live outside the Keyword
+    // a column-type slot. Type names are a closed set sourced
-    // enum (ADR-0020 §2 — type names stay as identifiers,
+    // from `Type::all()` (ADR-0005 declaration order:
-    // validated by Type::from_str), so they need their own
+    // text/int/real/decimal/bool/date/datetime/blob/serial/
-    // completion path. Preserve `Type::all()` declaration
+    // shortid). The walker surfaces this as
-    // order — that's text/int/real/decimal/bool/date/datetime/
+    // `Expectation::Ident { source: Types }`.
-    // blob/serial/shortid, the order a learner reads them in
+    let type_names: Vec<String> = if expected.iter().any(|e| {
-    // ADR-0005.
+        matches!(e, Expectation::Ident { source: IdentSource::Types, .. })
-    let type_names: Vec<String> = if expected.iter().any(|s| s == TYPE_SLOT_LABEL) {
+    }) {
        Type::all()
            .iter()
            .map(|t| t.keyword().to_string())
@@ -201,56 +197,66 @@ pub fn candidates_at_cursor(
        Vec::new()
    };
-    // Source 1.55: flag candidates (`--name`). Like type
+    // Source 1.55: flag candidates (`--name`). Surfaced as a
-    // names, flags live outside the Keyword enum — the parser
+    // distinct CandidateKind so the hint panel can colour them
-    // labels them as backticked literals like `` `--all-rows` ``.
+    // with `tok_flag` (matching how they'll appear after
-    // Surface them as a distinct CandidateKind so the hint
+    // insertion). The standard prefix matcher walks back over
-    // panel can colour them with `tok_flag` (matching how
+    // alphanumeric + underscore, which does NOT cross `-`, so
-    // they'll appear in the input pane after insertion).
+    // when the user types `--all` the partial is `all` — match
-    //
+    // the flag's body against that. Otherwise match the full
-    // The user can either Tab from a bare cursor position
+    // `--name` against the partial (which may be empty or start
-    // (partial empty) or after typing `--` (partial = "--").
+    // with `--`).
    // The standard prefix matcher walks back over alphanumeric +
    // underscore, which does NOT cross `-`, so when the user
    // types `--all` the partial is `all` — match the flag's
    // body against that. Otherwise match the full `--name`
    // against the partial (which may be empty or start with `--`).
    let flags: Vec<String> = expected
        .iter()
-        .filter_map(|item| strip_backticks(item))
+        .filter_map(|e| match e {
-        .filter(|name| name.starts_with("--"))
+            Expectation::Flag(name) => Some(*name),
-        .filter(|name| {
+            _ => None,
-            if partial_prefix.starts_with("--") || partial_prefix.is_empty() {
+        })
-                matches_prefix(name)
+        .filter(|body| {
            if partial_prefix.starts_with("--") {
                format!("--{body}")
                    .to_lowercase()
                    .starts_with(&lowered_prefix)
            } else if partial_prefix.is_empty() {
                true
            } else {
                // partial is the alphanumeric tail past `--`
                let body = &name[2..];
                body.to_lowercase().starts_with(&lowered_prefix)
            }
        })
-        .map(|name| name.to_string())
+        .map(|body| format!("--{body}"))
        .collect();
    // Source 1.6: composite-literal candidates. Some commands
-    // start with a multi-token literal sequence that the lexer
+    // start with a multi-token literal sequence that the user
-    // splits into Number/Punct/Identifier (e.g. `1:n` for
+    // types as a single fluent piece (e.g. `1:n` for
-    // `add 1:n relationship`). The parser's expected-set
+    // `add 1:n relationship`). The walker's expected-set
-    // surfaces just the first token (`` `1` ``), which would
+    // surfaces the first token only (`Expectation::Literal("1")`);
-    // otherwise be filtered out (not a Keyword variant). We
+    // the engine surfaces the full composite text so the user
-    // surface the full composite so the user can Tab through
+    // can Tab through without knowing the surface syntax.
    // without knowing the surface syntax.
    let composites: Vec<String> = COMPOSITE_CANDIDATES
        .iter()
-        .filter(|(opener, _)| expected.iter().any(|s| s == *opener))
+        .filter(|(opener, _)| {
            expected.iter().any(|e| match e {
                Expectation::Literal(l) | Expectation::Word(l) => *l == *opener,
                _ => false,
            })
        })
        .map(|(_, text)| (*text).to_string())
        .filter(|s| matches_prefix(s))
        .collect();
    // Source 2: schema identifiers — accumulated across every
-    // matching known-set slot. `NewName` slots return `&[]`.
+    // matching schema-listable `Ident { source }` expectation.
    // `NewName` / `Types` / `Free` sources don't query the
    // schema cache and contribute nothing here.
    let mut identifiers: Vec<String> = expected
        .iter()
-        .filter_map(|item| IdentSource::from_expected_label(item))
+        .filter_map(|e| match e {
            Expectation::Ident { source, .. } if source.completes_from_schema() => {
                Some(*source)
            }
            _ => None,
        })
        .flat_map(|source| cache.for_source(source).iter().cloned())
        .filter(|name| matches_prefix(name))
        .collect();
@@ -306,18 +312,22 @@ pub fn candidates_at_cursor(
    })
 }
 fn strip_backticks(s: &str) -> Option<&str> {
    s.strip_prefix('`').and_then(|s| s.strip_suffix('`'))
 }
 /// Detect a value-literal expected-set signature. A value-literal
-/// slot is the only position where chumsky's expected-set
+/// slot is the only position where the walker's expected-set
 /// simultaneously contains all five forms `null` / `true` /
-/// `false` / number / string literal. See the suppression rationale
+/// `false` / number / string literal. See the suppression
-/// at the call site in `candidates_at_cursor`.
+/// rationale at the call site in `candidates_at_cursor`.
-fn is_value_literal_signature(expected: &[String]) -> bool {
+fn is_value_literal_signature(expected: &[Expectation]) -> bool {
-    let has = |needle: &str| expected.iter().any(|e| e == needle);
+    let has_word = |needle: &str| {
-    has("`null`") && has("`true`") && has("`false`") && has("number") && has("string literal")
+        expected
            .iter()
            .any(|e| matches!(e, Expectation::Word(w) if *w == needle))
    };
    has_word("null")
        && has_word("true")
        && has_word("false")
        && expected.iter().any(|e| matches!(e, Expectation::NumberLit))
        && expected.iter().any(|e| matches!(e, Expectation::StringLit))
 }
 /// `Some(prose)` when the cursor sits at an empty-prefix value-literal slot.
@@ -350,7 +360,7 @@ pub fn value_literal_hint_at_cursor(input: &str, cursor: usize) -> Option<String
        return None;
    }
    let leading = &input[..start];
-    let expected = expected_set(leading);
+    let expected = expected_at(leading);
    if !is_value_literal_signature(&expected) {
        return None;
    }
@@ -410,11 +420,16 @@ pub fn typing_name_at_cursor(input: &str, cursor: usize) -> Option<TypingName> {
        }
    }
    let leading = &input[..start];
-    let expected = expected_set(leading);
+    let expected = expected_at(leading);
-    let is_new_name_slot = expected
+    let is_new_name_slot = expected.iter().any(|e| {
-        .iter()
+        matches!(
-        .filter_map(|item| IdentSource::from_expected_label(item))
+            e,
-        .any(|source| source == IdentSource::NewName);
+            Expectation::Ident {
                source: IdentSource::NewName,
                ..
            }
        )
    });
    if !is_new_name_slot {
        return None;
    }
@@ -485,15 +500,19 @@ pub fn invalid_ident_at_cursor(
    }
    let partial = &input[start..cursor];
    let leading = &input[..start];
-    let expected = expected_set(leading);
+    let expected = expected_at(leading);
    if expected.is_empty() {
        return None;
    }
-    // Find every known-set slot in the expected list.
+    // Find every schema-listable source in the expected list.
    let sources: Vec<IdentSource> = expected
        .iter()
-        .filter_map(|item| IdentSource::from_expected_label(item))
+        .filter_map(|e| match e {
-        .filter(|s| s.completes_from_schema())
+            Expectation::Ident { source, .. } if source.completes_from_schema() => {
                Some(*source)
            }
            _ => None,
        })
        .collect();
    if sources.is_empty() {
        return None;
@@ -521,23 +540,10 @@ pub fn invalid_ident_at_cursor(
    })
 }
-/// The expected-token set at the end of `leading`. Empty
+// `expected_set` is gone: the walker-driven `expected_at` above
-/// `leading` (whitespace only) yields every command-entry
+// returns structured `Expectation`s with full `IdentSource`
-/// keyword — there's no parser failure to drive this from, so
+// information, avoiding the lossy string round-trip the
-/// we synthesise it from the usage registry.
+// chumsky-era completion engine relied on.
 fn expected_set(leading: &str) -> Vec<String> {
    if leading.trim().is_empty() {
        return crate::dsl::grammar::entry_words_alphabetised()
            .into_iter()
            .map(|w| format!("`{w}`"))
            .collect();
    }
    match parse_command(leading) {
        Ok(_) => Vec::new(),
        Err(ParseError::Empty) => Vec::new(),
        Err(ParseError::Invalid { expected, .. }) => expected,
    }
 }
 /// Snapshot of a freshly-inserted completion. The memo lives
 /// on `App::last_completion` until any non-Tab/non-Shift-Tab
@@ -30,6 +30,49 @@ use crate::dsl::walker::outcome::{
 pub use context::ColumnInfo;
 pub use highlight::highlight_runs;
 /// What the grammar would accept at the end of `source`
 /// (ADR-0024 §architecture, Phase F walker-driven completion).
 ///
 /// Empty / whitespace-only input yields every command-entry word
 /// as `Expectation::Word(primary)`. Otherwise the walker is
 /// driven to `EndOfInput`; if the input completes a command,
 /// the result is empty; if it fails or is incomplete, the
 /// walker's expected-set surfaces verbatim — `Ident { source,
 /// role }` carries its `IdentSource` (so the completion engine
 /// can schema-look-up without a string round-trip), `Word` /
 /// `Literal` carry their primary literal, etc.
 ///
 /// Inputs whose first token is not a registered entry word
 /// fall back to listing every entry word — matches the
 /// synthetic "unknown command" expectation set the parser
 /// produces.
 #[must_use]
 pub fn expected_at_input(source: &str) -> Vec<outcome::Expectation> {
    use crate::dsl::grammar::REGISTRY;
    if source.trim().is_empty() {
        return REGISTRY
            .iter()
            .map(|c| outcome::Expectation::Word(c.entry.primary))
            .collect();
    }
    let mut ctx = context::WalkContext::new();
    let (result, _cmd) = walk(source, outcome::WalkBound::EndOfInput, &mut ctx);
    match result.map(|r| r.outcome) {
        Some(outcome::WalkOutcome::Match { .. }) => Vec::new(),
        Some(outcome::WalkOutcome::Incomplete { expected, .. }) => expected,
        Some(outcome::WalkOutcome::Mismatch { expected, .. }) => expected,
        Some(outcome::WalkOutcome::ValidationFailed { .. }) => Vec::new(),
        // Walker didn't engage (unknown entry word): the
        // completion engine should still surface the available
        // entry words so the user can recover.
        None => REGISTRY
            .iter()
            .map(|c| outcome::Expectation::Word(c.entry.primary))
            .collect(),
    }
 }
 /// Public walk entry. `bound` is `EndOfInput` for parse;
 /// `Position(cursor)` for completion / hint (Phase A: not yet
 /// wired).