ADR-0024 Phase F (full) step 5: walker-driven completion

Replaces the `ParseError::Invalid::expected: Vec<String>` round-trip with structured `Expectation`s direct from the walker (ADR-0024 §architecture). The completion engine no longer parses formatted strings back into types — `Expectation::Ident { source, role }`, `Expectation::Word`, `Expectation::Literal`, `Expectation::Flag`, `Expectation::NumberLit`, and `Expectation::StringLit` are consumed as enum variants. New helper: - `walker::expected_at_input(source) -> Vec<Expectation>` consolidates the empty-input case (returns every CommandNode entry word), unknown-command-word case (also entry words), and walker-engaged case (Incomplete / Mismatch expectations) in one place. ValidationFailed and Match resolve to empty. `completion.rs` refactor: - `expected_at(leading)` wraps the walker helper; replaces the legacy string-based `expected_set`. - Keyword candidates: filter `Expectation::Word(w)` / `Expectation::Literal(s)` to alphabetic-only literals (no more string-parsing / `strip_backticks`). - Type names: detect `Expectation::Ident { source: IdentSource::Types }` directly (replaces the `TYPE_SLOT_LABEL` magic string). - Flag candidates: read `Expectation::Flag(body)` and format as `--{body}` (replaces backticked-string matching). - Composite-literal candidates: match against `Expectation::Literal("1")` (replaces the backticked-string `` `1` ``). - Schema identifiers: `Expectation::Ident { source, .. }` filtered by `source.completes_from_schema()`. - `is_value_literal_signature` checks for `Expectation::Word` values "null"/"true"/"false" and `Expectation::NumberLit` + `Expectation::StringLit` variants (replaces backticked-string matching). - `invalid_ident_at_cursor` and `typing_name_at_cursor` adopt the same path. The `typing_name_at_cursor` probe (substitute placeholder and re-parse) still goes through `parse_command` because the probe specifically wants the *post-name* expected set — `parse_command` + the string `expected` field carries that today. A future follow-up could thread the structured probe through `walker`, but the value-add is marginal. `COMPOSITE_CANDIDATES` opener key changes from `` `1` `` (the backticked-string the chumsky parser produced) to bare `"1"` (the Expectation::Literal payload). Touched modules: `dsl/walker/mod.rs` (new export), `src/completion.rs` (refactor). Tests: 806 passing, 0 failing, 1 ignored — every existing completion test passes unchanged, proving the structured path is behaviour-preserving. Clippy clean.
2026-05-15 17:26:08 +00:00
parent 044173bd39
commit bbe12524ab
2 changed files with 163 additions and 114 deletions
@@ -16,27 +16,18 @@

 use crate::dsl::grammar::IdentSource;
 use crate::dsl::types::Type;
+use crate::dsl::walker::outcome::Expectation;
 use crate::dsl::{ParseError, parse_command};

-/// Label emitted by `type_keyword` (in `dsl::parser`) when it
-/// expects a column-type token. Matches the `.labelled("type")`
-/// applied on the inner `select_ref!`. Centralised here so the
-/// completion engine and the parser agree on the magic string.
-const TYPE_SLOT_LABEL: &str = "type";
-
 /// Composite literal candidates whose lexed shape is more than
 /// one token but which the user types as a single fluent piece.
-/// Pairs of (parser-expected-opener, full-composite-text).
+/// Pairs of (walker-expected-literal, full-composite-text).
 ///
-/// The opener is the first token's backticked label as it
-/// appears in `ParseError::Invalid::expected` — when present,
-/// the engine surfaces the full composite text as a Tab
-/// candidate.
-///
-/// Currently the only entry is `1:n` (start of
-/// `add 1:n relationship`). New entries register here; no
-/// parser change required.
-const COMPOSITE_CANDIDATES: &[(&str, &str)] = &[("`1`", "1:n")];
+/// When the walker reports `Expectation::Literal(opener)` at the
+/// cursor, the engine surfaces the full composite text as a Tab
+/// candidate. Today the only entry is `1:n` (the opener for
+/// `add 1:n relationship`) — adding more is a one-line edit.
+const COMPOSITE_CANDIDATES: &[(&str, &str)] = &[("1", "1:n")];

 /// Per-project schema lookup cache (ADR-0022 §9).
 ///
@@ -65,6 +56,14 @@ impl SchemaCache {
    }
 }

+/// What the grammar would accept at the end of `leading`,
+/// expressed as structured `Expectation`s direct from the
+/// walker (ADR-0024 §architecture, Phase F walker-driven
+/// completion). Replaces the `ParseError`-string round-trip.
+fn expected_at(leading: &str) -> Vec<Expectation> {
+    crate::dsl::walker::expected_at_input(leading)
+}
+
 /// A single Tab-insertable item with its source (so the
 /// renderer can colour keywords differently from schema
 /// identifiers, and so the ordering can group keywords first).
@@ -139,7 +138,7 @@ pub fn candidates_at_cursor(
    let partial_prefix = input[start..cursor].to_string();
    let leading = &input[..start];

-    let expected = expected_set(leading);
+    let expected = expected_at(leading);
    if expected.is_empty() {
        return None;
    }
@@ -149,13 +148,9 @@ pub fn candidates_at_cursor(
    // keyword literals — number / string-literal slots are
    // descriptive labels, not Tab candidates). Surfacing those
    // three is actively misleading — the user usually wants a
-    // number, a quoted string, or a date, and seeing just
-    // "null true false" implies those are *the* options. We
-    // suppress the keyword candidates here; the ambient_hint
-    // ladder falls through to a prose hint with format
-    // examples instead. Once the user starts typing a prefix
-    // (`n`, `tr`, `fa`) the normal keyword-completion path
-    // applies — the suppression only triggers at empty prefix.
+    // number, a quoted string, or a date. Suppress so the
+    // ambient_hint ladder falls through to a prose hint with
+    // format examples instead.
    if partial_prefix.is_empty() && is_value_literal_signature(&expected) {
        return None;
    }
@@ -163,35 +158,36 @@ pub fn candidates_at_cursor(
    let lowered_prefix = partial_prefix.to_lowercase();
    let matches_prefix = |s: &str| s.to_lowercase().starts_with(&lowered_prefix);

-    // Source 1: keyword candidates from the parser's
-    // expected-set. Preserve `expected`'s order — it reflects
-    // chumsky's source-order traversal of `or_not` / `choice`
-    // chains, which matches the canonical command shape (e.g.
-    // `to` before `table` for `add column [to] [table] …`).
+    // Source 1: keyword candidates direct from the walker's
+    // expected set. `Word(primary)` and `Literal(s)` both
+    // surface here; we keep only the alphabetic ones —
+    // single-digit literals like `1` go through the composite
+    // pipeline below, and punct never surfaces as a candidate.
+    // Declaration order is preserved (matches the canonical
+    // command shape, e.g. `to` before `table` for
+    // `add column [to] [table] …`).
    let mut keywords: Vec<String> = expected
        .iter()
-        .filter_map(|item| strip_backticks(item))
-        // Backticked items are walker `Expectation::Word`s or
-        // `Expectation::Literal`s. Keywords are the
-        // alphabetic-only ones; punct (`,`, `=`) and digit
-        // literals (`1`) live in the same expected-set but
-        // surface through other candidate sources.
-        .filter(|name| !name.is_empty() && name.chars().all(|c| c.is_ascii_alphabetic()))
+        .filter_map(|e| match e {
+            Expectation::Word(w) | Expectation::Literal(w) => Some(*w),
+            _ => None,
+        })
+        .filter(|w| !w.is_empty() && w.chars().all(|c| c.is_ascii_alphabetic()))
        .map(str::to_string)
        .filter(|name| matches_prefix(name))
        .collect();
    let mut seen_kw = std::collections::HashSet::new();
    keywords.retain(|k| seen_kw.insert(k.clone()));

-    // Source 1.5: type-name candidates when the parser expects
-    // a column-type slot. Type names live outside the Keyword
-    // enum (ADR-0020 §2 — type names stay as identifiers,
-    // validated by Type::from_str), so they need their own
-    // completion path. Preserve `Type::all()` declaration
-    // order — that's text/int/real/decimal/bool/date/datetime/
-    // blob/serial/shortid, the order a learner reads them in
-    // ADR-0005.
-    let type_names: Vec<String> = if expected.iter().any(|s| s == TYPE_SLOT_LABEL) {
+    // Source 1.5: type-name candidates when the walker expects
+    // a column-type slot. Type names are a closed set sourced
+    // from `Type::all()` (ADR-0005 declaration order:
+    // text/int/real/decimal/bool/date/datetime/blob/serial/
+    // shortid). The walker surfaces this as
+    // `Expectation::Ident { source: Types }`.
+    let type_names: Vec<String> = if expected.iter().any(|e| {
+        matches!(e, Expectation::Ident { source: IdentSource::Types, .. })
+    }) {
        Type::all()
            .iter()
            .map(|t| t.keyword().to_string())
@@ -201,56 +197,66 @@ pub fn candidates_at_cursor(
        Vec::new()
    };

-    // Source 1.55: flag candidates (`--name`). Like type
-    // names, flags live outside the Keyword enum — the parser
-    // labels them as backticked literals like `` `--all-rows` ``.
-    // Surface them as a distinct CandidateKind so the hint
-    // panel can colour them with `tok_flag` (matching how
-    // they'll appear in the input pane after insertion).
-    //
-    // The user can either Tab from a bare cursor position
-    // (partial empty) or after typing `--` (partial = "--").
-    // The standard prefix matcher walks back over alphanumeric +
-    // underscore, which does NOT cross `-`, so when the user
-    // types `--all` the partial is `all` — match the flag's
-    // body against that. Otherwise match the full `--name`
-    // against the partial (which may be empty or start with `--`).
+    // Source 1.55: flag candidates (`--name`). Surfaced as a
+    // distinct CandidateKind so the hint panel can colour them
+    // with `tok_flag` (matching how they'll appear after
+    // insertion). The standard prefix matcher walks back over
+    // alphanumeric + underscore, which does NOT cross `-`, so
+    // when the user types `--all` the partial is `all` — match
+    // the flag's body against that. Otherwise match the full
+    // `--name` against the partial (which may be empty or start
+    // with `--`).
    let flags: Vec<String> = expected
        .iter()
-        .filter_map(|item| strip_backticks(item))
-        .filter(|name| name.starts_with("--"))
-        .filter(|name| {
-            if partial_prefix.starts_with("--") || partial_prefix.is_empty() {
-                matches_prefix(name)
+        .filter_map(|e| match e {
+            Expectation::Flag(name) => Some(*name),
+            _ => None,
+        })
+        .filter(|body| {
+            if partial_prefix.starts_with("--") {
+                format!("--{body}")
+                    .to_lowercase()
+                    .starts_with(&lowered_prefix)
+            } else if partial_prefix.is_empty() {
+                true
            } else {
-                // partial is the alphanumeric tail past `--`
-                let body = &name[2..];
                body.to_lowercase().starts_with(&lowered_prefix)
            }
        })
-        .map(|name| name.to_string())
+        .map(|body| format!("--{body}"))
        .collect();

    // Source 1.6: composite-literal candidates. Some commands
-    // start with a multi-token literal sequence that the lexer
-    // splits into Number/Punct/Identifier (e.g. `1:n` for
-    // `add 1:n relationship`). The parser's expected-set
-    // surfaces just the first token (`` `1` ``), which would
-    // otherwise be filtered out (not a Keyword variant). We
-    // surface the full composite so the user can Tab through
-    // without knowing the surface syntax.
+    // start with a multi-token literal sequence that the user
+    // types as a single fluent piece (e.g. `1:n` for
+    // `add 1:n relationship`). The walker's expected-set
+    // surfaces the first token only (`Expectation::Literal("1")`);
+    // the engine surfaces the full composite text so the user
+    // can Tab through without knowing the surface syntax.
    let composites: Vec<String> = COMPOSITE_CANDIDATES
        .iter()
-        .filter(|(opener, _)| expected.iter().any(|s| s == *opener))
+        .filter(|(opener, _)| {
+            expected.iter().any(|e| match e {
+                Expectation::Literal(l) | Expectation::Word(l) => *l == *opener,
+                _ => false,
+            })
+        })
        .map(|(_, text)| (*text).to_string())
        .filter(|s| matches_prefix(s))
        .collect();

    // Source 2: schema identifiers — accumulated across every
-    // matching known-set slot. `NewName` slots return `&[]`.
+    // matching schema-listable `Ident { source }` expectation.
+    // `NewName` / `Types` / `Free` sources don't query the
+    // schema cache and contribute nothing here.
    let mut identifiers: Vec<String> = expected
        .iter()
-        .filter_map(|item| IdentSource::from_expected_label(item))
+        .filter_map(|e| match e {
+            Expectation::Ident { source, .. } if source.completes_from_schema() => {
+                Some(*source)
+            }
+            _ => None,
+        })
        .flat_map(|source| cache.for_source(source).iter().cloned())
        .filter(|name| matches_prefix(name))
        .collect();
@@ -306,18 +312,22 @@ pub fn candidates_at_cursor(
    })
 }

-fn strip_backticks(s: &str) -> Option<&str> {
-    s.strip_prefix('`').and_then(|s| s.strip_suffix('`'))
-}
-
 /// Detect a value-literal expected-set signature. A value-literal
-/// slot is the only position where chumsky's expected-set
+/// slot is the only position where the walker's expected-set
 /// simultaneously contains all five forms `null` / `true` /
-/// `false` / number / string literal. See the suppression rationale
-/// at the call site in `candidates_at_cursor`.
-fn is_value_literal_signature(expected: &[String]) -> bool {
-    let has = |needle: &str| expected.iter().any(|e| e == needle);
-    has("`null`") && has("`true`") && has("`false`") && has("number") && has("string literal")
+/// `false` / number / string literal. See the suppression
+/// rationale at the call site in `candidates_at_cursor`.
+fn is_value_literal_signature(expected: &[Expectation]) -> bool {
+    let has_word = |needle: &str| {
+        expected
+            .iter()
+            .any(|e| matches!(e, Expectation::Word(w) if *w == needle))
+    };
+    has_word("null")
+        && has_word("true")
+        && has_word("false")
+        && expected.iter().any(|e| matches!(e, Expectation::NumberLit))
+        && expected.iter().any(|e| matches!(e, Expectation::StringLit))
 }

 /// `Some(prose)` when the cursor sits at an empty-prefix value-literal slot.
@@ -350,7 +360,7 @@ pub fn value_literal_hint_at_cursor(input: &str, cursor: usize) -> Option<String
        return None;
    }
    let leading = &input[..start];
-    let expected = expected_set(leading);
+    let expected = expected_at(leading);
    if !is_value_literal_signature(&expected) {
        return None;
    }
@@ -410,11 +420,16 @@ pub fn typing_name_at_cursor(input: &str, cursor: usize) -> Option<TypingName> {
        }
    }
    let leading = &input[..start];
-    let expected = expected_set(leading);
-    let is_new_name_slot = expected
-        .iter()
-        .filter_map(|item| IdentSource::from_expected_label(item))
-        .any(|source| source == IdentSource::NewName);
+    let expected = expected_at(leading);
+    let is_new_name_slot = expected.iter().any(|e| {
+        matches!(
+            e,
+            Expectation::Ident {
+                source: IdentSource::NewName,
+                ..
+            }
+        )
+    });
    if !is_new_name_slot {
        return None;
    }
@@ -485,15 +500,19 @@ pub fn invalid_ident_at_cursor(
    }
    let partial = &input[start..cursor];
    let leading = &input[..start];
-    let expected = expected_set(leading);
+    let expected = expected_at(leading);
    if expected.is_empty() {
        return None;
    }
-    // Find every known-set slot in the expected list.
+    // Find every schema-listable source in the expected list.
    let sources: Vec<IdentSource> = expected
        .iter()
-        .filter_map(|item| IdentSource::from_expected_label(item))
-        .filter(|s| s.completes_from_schema())
+        .filter_map(|e| match e {
+            Expectation::Ident { source, .. } if source.completes_from_schema() => {
+                Some(*source)
+            }
+            _ => None,
+        })
        .collect();
    if sources.is_empty() {
        return None;
@@ -521,23 +540,10 @@ pub fn invalid_ident_at_cursor(
    })
 }

-/// The expected-token set at the end of `leading`. Empty
-/// `leading` (whitespace only) yields every command-entry
-/// keyword — there's no parser failure to drive this from, so
-/// we synthesise it from the usage registry.
-fn expected_set(leading: &str) -> Vec<String> {
-    if leading.trim().is_empty() {
-        return crate::dsl::grammar::entry_words_alphabetised()
-            .into_iter()
-            .map(|w| format!("`{w}`"))
-            .collect();
-    }
-    match parse_command(leading) {
-        Ok(_) => Vec::new(),
-        Err(ParseError::Empty) => Vec::new(),
-        Err(ParseError::Invalid { expected, .. }) => expected,
-    }
-}
+// `expected_set` is gone: the walker-driven `expected_at` above
+// returns structured `Expectation`s with full `IdentSource`
+// information, avoiding the lossy string round-trip the
+// chumsky-era completion engine relied on.

 /// Snapshot of a freshly-inserted completion. The memo lives
 /// on `App::last_completion` until any non-Tab/non-Shift-Tab