walker: 2e prereq — §10.3 stage-2 CTE harvest + cte_arity_mismatch

Implements the six ADR-0032 §10.3 output-column derivation rules
at CTE body-frame exit, populating the placeholder CteBinding's
columns. Unblocks `diagnostic.cte_arity_mismatch` (which compares
declared col-list arity vs derived projection arity) and the
upcoming qualified-prefix completion in 2e proper.

- `WalkContext::pending_cte_harvest`: bookkeeping for an in-progress
  CTE harvest, armed by writes_cte_name + extended by cte_column
  idents, consumed by the next walk_scoped_subgrammar invocation
  (CTE syntax has no intervening ScopedSubgrammar, so timing is
  deterministic). Cleared on every walk_scoped_subgrammar entry
  to prevent stale state surviving a speculative walk rollback.

- `run_cte_harvest`: post-walk path-scan classifier that
  reconstructs the body's first leg's projection-list and applies
  the six derivation rules. Compound bodies take columns from the
  first leg per spec; recursive CTE bodies take the non-recursive
  (first) leg. Optional (col-list) renames positionally with
  preserved types.

- `expand_binding`: bridges a TableBinding to a CteColumn list,
  resolving CTE-source bindings (empty columns + table-name
  matches an in-scope CteBinding) through to the CTE's harvested
  columns. Enables sibling CTEs to project correctly: in
  \`WITH a AS (...), b AS (SELECT * FROM a) ...\`, b's harvest sees
  a's derived columns through the body's from_scope binding.

- `WalkContext::pending_diagnostics`: accumulator for diagnostics
  emitted DURING the walk by node handlers with context the
  post-walk passes can't reconstruct. Drained by the top-level
  walk function on both match and non-match paths so a re-used
  context can't leak entries between walks.

Test totals: 1399 → 1414 passing (+15: 10 derivation rules + 1
sibling CTE + 4 arity match/mismatch tests). Clippy clean.
This commit is contained in:
claude@clouddev1
2026-05-20 17:42:17 +00:00
parent c20c6e05ca
commit dd37a1cbfc
3 changed files with 859 additions and 15 deletions
+54
View File
@@ -164,6 +164,56 @@ pub struct WalkContext<'a> {
/// on top. Always non-empty: the bottom frame is created at
/// `WalkContext::new` / `with_schema` time and never popped.
pub from_scope_stack: Vec<ScopeFrame>,
/// Diagnostics emitted *during* the walk by node handlers
/// that have context the post-walk path scanners can no
/// longer reconstruct (notably the §10.3 CTE harvest, which
/// runs at body-frame exit and has direct access to both
/// the declared col-list and the derived columns). The
/// walker's top-level `walk` function drains this on
/// successful parses and folds the entries into the final
/// diagnostic vector.
pub pending_diagnostics: Vec<crate::dsl::walker::outcome::Diagnostic>,
/// Set by the `writes_cte_name` ident path right after the
/// placeholder `CteBinding` is pushed onto the outer frame.
/// Tells the very next `walk_scoped_subgrammar` invocation
/// that the body it's about to walk is a CTE body and that,
/// on `Matched` exit, it should run the §10.3 harvest into
/// the recorded placeholder. `cte_column` idents (the
/// optional `(c1, c2)` list between the cte name and `AS`)
/// append to `col_list` as they're seen.
///
/// CTE syntax has no intervening `ScopedSubgrammar` between
/// the cte-name ident and the body, so the timing is
/// deterministic. Cleared by `walk_scoped_subgrammar` whether
/// or not the inner walk matched (a speculatively-walked
/// then-rolled-back body must not leave a stale request).
pub pending_cte_harvest: Option<PendingCteHarvest>,
}
/// Bookkeeping for an in-progress CTE harvest (ADR-0032 §10.3
/// stage 2).
///
/// The `writes_cte_name` ident sets one of these after pushing
/// the placeholder `CteBinding`; the next
/// `walk_scoped_subgrammar` invocation takes it and runs the
/// harvest after the body matches.
#[derive(Debug, Clone)]
pub struct PendingCteHarvest {
/// Index of the placeholder `CteBinding` in the *outer*
/// frame's `cte_bindings`. The outer frame is
/// `from_scope_stack[len() - 2]` at the moment the body's
/// frame is on top.
pub placeholder_index: usize,
/// Explicit `(c1, c2, …)` rename list — empty when the CTE
/// declared no column list. The harvest's derived column
/// names are overridden positionally by this list per ADR-
/// 0032 §10.3.
pub col_list: Vec<String>,
/// Span of the cte_name ident — the diagnostic anchor for
/// `cte_arity_mismatch` if the col-list arity disagrees with
/// the body's derived arity.
pub cte_name: String,
pub cte_name_span: (usize, usize),
}
impl<'a> WalkContext<'a> {
@@ -185,6 +235,8 @@ impl<'a> WalkContext<'a> {
user_listed_columns: None,
subgrammar_depth: 0,
from_scope_stack: vec![ScopeFrame::default()],
pending_diagnostics: Vec::new(),
pending_cte_harvest: None,
}
}
@@ -205,6 +257,8 @@ impl<'a> WalkContext<'a> {
user_listed_columns: None,
subgrammar_depth: 0,
from_scope_stack: vec![ScopeFrame::default()],
pending_diagnostics: Vec::new(),
pending_cte_harvest: None,
}
}
}