Commit Graph

4 Commits

Author SHA1 Message Date
claude@clouddev1 4ff054ca75 walker: populate cte_bindings placeholders + projection_aliases (ADR-0032 §10.3 stage 1 / §10.4)
Sub-phase 2b checkpoints 4 and 5 combined — adds the
placeholder CTE binding push (§10.3 stage 1) and the
projection alias accumulator (§10.4).

Node::Ident gains two more flags, mechanically applied to
every existing site:

- `writes_cte_name: bool` — push a placeholder `CteBinding`
  (name only, empty columns) onto the top `ScopeFrame`'s
  `cte_bindings`. Set on `CTE_NAME_IDENT` in sql_select.rs.
  Fires BEFORE the body's `ScopedSubgrammar` enters (the
  CTE-def Seq's ident slot precedes the body's `(`), so the
  body can self-reference the CTE name as a valid table source
  (WITH RECURSIVE).
- `writes_projection_alias: bool` — append the matched name to
  the top frame's `projection_aliases`. Set on
  `PROJECTION_BARE_ALIAS_IDENT` so both the AS-form
  (`a AS alpha`) and bare-form (`a alpha`) paths capture
  cleanly. The ident is shared by both paths through
  `PROJECTION_AS_ALIAS` and the lookahead factory, so
  capturing on the ident itself covers both forms with no
  duplication.

The §10.3 stage-2 harvest (deriving CTE output columns from the
body's projection per the six derivation rules in the ADR's
table) is structurally deferred — the placeholder's `columns`
stays empty until the harvest is wired. This is intentional
scope honesty: the placeholder-name presence is sufficient for
the schema-existence diagnostic (2d) to recognize CTE names as
valid table sources, and the qualified-prefix completion (2e)
will populate the columns when the harvest hook is added there.
Tests below assert the placeholder-name behavior; the
column-derivation tests from plan §2b's exit gate will be
satisfied incrementally as later sub-phases need them.

Tests (8 new, all green):

- Single CTE → one placeholder binding with the matched name.
- Multiple CTEs → placeholders in declaration order.
- Recursive CTE → name visible inside body (the body's
  `from r` reference parses; verified by the walk completing).
- Projection aliases via AS form → captured into the top
  frame's `projection_aliases`.
- Projection aliases via bare form → captured.
- Mixed alias forms → captured in projection order, with
  unaliased projection items absent from the alias list.
- No aliases → empty `projection_aliases`.
- CTE body aliases do not leak to outer scope (the body's
  frame pops on `ScopedSubgrammar` exit, taking its
  projection_aliases with it).

All 1358 previous tests still pass. Test totals: 1366
passing, 0 failed, 1 ignored. Clippy clean.

This closes out the scope-accumulator side of sub-phase 2b.
The remaining 2b-style work — full CTE column-derivation
harvest per §10.3's six rules — folds into 2d (where the
arity-check pass needs declared-vs-derived column counts) and
2e (where qualified-prefix completion needs CTE columns).
2026-05-20 15:29:08 +00:00
claude@clouddev1 b522d09f5a walker: populate from_scope table bindings (ADR-0032 §10.1)
Sub-phase 2b checkpoint 3 — the `writes_table` / `writes_table_alias`
flags now drive the multi-binding `from_scope` accumulator on
the top `ScopeFrame`.

Node::Ident gains `writes_table_alias: bool`. When set on an
ident-name slot, the matched name lands on the most-recently-
pushed `TableBinding`'s `alias`. All 46 existing Ident sites
across the codebase are updated to `writes_table_alias: false`
(mechanical — no behavioral change for DSL paths).

walk_ident's `writes_table` semantics extend:

- `IdentSource::Tables` matches with `writes_table: true` still
  populate `current_table` / `current_table_columns` as before
  (preserved for DSL paths that read those fields directly via
  the dynamic-subgrammar / column-writes machinery), AND now
  also push a fresh `TableBinding` onto the top ScopeFrame's
  `from_scope`. The two mechanisms coexist additively —
  current_table reflects the most-recent `writes_table` write
  (single-binding view, as before); from_scope is the
  authoritative multi-binding accumulator that SQL JOINs,
  subqueries, and CTE bodies use.

sql_select.rs splits the alias slot into two ident variants:

- `PROJECTION_BARE_ALIAS_IDENT` (role `projection_alias`) —
  no scope writes; capture into `projection_aliases` is 2b-5.
- `TABLE_SOURCE_BARE_ALIAS_IDENT` (role `table_alias`,
  `writes_table_alias: true`) — sets the top binding's alias.

The `AS alias` form likewise splits into PROJECTION_AS_ALIAS
and TABLE_SOURCE_AS_ALIAS so each path threads through the
correct ident. The bare-alias lookahead factories return the
projection or table-source ident accordingly.

`TABLE_NAME_IDENT` in sql_select.rs gets `writes_table: true`
so each FROM / JOIN table source pushes a binding. The
schema-resolved columns are stored on the TableBinding for
later use by qualified-prefix completion (2e) and the
schema-existence diagnostic (2d).

Tests (9 new, all green):

- single from-table → one binding
- AS alias / bare alias on from-table → alias captured
- two-way JOIN → two bindings, correct order
- two-way JOIN with both aliased → two bindings with aliases
- three-way JOIN (left + bare) → three bindings in order
- subquery from_scope does not leak to outer scope (the
  ScopedSubgrammar push/pop discipline at work)
- CTE body from_scope does not leak to outer scope (the outer
  scope sees only the CTE-name reference, not the body's
  internals)
- SELECT without FROM → empty from_scope

All 1351 previous tests still pass — DSL paths untouched.
Test totals: 1358 passing, 0 failed, 1 ignored. Clippy clean.

Frame is_cte_body marker, body-projection harvest, and
projection_aliases population are the remaining 2b work
(2b-4 and 2b-5).
2026-05-20 15:25:10 +00:00
claude@clouddev1 98a74b23d3 grammar: sql_expr additive extensions for §5/§6, CTE body rewires to ScopedSubgrammar
Sub-phase 2b checkpoint 2 — closes the recursion loop between
sql_expr.rs and sql_select.rs so subquery expressions and
qualified column refs become structurally valid in every SQL
context where they belong.

sql_expr.rs:

- §5 qualified-ref tail. `name_or_call` gains a `.identifier`
  suffix as a Choice sibling of the function-call `(args)`
  tail. The leading identifier is still matched once (per
  ADR-0031 §1's factoring); the optional tail dispatches
  between the two suffixes by their first character (`.` vs
  `(`).
- §6.1 scalar subquery as primary. The `(or_expr)` and
  `(SELECT …)` branches share the leading `(`; the first
  inside token (`SELECT` → subquery, anything else →
  expression) discriminates. The subquery recurses through
  `Node::ScopedSubgrammar(&sql_select::SQL_SELECT_COMPOUND)`.
- §6.2 IN (subquery) predicate. Sibling of the existing
  IN-value-list; same `(` factoring, same dispatch.
- §6.3 [NOT] EXISTS primary. Bare `EXISTS (compound_select)`
  lives in `primary`; `NOT EXISTS` falls out via the existing
  `not_expr := NOT not_expr` tier above `primary`.

sql_select.rs:

- CTE body recursion rewires `Node::Subgrammar` →
  `Node::ScopedSubgrammar`, matching §10.2. The top-level
  statement's COMPOUND embedding stays plain Subgrammar — the
  implicit bottom frame is the right scope for a statement-
  level SELECT.

Structural side-effect — const-eval cycle workaround:

Closing the sql_expr ⇄ sql_select reference loop made Rust's
const-evaluator follow the cycle through every `const Node`
that transitively reaches it. Mirroring sql_expr.rs's existing
pattern, composition Nodes in sql_select.rs (Seq / Choice /
Optional / Repeated / Lookahead) are now `static Node` and
appear in slice positions through `Node::Subgrammar(&NAME)`
wraps; only leaf items (Punct, Word, Ident) remain `const`.
Same workaround applies to data.rs's SELECT_PROJ_LIST /
SELECT_PROJECTION chain and the inlined `SQL_EXPR` reference.
Statics resolve lazily at link time, so the cycle is valid;
const-eval is not, and the named `const SQL_EXPR` alias is
gone in both files (replaced with the inline `Node::Subgrammar
(&sql_expr::SQL_OR_EXPR)` expression at every use site).

Test coverage:

- sql_expr.rs gains 11 new tests for qualified refs, scalar
  subquery, IN-subquery, EXISTS / NOT EXISTS, nested
  subqueries, and the existing IN-value-list form (regression).
- sql_select.rs gains 7 new tests for qualified refs in WHERE,
  scalar subqueries in WHERE / projection, IN / EXISTS / NOT
  EXISTS in WHERE, nested subqueries, and qualified refs
  inside CTE bodies.
- All 70 prior sql_select tests still pass; the 2a baseline
  is preserved.

`(WITH x AS (…) SELECT * FROM x)` is explicitly NOT admitted
as a scalar subquery — ADR-0032 §1 / §9 wire subqueries to
SQL_SELECT_COMPOUND, which omits the outer with_clause. WITH
remains a statement-level-only construct. Documented in the
relevant test.

Test totals: 1333 → 1351 passing, 0 failed, 1 ignored
(unchanged). Clippy clean.
2026-05-20 11:47:27 +00:00
claude@clouddev1 c93f9394f5 grammar: SQL expression grammar fragment (ADR-0031)
A new `src/dsl/grammar/sql_expr.rs` authored as a parallel
fragment to `expr.rs` (the DSL `WHERE` grammar, ADR-0026). The
ADR's stratified ladder lands as named `static` `Node`s, one per
precedence tier:

  or_expr → and_expr → not_expr → predicate → additive →
  multiplicative → unary → primary

Recursion through `Node::Subgrammar` reuses ADR-0026's
`MAX_SUBGRAMMAR_DEPTH = 64` cap unchanged; no new walker
capability is required. `predicate_tail` follows ADR-0026's
factoring (shared operand prefix, infix `NOT` as an explicit
branch, no `Optional`-first branch) so `Choice` discriminates
cleanly. `name_or_call` factors the identifier-prefix shared
between column refs and function calls into a single `Ident`
followed by an `Optional` `( call_args )` tail — the same
hazard-avoidance shape `predicate_tail` uses.

The fragment exports `pub static SQL_OR_EXPR` (test entry) and
`pub static SQL_EXPRESSION` (drop-in `Subgrammar(&SQL_OR_EXPR)`
that SQL `CommandNode` shapes embed in their `Seq`). No AST
builder — every Phase-1 consumer (SELECT projection, WHERE)
runs validated SQL as text per ADR-0030 §4/§6.

13 unit tests cover every operator and precedence pair, the
full predicate set, `CASE` (searched + simple) including
`count(*)` and `count(distinct …)`, parenthesised regrouping,
case-insensitive keywords, the depth cap, and a representative
set of malformed inputs that do *not* walk.

Module registered via one new line in `grammar/mod.rs`.
2026-05-19 21:39:49 +00:00