Files
rdbms-playground/docs/plans/20260520-adr-0032-phase-2.md
T
claude@clouddev1 a7db7dd2da docs: ADR-0032 + Phase 2 plan — full SQL SELECT grammar
ADR-0030 §3 commissioned a focused ADR for the full SELECT
grammar (the "SELECT — full" phase). ADR-0032 records the
decisions; docs/plans/20260520-adr-0032-phase-2.md is the
implementation plan walking the work.

Phase 2's grammar surface:

- Five JOIN flavours (INNER, LEFT, RIGHT, FULL OUTER, CROSS).
  NATURAL/USING/comma-FROM explicitly OOS.
- All four set ops (UNION, UNION ALL, INTERSECT, EXCEPT).
- WITH and WITH RECURSIVE CTEs, with optional (col-list) renaming.
- Scalar subqueries, IN (SELECT …), [NOT] EXISTS as additive
  primary branches in sql_expr (redeems ADR-0031 §7 OOS-1).
- Qualified column refs t.c / alias.c as a name_or_call tail
  (redeems ADR-0031 §7 OOS-2).
- LIMIT n [OFFSET m]; legacy `LIMIT m, n` OOS.
- DISTINCT/ALL, t.* projection, bare-alias projection (lifts
  Phase-1 §4.2's autonomous decision).

Walker-capability honesty (§10): ADR-0030 §8's "ambient
assistance comes for free" holds for grammar recursion (reuses
ADR-0026's Subgrammar + depth cap unchanged) but not for
completion scope. Phase 2 adds a new Node::ScopedSubgrammar
variant alongside the existing Node::Subgrammar (DSL Expr and
sql_expr recursion untouched), a from_scope_stack of
ScopeFrames holding from_scope / cte_bindings /
projection_aliases, qualified-prefix completion narrowing, and
a post-walk fixup pass that re-resolves projection-list
identifier highlighting/validity once FROM is parsed (the
projection-before-FROM problem).

CTE column resolution (§10.3): SELECT * and explicit-projection
CTE bodies both yield real column completion past cte_alias.|
via a body-projection derivation rule that runs at the body's
ScopedSubgrammar exit and writes derived columns back into the
binding.

Diagnostics (§11): every Phase-2 validation case classified
against ADR-0027's ERROR/WARNING guideline. Five new diagnostic.*
catalog keys for parse-time-detectable cases (unknown_qualifier,
ambiguous_column, projection_alias_misplaced, cte_arity_mismatch,
compound_arity_mismatch) plus eight engine.* translation keys.
A MatchedPath-walking predicate-warnings variant closes the
Phase-1 carry-over gap where SQL WHERE expressions emitted no
LIKE-on-numeric / = NULL / type-mismatch warnings — ADR-0027
Amendment 1 finally extends to the SQL surface.

Result-column type resolution (§12): rusqlite 0.39.0 exposes
column_table_name / column_origin_name / column_database_name
behind a `column_metadata` feature; verified. Bare column refs
recover their playground type — partially lifts Phase-1 §4.5's
bool→0/1 deferral.

The implementation plan breaks Phase 2 into seven sub-phases
(2a–2g) with explicit exit gates per sub-phase and a cross-cut
verification matrix that names every "X comes for free" claim
from ADR-0030/0031/0032. The Phase-1 SQL-expression
predicate-warning gap is a named row, preventing an analogous
silent gap from shipping. The plan encodes the user's standing
authorization for the implementer to walk uninterrupted between
gates and commit with standard messages — escalation
discipline preserved for design ambiguities and real blockers.
Pushes remain user-only.

New docs/plans/ directory sets a pattern for future phase plans.

Status: Accepted.
2026-05-20 10:25:43 +00:00

38 KiB
Raw Blame History

Implementation plan — ADR-0032 (ADR-0030 Phase 2)

Plan author date: 2026-05-20 Driving ADR: ADR-0032 — The full SQL SELECT grammar Parent ADR: ADR-0030 — Advanced mode: the standard-SQL surface

Purpose

ADR-0032 is the decision: what Phase 2 covers, how it composes, and what consequences fall out. This plan is the mechanics: the build order, the per-step exit gates, the cross-cut verifications that prevent the kind of silent gap Phase 1 shipped (the SQL-expression predicate-warning gap closed in ADR-0032 §11.6).

The plan does not re-litigate ADR-0032's decisions. Any finding during implementation that suggests a decision should change is escalated to the user and resolved by ADR amendment (or a new ADR) before code proceeds — never by silent drift.

Why this plan exists

Phase 1 of ADR-0030 shipped with a structurally green test suite (1260 passing, 0 failing) and a believed-but-unverified "ambient assistance comes for free" claim. The verification missed the fact that ADR-0027's predicate warnings — emitted by a pass walking the DSL Expr AST — could not fire on SQL expressions because sql_expr.rs builds no AST (ADR-0031 §2). ADR-0032 §11.6 documents the gap and closes it; this plan exists to make sure Phase 2 does not ship with an analogous unnoticed gap.

The two failure modes the plan is built to prevent:

  • The "free for free" claim shipping without a test that proves the claim. Every "X works for SQL automatically" in ADR-0030/0031/0032 needs an explicit cross-cut test.
  • A sub-phase reporting green because its own scope is covered, while a behaviour it enables for downstream code is silently broken. Each sub-phase's exit gate includes the cross-cut behaviours it unlocks, not just its own internal correctness.

Working-mode reminder

Per CLAUDE.md's "Working mode — solo with sub-agents by default": this work proceeds in solo mode unless the user explicitly invokes /team. The phased structure below maps to solo-mode's Phase 1 → 5 discipline within each sub-phase:

  • Requirements extraction (the sub-phase's scope below) is Phase 1.
  • Divergent exploration is rarely needed for individual sub-phases here (ADR-0032 already selected the approach), but if implementation reveals an ambiguity, it lands as a Phase-2 exploration before code proceeds.
  • The DA hat is worn at every sub-phase gate. The DA critique is written, not silent (CLAUDE.md "Devil's Advocate discipline").
  • Escalation rather than guessing applies throughout.

Standing authorizations and continuation policy

The user has granted standing authorizations for this plan, overriding the corresponding CLAUDE.md defaults for this plan's scope only. They are recorded here verbatim so any implementer (the original session, a future session, or a fresh agent) operates on the same terms.

1. Walk the plan without interruption when no interruption is needed

The plan's sub-phases (2a2g) have explicit, written exit gates. Within a sub-phase, the implementer proceeds through the build steps and verifies the gate without asking for permission at each step. Between sub-phases, the implementer proceeds to the next one without asking, provided the previous gate is green.

The implementer does NOT interrupt to:

  • Confirm "should I continue to the next sub-phase?"
  • Confirm minor implementation choices that match surrounding code (CLAUDE.md's code-style discipline applies; pick the consistent option and move on).
  • Confirm test naming, file organisation, commit granularity within a sub-phase.
  • Report progress mid-step. A status update arrives at gate boundaries — not every five minutes.

2. When the implementer MUST escalate

The escalation rule from CLAUDE.md ("Escalate ambiguity — do not decide for the user") is not relaxed. Standing authorization to walk uninterrupted is not authorization to guess on design questions. The implementer escalates if:

  • An exit gate cannot be made green by work in the current sub-phase's scope — a test fails for a reason that suggests the ADR is wrong, or a behaviour the gate requires turns out to need machinery not in ADR-0032.
  • An ambiguity arises that the ADR and this plan do not settle. See the "Open questions to escalate before code starts" section at the end of this plan — those are pre-identified. Anything similar that surfaces is also escalation territory.
  • An autonomous design decision is required (a choice outside ADR-0032 §§113's settled list and the plan's per-sub-phase scope). Per CLAUDE.md's "Out of scope and non-blocking require user confirmation": the default assumption is that the work is in scope and is blocking unless the user has said otherwise.
  • A "real blocker" is encountered — environment, dependency, or test-infrastructure failure that prevents progress and cannot be resolved within the sub-phase. (A flaky test is not a real blocker — investigate per CLAUDE.md's "no excuses" rule on test infrastructure.)

3. Commits are pre-authorized at standard granularity

The user has explicitly pre-authorized commits within this plan's scope, overriding CLAUDE.md's "All commits MUST be confirmed by the user first" for this plan. The implementer:

  • Commits at natural break points within each sub-phase and always at sub-phase gate boundaries. Granular is better than monolithic.
  • Writes detailed commit messages in the project's existing style — area: short subject first line, then a body explaining what changed and why. Recent precedent:
    grammar: SQL SELECT end-to-end (ADR-0030 Phase 1)
    tests: Phase 1 SQL SELECT integration tests
    app: mode-threaded completion, overlay, and validity indicator
    
    Body should reference the ADR section being implemented (e.g., "ADR-0032 §10.2 — Node::ScopedSubgrammar variant added") so a reader can trace each commit back to a decision.
  • Includes NO AI attribution of any form (Co-Authored-By, generated-by lines, AI footers, …). This is a global rule (CLAUDE.md); the standing authorization here is for committing, not for relaxing attribution discipline.

4. Pushes remain a user step

CLAUDE.md's "Push is a user step — agents cannot and shall never do it" is not overridden. The implementer commits freely; the user pushes when they choose. Unpushed commits are a normal working state, and the implementer never describes work as blocked on a push.

5. End-of-plan handoff

At the end of 2g, after the verification report is produced and the DA's final PASS verdict is recorded, the implementer stops and surfaces a summary to the user — exact test counts, the filled cross-cut matrix, the requirements-to-test mapping, any autonomous decisions made and their justification. The user confirms acceptance before the plan is considered done. (This is the one explicit hand-back point.)

If the user wants follow-up work — adjustments, polish, additional tests — the implementer addresses it and re-runs the gate before re-surfacing. The plan does not declare itself complete; the user does.

Sub-phase overview

Sub-phase Title Test-suite gate
2a Grammar fragment All §1 productions accept; OOS shapes reject
2b ScopedSubgrammar + scope accumulators Subquery scope isolation; CTE self-reference visible
2c Phase-1 grammar migration All 7 Phase-1 SQL SELECT tests still green
2d Diagnostics passes Every catalog key has positive + negative tests; Phase-1 gap closure verified
2e Completion + post-walk fixup Per-slot completion matrix green; projection-before-FROM scenario verified
2f Type resolution Each of 10 playground types preserved through bare ref
2g Integration sweep + cross-cut verification All cross-cut tests green; final verification report

Each sub-phase ships independently: green tests, clippy clean, a written DA gate review, and (per project rules) a user-confirmed commit message before commit.


Sub-phase 2a — Grammar fragment

Scope (in)

  • Author src/dsl/grammar/sql_select.rs exporting pub static SQL_SELECT_STATEMENT: Node and pub static SQL_SELECT_COMPOUND: Node.
  • Encode the full §1 BNF: with_clause, compound_select, select_core, the projection list with DISTINCT/ALL/*/ t.*/bare-alias, the from_clause/join_clause chain with the five JOIN flavours, where_clause, group_by_clause, having_clause, order_by_clause, limit_clause with OFFSET.
  • Wire sql_expr::SQL_OR_EXPR into every expression slot via Subgrammar.
  • Author the structural one-line hint key per slot (per ADR-0032 Consequences "Hint-panel prose" bullet); the keys are placed in the catalog with placeholder text. Rich teaching prose is Phase 6.
  • Encode the reject_internal_table validator on every new table-source slot.

Scope (out — explicit)

  • The ScopedSubgrammar variant — that is 2b.
  • The Phase-1 data::SELECT migration — that is 2c.
  • Any new WalkContext field — that is 2b.

Build steps

  1. Stub sql_select.rs with the static node skeleton and re-export through grammar/mod.rs.
  2. Author each production tier as a named static Node in loosest-to-tightest order, matching the §1 BNF.
  3. For each production, add a unit test in the file (the expr.rs / sql_expr.rs test pattern) that walks a representative input.
  4. Author the OOS reject tests (NATURAL JOIN, comma-FROM, LIMIT m, n, VALUES, LATERAL, window-function OVER).
  5. cargo test and cargo clippy --all-targets -- -D warnings.

Exit gate

Required green tests:

  • Every JOIN flavour (INNER, bare JOIN, LEFT [OUTER], RIGHT [OUTER], FULL [OUTER], CROSS) accepts.
  • Every set op (UNION, UNION ALL, INTERSECT, EXCEPT) accepts in a compound chain.
  • Both CTE forms (WITH, WITH RECURSIVE) accept; both column- list and no-column-list shapes accept.
  • DISTINCT/ALL accepts and rejects each other co-occurring.
  • *, t.*, bare-alias projection, AS alias projection all accept.
  • Qualified column refs (t.c) accept in WHERE/projection/ ORDER BY/HAVING/GROUP BY.
  • LIMIT n and LIMIT n OFFSET m both accept; LIMIT m, n rejects.
  • Every §13 OOS-1 to OOS-7 shape rejects with a parse error (not a silent accept).

Required negative tests: absence of bare RECURSIVE (must be WITH RECURSIVE); absence of JOIN after INNER (must be INNER JOIN); etc. — the obvious parse-failure modes for each production.

Test-suite count: ≥ 50 new unit tests in sql_select.rs (rough order; the actual count is dictated by production coverage).

Other gates:

  • cargo test total: 1260 (baseline) + ≥ 50 (new) = ≥ 1310. Zero failures, one ignored (the unchanged doctest), zero skipped.
  • cargo clippy --all-targets -- -D warnings clean.

DA gate (written)

  • Does every §1 BNF production have at least one positive test?
  • Does every §13 OOS shape have at least one reject test?
  • Did the implementation introduce any structure not covered by ADR-0032 §1 or §10–§11 (an autonomous decision)? If yes, escalate to the user before the gate is signed.
  • Are the OOS rejects emitting engine-neutral parse errors, or are they leaking SQLite messages?

Sub-phase 2b — ScopedSubgrammar + scope accumulators

Scope (in)

  • Add the new walker node variant Node::ScopedSubgrammar(&Node).
  • Add the ScopeFrame struct and from_scope_stack: Vec<ScopeFrame> to WalkContext. Add from_scope, cte_bindings, projection_aliases as fields inside ScopeFrame.
  • Teach the walker driver to push/pop a fresh frame on ScopedSubgrammar entry/exit.
  • Rewire every Subgrammar reference to &SQL_SELECT_COMPOUND in sql_expr.rs and sql_select.rs to use the new variant.
  • Teach from_clause / join_clause walks to populate the current frame's from_scope.
  • Teach with_clause walks to populate the outer frame's cte_bindings (placeholder push pre-body per §10.3 stage 1).
  • Teach projection_item walks to populate the current frame's projection_aliases.
  • Implement §10.3 stage 2: at the CTE body's frame exit, derive output columns from the body's projection and rewrite the placeholder binding in the outer frame.
  • Keep current_table / current_table_columns as derived helpers (top-frame single-binding view) — verify the DSL paths still see them correctly.

Scope (out — explicit)

  • Qualified-prefix completion narrowing — that is 2e.
  • The post-walk fixup pass for projection identifiers — that is 2e.
  • The arity-check ERROR pass (uses the frame-exit hook from §10.3 stage 2 but adds a new pass) — that is 2d.

Build steps

  1. Add the variant to Node; add walker driver match arm (push, recurse, pop).
  2. Add ScopeFrame and from_scope_stack to WalkContext; rewrite current_table / current_table_columns accessors to derive from the top frame.
  3. Rewire Subgrammar references to ScopedSubgrammar in sql_select.rs (subquery primary, IN/EXISTS tails, CTE bodies — except for the ladder-internal recursions in sql_expr.rs that stay as Subgrammar).
  4. Teach the from_clause / join_clause walker handlers to build and push TableBindings.
  5. Teach the with_clause handler to push the placeholder CteBinding before entering the body.
  6. Teach the projection_item handler to record aliases.
  7. Implement the body-frame-exit hook that derives CTE output columns per §10.3 stage 2 rules.
  8. Unit tests for each.

Exit gate

Required green tests:

  • An empty query through the walker still produces the baseline output (no spurious push).
  • A DSL UPDATE T … still sets current_table correctly via the derived helper (DSL paths untouched).
  • A subquery's FROM does not leak: parse SELECT * FROM a WHERE id IN (SELECT id FROM b), then probe a cursor outside the subquery — only a is in from_scope.
  • A correlated reference resolves: parse SELECT * FROM a WHERE EXISTS (SELECT 1 FROM b WHERE b.x = a.x), then probe inside the subquery — a is visible via outer frame.
  • A CTE name is visible inside its own body: parse WITH r AS (SELECT 1 UNION ALL SELECT n+1 FROM r) SELECT * FROM r, then probe inside the body — r is in cte_bindings.
  • A CTE body's SELECT * derives the right output columns: parse WITH x AS (SELECT * FROM users) …, verify x's derived columns match users' columns by name and playground type.
  • A CTE body's SELECT a, b AS bb FROM t derives two columns with the right names and types (a: source type; bb: source type since underlying is a bare ref).
  • A CTE body's SELECT a + 1 FROM t derives one column with name = None.
  • A CTE column list (c1, c2) renames positionally; arity mismatch is detectable at frame exit (the diagnostic is emitted by 2d but the detection runs here).
  • The depth cap (MAX_SUBGRAMMAR_DEPTH = 64) still fires on pathological nesting — exercised through both variants.

Other gates:

  • All 2a tests still green.
  • All 1260 baseline tests still green (DSL paths verified insulated).
  • cargo clippy --all-targets -- -D warnings clean.

DA gate (written)

  • Did any DSL test regress? If yes, the current_table helper is not deriving correctly — fix before gate signs.
  • Is the ScopedSubgrammar push/pop happening on every entry and every exit, including error paths? Read the walker driver's match arm and verify.
  • Does the CTE column derivation handle all six §10.3 rules? Walk the rule table and find a test for each.
  • Did 2b introduce any structure ADR-0032 doesn't sanction?

Sub-phase 2c — Phase-1 grammar migration

Scope (in)

  • Move the Phase-1 SQL SELECT grammar nodes from src/dsl/grammar/data.rs into src/dsl/grammar/sql_select.rs (or remove them if redundant with the new §1 productions).
  • Rebuild the data::SELECT CommandNode so its body is a reference to SQL_SELECT_STATEMENT.
  • Confirm the seven Phase-1 SQL SELECT integration tests still pass without modification.

Scope (out — explicit)

  • Any behaviour change. This is a refactor: the Phase-1 surface must be preserved exactly.

Build steps

  1. Identify the Phase-1 SQL SELECT static nodes in data.rs (SELECT_PROJECTION_*, SELECT_WHERE_*, SELECT_ORDER_*, SELECT_LIMIT_* if separate, plus any sub-tree they reference).
  2. Move them into sql_select.rs or remove if the §1 productions cover them.
  3. Rewrite the data::SELECT CommandNode to reference SQL_SELECT_STATEMENT.
  4. Run the seven Phase-1 SQL SELECT integration tests.

Exit gate

Required green tests:

  • All seven Phase-1 SQL SELECT integration tests (tests/sql_select_*.rs or equivalent) pass without modification.
  • All 2a and 2b tests still green.
  • All 1260 baseline tests still green.

Other gates:

  • cargo clippy --all-targets -- -D warnings clean.
  • Dead code: confirm via cargo check that the old data.rs Phase-1 SELECT nodes are either removed or re-exported, not orphaned.

DA gate (written)

  • Were any Phase-1 SELECT behaviours quietly altered? Diff the Phase-1 integration test outputs (text and structural) — they must be identical, not "equivalent".
  • Did the migration produce dead code? Removing dead code is fine; leaving it isn't.

Sub-phase 2d — Diagnostics passes

Scope (in)

  • Implement the three diagnostic passes per ADR-0032 §11.7:
    • Schema-existence ERROR pass (extended for multi-binding from_scope, qualified-reference resolution, ambiguity detection).
    • Arity-check ERROR pass (new, hooked to CTE-body and compound-query frame-exits from 2b).
    • Predicate-warnings pass (extended with a MatchedPath-walking variant for sql_expr per §11.6).
  • Author the five new diagnostic.* catalog keys (§11.5): unknown_qualifier, ambiguous_column, projection_alias_misplaced, cte_arity_mismatch, compound_arity_mismatch.
  • Author the eight new engine.* catalog keys for friendly- error translations (§11.5).

Scope (out — explicit)

  • The post-walk projection-list fixup pass (§10.6) — that is 2e.
  • Detection of engine-rejected cases (§11.4) — those stay engine-side.

Build steps

  1. Extend the schema-existence pass to iterate every from_scope binding and the active cte_bindings.
  2. Add qualified-reference resolution: when a name_or_call match has a '.' identifier tail, resolve the qualifier against from_scope aliases first, then table names.
  3. Add ambiguity detection: an unqualified column ref that resolves into multiple bindings.
  4. Add projection-alias-misplaced check: walk sql_expr slots in WHERE/GROUP BY/HAVING; identifier matches against the current frame's projection_aliases → ERROR.
  5. Author the arity-check pass: at every ScopedSubgrammar exit, if the frame is a CTE body, compare declared vs derived column counts; if the parent context is a compound-query chain, compare against the previous leg's arity.
  6. Author the MatchedPath-walking predicate-warnings variant: identify LIKE / =/!= with NULL / mismatched-type comparison predicate-tails by node-name labels; emit the corresponding catalog key with the offending operand's span.
  7. Author the catalog entries in the i18n catalog file with engine-neutral wording.
  8. Author tests per §11.2 and §11.6.

Exit gate

Required green tests:

  • One positive and one negative test per new ERROR catalog key (the negative confirms the key does not fire on a well-formed equivalent):
    • diagnostic.unknown_qualifier: SELECT t.c FROM u (where t is not in scope) fires; SELECT u.c FROM u doesn't.
    • diagnostic.ambiguous_column: SELECT id FROM a JOIN b ON a.id = b.id fires for the bare id; SELECT a.id FROM a JOIN b ON a.id = b.id doesn't.
    • diagnostic.projection_alias_misplaced: SELECT a + b AS x FROM t WHERE x > 0 fires; … ORDER BY x doesn't (alias is bound by ORDER BY time).
    • diagnostic.cte_arity_mismatch: `WITH x(a, b) AS (SELECT
      1. SELECT * FROM x` fires; matched arity doesn't.
    • diagnostic.compound_arity_mismatch: SELECT 1, 2 UNION SELECT 1 fires; matched arity doesn't.
  • The Phase-1 gap closure explicitly verified: SELECT * FROM products WHERE price LIKE 5 fires diagnostic.like_numeric (the test that would have caught the Phase-1 gap). Sister tests: WHERE name = NULL fires diagnostic.eq_null; WHERE age = 'old' fires diagnostic.type_mismatch.
  • Predicate warnings extend across every SQL slot: the same LIKE-on-numeric pattern fires in HAVING, ON, a CASE WHEN clause, a projection item, and an ORDER BY item — one test per slot.
  • Engine.* keys: integration tests that drive the engine to the error condition verify the friendly-error layer surfaces the engine-neutral wording (one test per key, run-and-assert shape).

Other gates:

  • All 2a2c tests still green.
  • All 1260 baseline tests still green.
  • Existing DSL predicate warnings (the DSL Expr AST variant) still fire correctly — verify by re-running the DSL test suite section that exercises them.
  • cargo clippy --all-targets -- -D warnings clean.

DA gate (written)

  • Does every new catalog key have both a positive and a negative test, or did any slip in with only one?
  • Was the Phase-1 gap closure verified, or did the test slip past in "well, it should work now"?
  • Are the engine.* messages truly engine-neutral, or does any one leak SQLite product/feature names? (ADR-0030 §7.)
  • Did 2d cover every Phase-2 case in §11.2, or were any silently downgraded to "engine handles"?

Sub-phase 2e — Completion + post-walk fixup

Scope (in)

  • Add qualified-prefix completion narrowing (§10.5): when the matched path immediately preceding an IdentSource::Columns slot ends with Ident '.', the completer narrows candidates to the named binding's columns.
  • Implement the post-walk fixup pass for projection-list identifiers (§10.6): collect projection Ident terminals during the walk; after walk completion, re-resolve each against the final from_scope; rewrite the highlight class and validity diagnostic in the walker's accumulated output.
  • The pass runs as a final stage of the walk itself (§10.6 integration-point convention).

Scope (out — explicit)

  • Completion behaviour beyond column-name narrowing (keyword completion / function names / table names) — those come for free from the walker's expected-set machinery and need only cross-cut verification, not new code.

Build steps

  1. Add a "prefix qualifier" hint to the completion API that the walker passes when the cursor is immediately after Ident '.'.
  2. Extend IdentSource::Columns completion to honour the qualifier hint when present.
  3. Implement the projection-Ident recording during the walk (append a small struct per projection identifier to a walker-local list).
  4. Implement the fixup pass that runs after the main walk: re-resolve each recorded identifier; rewrite the highlight run; update the diagnostics vector.
  5. Tests per the exit gate.

Exit gate

Required green tests:

  • Qualified-prefix completion: at the cursor in SELECT t.|, candidates are t's columns; at SELECT u.| where u is not in scope, candidates are empty.
  • Alias-prefix completion: at the cursor in SELECT FROM a AS x JOIN b AS y ON x.|, candidates are x's (i.e. a's) columns.
  • Qualified-prefix narrowing for CTE: at WITH x AS (SELECT a, b FROM t) SELECT x.|, candidates are a and b (the derived columns).
  • Qualified-prefix narrowing for CTE with SELECT *: at WITH x AS (SELECT * FROM users) SELECT x.|, candidates are users' columns (per §10.3 derivation).
  • Projection-before-FROM: typing SELECT col1, col2 produces a generic-identifier highlight on col1/col2 (no from_scope to resolve against); typing FROM t after produces a column highlight on col1/col2 if they exist in t, or an unknown-identifier diagnostic if they don't — verified within one debounce/walk cycle.
  • ORDER BY alias resolution: parses SELECT a + b AS total FROM t ORDER BY total; the alias resolves without a fixup pass (single-pass walk handles it per §10.6).
  • Completion across every Phase-2 slot: Tab at WHERE |, GROUP BY |, HAVING |, ON |, projection |, and ORDER BY | all offer expected candidates from the union of from_scope bindings.

Other gates:

  • All 2a2d tests still green.
  • All 1260 baseline tests still green.
  • cargo clippy --all-targets -- -D warnings clean.

DA gate (written)

  • Does the projection-before-FROM verification actually exercise the full re-walk cycle, or does it just call the fixup pass directly? (The user-observable behaviour is the re-walk on debounce; that is what the test must exercise.)
  • Does the qualified-prefix narrowing produce empty candidates when the qualifier doesn't resolve, or does it accidentally fall back to the global column set?
  • Did 2e introduce any priority/ordering for outer-frame candidates? ADR-0032 §10.5 explicitly says ordering is completion-tier polish, not specified here — if 2e added priority, that is an autonomous decision and escalates.

Sub-phase 2f — Type resolution

Scope (in)

  • Add "column_metadata" to the rusqlite feature list in Cargo.toml (alongside "bundled").
  • Implement the worker-side type-resolution helper that runs after prepare, before iterating rows: for each result column, query RawStatement::column_table_name(i) / column_origin_name(i); if both return Some, look up the playground type from SchemaCache.columns_for_table; otherwise leave the slot as None.
  • Call the helper from do_run_select before constructing the DataResult.

Scope (out — explicit)

  • Any new walker / grammar work. This is a worker-side change only.

Build steps

  1. Update Cargo.toml. cargo build to confirm the feature resolves through libsqlite3-sys and the bundled SQLite is rebuilt with SQLITE_ENABLE_COLUMN_METADATA.
  2. Add the resolver method on Database (probably resolve_select_column_types(&Statement, &SchemaCache) -> Vec<Option<Type>>).
  3. Wire into do_run_select.
  4. Tests per the exit gate.

Exit gate

Required green tests:

  • One test per playground type (all ten: text, int, real, decimal, bool, date, datetime, blob, serial, shortid) — a bare-column SELECT of a column of that type produces column_types[0] = Some(<that type>) and the renderer formats accordingly.
  • bool renders as true/false (the pedagogically- important case lifting Phase-1 §4.5).
  • Computed expression stays typeless: SELECT a + 1 FROM t produces column_types[0] = None; the renderer uses neutral alignment.
  • Multi-column mixed test: SELECT name, age, is_active, age + 1 FROM users produces [Some(text), Some(int), Some(bool), None].
  • CTE pass-through: WITH x AS (SELECT name FROM users) SELECT name FROM x recovers Some(text) — this verifies the engine's column-origin metadata follows through the CTE in the bundled SQLite build (verified, not assumed).
  • Subquery result: SELECT (SELECT name FROM users WHERE id = 1) — the engine's column-origin metadata may or may not follow through a scalar subquery; whichever it does, the test asserts the actual behaviour (not the wished-for behaviour) and an explanatory comment captures the engine's posture.
  • Unknown-table SELECT: SELECT * FROM nonexistent — the resolver does not panic; the parse-time ERROR already fired in 2d.

Other gates:

  • All 2a2e tests still green.
  • All 1260 baseline tests still green.
  • cargo clippy --all-targets -- -D warnings clean.

DA gate (written)

  • Does the resolver handle every result-column case (named column, expression, NULL literal, CTE pass-through) without panicking?
  • Is the rusqlite feature flag in place? grep column_metadata Cargo.toml.
  • Did 2f surface any cases where engine-side metadata doesn't follow through (CTEs, views, subqueries)? Each such finding must be documented as an honest limitation, with the test capturing the actual behaviour.

Sub-phase 2g — Integration sweep + cross-cut verification

Scope (in)

The catch-all sub-phase. Verifies the full end-to-end Phase 2 surface and every cross-cut "X comes for free" claim. Produces the final phase-exit verification report (next section).

Build steps

  1. Author the end-to-end integration tests (Tier 3) for real-world-shape queries (the exit-gate list below).
  2. Author or extend the cross-cut verification matrix (next section) for every "X comes for free" claim from ADR-0030/0031/0032.
  3. Run the full test suite. Confirm 1260 baseline + all sub-phase 2a2f additions + the 2g integration tests.
  4. Run cargo clippy --all-targets -- -D warnings.
  5. Produce the final verification report.

Exit gate

Required end-to-end integration tests:

  • JOIN with WHERE: SELECT a.name, b.total FROM customers a JOIN orders b ON a.id = b.customer_id WHERE a.country = 'DE' — runs, returns expected rows, the data-table renderer emits a clean output.
  • Recursive CTE: a small tree traversal (e.g., generate integers 1..10 via WITH RECURSIVE) — runs, returns 10 rows, renderer clean.
  • UNION of two SELECTs: runs, returns the union, renderer clean.
  • Correlated EXISTS subquery: SELECT name FROM customers c WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id) — runs, returns expected rows, renderer clean.
  • history.log replay: every Phase-2 statement form written above replays from the log faithfully (ADR-0030 §11 unchanged).
  • reject_internal_table: FROM __rdbms_columns parse- rejects; WITH __rdbms_x AS (…) parse-rejects; a FROM __rdbms_* inside a CTE body parse-rejects.

Cross-cut verification matrix (next section): every claim must have a passing test.

Other gates:

  • Total test count: ≥ 1310 + (additions from 2b2g). The exact count depends on per-step authorship; the gate is "every required-test row above and in the cross-cut matrix has at least one green test", not a magic number.
  • Zero failures, zero skipped (excluding the unchanged doctest).
  • cargo clippy --all-targets -- -D warnings clean.

DA gate (written)

The final DA review, per CLAUDE.md Phase-5 verification:

  • Does every requirement on the ADR-0032 checklist have a test? Walk the next-section matrix row by row.
  • Are there any "free for free" claims from ADR-0030/0031/0032 that have no explicit test? (The Phase-1 gap pattern.) Each must have one.
  • Did any sub-phase silently downgrade a requirement to "later"? Phase-1 §4.1–§4.5 were autonomous decisions flagged at handoff; any Phase-2 autonomous decision must also be flagged with the same discipline.
  • Are ALL tests green AND zero-skipped? If not, the verdict is FAIL and 2g loops back.

Cross-cut verification matrix

Each row is a claim from ADR-0030, ADR-0031, or ADR-0032 that needs an explicit test, with the test's location. The matrix is filled in during 2g; the gate is "every row green".

Claim source Claim Test location Status
ADR-0030 §8 Syntax highlighting works for SQL keywords (TBD) (TBD)
ADR-0030 §8 Tab completion works for SQL keywords (TBD) (TBD)
ADR-0030 §8 Hint-panel prose appears at every SQL grammar slot (TBD) (TBD)
ADR-0030 §8 [ERR]/[WRN] validity indicator fires for SQL (TBD) (TBD)
ADR-0030 §8 Per-command parse-error usage fires for SQL (TBD) (TBD)
ADR-0030 §9 Out-of-subset construct (OVER (…)) produces engine-neutral parse error (TBD) (TBD)
ADR-0030 §11 Every Phase-2 SQL statement form logs to history.log and replays (TBD) (TBD)
ADR-0031 §5 Highlighting works for SQL expression operators / CASE / function calls (TBD) (TBD)
ADR-0031 §5 Column completion works for IdentSource::Columns slots in sql_expr (TBD) (TBD)
ADR-0031 §5 Hint prose appears at every sql_expr slot (TBD) (TBD)
ADR-0032 §10.2 DSL Subgrammar recursion (ADR-0026) does NOT push scope (TBD) (TBD)
ADR-0032 §10.2 sql_expr ladder Subgrammar recursion does NOT push scope (TBD) (TBD)
ADR-0032 §10.2 ScopedSubgrammar (SELECT) DOES push scope (TBD) (TBD)
ADR-0032 §10.3 SELECT * CTE body derives output columns from from_scope (TBD) (TBD)
ADR-0032 §10.3 Computed CTE projection without alias yields unnamed slot (TBD) (TBD)
ADR-0032 §10.3 CTE (col-list) renames positionally (TBD) (TBD)
ADR-0032 §10.3 Compound CTE body takes columns from first leg (TBD) (TBD)
ADR-0032 §10.3 Recursive CTE body takes columns from non-recursive leg (TBD) (TBD)
ADR-0032 §10.6 Projection-before-FROM re-resolves on full re-walk (TBD) (TBD)
ADR-0032 §11.6 Phase-1 gap: LIKE-on-numeric fires on SQL WHERE (TBD) (TBD)
ADR-0032 §11.6 = NULL fires on SQL WHERE (TBD) (TBD)
ADR-0032 §11.6 Type-mismatch fires on SQL WHERE (TBD) (TBD)
ADR-0032 §11.6 Predicate warnings fire on HAVING, ON, CASE, projection, ORDER BY (TBD) (TBD)
ADR-0032 §11.4 / §13 Aggregate-in-WHERE rejected by engine, surfaced engine-neutral (TBD) (TBD)
ADR-0032 §11.4 / §13 Non-aggregated-column-with-GROUP-BY engine-neutral (TBD) (TBD)
ADR-0032 §9 Depth cap fires on pathological SELECT nesting (≥ 64 frames) (TBD) (TBD)
ADR-0032 §12 Engine column-origin metadata follows through CTE (TBD) (TBD)
ADR-0032 §12 All 10 playground types recovered via bare ref (TBD) (TBD)
ADR-0032 §13 Every OOS shape rejects (NATURAL, USING, comma-FROM, comma-LIMIT, window, LATERAL, VALUES) (TBD) (TBD)

The implementer fills in Test location and Status (green checkmark or red cross) as 2g proceeds. A row marked red blocks the phase exit.


Final phase-exit verification report

Produced at the end of 2g. A markdown document at docs/handoff/<date>-phase-2-verification.md (or similar). The report contains:

  1. Test suite totals — passed / failed / skipped / ignored, exact numbers from cargo test output. Compared against the Phase-1 baseline (1260 / 0 / 0 / 1).
  2. The full cross-cut verification matrix with every row green.
  3. A requirements-to-test mapping — every numbered decision in ADR-0032 §§113 with the test(s) that prove it.
  4. A list of any autonomous decisions made during implementation, with rationale and explicit user-confirmation pointer for each. (Per CLAUDE.md: an autonomous decision without explicit user confirmation is a verdict FAIL.)
  5. The DA's written final review — PASS or FAIL, no "conditional" verdicts.

Only after the report is produced and signed off does the user-facing "Phase 2 complete" message issue.

Open questions to escalate before code starts

These are matters the plan deliberately leaves to the implementer to escalate when reached, not to decide silently:

  1. Subquery type-resolution through scalar subqueries. The 2f exit gate notes the engine's column-origin metadata may or may not follow through a scalar subquery. The test asserts the actual behaviour; if the behaviour is "doesn't follow", that becomes an honest limitation in §12 — the user is informed, the limitation is documented.
  2. Duplicate CTE-name detection. §11.4 leaves this as engine-rejected (no pre-flight diagnostic). If 2g's verification finds the engine message is confusing, escalate: the user may want a pre-flight diagnostic.duplicate_cte ERROR, which would mean a small extension to 2d.
  3. Function name completion / allowlist. §11.4 / ADR-0030 §13 OOS-3 leaves no allowlist. If real-world testing reveals learners are confused by silently-admitted-then-engine-rejected function names, escalate for an explicit ADR.

What this plan does NOT contain

  • Time estimates. Phase 2 is large and ADR-0032 is the first pass through the problem; calibration is poor. Milestones, not hours.
  • Sub-phase commit granularity beyond "ship green tests at each gate". The implementer commits at natural break points within a sub-phase; the gate is what gets verified.
  • Re-statements of ADR-0032's decisions. The plan refers to ADR-0032 by section; the implementer reads both.