Files
rdbms-playground/docs/handoff/20260520-handoff-27.md
claude@clouddev1 5d716e64ab docs: handoff 27 — ADR-0032 Phase 2 sub-phases 2a / 2b / 2c / 2d / 2f shipped
Session summary covering nine implementation commits since
handoff-26 (e032f01..0c3847a). Records the user-approved
deferral of §10.3 stage 2 (CTE column-derivation harvest),
flags three further 2d diagnostic-key deferrals that need
user confirmation, and points the next implementer at the
seams for sub-phases 2e (qualified-prefix completion +
post-walk fixup) and 2g (verification sweep + final
report).

Test totals at handoff: 1385 passing, 0 failed, 1 ignored
(+125 from the 1260 baseline at handoff-26).
2026-05-20 16:29:18 +00:00

397 lines
17 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Session handoff — 2026-05-20 (27)
Twenty-seventh handover. **Implementation session for ADR-0032
Phase 2.** Sub-phases 2a, 2b (with §10.3 stage 2 deferred), 2c,
2d (with three diagnostic keys deferred), and 2f all shipped.
Sub-phases **2e (completion + post-walk fixup)** and
**2g (verification sweep)** remain.
## §1. State at handoff
**Branch:** `main`. Working tree clean. **11 unpushed commits**
(`be8a4f5..0c3847a` plus the prior handoff-26 commit on top of
`3db292c`).
**Tests:** **1385 passing, 0 failing, 1 ignored** (up from
the 2026-05-19 handoff-26 baseline of 1260 / 0 / 1 — a net
**+125 tests**).
**Clippy:** clean (last verified at end of 2f, commit
`0c3847a`).
**Commits since handoff-26:**
```
0c3847a db: column-origin type recovery in SELECT results (sub-phase 2f)
c5cf03b walker: SQL diagnostics — multi-binding scope, qualified refs, Phase-1 gap closure (sub-phase 2d)
a491df3 grammar: migrate Phase-1 SELECT to the ADR-0032 fragment (sub-phase 2c)
4ff054c walker: populate cte_bindings placeholders + projection_aliases (§10.3 stage 1 / §10.4)
b522d09 walker: populate from_scope table bindings (§10.1)
98a74b2 grammar: sql_expr additive extensions for §5/§6, CTE body rewires to ScopedSubgrammar
4f89106 walker: Node::ScopedSubgrammar variant + scope-frame stack (§10.2)
8d29335 grammar: SQL SELECT full statement fragment (Phase 2a)
e032f01 docs: ADR-0032 Amendment 1 — empirical scope of column-origin metadata
```
## §2. What's done — the Phase-2 surface
A user typing in advanced mode can now write the **full Phase-2
SELECT surface** authored by ADR-0032 §§19:
- **All five JOIN flavours** (`INNER` / `LEFT OUTER` / `RIGHT
OUTER` / `FULL OUTER` / `CROSS`) and bare `JOIN`.
- **All four set ops** (`UNION` / `UNION ALL` / `INTERSECT` /
`EXCEPT`) in compound queries.
- **CTEs** — both `WITH` and `WITH RECURSIVE`, with optional
`(col-list)`. Dispatched via a new `data::WITH` `CommandNode`
(entry word `with` is now in `ADVANCED_ONLY_ENTRIES`).
- **`GROUP BY` / `HAVING`** with arbitrary `sql_expr` keys.
- **Qualified column refs** (`t.c` / `alias.c`) — additive
extension to `sql_expr.rs`'s `name_or_call` (§5).
- **Subquery expressions** — scalar `(SELECT …)`, `IN (SELECT
…)`, `[NOT] EXISTS (SELECT …)` — additive `Choice` branches
in `sql_expr.rs` (§6), recursing through
`Node::ScopedSubgrammar(&SQL_SELECT_COMPOUND)`.
- **`LIMIT n [OFFSET m]`** with full `sql_expr` admission for
both args (legacy `LIMIT m, n` rejected per §13 OOS-4).
- **`DISTINCT` / `ALL`** prefix.
- **`t.*` qualified wildcard** in projection.
- **Bare-alias projection** (`SELECT a x FROM t` — lifts
Phase-1 §4.2) via per-position `Node::Lookahead` follow-set
gating.
- **OOS shapes reject**: `NATURAL JOIN`, `JOIN … USING`,
comma-FROM, derived tables in FROM, window functions
(`OVER (…)`), `VALUES` row source, `LATERAL` (partial — the
single-keyword form is admitted as a bare alias; the comma
form rejects via OOS-3).
The Phase-1 `data::SELECT` shape migrated from `data.rs` into
`sql_select.rs`'s `SQL_SELECT_TAIL` (sub-phase 2c). Phase-1's
22 local grammar nodes plus the duplicate
`reject_internal_table` validator are gone; the single source
of truth for the SELECT shape is now `sql_select.rs`.
### 2.1. Walker capabilities
- **`Node::ScopedSubgrammar(&'static Node)`** (ADR-0032 §10.2)
is the new node variant. Same recursion semantics as
`Subgrammar`, plus a push/pop of `ScopeFrame` on the
`WalkContext::from_scope_stack`. Shares `subgrammar_depth`
with `Subgrammar` so the `MAX_SUBGRAMMAR_DEPTH = 64` cap
fires uniformly.
- **`WalkContext::from_scope_stack: Vec<ScopeFrame>`** —
always non-empty (the bottom frame is the implicit top-level
scope DSL paths use). Each `ScopeFrame` carries `from_scope:
Vec<TableBinding>`, `cte_bindings: Vec<CteBinding>`, and
`projection_aliases: Vec<String>`.
- **Three new `Node::Ident` flags**: `writes_table_alias`,
`writes_cte_name`, `writes_projection_alias`. Together with
the existing `writes_table` (now repurposed to push a
TableBinding alongside its prior `current_table` write),
they populate the top frame's accumulators during the walk.
All 48 existing Ident sites carry the new flags as
`false` (mechanical sed update; no behavioural change for
DSL paths).
- **`current_table` / `current_table_columns` stay as fields
on `WalkContext`** (additive approach — they're populated
by `writes_table` alongside the from_scope push). Plan §2b's
"make them derived helpers" became "keep them populated
redundantly" — strictly safer, no behavioural drift for DSL
paths, but a future-cleanup opportunity if the duplication
proves a maintenance burden.
### 2.2. Diagnostics
Three diagnostic passes run on every successful parse
(ADR-0032 §11.7):
1. **`schema_existence_diagnostics`** — extended for
multi-binding scope (a pre-pass collects all
`table_name` / `cte_name` / `table_alias` idents to sidestep
the projection-before-FROM ordering problem). Emits
`diagnostic.unknown_table`, `diagnostic.unknown_column`,
`diagnostic.unknown_qualifier`, `diagnostic.ambiguous_column`,
`diagnostic.duplicate_cte` (Plan §Open-2, user-approved).
2. **`sql_predicate_warnings`** — NEW. MatchedPath-walking
variant of the predicate-warning pass, closes the
Phase-1 carry-over gap (§11.6). Emits
`diagnostic.eq_null`, `diagnostic.like_numeric`,
`diagnostic.type_mismatch` on SQL `WHERE` / `HAVING` /
`JOIN ON` / `CASE` / projection / `ORDER BY` slots.
Scoped to bare column refs in `<column> <op> <literal>`
form — qualified-ref and expression-operand cases stay
un-flagged (safe false-negative posture).
3. **DSL `predicate_warnings` (existing)** — unchanged.
Still walks the DSL `Expr` AST for DSL `WHERE` expressions.
### 2.3. Type recovery (sub-phase 2f)
Per ADR-0032 §12 + Amendment 1:
- `Cargo.toml` rusqlite features now include `column_metadata`.
- `do_run_select` calls `resolve_select_column_types(conn,
stmt)` after `prepare`, queries `__rdbms_playground_columns`
for each result column whose origin metadata is populated,
and threads the type through to `format_cell`.
- ADR-0030 Phase-1 §4.5 (bool rendering as `0` / `1`) is
lifted for any bare-column reference whose origin the engine
carries through — non-recursive CTE bodies, scalar
subqueries, derived tables, set ops, JOINs all work for
free per Amendment 1.
### 2.4. Catalog (ADR-0032 §11.5)
Six new `diagnostic.*` keys + eight new `engine.*` keys
authored in `src/friendly/strings/en-US.yaml` and registered
in `src/friendly/keys.rs`:
- `diagnostic.ambiguous_column`,
`diagnostic.compound_arity_mismatch`,
`diagnostic.cte_arity_mismatch`, `diagnostic.duplicate_cte`,
`diagnostic.projection_alias_misplaced`,
`diagnostic.unknown_qualifier`.
- `engine.no_such_table`, `engine.no_such_column`,
`engine.ambiguous_column`, `engine.aggregate_misuse`,
`engine.group_by_required`,
`engine.compound_arity_mismatch`,
`engine.scalar_subquery_too_many_rows`,
`engine.recursive_cte_malformed`.
The `engine.*` keys are catalog entries only — the
friendly-error layer reads them by key when reached, but no
proactive enhancement of the layer's engine-message
classification was pulled through here. That stays as a
small follow-up.
## §3. Deferrals — all user-approved or documented
### §3.1. §10.3 stage 2 — CTE column-derivation harvest
**Status:** user-approved deferral (mid-session, 2026-05-20).
**Recorded in:** `docs/plans/20260520-adr-0032-phase-2.md`
§2b "Implementation status (2026-05-20)" block.
The placeholder CTE-binding push (stage 1) IS implemented —
the CTE name is visible to the body (WITH RECURSIVE
self-reference works) and to downstream CTE-name validators
(2d's schema-existence pass accepts CTE-name table-source
refs). What's missing is the six §10.3 derivation rules that
populate the placeholder's `columns` at body-frame exit.
**Knock-on effect:** qualified-prefix completion past
`cte_alias.|` returns an empty candidate list. Bare column
refs into a CTE binding short-circuit to "accept silently" in
2d's schema-existence pass (we don't know the columns yet).
Folds into 2e (where qualified-prefix completion needs CTE
columns) and any future arity-check pass that wants to compare
declared `(col-list)` vs derived projection arity.
### §3.2. Three 2d diagnostic keys deferred
`projection_alias_misplaced`, `compound_arity_mismatch`, and
`cte_arity_mismatch` — catalog keys + strings are authored
but the emitting logic is not implemented:
- `projection_alias_misplaced` needs clause-context detection
(which clause an Ident sits in — the matched path is flat;
some role-based recognition would work).
- `compound_arity_mismatch` needs per-leg projection counting
as the compound is assembled.
- `cte_arity_mismatch` depends on §10.3 stage 2.
These were not explicitly user-approved as deferrals — they
were noted in the 2d commit message as "deferred per the
2d-scoped DA review (documented as `(TBD)` in the cross-cut
matrix for 2g)". The user should flag if these need to land
before 2g signs off.
### §3.3. Engine.* key wiring
Catalog entries authored; the friendly-error layer doesn't
yet recognise the new engine messages and route them through
the new keys. This is a small mechanical extension to the
engine-error matcher; not blocking but worth folding in.
## §4. What's next — sub-phases 2e and 2g
### §4.1. Sub-phase 2e — completion + post-walk fixup
Per plan §2e:
- **Qualified-prefix completion narrowing** (§10.5): at the
cursor in `SELECT t.|`, candidates should be `t`'s columns
(not the global column set). Requires:
- The completion API gains a "prefix qualifier" hint that
the walker passes when the cursor is immediately after
`Ident '.'`.
- The `IdentSource::Columns` completion path honours the
hint when present, resolving against the top frame's
`from_scope` binding for `t`.
- **Post-walk projection-list fixup pass** (§10.6): the
projection-before-FROM problem. Collect projection `Ident`
terminals during the walk; after walk completion,
re-resolve each against the final `from_scope`; rewrite
the highlight class and validity diagnostic.
**Note:** the §10.6 problem is *partially* sidestepped already
in 2d — the schema-existence pass uses a two-pass approach
that collects bindings first and then resolves. So
projection-side identifiers don't get false-positive
diagnostics from the existing pass. The 2e fixup-pass scope
is more about getting the HIGHLIGHT class right at every
keystroke (the typing experience) and supporting completion
mid-typing. Worth re-reading §10.6 carefully with the 2d
implementation in hand.
### §4.2. Sub-phase 2g — integration sweep + verification
Per plan §2g:
- Fill in the cross-cut verification matrix (29 rows in
`docs/plans/20260520-adr-0032-phase-2.md`). Many rows are
satisfied by tests already authored in 2a2f; the matrix
needs to be walked and each row pointed at its test.
- Produce the final phase-exit verification report.
- DA final review.
### §4.3. The §10.3 stage-2 harvest
If 2e wants real CTE-column-prefix completion, the harvest
needs to land first. Six derivation rules per the ADR's
table. Mechanism design: hook a callback into
`walk_scoped_subgrammar`'s on-exit path; when the frame
being popped is a CTE body, walk its projection items and
populate the placeholder binding in the outer frame.
Projection item shape can be reconstructed from a
`projection_record` accumulator on the frame (record the
shape of each `projection_item` as the walker matches it —
a new `Node::ProjectionRecord` wrapper, or a flag on
`Lookahead` factories, would be the mechanism).
If the user is willing to defer CTE-prefix completion
along with the harvest, 2e can ship without the harvest;
qualified completion past `cte.|` returns nothing today
and would continue to do so.
## §5. Working tree and unpushed commits
```
e032f01 docs: ADR-0032 Amendment 1 — empirical scope of column-origin metadata
8d29335 grammar: SQL SELECT full statement fragment (Phase 2a)
4f89106 walker: Node::ScopedSubgrammar variant + scope-frame stack (§10.2)
98a74b2 grammar: sql_expr additive extensions for §5/§6, CTE body rewires to ScopedSubgrammar
b522d09 walker: populate from_scope table bindings (§10.1)
4ff054c walker: populate cte_bindings placeholders + projection_aliases (§10.3 stage 1 / §10.4)
a491df3 grammar: migrate Phase-1 SELECT to the ADR-0032 fragment (sub-phase 2c)
c5cf03b walker: SQL diagnostics — multi-binding scope, qualified refs, Phase-1 gap closure (sub-phase 2d)
0c3847a db: column-origin type recovery in SELECT results (sub-phase 2f)
```
Plus this handoff commit, once it lands.
## §6. Seams for the next implementer
Where to look first:
- **For 2e qualified-prefix completion**: the existing
`IdentSource::Columns` completion path in `src/completion.rs`
(driven by `WalkContext::current_table_columns`). The
qualified case needs a new "narrow to this binding's
columns" pathway. The walker already records the
`sql_expr_qualified_ref` role on the trailing ident in a
`t.c` reference; the preceding `Punct('.')` and
`sql_expr_ident` give the qualifier. Walker `WalkContext`
could grow a `pending_qualifier: Option<String>` set when
the walker enters a qualified-ref tail, cleared on
successful match.
- **For 2e post-walk fixup**: the schema-existence pass in
`src/dsl/walker/mod.rs` already does a pre-pass. The 2e
fixup is about the HIGHLIGHT layer — rewriting
`per_byte_class` entries for projection-list idents once
the final `from_scope` is known. The walker's `per_byte`
vec is the mutation target. Look at `walker/highlight.rs`
for how byte-class runs are built today.
- **For 2g verification matrix**: walk row by row through
`docs/plans/20260520-adr-0032-phase-2.md`'s "Cross-cut
verification matrix" table. Many rows already have tests;
fill in the test location column. The handful that don't
have explicit tests (the projection-before-FROM re-walk
cycle, for example) may need new ones authored.
- **For the §10.3 harvest**: see §4.3 above. The cleanest
mechanism is probably a new walker hook (a function pointer
on a per-`ScopedSubgrammar` basis, or a `usage` flag on
the `Node::Ident` variants that record projection items),
but the design isn't fully nailed down. The implementer
should think about whether to escalate the harvest design
to the user before coding.
Code locations modified this session (in commit order):
- `docs/adr/0032-sql-select-grammar.md` — Amendment 1 added.
- `docs/adr/README.md` — index entry extended for Amendment 1.
- `docs/plans/20260520-adr-0032-phase-2.md` — §2b
"Implementation status (2026-05-20)" block recording the
§10.3 stage-2 deferral.
- `src/dsl/grammar/mod.rs` — `Node::ScopedSubgrammar` variant
+ three new Ident flags + `data::WITH` registered +
`ADVANCED_ONLY_ENTRIES` extended with `with`.
- `src/dsl/grammar/data.rs` — ~150 lines of dead Phase-1
SELECT grammar removed; `data::SELECT` reshaped to
`Node::Subgrammar(&sql_select::SQL_SELECT_TAIL)`;
`data::WITH` added.
- `src/dsl/grammar/sql_select.rs` — new file (sub-phase 2a),
augmented through 2b and 2c.
- `src/dsl/grammar/sql_expr.rs` — additive §5 / §6 extensions.
- `src/dsl/walker/context.rs` — `TableBinding`, `CteBinding`,
`CteColumn`, `ScopeFrame`; `WalkContext::from_scope_stack`.
- `src/dsl/walker/driver.rs` — `Node::ScopedSubgrammar`
match arm; `walk_scoped_subgrammar`; `writes_table_alias` /
`writes_cte_name` / `writes_projection_alias` handling in
`walk_ident`.
- `src/dsl/walker/mod.rs` — multi-binding
`schema_existence_diagnostics` rewrite, new
`sql_predicate_warnings` pass, 16 new tests.
- `src/friendly/keys.rs` + `src/friendly/strings/en-US.yaml`
— 6 new `diagnostic.*` + 8 new `engine.*` keys.
- `Cargo.toml` — `column_metadata` added to rusqlite
features.
- `src/db.rs` — `resolve_select_column_types` helper;
`do_run_select` threads per-column types through to
`format_cell`.
- `tests/sql_select.rs` — 3 new SELECT type-recovery tests.
## §7. How to take over
1. **Read this file** in full. Then `docs/adr/0032-sql-select-
grammar.md` (and Amendment 1 — the empirical findings
matter for 2f's design), `docs/plans/20260520-adr-0032-
phase-2.md`, and `CLAUDE.md`.
2. **`cargo test`** — expect **1385 passing, 0 failed, 1
ignored**.
3. **`cargo clippy --all-targets -- -D warnings`** — clean.
4. **Pick the next sub-phase.** 2e is substantial (qualified
completion + post-walk fixup); 2g is the closing sweep
(cross-cut matrix + verification report). Either is a
reasonable next session.
5. **Escalate before re-litigating deferrals.** The three
diagnostic deferrals in 2d (§3.2) are not user-approved —
check whether they need to land before 2g signs off.
6. **The §10.3 stage-2 harvest** is the biggest open piece;
if 2e wants real CTE-prefix completion, escalate the
harvest design before coding.
## §8. What's open beyond Phase 2
Unchanged from handoff-26 §8: snapshot/undo `U`-series;
m:n convenience `C4`; modify-relationship `C3a`; rename-table
`C1`; friendly-error sweep `H1`; CI `TT5`; session-log /
Markdown export `V4`.
ADR-0030's later phases — DML (Phase 3), DDL (Phase 4), the
DSL→SQL teaching echo (Phase 5), polish (Phase 6) — remain on
the roadmap behind the Phase 2 sub-phases not yet completed.