The §5 deliverable from the implementation plan, this time with
a non-rubber-stamp DA review.
Documents:
- Final test state (1446 / 0 / 1 — clippy clean).
- Cross-cut matrix outcome (29 rows, all green per the plan doc).
- Requirements-to-test mapping for ADR-0032 §§1–13 + both
Amendments.
- Autonomous-decision audit (7 implementation decisions, each
with explicit user-confirmation pointer).
- DA's written final review with three blocking critiques
(now closed in commit 05884bd) and four non-blocking
observations recorded as known trade-offs.
- Process critique on the first DA pass being a rubber stamp.
Verdict: PASS, with non-blocking observations pinned in the
report rather than carried into the next phase as folklore.
16 KiB
ADR-0032 Phase 2 — Phase-exit verification report (2026-05-20)
Closes sub-phase 2g. This report is the §5 deliverable from the
implementation plan (docs/plans/20260520-adr-0032-phase-2.md,
§Final phase-exit verification report). It documents the final
test state, the cross-cut matrix outcome, the requirements-to-test
mapping, autonomous-decision audit, and the written DA review.
§1. Test suite totals
cargo test
→ 1446 passing, 0 failed, 1 ignored
Phase-1 baseline (handoff-26): 1260 / 0 / 1. Handoff-27 baseline (start of this session): 1385 / 0 / 1. End-of-Phase-2 (this report): 1446 / 0 / 1 (+186 since Phase-1 baseline; +61 since handoff-27).
The single ignored test is unchanged from Phase 1 — a doctest marker, not a regression.
Clippy: cargo clippy --all-targets -- -D warnings clean.
§2. Cross-cut verification matrix
Every row green. See docs/plans/20260520-adr-0032-phase-2.md
§Cross-cut verification matrix for the full 29-row table with
file::function references. Summary of coverage:
- 5 rows for ADR-0030 §8 / §9 / §11 — highlighting, completion, hints, validity indicator, parse errors, OOS rejections, history logging.
- 3 rows for ADR-0031 §5 — expression highlighting, column completion in sql_expr, hint prose.
- 3 rows for ADR-0032 §10.2 — Subgrammar vs ScopedSubgrammar scope-push semantics.
- 5 rows for ADR-0032 §10.3 — the six CTE column-derivation rules, compound first-leg rule, recursive CTE non-recursive-leg rule, col-list rename.
- 1 row for ADR-0032 §10.6 — projection-before-FROM behaviour via the two-pass diagnostic + Amendment 2's documented mechanism choice.
- 4 rows for ADR-0032 §11.6 — the Phase-1 carry-over gap closed
on every SQL
sql_exprslot (WHERE, HAVING, ON, CASE, projection, ORDER BY). - 2 rows for ADR-0032 §11.4 / §13 — engine-neutral surfacing of aggregate misuse and GROUP BY required.
- 1 row for ADR-0032 §9 — depth cap on pathological nesting.
- 2 rows for ADR-0032 §12 — engine column-origin metadata through CTE, all 10 playground types via bare ref.
- 1 row for ADR-0032 §13 — OOS shapes reject (NATURAL, USING, comma-FROM, VALUES, LATERAL, OVER).
§3. Requirements-to-test mapping
Each numbered decision in ADR-0032 §§1–13 traced to its tests:
- §1 (the SELECT shape).
src/dsl/grammar/sql_select.rstest module —compound_fragment_walks_with_or_without_with_clause,projection_alias_*,qualified_star_projection, and the shape's parsing acceptance tests (28+). - §4 (JOINs). Same module — the join-flavour matrix
(
inner_join_*,left_outer_join_*,right_outer_join_*,full_outer_join_*,cross_join_*). - §5 (sql_expr extensions).
src/dsl/grammar/sql_expr.rs—qualified_ref_basic_shapes,qualified_ref_function_call_*,subquery_recursion_through_compound,scalar_subquery_*. - §6 (subquery primaries). Same —
scalar_subquery_as_primary,exists_subquery_in_where_clause,in_subquery_predicate. - §7 (set operations). sql_select.rs —
union_*,intersect_*,except_*,set_op_with_outer_order_by_and_limit. - §8 (CTEs). sql_select.rs —
cte_*parse tests; walker driver —cte_harvest_*(10 tests) +cte_arity_*(4 tests). - §9 (out-of-subset rejects + depth cap). sql_select.rs —
natural_join_rejected,using_clause_rejected,values_row_source_rejected,lateral_join_rejected,window_function_rejected,comma_from_is_rejected,pathological_nesting_capped. - §10.1–§10.6 (scope discipline + harvest). Walker driver
module —
from_scope_*,projection_aliases_*,scoped_subgrammar_*,cte_harvest_*,cte_harvest_sibling_b_sees_a_columns,cte_harvest_nested_with_in_cte_body. Walker module —projection_before_from_testsmodule (4 tests). Completion —qualified_prefix_*(5 tests),lookahead_*(4 tests). - §11.2 ERROR diagnostics. Walker module —
unknown_qualifier_in_qualified_ref_is_error,ambiguous_bare_column_is_error,duplicate_cte_in_same_with_block_is_error,projection_alias_in_where_is_misplaced/_in_having_is_misplaced/_in_group_by_is_misplaced/_in_order_by_is_allowed,compound_union_arity_mismatch_fires/_intersect_/_except_/_union_all_/_three_leg_chain_emits_per_mismatch/_with_function_call_args_not_confused/_inside_cte_body_detected,cte_arity_mismatch_when_col_list_*,cte_arity_match_no_diagnostic. - §11.6 WARNING diagnostics (Phase-1 gap closure). Walker
module —
sql_where_like_numeric_warns,sql_where_eq_null_warns,sql_where_type_mismatch_*,sql_having_predicate_warning_fires,sql_join_on_predicate_warning_fires,sql_case_predicate_warning_fires,sql_order_by_predicate_warning_fires,sql_projection_predicate_warning_fires. - §11.5 catalog + §11.4 engine routing. Friendly translate —
aggregate_misuse_engine_message_routes_through_catalog,group_by_required_engine_message_routes_through_catalog,compound_arity_engine_message_routes_through_catalog,scalar_subquery_too_many_rows_routes_through_catalog. - §12 type recovery.
tests/sql_select.rs—database_run_select_recovers_bool_column_type,database_run_select_recovers_text_type_through_alias,database_run_select_computed_expression_stays_typeless,database_run_select_recovers_all_ten_playground_types. - §13 OOS list. sql_select.rs — see §9 above (the OOS-rejection tests cover §13's enumerated shapes).
- Amendment 1 (empirical column-origin scope). Worker probe
in
src/db.rs::resolve_select_column_types— exercised by every §12 type-recovery test. - Amendment 2 (§10.6 mechanism).
projection_before_from_testsmodule documents the post-walk re-resolve via 2d's two-pass diagnostic + the renderer's diagnostic-overlay path.
§4. Autonomous-decision audit
Per CLAUDE.md, every implementation decision not explicitly authorised by the ADR or user is listed here with rationale and the explicit user confirmation that approved it.
4.1. Sub-phase 2d.1 (this session)
Three diagnostic.* keys were deferred by the 2d implementer
without user approval (handoff-27 §3.2). User confirmed the
correct interpretation: two of the three (projection_alias_misplaced,
compound_arity_mismatch) needed to land in 2d.1; the third
(cte_arity_mismatch) was correctly attached to the user-approved
§10.3 stage-2 deferral. 2d.1 commit c20c6e0 closes both
independent keys. cte_arity_mismatch landed alongside the
harvest in dd37a1c.
4.2. §10.3 stage-2 harvest design
Per handoff-27 §4.3, the implementer was asked to "think about
whether to escalate the harvest design to the user before
coding". This session escalated three design questions to the
user (post-walk path-scan vs per-frame record; pending_cte_harvest
trigger mechanism; col_list arity check scope), got explicit
sign-off on each (option A for Q1, "yes use pending_cte_harvest"
for Q2, "yes all three in this sub-phase" for Q3), and then
implemented per those answers. Commit dd37a1c reflects the
user-approved design.
4.3. Nested WITH inside subqueries / CTE bodies
ADR-0032 §10.3 implies subqueries can declare their own CTEs
(the shadowing note), but the grammar didn't admit nested WITH
inside SQL_SELECT_COMPOUND. User explicitly chose "Fix grammar
now" over the ADR-Amendment-2 carve-out. Commit fd25904
closes the ADR-vs-implementation gap.
4.4. Completion look-ahead for the edit-scenario
The user raised the realistic "edit an existing query" workflow
(select c| from mytable where FROM exists after the cursor).
ADR §10.6 explicitly accepted the "noisy global fallback" posture
for this case. User chose to improve it via a look-ahead probe.
Commit 0fc7b08 adds the look-ahead, preserving the ADR's
posture as the fall-through when the full input doesn't parse
cleanly.
4.5. §10.6 fixup mechanism choice
The ADR prescribed "rewriting the highlight class"; the
implementation uses a different mechanism (diagnostic-overlay
renderer + 2d's two-pass binding collection). User explicitly
chose "Write Amendment 2 now" to document this. Commit
ee0dafd adds Amendment 2; commit ed881ee adds the
regression tests pinning the mechanism's user-visible behaviour.
4.6. 2g matrix-driven implementation gaps
Filling the matrix surfaced three real production gaps:
- Advanced-mode UI rendering bypassed the highlight walker.
- Engine.* catalog keys were authored but unreached.
- Three predicate-warning slots (CASE, ORDER BY, projection) had no explicit test coverage.
Each was a real Phase-2 deliverable, not a future enhancement.
Commit ed881ee closes all three. No additional user
approval was sought because the work was the literal completion
of items already approved as Phase-2 scope; the "defer this"
instinct that flagged them was the anti-pattern the user has
been correcting throughout the session.
4.7. DA rework — fourth UI gap
The first DA review of this report rubber-stamped the work with PASS. The user caught that the DA section asked the right questions but answered them too charitably. A genuine DA pass produced seven critiques; the user routed the work back to Phase 4 to address the three blocking ones (Blob test, engine patterns vs. real SQLite output, manual UI verification).
Critique #3 (manual UI verification) surfaced a fourth
implementation gap not caught by any of the unit tests: the
validity indicator ([ERR] / [WRN]) was mode-gated to Simple,
so SQL predicate warnings emitted in Advanced mode but never
reached the indicator the user sees. Without the manual TUI
check this would have shipped silently. Commit 05884bd
addresses all three DA critiques + this surfaced fourth gap.
The lesson: the matrix-row attribution for "validity indicator fires for SQL" pointed to DSL tests, not SQL tests. Filling the matrix doesn't guarantee the rows are correctly attributed; the DA had to challenge the attribution to find this.
§5. DA's written final review
The first attempt at this section was a rubber-stamp PASS. After
the user challenged it ("Did the DA have anything to say other
than PASS?"), the DA hat produced seven genuine critiques. Three
were blocking and went back to Phase 4 (now closed by commit
05884bd); four were observations recorded here for future
reference.
Critiques addressed (now closed)
#1 — All-10-types test's NULL-blob cell. Justified after
empirical verification: column-origin metadata is row-independent
(the engine returns the source table+column from the prepared
statement, not from cell values). Added
database_run_select_type_recovery_works_on_empty_table that
pins this invariant on an empty table; the all-types test now
references it via explicit comment.
#2 — Engine. pattern matching against hand-coded strings.*
Closed: three new tests in tests/sql_select.rs produce real
SQLite engine errors via run_select and assert catalog
routing. Aggregate-in-WHERE confirms the actual SQLite wording
matches the pattern matcher. GROUP-BY-required and
scalar-subquery-too-many-rows are SQLite-permissive (no error
on the natural triggers), so those tests verify the matcher
doesn't false-positive on benign queries + synthetic-message
routing still works.
#3 — No manual UI verification. Closed: launched the TUI,
typed a representative SELECT * FROM products WHERE price LIKE 5 in Advanced mode, confirmed (a) SELECT/FROM/WHERE/LIKE
render in keyword color, (b) the [WRN] indicator appears.
This step surfaced the fourth gap (validity verdict gated to
Simple); the additional fix in 05884bd wires the verdict
through the active effective mode.
Critiques recorded (non-blocking)
#4 — Group-by pattern overbreadth. translate_generic
matches any error containing "group by" — could route unrelated
future SQLite errors through engine.group_by_required. Low
risk in practice (SQLite's other "group by" mentions are rare
and the catalog wording is generic enough to not mislead). Pin
for follow-up if a real false-positive surfaces.
#5 — Look-ahead probe cost. The completion engine runs a second walk per Tab press when the leading walk produced no scope. For complex inputs this doubles parse cost. Not benchmarked. Acceptable in current debounce-cadence usage; revisit if profiling reveals it.
#6 — Tests-after-code on the matrix-coverage tests. The 13
new tests in ed881ee were written after the gaps were
discovered, not before — the spirit of CLAUDE.md's "establish
test coverage BEFORE making changes" rule. The work is sound,
the tests do prove the behavior, but the ordering was wrong.
Noted for future sub-phases.
#7 — Matrix attribution wasn't verified row-by-row. The validity-indicator row pointed at DSL tests; the DA had to catch this. Future matrix-filling work should cross-check the attributed test actually exercises the claimed surface (e.g., a "fires for SQL" row should point at a SQL-input test, not a DSL test that happens to share the same code path).
Process critique
The DA verdict on the first pass was a rubber stamp. The user had to challenge the verdict to get genuine engagement. That's a Phase-1 anti-pattern (CLAUDE.md: "the DA is not a rubber stamp"). The second-pass DA produced seven specific critiques and routed the three blockers to rework. Next time, DA review starts by listing specific critiques without the verdict, then concludes — not the other way around.
Are all tests green with zero skips?
Yes. 1446 passing, 0 failed, 1 ignored. The one ignored test is the unchanged Phase-1 doctest marker, present in the baseline. Clippy clean across the workspace.
Final verdict
PASS, with the four non-blocking observations recorded above for follow-up.
Phase 2 is complete:
- Every ADR-0032 decision (§§1–13) has at least one test.
- Both Amendments (1 + 2) are documented and have regression coverage.
- Every autonomous decision is flagged with the user confirmation that approved it.
- The manual TUI check confirms SQL keywords highlight in Advanced mode and the validity indicator surfaces SQL diagnostics correctly.
The four non-blocking critiques are pinned in this report rather than carried into the next phase as folklore. A future session can decide whether to address them or accept them as known trade-offs.
§6. Commits
In commit order (oldest first), all 18 unpushed at the moment this report writes:
e032f01 docs: ADR-0032 Amendment 1
8d29335 grammar: SQL SELECT full statement fragment (Phase 2a)
4f89106 walker: Node::ScopedSubgrammar variant + scope-frame stack
98a74b2 grammar: sql_expr additive extensions for §5/§6
b522d09 walker: populate from_scope table bindings
4ff054c walker: populate cte_bindings placeholders + projection_aliases
a491df3 grammar: migrate Phase-1 SELECT to ADR-0032 fragment
c5cf03b walker: SQL diagnostics — multi-binding scope, qualified refs
0c3847a db: column-origin type recovery in SELECT results
5d716e6 docs: handoff 27
c20c6e0 walker: 2d.1 — projection-alias misplaced + compound-arity
dd37a1c walker: 2e prereq — §10.3 stage-2 CTE harvest + cte_arity_mismatch
fd25904 grammar: admit WITH inside subqueries / CTE bodies (ADR-0032 §10.3)
0fc7b08 completion: §10.5 qualified-prefix + edit-scenario look-ahead
ee0dafd docs: ADR-0032 Amendment 2 + §10.6 regression tests
ed881ee 2g: advanced-mode highlight + engine.* wiring + matrix tests
05884bd 2g rework: address DA findings on type recovery + engine routing + UI
<this report>
Per CLAUDE.md, push is a user step. Phase 2 lands in
origin/main whenever the user pushes.