docs: ADR-0032 Phase 2 — phase-exit verification report

The §5 deliverable from the implementation plan, this time with
a non-rubber-stamp DA review.

Documents:
- Final test state (1446 / 0 / 1 — clippy clean).
- Cross-cut matrix outcome (29 rows, all green per the plan doc).
- Requirements-to-test mapping for ADR-0032 §§1–13 + both
  Amendments.
- Autonomous-decision audit (7 implementation decisions, each
  with explicit user-confirmation pointer).
- DA's written final review with three blocking critiques
  (now closed in commit 05884bd) and four non-blocking
  observations recorded as known trade-offs.
- Process critique on the first DA pass being a rubber stamp.

Verdict: PASS, with non-blocking observations pinned in the
report rather than carried into the next phase as folklore.
This commit is contained in:
claude@clouddev1
2026-05-20 21:59:45 +00:00
parent 05884bd13a
commit 1c42e78d92
@@ -0,0 +1,351 @@
# ADR-0032 Phase 2 — Phase-exit verification report (2026-05-20)
Closes sub-phase 2g. This report is the §5 deliverable from the
implementation plan (`docs/plans/20260520-adr-0032-phase-2.md`,
**§Final phase-exit verification report**). It documents the final
test state, the cross-cut matrix outcome, the requirements-to-test
mapping, autonomous-decision audit, and the written DA review.
## §1. Test suite totals
```
cargo test
→ 1446 passing, 0 failed, 1 ignored
```
Phase-1 baseline (handoff-26): 1260 / 0 / 1.
Handoff-27 baseline (start of this session): 1385 / 0 / 1.
End-of-Phase-2 (this report): **1446 / 0 / 1** (+186 since Phase-1
baseline; +61 since handoff-27).
The single ignored test is unchanged from Phase 1 — a doctest
marker, not a regression.
Clippy: `cargo clippy --all-targets -- -D warnings` clean.
## §2. Cross-cut verification matrix
Every row green. See `docs/plans/20260520-adr-0032-phase-2.md`
**§Cross-cut verification matrix** for the full 29-row table with
file::function references. Summary of coverage:
- **5 rows** for ADR-0030 §8 / §9 / §11 — highlighting, completion,
hints, validity indicator, parse errors, OOS rejections, history
logging.
- **3 rows** for ADR-0031 §5 — expression highlighting, column
completion in sql_expr, hint prose.
- **3 rows** for ADR-0032 §10.2 — Subgrammar vs ScopedSubgrammar
scope-push semantics.
- **5 rows** for ADR-0032 §10.3 — the six CTE column-derivation
rules, compound first-leg rule, recursive CTE non-recursive-leg
rule, col-list rename.
- **1 row** for ADR-0032 §10.6 — projection-before-FROM behaviour
via the two-pass diagnostic + Amendment 2's documented mechanism
choice.
- **4 rows** for ADR-0032 §11.6 — the Phase-1 carry-over gap closed
on every SQL `sql_expr` slot (WHERE, HAVING, ON, CASE, projection,
ORDER BY).
- **2 rows** for ADR-0032 §11.4 / §13 — engine-neutral surfacing
of aggregate misuse and GROUP BY required.
- **1 row** for ADR-0032 §9 — depth cap on pathological nesting.
- **2 rows** for ADR-0032 §12 — engine column-origin metadata
through CTE, all 10 playground types via bare ref.
- **1 row** for ADR-0032 §13 — OOS shapes reject (NATURAL, USING,
comma-FROM, VALUES, LATERAL, OVER).
## §3. Requirements-to-test mapping
Each numbered decision in ADR-0032 §§113 traced to its tests:
- **§1 (the SELECT shape).** `src/dsl/grammar/sql_select.rs` test
module — `compound_fragment_walks_with_or_without_with_clause`,
`projection_alias_*`, `qualified_star_projection`, and the
shape's parsing acceptance tests (28+).
- **§4 (JOINs).** Same module — the join-flavour matrix
(`inner_join_*`, `left_outer_join_*`, `right_outer_join_*`,
`full_outer_join_*`, `cross_join_*`).
- **§5 (sql_expr extensions).** `src/dsl/grammar/sql_expr.rs`
`qualified_ref_basic_shapes`, `qualified_ref_function_call_*`,
`subquery_recursion_through_compound`, `scalar_subquery_*`.
- **§6 (subquery primaries).** Same — `scalar_subquery_as_primary`,
`exists_subquery_in_where_clause`, `in_subquery_predicate`.
- **§7 (set operations).** sql_select.rs — `union_*`,
`intersect_*`, `except_*`, `set_op_with_outer_order_by_and_limit`.
- **§8 (CTEs).** sql_select.rs — `cte_*` parse tests; walker
driver — `cte_harvest_*` (10 tests) + `cte_arity_*` (4 tests).
- **§9 (out-of-subset rejects + depth cap).** sql_select.rs —
`natural_join_rejected`, `using_clause_rejected`,
`values_row_source_rejected`, `lateral_join_rejected`,
`window_function_rejected`, `comma_from_is_rejected`,
`pathological_nesting_capped`.
- **§10.1–§10.6 (scope discipline + harvest).** Walker driver
module — `from_scope_*`, `projection_aliases_*`,
`scoped_subgrammar_*`, `cte_harvest_*`,
`cte_harvest_sibling_b_sees_a_columns`,
`cte_harvest_nested_with_in_cte_body`. Walker module —
`projection_before_from_tests` module (4 tests).
Completion — `qualified_prefix_*` (5 tests), `lookahead_*`
(4 tests).
- **§11.2 ERROR diagnostics.** Walker module —
`unknown_qualifier_in_qualified_ref_is_error`,
`ambiguous_bare_column_is_error`,
`duplicate_cte_in_same_with_block_is_error`,
`projection_alias_in_where_is_misplaced` /
`_in_having_is_misplaced` / `_in_group_by_is_misplaced` /
`_in_order_by_is_allowed`,
`compound_union_arity_mismatch_fires` /
`_intersect_` / `_except_` / `_union_all_` /
`_three_leg_chain_emits_per_mismatch` /
`_with_function_call_args_not_confused` /
`_inside_cte_body_detected`, `cte_arity_mismatch_when_col_list_*`,
`cte_arity_match_no_diagnostic`.
- **§11.6 WARNING diagnostics (Phase-1 gap closure).** Walker
module — `sql_where_like_numeric_warns`,
`sql_where_eq_null_warns`, `sql_where_type_mismatch_*`,
`sql_having_predicate_warning_fires`,
`sql_join_on_predicate_warning_fires`,
`sql_case_predicate_warning_fires`,
`sql_order_by_predicate_warning_fires`,
`sql_projection_predicate_warning_fires`.
- **§11.5 catalog + §11.4 engine routing.** Friendly translate —
`aggregate_misuse_engine_message_routes_through_catalog`,
`group_by_required_engine_message_routes_through_catalog`,
`compound_arity_engine_message_routes_through_catalog`,
`scalar_subquery_too_many_rows_routes_through_catalog`.
- **§12 type recovery.** `tests/sql_select.rs`
`database_run_select_recovers_bool_column_type`,
`database_run_select_recovers_text_type_through_alias`,
`database_run_select_computed_expression_stays_typeless`,
`database_run_select_recovers_all_ten_playground_types`.
- **§13 OOS list.** sql_select.rs — see §9 above (the OOS-rejection
tests cover §13's enumerated shapes).
- **Amendment 1 (empirical column-origin scope).** Worker probe
in `src/db.rs::resolve_select_column_types` — exercised by
every §12 type-recovery test.
- **Amendment 2 (§10.6 mechanism).** `projection_before_from_tests`
module documents the post-walk re-resolve via 2d's two-pass
diagnostic + the renderer's diagnostic-overlay path.
## §4. Autonomous-decision audit
Per CLAUDE.md, every implementation decision not explicitly
authorised by the ADR or user is listed here with rationale and
the explicit user confirmation that approved it.
### 4.1. Sub-phase 2d.1 (this session)
Three `diagnostic.*` keys were deferred by the 2d implementer
without user approval (handoff-27 §3.2). User confirmed the
correct interpretation: two of the three (`projection_alias_misplaced`,
`compound_arity_mismatch`) needed to land in 2d.1; the third
(`cte_arity_mismatch`) was correctly attached to the user-approved
§10.3 stage-2 deferral. **2d.1 commit `c20c6e0` closes both
independent keys.** `cte_arity_mismatch` landed alongside the
harvest in `dd37a1c`.
### 4.2. §10.3 stage-2 harvest design
Per handoff-27 §4.3, the implementer was asked to "think about
whether to escalate the harvest design to the user before
coding". This session escalated three design questions to the
user (post-walk path-scan vs per-frame record; pending_cte_harvest
trigger mechanism; col_list arity check scope), got explicit
sign-off on each (option A for Q1, "yes use pending_cte_harvest"
for Q2, "yes all three in this sub-phase" for Q3), and then
implemented per those answers. **Commit `dd37a1c` reflects the
user-approved design.**
### 4.3. Nested WITH inside subqueries / CTE bodies
ADR-0032 §10.3 implies subqueries can declare their own CTEs
(the shadowing note), but the grammar didn't admit nested WITH
inside SQL_SELECT_COMPOUND. User explicitly chose "Fix grammar
now" over the ADR-Amendment-2 carve-out. **Commit `fd25904`
closes the ADR-vs-implementation gap.**
### 4.4. Completion look-ahead for the edit-scenario
The user raised the realistic "edit an existing query" workflow
(`select c| from mytable` where FROM exists after the cursor).
ADR §10.6 explicitly accepted the "noisy global fallback" posture
for this case. User chose to improve it via a look-ahead probe.
**Commit `0fc7b08` adds the look-ahead, preserving the ADR's
posture as the fall-through when the full input doesn't parse
cleanly.**
### 4.5. §10.6 fixup mechanism choice
The ADR prescribed "rewriting the highlight class"; the
implementation uses a different mechanism (diagnostic-overlay
renderer + 2d's two-pass binding collection). User explicitly
chose "Write Amendment 2 now" to document this. **Commit
`ee0dafd` adds Amendment 2; commit `ed881ee` adds the
regression tests pinning the mechanism's user-visible behaviour.**
### 4.6. 2g matrix-driven implementation gaps
Filling the matrix surfaced three real production gaps:
- Advanced-mode UI rendering bypassed the highlight walker.
- Engine.* catalog keys were authored but unreached.
- Three predicate-warning slots (CASE, ORDER BY, projection) had
no explicit test coverage.
Each was a real Phase-2 deliverable, not a future enhancement.
**Commit `ed881ee` closes all three.** No additional user
approval was sought because the work was the literal completion
of items already approved as Phase-2 scope; the "defer this"
instinct that flagged them was the anti-pattern the user has
been correcting throughout the session.
### 4.7. DA rework — fourth UI gap
The first DA review of this report rubber-stamped the work with
PASS. The user caught that the DA section asked the right
questions but answered them too charitably. A genuine DA pass
produced seven critiques; the user routed the work back to
Phase 4 to address the three blocking ones (Blob test, engine
patterns vs. real SQLite output, manual UI verification).
Critique #3 (manual UI verification) surfaced a fourth
implementation gap not caught by any of the unit tests: the
validity indicator (`[ERR]` / `[WRN]`) was mode-gated to Simple,
so SQL predicate warnings emitted in Advanced mode but never
reached the indicator the user sees. Without the manual TUI
check this would have shipped silently. **Commit `05884bd`
addresses all three DA critiques + this surfaced fourth gap.**
The lesson: the matrix-row attribution for "validity indicator
fires for SQL" pointed to DSL tests, not SQL tests. Filling the
matrix doesn't guarantee the rows are correctly attributed; the
DA had to challenge the attribution to find this.
## §5. DA's written final review
The first attempt at this section was a rubber-stamp PASS. After
the user challenged it ("Did the DA have anything to say other
than PASS?"), the DA hat produced seven genuine critiques. Three
were blocking and went back to Phase 4 (now closed by commit
`05884bd`); four were observations recorded here for future
reference.
### Critiques addressed (now closed)
**#1 — All-10-types test's NULL-blob cell.** Justified after
empirical verification: column-origin metadata is row-independent
(the engine returns the source table+column from the prepared
statement, not from cell values). Added
`database_run_select_type_recovery_works_on_empty_table` that
pins this invariant on an empty table; the all-types test now
references it via explicit comment.
**#2 — Engine.* pattern matching against hand-coded strings.**
Closed: three new tests in `tests/sql_select.rs` produce real
SQLite engine errors via `run_select` and assert catalog
routing. Aggregate-in-WHERE confirms the actual SQLite wording
matches the pattern matcher. GROUP-BY-required and
scalar-subquery-too-many-rows are SQLite-permissive (no error
on the natural triggers), so those tests verify the matcher
doesn't false-positive on benign queries + synthetic-message
routing still works.
**#3 — No manual UI verification.** Closed: launched the TUI,
typed a representative `SELECT * FROM products WHERE price LIKE
5` in Advanced mode, confirmed (a) SELECT/FROM/WHERE/LIKE
render in keyword color, (b) the `[WRN]` indicator appears.
This step surfaced the fourth gap (validity verdict gated to
Simple); the additional fix in `05884bd` wires the verdict
through the active effective mode.
### Critiques recorded (non-blocking)
**#4 — Group-by pattern overbreadth.** `translate_generic`
matches any error containing "group by" — could route unrelated
future SQLite errors through `engine.group_by_required`. Low
risk in practice (SQLite's other "group by" mentions are rare
and the catalog wording is generic enough to not mislead). Pin
for follow-up if a real false-positive surfaces.
**#5 — Look-ahead probe cost.** The completion engine runs a
second walk per Tab press when the leading walk produced no
scope. For complex inputs this doubles parse cost. Not
benchmarked. Acceptable in current debounce-cadence usage;
revisit if profiling reveals it.
**#6 — Tests-after-code on the matrix-coverage tests.** The 13
new tests in `ed881ee` were written after the gaps were
discovered, not before — the spirit of CLAUDE.md's "establish
test coverage BEFORE making changes" rule. The work is sound,
the tests do prove the behavior, but the ordering was wrong.
Noted for future sub-phases.
**#7 — Matrix attribution wasn't verified row-by-row.** The
validity-indicator row pointed at DSL tests; the DA had to
catch this. Future matrix-filling work should cross-check the
attributed test actually exercises the claimed surface (e.g.,
a "fires for SQL" row should point at a SQL-input test, not a
DSL test that happens to share the same code path).
### Process critique
**The DA verdict on the first pass was a rubber stamp.** The
user had to challenge the verdict to get genuine engagement.
That's a Phase-1 anti-pattern (CLAUDE.md: "the DA is not a
rubber stamp"). The second-pass DA produced seven specific
critiques and routed the three blockers to rework. **Next
time, DA review starts by listing specific critiques without
the verdict, then concludes — not the other way around.**
### Are all tests green with zero skips?
Yes. **1446 passing, 0 failed, 1 ignored.** The one ignored
test is the unchanged Phase-1 doctest marker, present in the
baseline. Clippy clean across the workspace.
### Final verdict
**PASS, with the four non-blocking observations recorded above
for follow-up.**
Phase 2 is complete:
- Every ADR-0032 decision (§§113) has at least one test.
- Both Amendments (1 + 2) are documented and have regression
coverage.
- Every autonomous decision is flagged with the user
confirmation that approved it.
- The manual TUI check confirms SQL keywords highlight in
Advanced mode and the validity indicator surfaces SQL
diagnostics correctly.
The four non-blocking critiques are pinned in this report
rather than carried into the next phase as folklore. A future
session can decide whether to address them or accept them as
known trade-offs.
## §6. Commits
In commit order (oldest first), all 18 unpushed at the moment
this report writes:
```
e032f01 docs: ADR-0032 Amendment 1
8d29335 grammar: SQL SELECT full statement fragment (Phase 2a)
4f89106 walker: Node::ScopedSubgrammar variant + scope-frame stack
98a74b2 grammar: sql_expr additive extensions for §5/§6
b522d09 walker: populate from_scope table bindings
4ff054c walker: populate cte_bindings placeholders + projection_aliases
a491df3 grammar: migrate Phase-1 SELECT to ADR-0032 fragment
c5cf03b walker: SQL diagnostics — multi-binding scope, qualified refs
0c3847a db: column-origin type recovery in SELECT results
5d716e6 docs: handoff 27
c20c6e0 walker: 2d.1 — projection-alias misplaced + compound-arity
dd37a1c walker: 2e prereq — §10.3 stage-2 CTE harvest + cte_arity_mismatch
fd25904 grammar: admit WITH inside subqueries / CTE bodies (ADR-0032 §10.3)
0fc7b08 completion: §10.5 qualified-prefix + edit-scenario look-ahead
ee0dafd docs: ADR-0032 Amendment 2 + §10.6 regression tests
ed881ee 2g: advanced-mode highlight + engine.* wiring + matrix tests
05884bd 2g rework: address DA findings on type recovery + engine routing + UI
<this report>
```
Per CLAUDE.md, push is a user step. Phase 2 lands in
`origin/main` whenever the user pushes.