docs: ADR-0032 Phase 2 — phase-exit verification report

The §5 deliverable from the implementation plan, this time with a non-rubber-stamp DA review. Documents: - Final test state (1446 / 0 / 1 — clippy clean). - Cross-cut matrix outcome (29 rows, all green per the plan doc). - Requirements-to-test mapping for ADR-0032 §§1–13 + both Amendments. - Autonomous-decision audit (7 implementation decisions, each with explicit user-confirmation pointer). - DA's written final review with three blocking critiques (now closed in commit 05884bd) and four non-blocking observations recorded as known trade-offs. - Process critique on the first DA pass being a rubber stamp. Verdict: PASS, with non-blocking observations pinned in the report rather than carried into the next phase as folklore.
2026-05-20 21:59:45 +00:00
parent 05884bd13a
commit 1c42e78d92
1 changed files with 351 additions and 0 deletions
@@ -0,0 +1,351 @@
 # ADR-0032 Phase 2 — Phase-exit verification report (2026-05-20)
 Closes sub-phase 2g. This report is the §5 deliverable from the
 implementation plan (`docs/plans/20260520-adr-0032-phase-2.md`,
 **§Final phase-exit verification report**). It documents the final
 test state, the cross-cut matrix outcome, the requirements-to-test
 mapping, autonomous-decision audit, and the written DA review.
 ## §1. Test suite totals
 ```
 cargo test
 → 1446 passing, 0 failed, 1 ignored
 ```
 Phase-1 baseline (handoff-26): 1260 / 0 / 1.
 Handoff-27 baseline (start of this session): 1385 / 0 / 1.
 End-of-Phase-2 (this report): **1446 / 0 / 1** (+186 since Phase-1
 baseline; +61 since handoff-27).
 The single ignored test is unchanged from Phase 1 — a doctest
 marker, not a regression.
 Clippy: `cargo clippy --all-targets -- -D warnings` clean.
 ## §2. Cross-cut verification matrix
 Every row green. See `docs/plans/20260520-adr-0032-phase-2.md`
 **§Cross-cut verification matrix** for the full 29-row table with
 file::function references. Summary of coverage:
 - **5 rows** for ADR-0030 §8 / §9 / §11 — highlighting, completion,
  hints, validity indicator, parse errors, OOS rejections, history
  logging.
 - **3 rows** for ADR-0031 §5 — expression highlighting, column
  completion in sql_expr, hint prose.
 - **3 rows** for ADR-0032 §10.2 — Subgrammar vs ScopedSubgrammar
  scope-push semantics.
 - **5 rows** for ADR-0032 §10.3 — the six CTE column-derivation
  rules, compound first-leg rule, recursive CTE non-recursive-leg
  rule, col-list rename.
 - **1 row** for ADR-0032 §10.6 — projection-before-FROM behaviour
  via the two-pass diagnostic + Amendment 2's documented mechanism
  choice.
 - **4 rows** for ADR-0032 §11.6 — the Phase-1 carry-over gap closed
  on every SQL `sql_expr` slot (WHERE, HAVING, ON, CASE, projection,
  ORDER BY).
 - **2 rows** for ADR-0032 §11.4 / §13 — engine-neutral surfacing
  of aggregate misuse and GROUP BY required.
 - **1 row** for ADR-0032 §9 — depth cap on pathological nesting.
 - **2 rows** for ADR-0032 §12 — engine column-origin metadata
  through CTE, all 10 playground types via bare ref.
 - **1 row** for ADR-0032 §13 — OOS shapes reject (NATURAL, USING,
  comma-FROM, VALUES, LATERAL, OVER).
 ## §3. Requirements-to-test mapping
 Each numbered decision in ADR-0032 §§1–13 traced to its tests:
 - **§1 (the SELECT shape).** `src/dsl/grammar/sql_select.rs` test
  module — `compound_fragment_walks_with_or_without_with_clause`,
  `projection_alias_*`, `qualified_star_projection`, and the
  shape's parsing acceptance tests (28+).
 - **§4 (JOINs).** Same module — the join-flavour matrix
  (`inner_join_*`, `left_outer_join_*`, `right_outer_join_*`,
  `full_outer_join_*`, `cross_join_*`).
 - **§5 (sql_expr extensions).** `src/dsl/grammar/sql_expr.rs`
  — `qualified_ref_basic_shapes`, `qualified_ref_function_call_*`,
  `subquery_recursion_through_compound`, `scalar_subquery_*`.
 - **§6 (subquery primaries).** Same — `scalar_subquery_as_primary`,
  `exists_subquery_in_where_clause`, `in_subquery_predicate`.
 - **§7 (set operations).** sql_select.rs — `union_*`,
  `intersect_*`, `except_*`, `set_op_with_outer_order_by_and_limit`.
 - **§8 (CTEs).** sql_select.rs — `cte_*` parse tests; walker
  driver — `cte_harvest_*` (10 tests) + `cte_arity_*` (4 tests).
 - **§9 (out-of-subset rejects + depth cap).** sql_select.rs —
  `natural_join_rejected`, `using_clause_rejected`,
  `values_row_source_rejected`, `lateral_join_rejected`,
  `window_function_rejected`, `comma_from_is_rejected`,
  `pathological_nesting_capped`.
 - **§10.1–§10.6 (scope discipline + harvest).** Walker driver
  module — `from_scope_*`, `projection_aliases_*`,
  `scoped_subgrammar_*`, `cte_harvest_*`,
  `cte_harvest_sibling_b_sees_a_columns`,
  `cte_harvest_nested_with_in_cte_body`. Walker module —
  `projection_before_from_tests` module (4 tests).
  Completion — `qualified_prefix_*` (5 tests), `lookahead_*`
  (4 tests).
 - **§11.2 ERROR diagnostics.** Walker module —
  `unknown_qualifier_in_qualified_ref_is_error`,
  `ambiguous_bare_column_is_error`,
  `duplicate_cte_in_same_with_block_is_error`,
  `projection_alias_in_where_is_misplaced` /
  `_in_having_is_misplaced` / `_in_group_by_is_misplaced` /
  `_in_order_by_is_allowed`,
  `compound_union_arity_mismatch_fires` /
  `_intersect_` / `_except_` / `_union_all_` /
  `_three_leg_chain_emits_per_mismatch` /
  `_with_function_call_args_not_confused` /
  `_inside_cte_body_detected`, `cte_arity_mismatch_when_col_list_*`,
  `cte_arity_match_no_diagnostic`.
 - **§11.6 WARNING diagnostics (Phase-1 gap closure).** Walker
  module — `sql_where_like_numeric_warns`,
  `sql_where_eq_null_warns`, `sql_where_type_mismatch_*`,
  `sql_having_predicate_warning_fires`,
  `sql_join_on_predicate_warning_fires`,
  `sql_case_predicate_warning_fires`,
  `sql_order_by_predicate_warning_fires`,
  `sql_projection_predicate_warning_fires`.
 - **§11.5 catalog + §11.4 engine routing.** Friendly translate —
  `aggregate_misuse_engine_message_routes_through_catalog`,
  `group_by_required_engine_message_routes_through_catalog`,
  `compound_arity_engine_message_routes_through_catalog`,
  `scalar_subquery_too_many_rows_routes_through_catalog`.
 - **§12 type recovery.** `tests/sql_select.rs` —
  `database_run_select_recovers_bool_column_type`,
  `database_run_select_recovers_text_type_through_alias`,
  `database_run_select_computed_expression_stays_typeless`,
  `database_run_select_recovers_all_ten_playground_types`.
 - **§13 OOS list.** sql_select.rs — see §9 above (the OOS-rejection
  tests cover §13's enumerated shapes).
 - **Amendment 1 (empirical column-origin scope).** Worker probe
  in `src/db.rs::resolve_select_column_types` — exercised by
  every §12 type-recovery test.
 - **Amendment 2 (§10.6 mechanism).** `projection_before_from_tests`
  module documents the post-walk re-resolve via 2d's two-pass
  diagnostic + the renderer's diagnostic-overlay path.
 ## §4. Autonomous-decision audit
 Per CLAUDE.md, every implementation decision not explicitly
 authorised by the ADR or user is listed here with rationale and
 the explicit user confirmation that approved it.
 ### 4.1. Sub-phase 2d.1 (this session)
 Three `diagnostic.*` keys were deferred by the 2d implementer
 without user approval (handoff-27 §3.2). User confirmed the
 correct interpretation: two of the three (`projection_alias_misplaced`,
 `compound_arity_mismatch`) needed to land in 2d.1; the third
 (`cte_arity_mismatch`) was correctly attached to the user-approved
 §10.3 stage-2 deferral. **2d.1 commit `c20c6e0` closes both
 independent keys.** `cte_arity_mismatch` landed alongside the
 harvest in `dd37a1c`.
 ### 4.2. §10.3 stage-2 harvest design
 Per handoff-27 §4.3, the implementer was asked to "think about
 whether to escalate the harvest design to the user before
 coding". This session escalated three design questions to the
 user (post-walk path-scan vs per-frame record; pending_cte_harvest
 trigger mechanism; col_list arity check scope), got explicit
 sign-off on each (option A for Q1, "yes use pending_cte_harvest"
 for Q2, "yes all three in this sub-phase" for Q3), and then
 implemented per those answers. **Commit `dd37a1c` reflects the
 user-approved design.**
 ### 4.3. Nested WITH inside subqueries / CTE bodies
 ADR-0032 §10.3 implies subqueries can declare their own CTEs
 (the shadowing note), but the grammar didn't admit nested WITH
 inside SQL_SELECT_COMPOUND. User explicitly chose "Fix grammar
 now" over the ADR-Amendment-2 carve-out. **Commit `fd25904`
 closes the ADR-vs-implementation gap.**
 ### 4.4. Completion look-ahead for the edit-scenario
 The user raised the realistic "edit an existing query" workflow
 (`select c| from mytable` where FROM exists after the cursor).
 ADR §10.6 explicitly accepted the "noisy global fallback" posture
 for this case. User chose to improve it via a look-ahead probe.
 **Commit `0fc7b08` adds the look-ahead, preserving the ADR's
 posture as the fall-through when the full input doesn't parse
 cleanly.**
 ### 4.5. §10.6 fixup mechanism choice
 The ADR prescribed "rewriting the highlight class"; the
 implementation uses a different mechanism (diagnostic-overlay
 renderer + 2d's two-pass binding collection). User explicitly
 chose "Write Amendment 2 now" to document this. **Commit
 `ee0dafd` adds Amendment 2; commit `ed881ee` adds the
 regression tests pinning the mechanism's user-visible behaviour.**
 ### 4.6. 2g matrix-driven implementation gaps
 Filling the matrix surfaced three real production gaps:
 - Advanced-mode UI rendering bypassed the highlight walker.
 - Engine.* catalog keys were authored but unreached.
 - Three predicate-warning slots (CASE, ORDER BY, projection) had
  no explicit test coverage.
 Each was a real Phase-2 deliverable, not a future enhancement.
 **Commit `ed881ee` closes all three.** No additional user
 approval was sought because the work was the literal completion
 of items already approved as Phase-2 scope; the "defer this"
 instinct that flagged them was the anti-pattern the user has
 been correcting throughout the session.
 ### 4.7. DA rework — fourth UI gap
 The first DA review of this report rubber-stamped the work with
 PASS. The user caught that the DA section asked the right
 questions but answered them too charitably. A genuine DA pass
 produced seven critiques; the user routed the work back to
 Phase 4 to address the three blocking ones (Blob test, engine
 patterns vs. real SQLite output, manual UI verification).
 Critique #3 (manual UI verification) surfaced a fourth
 implementation gap not caught by any of the unit tests: the
 validity indicator (`[ERR]` / `[WRN]`) was mode-gated to Simple,
 so SQL predicate warnings emitted in Advanced mode but never
 reached the indicator the user sees. Without the manual TUI
 check this would have shipped silently. **Commit `05884bd`
 addresses all three DA critiques + this surfaced fourth gap.**
 The lesson: the matrix-row attribution for "validity indicator
 fires for SQL" pointed to DSL tests, not SQL tests. Filling the
 matrix doesn't guarantee the rows are correctly attributed; the
 DA had to challenge the attribution to find this.
 ## §5. DA's written final review
 The first attempt at this section was a rubber-stamp PASS. After
 the user challenged it ("Did the DA have anything to say other
 than PASS?"), the DA hat produced seven genuine critiques. Three
 were blocking and went back to Phase 4 (now closed by commit
 `05884bd`); four were observations recorded here for future
 reference.
 ### Critiques addressed (now closed)
 **#1 — All-10-types test's NULL-blob cell.** Justified after
 empirical verification: column-origin metadata is row-independent
 (the engine returns the source table+column from the prepared
 statement, not from cell values). Added
 `database_run_select_type_recovery_works_on_empty_table` that
 pins this invariant on an empty table; the all-types test now
 references it via explicit comment.
 **#2 — Engine.* pattern matching against hand-coded strings.**
 Closed: three new tests in `tests/sql_select.rs` produce real
 SQLite engine errors via `run_select` and assert catalog
 routing. Aggregate-in-WHERE confirms the actual SQLite wording
 matches the pattern matcher. GROUP-BY-required and
 scalar-subquery-too-many-rows are SQLite-permissive (no error
 on the natural triggers), so those tests verify the matcher
 doesn't false-positive on benign queries + synthetic-message
 routing still works.
 **#3 — No manual UI verification.** Closed: launched the TUI,
 typed a representative `SELECT * FROM products WHERE price LIKE
 5` in Advanced mode, confirmed (a) SELECT/FROM/WHERE/LIKE
 render in keyword color, (b) the `[WRN]` indicator appears.
 This step surfaced the fourth gap (validity verdict gated to
 Simple); the additional fix in `05884bd` wires the verdict
 through the active effective mode.
 ### Critiques recorded (non-blocking)
 **#4 — Group-by pattern overbreadth.** `translate_generic`
 matches any error containing "group by" — could route unrelated
 future SQLite errors through `engine.group_by_required`. Low
 risk in practice (SQLite's other "group by" mentions are rare
 and the catalog wording is generic enough to not mislead). Pin
 for follow-up if a real false-positive surfaces.
 **#5 — Look-ahead probe cost.** The completion engine runs a
 second walk per Tab press when the leading walk produced no
 scope. For complex inputs this doubles parse cost. Not
 benchmarked. Acceptable in current debounce-cadence usage;
 revisit if profiling reveals it.
 **#6 — Tests-after-code on the matrix-coverage tests.** The 13
 new tests in `ed881ee` were written after the gaps were
 discovered, not before — the spirit of CLAUDE.md's "establish
 test coverage BEFORE making changes" rule. The work is sound,
 the tests do prove the behavior, but the ordering was wrong.
 Noted for future sub-phases.
 **#7 — Matrix attribution wasn't verified row-by-row.** The
 validity-indicator row pointed at DSL tests; the DA had to
 catch this. Future matrix-filling work should cross-check the
 attributed test actually exercises the claimed surface (e.g.,
 a "fires for SQL" row should point at a SQL-input test, not a
 DSL test that happens to share the same code path).
 ### Process critique
 **The DA verdict on the first pass was a rubber stamp.** The
 user had to challenge the verdict to get genuine engagement.
 That's a Phase-1 anti-pattern (CLAUDE.md: "the DA is not a
 rubber stamp"). The second-pass DA produced seven specific
 critiques and routed the three blockers to rework. **Next
 time, DA review starts by listing specific critiques without
 the verdict, then concludes — not the other way around.**
 ### Are all tests green with zero skips?
 Yes. **1446 passing, 0 failed, 1 ignored.** The one ignored
 test is the unchanged Phase-1 doctest marker, present in the
 baseline. Clippy clean across the workspace.
 ### Final verdict
 **PASS, with the four non-blocking observations recorded above
 for follow-up.**
 Phase 2 is complete:
 - Every ADR-0032 decision (§§1–13) has at least one test.
 - Both Amendments (1 + 2) are documented and have regression
  coverage.
 - Every autonomous decision is flagged with the user
  confirmation that approved it.
 - The manual TUI check confirms SQL keywords highlight in
  Advanced mode and the validity indicator surfaces SQL
  diagnostics correctly.
 The four non-blocking critiques are pinned in this report
 rather than carried into the next phase as folklore. A future
 session can decide whether to address them or accept them as
 known trade-offs.
 ## §6. Commits
 In commit order (oldest first), all 18 unpushed at the moment
 this report writes:
 ```
 e032f01 docs: ADR-0032 Amendment 1
 8d29335 grammar: SQL SELECT full statement fragment (Phase 2a)
 4f89106 walker: Node::ScopedSubgrammar variant + scope-frame stack
 98a74b2 grammar: sql_expr additive extensions for §5/§6
 b522d09 walker: populate from_scope table bindings
 4ff054c walker: populate cte_bindings placeholders + projection_aliases
 a491df3 grammar: migrate Phase-1 SELECT to ADR-0032 fragment
 c5cf03b walker: SQL diagnostics — multi-binding scope, qualified refs
 0c3847a db: column-origin type recovery in SELECT results
 5d716e6 docs: handoff 27
 c20c6e0 walker: 2d.1 — projection-alias misplaced + compound-arity
 dd37a1c walker: 2e prereq — §10.3 stage-2 CTE harvest + cte_arity_mismatch
 fd25904 grammar: admit WITH inside subqueries / CTE bodies (ADR-0032 §10.3)
 0fc7b08 completion: §10.5 qualified-prefix + edit-scenario look-ahead
 ee0dafd docs: ADR-0032 Amendment 2 + §10.6 regression tests
 ed881ee 2g: advanced-mode highlight + engine.* wiring + matrix tests
 05884bd 2g rework: address DA findings on type recovery + engine routing + UI
 <this report>
 ```
 Per CLAUDE.md, push is a user step. Phase 2 lands in
 `origin/main` whenever the user pushes.