Files
rdbms-playground/docs/handoff/20260523-phase-3-verification.md
T
claude@clouddev1 380c4238ef test+docs: 3k Phase-3 verification sweep — e2e DML + filled cross-cut matrix
Sub-phase 3k of ADR-0033. Adds the Tier-3 end-to-end DML suite (tests/sql_dml_e2e.rs) and the cross-cut gap-fill tests, fills the verification matrix (every row a verified file::function), and produces the phase-exit report.

- tests/sql_dml_e2e.rs: INSERT…SELECT cross-table, all-ten-type multi-row INSERT + RETURNING type recovery, UPDATE-with-subquery-in-SET, cascade DELETE, UPSERT round-trip, RETURNING x3, history.log replay, OOS rejections (full §13 table), validity-indicator-from-SQL-DML.
- walker/mod.rs, highlight.rs, completion.rs, input_render.rs: inherited-diagnostic, DML-keyword highlight, INSERT INTO completion, and advanced-mode DML hint-panel cross-cuts.
- Matrix correction (user-confirmed): predicate warnings fire on row-scoped DML slots; INSERT VALUES has no row scope (ADR-0033 §8.4).
- Auto-snapshot row marked N/A (user-confirmed): ADR-0006 unimplemented for both paths; deferred.

/runda round: added an advanced-mode DML hint-panel test (A6 was attributed to simple-mode prose under the §8 advanced heading); extended OOS coverage to the full ADR-0033 §13 table (OOS-5 INDEXED BY / OOS-6 multi-statement) + a trailing-semicolon guard.

1645 passing / 0 failing / 0 skipped / 1 ignored. Clippy clean.
2026-05-23 22:26:04 +00:00

15 KiB
Raw Permalink Blame History

Phase 3 (ADR-0033) — phase-exit verification report

Date: 2026-05-23 Sub-phase: 3k (verification sweep) Driving ADR: ADR-0033 — full SQL DML grammar (+ Amendments 13) Plan: docs/plans/20260520-adr-0033-phase-3.md

This report closes Phase 3. It records the test totals, the filled cross-cut verification matrix (in the plan doc), the requirements-to-test mapping, the autonomous decisions made during 3k (both user-confirmed), and the Devil's-Advocate final review.


1. Test-suite totals

Metric Phase-2 baseline Phase-3 exit (3k) Δ
passed 1446 1645 +199
failed 0 0 0
skipped 0 0 0
ignored 1 (doctest) 1 (doctest) 0
  • cargo test1645 passed; 0 failed; 0 skipped; 1 ignored. The single ignored test is the unchanged src/friendly/mod.rs doctest (a non-executable illustrative snippet), as in every prior phase.
  • cargo clippy --all-targets -- -D warningsclean.
  • No regression from the Phase-2 baseline; every prior test stays green.

3k itself added 19 tests over the handoff-33 count (1626):

  • tests/sql_dml_e2e.rs — 11 Tier-3 end-to-end DML tests (new file), including the OOS parse-rejections (extended to the full §13 table — OOS-1…6 — during the /runda round) and the single-statement trailing-; guard.
  • src/dsl/walker/mod.rs::tests — 5 inherited-diagnostic cross-cuts (I3 DELETE/UPDATE WHERE, I6 like-numeric on DELETE WHERE, I7 the corrected row-scoped predicate slots).
  • src/dsl/walker/highlight.rs::tests — 1 DML-keyword highlight test.
  • src/completion.rs::tests — 1 INSERT INTO target-table completion test.
  • src/input_render.rs::tests — 1 advanced-mode DML hint-panel test (added in the /runda round; see §5.1).

2. Cross-cut verification matrix

The full matrix is filled in the plan doc: docs/plans/20260520-adr-0033-phase-3.md → "Cross-cut verification matrix". Every row carries a verified file::function (the legend maps the abbreviations) and a status:

  • 85 rows — a green test whose INPUT matches the claim source (SQL claims take SQL-syntax input, per the handoff-29 §4.4 pin, cross-checked at attribution time by reading each test).
  • 1 row N/A — auto-snapshot (ADR-0006), unimplemented for both paths; see §4 / footnote [e].
  • 0 rows .

Five rows carry footnotes recording resolutions surfaced during 3k (matrix footnotes [a][e]): the Amendment-1 dispatch mechanism, the Amendment-3 simple-mode pointer, the Amendment-2 cascade count-diff, the corrected predicate-in-VALUES claim, and the auto-snapshot deferral.

Attribution integrity note. The initial catalog pass (sub-agent survey) produced several fabricated function names and mis-attributed the dispatch rows to the 3a synthetic smoke tests. Every row in the final matrix was re-verified by reading the named test, per the project's "probe, don't reason" discipline. In particular the inherited-diagnostic rows (I1I7) and the dispatch rows now point at the real tests in src/dsl/walker/mod.rs::tests and src/dsl/parser.rs::tests.


3. Requirements-to-test mapping (ADR-0033 §§113)

§ Decision Proven by
§1 INSERT/UPDATE/DELETE statement shapes ins::single_row_…, ins::multi_row_…, ins::no_column_list_…, upd::*, del::*
§2 Dispatch on shared entry words (Amendments 1 & 3) parser::advanced_ambiguous_{insert,update,delete}_routes_to_sql, parser::advanced_dsl_only_delete_falls_back_to_dsl, parser::simple_*
§3 Multi-row VALUES ins::multi_row_insert_persists_both_rows, e2e::e2e_multirow_insert_all_ten_types_…
§4 INSERT … SELECT (+ WITH-prefixed row source) ins::insert_select_copies_rows_and_persists, ins::with_prefixed_insert_select_runs_and_persists, e2e::e2e_insert_select_cross_table_…
§5 RETURNING on all three kinds ins::insert_returning_*, upd::update_returning_*, del::delete_returning_*, e2e::e2e_returning_on_insert_update_delete
§6 shortid auto-fill (worker post-fill) ins::values_autofills_*, ins::*_distinct_shortids, ins::combined_serial_and_shortid_autofill
§7 Cascade summary on DELETE (Amendment 2 — count-diff) del::cascade_parity_with_dsl, del::r2_where_with_subquery, del::cascade_delete_reports_summary_and_repersists_child
§8.1 insert_arity_mismatch (ERROR, per-row) walker::insert_arity_mismatch_*
§8.2 auto_column_overridden (WARNING) walker::auto_column_overridden_*
§8.3 not_null_missing (WARNING) walker::not_null_missing_*
§8.4 Inherited Phase-2 diagnostics on DML slots walker::insert_column_list_unknown_column_is_flagged, walker::sql_update_*, walker::sql_delete_where_*, walker::insert_select_where_predicate_warns
§9 UPSERT (ON CONFLICT … DO NOTHING/UPDATE, excluded) ins::on_conflict_*, completion::excluded_prefix_completes_to_target_columns, walker::excluded_*
§10 Three typed Command variants + worker handlers + persistence ins/upd/del worker round-trips, e2e::*
§11 Engine-error translation (reuses existing keys) friendly::enrich_unique_*, friendly::enrich_not_null_*, del::delete_violating_fk_fails_and_persists_nothing
§12 Result-column type recovery on RETURNING e2e::e2e_multirow_insert_all_ten_types_… (all ten), ins::insert_returning_recovers_multiple_bare_column_types
§13 OOS rejections (DEFAULT VALUES, OR REPLACE, UPDATE FROM, WITH UPDATE/DELETE) e2e::e2e_out_of_scope_dml_forms_parse_reject

4. Autonomous decisions during 3k (both user-confirmed)

Per CLAUDE.md: an autonomous decision without explicit user confirmation is a verdict FAIL. Two ADR-vs-implementation mismatches surfaced while filling the matrix; both were escalated and resolved before any test or matrix edit landed (AskUserQuestion, 2026-05-23):

  1. Matrix row I7 — "predicate warning fires inside INSERT VALUES sql_expr". Probing showed the predicate-warning pass fires on every DML sql_expr slot that carries a resolvable column binding (UPDATE/DELETE WHERE, UPDATE SET incl. CASE, INSERT…SELECT projection & WHERE) but not in plain INSERT … VALUES, because a VALUES literal has no row scope (a bare-column predicate there is meaningless). ADR-0033 §8.4 itself describes the pass as covering "WHERE and CASE expressions." User decision: correct the matrix claim to the realizable, row-scoped slots. Backed by new tests walker::sql_update_set_case_predicate_warns and walker::insert_select_where_predicate_warns. No code change.

  2. Matrix row "Auto-snapshot fires for SQL DML the same way as DSL" (ADR-0006 / ADR-0033 §10). ADR-0006 auto-snapshot/undo is unimplemented for both the DSL and SQL paths (the U-series, per CLAUDE.md "Things deliberately deferred"). User decision: mark N/A — deferred. The "same as DSL" parity holds vacuously and Phase 3 introduces no regression; ADR-0033 §10's auto-snapshot consequence is contingent on the deferred ADR-0006 work.

No other autonomous decisions were made in 3k. (The 3d Option-A/B escalation and the Amendment 1/2/3 decisions were resolved in their own sub-phases with user approval, recorded in ADR-0033.)


5. Devil's-Advocate final review

Specific critiques first; verdict last (handoff-28/29 §4.1 pin — a rubber-stamp PASS is a process failure).

5.1 /runda round (second-pass DA, user-invoked)

A user-invoked /runda round re-checked the matrix attributions against the actual test inputs/modes and re-swept the requirement sources. It found and closed two issues, and re-verified the rest:

  1. A6 (hint-panel) was attributed to Simple-mode tests under an advanced (§8) heading. The hint-prose tests call ambient_hint, which defaults to Mode::Simple (the DSL value-slot prose) — so the §8 advanced-surface claim had no advanced-mode attribution. Fixed: added input_render::advanced_mode_ambient_offers_dml_slot_candidates (column candidates at an advanced-mode UPDATE … SET LHS and inside an INSERT … ( column list); matrix footnote [f].

  2. OOS-5 / OOS-6 (ADR-0033 §13) had no tests. The plan's matrix scoped OOS to §13's first four rows; OOS-5 (INDEXED BY / NOT INDEXED) and OOS-6 (multi-statement batch) were unverified. Probed: all reject in Advanced mode; a single statement with a trailing ; still parses. Fixed: extended e2e_out_of_scope_dml_forms_parse_reject to the full §13 table and added e2e_single_dml_statement_with_trailing_semicolon_parses; added matrix rows OOS-5/OOS-6.

Re-verified clean (no change needed):

  • requirements.md — Q1/Q2 (SQL subset + clear out-of-subset rejection message) and M1/M2 (mode acceptance) are multi-phase and remain [ ] by design; Phase 3 contributes the DML slice without claiming closure. Q2's polished rejection message is explicitly "Implementation pending" — the 3k OOS tests assert only that the forms parse-reject (ADR-0033 §13), not message quality, so no over-claim. M4 is the deferred side-channel (Amendment 3).
  • Issue trackersorigin is GitHub (gh issue list empty); the configured gitea instance does not host this repo. The handoff's "#8/#9/#12" are internal task/finding numbers, not external issues. No external acceptance criteria exist beyond the ADRs / plan / requirements.md.
  • Engine-neutrality — all new Phase-3 catalog strings (also_valid_sql, insert_arity_mismatch, auto_column_overridden, not_null_missing, plus the reused eq_null/like_numeric/ sql_in_simple) are engine-neutral; a full scan of src/friendly/strings/en-US.yaml for sqlite/rusqlite/pragma/ strict finds only a code comment and the internal sqlite: key whose value ("database error: …") is neutral. engine_vocabulary_audit is green.
  • The catalog-attribution sub-agents' fabrications (noted below) were the original trigger for verifying every row by hand; the /runda pass confirmed no fabricated or mode-mismatched attribution survived.

5.2 Final critiques

Critiques raised and their resolution

  1. The catalog sub-agents fabricated test names and mis-attributed the dispatch rows. Real risk of a matrix that points at non-existent or wrong tests. Resolved: every row was re-verified by reading the named test; the I1I7 and dispatch rows now point at the real walker:: / parser:: tests, not the fabricated names or the 3a smoke registry.

  2. Two matrix rows over-claimed vs the implementation (predicate warnings in VALUES; auto-snapshot). The Phase-2 lesson was "expect at least one gap from filling the matrix" — Phase 3 surfaced two. Resolved: both escalated to the user and dispositioned (§4); the corrected predicate claim gained two backing tests, and the auto-snapshot row is N/A-deferred, not silently dropped.

  3. R5 (ten-type recovery via RETURNING) was only 5/10 before 3k. The existing insert_returning_recovers_multiple_bare_column_types covered int/text/decimal/real/bool only. Resolved: the new e2e::e2e_multirow_insert_all_ten_types_… asserts column-origin recovery for all ten types via a RETURNING list.

  4. blob is inserted NULL in the all-types e2e. Honest limitation: src/dsl/value.rs has no blob value-literal yet (a pre-existing, tracked DSL gap — not a Phase-3 regression). The blob column's type still round-trips through the RETURNING column-origin path, which is what §12 claims. The data round-trip is exercised for the other nine types.

  5. The Tier-3 e2e tests go parse → worker, not through the full Tokio event loop. Mitigation: the App dispatch seam is covered separately (parser::* dispatch tests + the sql_select.rs App-driven tests), and the runtime replay loop is exercised end-to-end by e2e::e2e_replay_phase3_dml_forms_from_a_script (run_replay → parse-in-advanced → worker → persisted state). The full stack is covered across these seams.

  6. No requirement was silently downgraded to "later." The only deferrals (auto-snapshot, blob literals) are pre-existing, user-confirmed, and tracked — not new scope reductions introduced by 3k.

Verdict

PASS. All 1645 tests green, zero failed, zero skipped, one unchanged ignored doctest; clippy clean; no regression from the Phase-2 1446 baseline. Every cross-cut matrix row is (85) or N/A-deferred (1, user-confirmed), with verified file::function attributions whose inputs/modes match the claim source — re-checked in the /runda round (§5.1), which corrected the A6 mode mismatch and closed the OOS-5/6 coverage gap. The two over-claims the matrix carried (predicate-in-VALUES, auto-snapshot) were surfaced, escalated, and resolved with the user rather than papered over.


6. State at handoff

  • Branch: main. Tests: 1645 passing, 0 failing, 0 skipped, 1 ignored. Clippy clean.
  • Uncommitted (this session's 3k work, pending the user's commit approval):
    • tests/sql_dml_e2e.rs (new — 11 Tier-3 e2e tests)
    • src/dsl/walker/mod.rs (5 cross-cut tests)
    • src/dsl/walker/highlight.rs (DML-keyword highlight test)
    • src/completion.rs (INSERT INTO target-table completion test)
    • src/input_render.rs (advanced-mode DML hint-panel test — /runda)
    • docs/plans/20260520-adr-0033-phase-3.md (matrix filled + footnotes)
    • docs/handoff/20260523-phase-3-verification.md (this report)
  • ADR-0033 can move from Proposed to Accepted once the user accepts this report (the plan's §5 hand-back point). Its three amendments are already settled.

Follow-ups (tracked, not blocking Phase 3)

Per the user's standing "extra tasks after phase 3 is complete":

  • ADR-0006 auto-snapshot/undo (U-series) — implement for both the DSL and SQL DML worker handlers; satisfies the N/A matrix row.
  • M4 — execution-time mode side-channel (ADR-0033 Amendment 3; requirements.md). Its own future ADR.
  • TASK #8 — ADR-0034 history journal; TASK #9 — replay line format (<ts>|<status>|<source>). Orthogonal to Phase 3.
  • blob value-literal in the DSL/SQL value grammar (src/dsl/value.rs) — pre-existing gap.
  • The cosmetic tests/sql_*.rs helper names (run_sqlinsert etc.) retain their dev-word-era spelling though they now parse the real words.