Files

T

claude@clouddev1 380c4238ef test+docs: 3k Phase-3 verification sweep — e2e DML + filled cross-cut matrix

Sub-phase 3k of ADR-0033. Adds the Tier-3 end-to-end DML suite (tests/sql_dml_e2e.rs) and the cross-cut gap-fill tests, fills the verification matrix (every row a verified file::function), and produces the phase-exit report.

- tests/sql_dml_e2e.rs: INSERT…SELECT cross-table, all-ten-type multi-row INSERT + RETURNING type recovery, UPDATE-with-subquery-in-SET, cascade DELETE, UPSERT round-trip, RETURNING x3, history.log replay, OOS rejections (full §13 table), validity-indicator-from-SQL-DML.
- walker/mod.rs, highlight.rs, completion.rs, input_render.rs: inherited-diagnostic, DML-keyword highlight, INSERT INTO completion, and advanced-mode DML hint-panel cross-cuts.
- Matrix correction (user-confirmed): predicate warnings fire on row-scoped DML slots; INSERT VALUES has no row scope (ADR-0033 §8.4).
- Auto-snapshot row marked N/A (user-confirmed): ADR-0006 unimplemented for both paths; deferred.

/runda round: added an advanced-mode DML hint-panel test (A6 was attributed to simple-mode prose under the §8 advanced heading); extended OOS coverage to the full ADR-0033 §13 table (OOS-5 INDEXED BY / OOS-6 multi-statement) + a trailing-semicolon guard.

1645 passing / 0 failing / 0 skipped / 1 ignored. Clippy clean.

2026-05-23 22:26:04 +00:00

15 KiB

Raw Permalink Blame History

Phase 3 (ADR-0033) — phase-exit verification report

Date: 2026-05-23 Sub-phase: 3k (verification sweep) Driving ADR: ADR-0033 — full SQL DML grammar (+ Amendments 1–3) Plan: docs/plans/20260520-adr-0033-phase-3.md

This report closes Phase 3. It records the test totals, the filled cross-cut verification matrix (in the plan doc), the requirements-to-test mapping, the autonomous decisions made during 3k (both user-confirmed), and the Devil's-Advocate final review.

1. Test-suite totals

Metric	Phase-2 baseline	Phase-3 exit (3k)	Δ
passed	1446	1645	+199
failed	0	0	0
skipped	0	0	0
ignored	1 (doctest)	1 (doctest)	0

cargo test → 1645 passed; 0 failed; 0 skipped; 1 ignored. The single ignored test is the unchanged src/friendly/mod.rs doctest (a non-executable illustrative snippet), as in every prior phase.
cargo clippy --all-targets -- -D warnings → clean.
No regression from the Phase-2 baseline; every prior test stays green.

3k itself added 19 tests over the handoff-33 count (1626):

tests/sql_dml_e2e.rs — 11 Tier-3 end-to-end DML tests (new file), including the OOS parse-rejections (extended to the full §13 table — OOS-1…6 — during the /runda round) and the single-statement trailing-; guard.
src/dsl/walker/mod.rs::tests — 5 inherited-diagnostic cross-cuts (I3 DELETE/UPDATE WHERE, I6 like-numeric on DELETE WHERE, I7 the corrected row-scoped predicate slots).
src/dsl/walker/highlight.rs::tests — 1 DML-keyword highlight test.
src/completion.rs::tests — 1 INSERT INTO target-table completion test.
src/input_render.rs::tests — 1 advanced-mode DML hint-panel test (added in the /runda round; see §5.1).

2. Cross-cut verification matrix

The full matrix is filled in the plan doc: docs/plans/20260520-adr-0033-phase-3.md → "Cross-cut verification matrix". Every row carries a verified file::function (the legend maps the abbreviations) and a status:

85 rows ✅ — a green test whose INPUT matches the claim source (SQL claims take SQL-syntax input, per the handoff-29 §4.4 pin, cross-checked at attribution time by reading each test).
1 row N/A — auto-snapshot (ADR-0006), unimplemented for both paths; see §4 / footnote [e].
0 rows ❌.

Five rows carry footnotes recording resolutions surfaced during 3k (matrix footnotes [a]–[e]): the Amendment-1 dispatch mechanism, the Amendment-3 simple-mode pointer, the Amendment-2 cascade count-diff, the corrected predicate-in-VALUES claim, and the auto-snapshot deferral.

Attribution integrity note. The initial catalog pass (sub-agent survey) produced several fabricated function names and mis-attributed the dispatch rows to the 3a synthetic smoke tests. Every row in the final matrix was re-verified by reading the named test, per the project's "probe, don't reason" discipline. In particular the inherited-diagnostic rows (I1–I7) and the dispatch rows now point at the real tests in src/dsl/walker/mod.rs::tests and src/dsl/parser.rs::tests.

3. Requirements-to-test mapping (ADR-0033 §§1–13)

§	Decision	Proven by
§1	INSERT/UPDATE/DELETE statement shapes	`ins::single_row_…`, `ins::multi_row_…`, `ins::no_column_list_…`, `upd::`, `del::`
§2	Dispatch on shared entry words (Amendments 1 & 3)	`parser::advanced_ambiguous_{insert,update,delete}_routes_to_sql`, `parser::advanced_dsl_only_delete_falls_back_to_dsl`, `parser::simple_*`
§3	Multi-row VALUES	`ins::multi_row_insert_persists_both_rows`, `e2e::e2e_multirow_insert_all_ten_types_…`
§4	INSERT … SELECT (+ WITH-prefixed row source)	`ins::insert_select_copies_rows_and_persists`, `ins::with_prefixed_insert_select_runs_and_persists`, `e2e::e2e_insert_select_cross_table_…`
§5	RETURNING on all three kinds	`ins::insert_returning_`, `upd::update_returning_`, `del::delete_returning_*`, `e2e::e2e_returning_on_insert_update_delete`
§6	shortid auto-fill (worker post-fill)	`ins::values_autofills_`, `ins::_distinct_shortids`, `ins::combined_serial_and_shortid_autofill`
§7	Cascade summary on DELETE (Amendment 2 — count-diff)	`del::cascade_parity_with_dsl`, `del::r2_where_with_subquery`, `del::cascade_delete_reports_summary_and_repersists_child`
§8.1	`insert_arity_mismatch` (ERROR, per-row)	`walker::insert_arity_mismatch_*`
§8.2	`auto_column_overridden` (WARNING)	`walker::auto_column_overridden_*`
§8.3	`not_null_missing` (WARNING)	`walker::not_null_missing_*`
§8.4	Inherited Phase-2 diagnostics on DML slots	`walker::insert_column_list_unknown_column_is_flagged`, `walker::sql_update_`, `walker::sql_delete_where_`, `walker::insert_select_where_predicate_warns`
§9	UPSERT (`ON CONFLICT … DO NOTHING/UPDATE`, `excluded`)	`ins::on_conflict_`, `completion::excluded_prefix_completes_to_target_columns`, `walker::excluded_`
§10	Three typed Command variants + worker handlers + persistence	`ins`/`upd`/`del` worker round-trips, `e2e::*`
§11	Engine-error translation (reuses existing keys)	`friendly::enrich_unique_`, `friendly::enrich_not_null_`, `del::delete_violating_fk_fails_and_persists_nothing`
§12	Result-column type recovery on RETURNING	`e2e::e2e_multirow_insert_all_ten_types_…` (all ten), `ins::insert_returning_recovers_multiple_bare_column_types`
§13	OOS rejections (DEFAULT VALUES, OR REPLACE, UPDATE FROM, WITH UPDATE/DELETE)	`e2e::e2e_out_of_scope_dml_forms_parse_reject`

4. Autonomous decisions during 3k (both user-confirmed)

Per CLAUDE.md: an autonomous decision without explicit user confirmation is a verdict FAIL. Two ADR-vs-implementation mismatches surfaced while filling the matrix; both were escalated and resolved before any test or matrix edit landed (AskUserQuestion, 2026-05-23):

Matrix row I7 — "predicate warning fires inside INSERT VALUES sql_expr". Probing showed the predicate-warning pass fires on every DML sql_expr slot that carries a resolvable column binding (UPDATE/DELETE WHERE, UPDATE SET incl. CASE, INSERT…SELECT projection & WHERE) but not in plain INSERT … VALUES, because a VALUES literal has no row scope (a bare-column predicate there is meaningless). ADR-0033 §8.4 itself describes the pass as covering "WHERE and CASE expressions." User decision: correct the matrix claim to the realizable, row-scoped slots. Backed by new tests walker::sql_update_set_case_predicate_warns and walker::insert_select_where_predicate_warns. No code change.
Matrix row "Auto-snapshot fires for SQL DML the same way as DSL" (ADR-0006 / ADR-0033 §10). ADR-0006 auto-snapshot/undo is unimplemented for both the DSL and SQL paths (the U-series, per CLAUDE.md "Things deliberately deferred"). User decision: mark N/A — deferred. The "same as DSL" parity holds vacuously and Phase 3 introduces no regression; ADR-0033 §10's auto-snapshot consequence is contingent on the deferred ADR-0006 work.

No other autonomous decisions were made in 3k. (The 3d Option-A/B escalation and the Amendment 1/2/3 decisions were resolved in their own sub-phases with user approval, recorded in ADR-0033.)

5. Devil's-Advocate final review

Specific critiques first; verdict last (handoff-28/29 §4.1 pin — a rubber-stamp PASS is a process failure).

5.1 `/runda` round (second-pass DA, user-invoked)

A user-invoked /runda round re-checked the matrix attributions against the actual test inputs/modes and re-swept the requirement sources. It found and closed two issues, and re-verified the rest:

A6 (hint-panel) was attributed to Simple-mode tests under an advanced (§8) heading. The hint-prose tests call ambient_hint, which defaults to Mode::Simple (the DSL value-slot prose) — so the §8 advanced-surface claim had no advanced-mode attribution. Fixed: added input_render::advanced_mode_ambient_offers_dml_slot_candidates (column candidates at an advanced-mode UPDATE … SET LHS and inside an INSERT … ( column list); matrix footnote [f].
OOS-5 / OOS-6 (ADR-0033 §13) had no tests. The plan's matrix scoped OOS to §13's first four rows; OOS-5 (INDEXED BY / NOT INDEXED) and OOS-6 (multi-statement batch) were unverified. Probed: all reject in Advanced mode; a single statement with a trailing ; still parses. Fixed: extended e2e_out_of_scope_dml_forms_parse_reject to the full §13 table and added e2e_single_dml_statement_with_trailing_semicolon_parses; added matrix rows OOS-5/OOS-6.

Re-verified clean (no change needed):

requirements.md — Q1/Q2 (SQL subset + clear out-of-subset rejection message) and M1/M2 (mode acceptance) are multi-phase and remain [ ] by design; Phase 3 contributes the DML slice without claiming closure. Q2's polished rejection message is explicitly "Implementation pending" — the 3k OOS tests assert only that the forms parse-reject (ADR-0033 §13), not message quality, so no over-claim. M4 is the deferred side-channel (Amendment 3).
Issue trackers — origin is GitHub (gh issue list empty); the configured gitea instance does not host this repo. The handoff's "#8/#9/#12" are internal task/finding numbers, not external issues. No external acceptance criteria exist beyond the ADRs / plan / requirements.md.
Engine-neutrality — all new Phase-3 catalog strings (also_valid_sql, insert_arity_mismatch, auto_column_overridden, not_null_missing, plus the reused eq_null/like_numeric/ sql_in_simple) are engine-neutral; a full scan of src/friendly/strings/en-US.yaml for sqlite/rusqlite/pragma/ strict finds only a code comment and the internal sqlite: key whose value ("database error: …") is neutral. engine_vocabulary_audit is green.
The catalog-attribution sub-agents' fabrications (noted below) were the original trigger for verifying every row by hand; the /runda pass confirmed no fabricated or mode-mismatched attribution survived.

5.2 Final critiques

Critiques raised and their resolution

The catalog sub-agents fabricated test names and mis-attributed the dispatch rows. Real risk of a matrix that points at non-existent or wrong tests. Resolved: every row was re-verified by reading the named test; the I1–I7 and dispatch rows now point at the real walker:: / parser:: tests, not the fabricated names or the 3a smoke registry.
Two matrix rows over-claimed vs the implementation (predicate warnings in VALUES; auto-snapshot). The Phase-2 lesson was "expect at least one gap from filling the matrix" — Phase 3 surfaced two. Resolved: both escalated to the user and dispositioned (§4); the corrected predicate claim gained two backing tests, and the auto-snapshot row is N/A-deferred, not silently dropped.
R5 (ten-type recovery via RETURNING) was only 5/10 before 3k. The existing insert_returning_recovers_multiple_bare_column_types covered int/text/decimal/real/bool only. Resolved: the new e2e::e2e_multirow_insert_all_ten_types_… asserts column-origin recovery for all ten types via a RETURNING list.
blob is inserted NULL in the all-types e2e. Honest limitation: src/dsl/value.rs has no blob value-literal yet (a pre-existing, tracked DSL gap — not a Phase-3 regression). The blob column's type still round-trips through the RETURNING column-origin path, which is what §12 claims. The data round-trip is exercised for the other nine types.
The Tier-3 e2e tests go parse → worker, not through the full Tokio event loop. Mitigation: the App dispatch seam is covered separately (parser::* dispatch tests + the sql_select.rs App-driven tests), and the runtime replay loop is exercised end-to-end by e2e::e2e_replay_phase3_dml_forms_from_a_script (run_replay → parse-in-advanced → worker → persisted state). The full stack is covered across these seams.
No requirement was silently downgraded to "later." The only deferrals (auto-snapshot, blob literals) are pre-existing, user-confirmed, and tracked — not new scope reductions introduced by 3k.

Verdict

PASS. All 1645 tests green, zero failed, zero skipped, one unchanged ignored doctest; clippy clean; no regression from the Phase-2 1446 baseline. Every cross-cut matrix row is ✅ (85) or N/A-deferred (1, user-confirmed), with verified file::function attributions whose inputs/modes match the claim source — re-checked in the /runda round (§5.1), which corrected the A6 mode mismatch and closed the OOS-5/6 coverage gap. The two over-claims the matrix carried (predicate-in-VALUES, auto-snapshot) were surfaced, escalated, and resolved with the user rather than papered over.

6. State at handoff

Branch: main. Tests: 1645 passing, 0 failing, 0 skipped, 1 ignored. Clippy clean.
Uncommitted (this session's 3k work, pending the user's commit approval):
- tests/sql_dml_e2e.rs (new — 11 Tier-3 e2e tests)
- src/dsl/walker/mod.rs (5 cross-cut tests)
- src/dsl/walker/highlight.rs (DML-keyword highlight test)
- src/completion.rs (INSERT INTO target-table completion test)
- src/input_render.rs (advanced-mode DML hint-panel test — /runda)
- docs/plans/20260520-adr-0033-phase-3.md (matrix filled + footnotes)
- docs/handoff/20260523-phase-3-verification.md (this report)
ADR-0033 can move from Proposed to Accepted once the user accepts this report (the plan's §5 hand-back point). Its three amendments are already settled.

Follow-ups (tracked, not blocking Phase 3)

Per the user's standing "extra tasks after phase 3 is complete":

ADR-0006 auto-snapshot/undo (U-series) — implement for both the DSL and SQL DML worker handlers; satisfies the N/A matrix row.
M4 — execution-time mode side-channel (ADR-0033 Amendment 3; requirements.md). Its own future ADR.
TASK #8 — ADR-0034 history journal; TASK #9 — replay line format (<ts>|<status>|<source>). Orthogonal to Phase 3.
blob value-literal in the DSL/SQL value grammar (src/dsl/value.rs) — pre-existing gap.
The cosmetic tests/sql_*.rs helper names (run_sqlinsert etc.) retain their dev-word-era spelling though they now parse the real words.

15 KiB Raw Permalink Blame History Unescape Escape