Files
rdbms-playground/docs/adr/0033-sql-dml-grammar.md
claude@clouddev1 c12ed1da9a fix: INSERT Form B value-count UX (ADR-0033 Amendment 5)
Three layered fixes for advanced/simple-mode positional INSERT
value-count mismatches (e.g. `insert into T values (...)` with the
wrong number of values for T's column count), plus ADR-0033
Amendment 5 recording the gate refinement.

Walker diagnostic (dml_insert_arity_diagnostics): the function's own
doc-comment recorded the no-column-list (Form B) case as deferred.
This commit closes that gap. Form B mismatches now emit a new
diagnostic.insert_arity_mismatch_form_b ERROR per offending tuple,
keyed off the target table's column count from the schema cache. The
[ERR] validity indicator (ADR-0027) lights up at typing time for the
reported scenario, no longer needing a submit.

Cross-mode pointer gate (advanced_alternative_note): refactored from
a hand-rolled Form B count check to a single input_verdict_in_mode(
input, schema, Mode::Advanced) call. The pointer fires only when the
verdict is None — the ADR-0027 sense of "valid". Any future static
check added to the verdict pipeline participates automatically; no
per-feature maintenance.

Teaching notes for the value-count cases users can hit before the
indicator turns red:
  - simple-mode submit: insert.form_b_extra_values_note covers under-,
    in-window, and over-supply against the Form B contract; suppressed
    when the cross-mode pointer fires (to avoid parallel advice).
  - advanced-mode dispatch pre-flight:
    insert.form_b_positional_count_mismatch_note catches a submitted
    mismatch with a teaching message before the engine produces its
    raw NOT-NULL / type error.

The advanced_mode.also_valid_sql pointer wording was reworked to
"trying to write SQL? switch with `mode advanced`, or prefix `:` to
run once". One insta snapshot regenerated.

ADR-0033 Amendment 5 records the gate change: Amendment 3's "would
parse in advanced mode" now reads as "valid in advanced mode" in the
explicit verdict-is-None sense, with the precise definition spelled
out so future readers can't drift back to the syntactic-only reading.
ADR-0000 index entry updated; docs/requirements.md H1a citation added
listing the three new pedagogical strings.

Tests added (8): four walker arity tests (under-supply, over-supply,
match, unknown-table); two app-level teaching-note tests for the
sibling cases (under-supply, over-supply beyond total); one pointer-
gate unit test pinning the bug-case suppression; one gate-precedence
test ensuring only one advice line per error. Existing
simple_mode_submit_of_sql_construct_appends_advanced_pointer updated
to use a known schema (the new validity gate requires it).

Full suite: 2031 passed, 0 failed, 0 unexpected skips. Clippy clean.
2026-05-28 16:38:33 +00:00

73 KiB
Raw Permalink Blame History

ADR-0033: The full SQL DML grammar — INSERT / UPDATE / DELETE

Status

Accepted

Phase 3 implemented and verified through sub-phase 3k (2026-05-23); see docs/handoff/20260523-phase-3-verification.md for the phase-exit report and the filled cross-cut matrix in docs/plans/20260520-adr-0033-phase-3.md. Amendments 13 below are part of this acceptance.

Forward note (2026-05-26, ADR-0036 Accepted). ADR-0036 augments the §10 model — it does not change verbatim execution and does not collapse the two-command identity of Amendment 3 (which stands). It adds a value-validation step: INSERT/UPDATE literal values are parsed to typed Values, validated against the column type (sharing the DSL's validators), and retained on the command (so a constraint error can show the offending value) — then the statement still executes verbatim. Command::SqlInsert gains a captured-literals payload; its execution and plan_shortid_autofill are unchanged. Expressions, WHERE, INSERT … SELECT, RETURNING, UPSERT, and SELECT are untouched. (The serial/shortid auto-fill difference vs simple mode is tracked separately as requirements.md X4.)

Context

ADR-0030 commissions advanced mode as a body of SQL grammar inside the unified grammar tree. Phase 1 ("foundations + first SELECT") and Phase 2 ("SELECT — full") have shipped through ADR-0031 and ADR-0032 respectively. The next slice on ADR-0030's roadmap is Phase 3 — DML: INSERT / UPDATE / DELETE as validated SQL.

ADR-0030 §3 fixed the headline scope:

INSERT (single- and multi-row), UPDATE, DELETE.

ADR-0030 §4 fixed the execute path:

DML and SELECT → executed as the validated SQL itself (re-rendered canonically from the matched parse, or the validated original text). … For DML the worker — knowing the statement kind and target table from the parse — runs the statement and re-persists that table's CSV.

ADR-0030 §11 fixed the persistence model. ADR-0030 §12 fixed the safety posture: SQL UPDATE/DELETE without WHERE executes as written (no --all-rows rail).

ADR-0030 §3 also flagged this ADR's existence in advance:

Settle multi-row INSERT and shortid auto-fill on a SQL INSERT.

This ADR fixes the grammar shape, the dispatch architecture for shared entry words, the diagnostic surface new to DML, and the execute-path Command variants. It does not revisit ADR-0030's controlling decisions; references in this text to ADR-0030 §X mean "that decision is controlling."

What ADR-0030, ADR-0031, ADR-0032 already fix

  • One walker, grammar-as-text execution, ambient assistance for free (ADR-0030 §1/§4/§6/§8).
  • sql_expr is the shared expression grammar — UPDATE SET expressions, WHERE predicates, INSERT VALUES tuples, RETURNING projection items all consume it (ADR-0031, ADR-0032 §5/§6).
  • Node::ScopedSubgrammar + from_scope_stack carry CTEs and per-statement scope into nested constructs (ADR-0032 §10).
  • The diagnostic pipeline — schema_existence_diagnostics, sql_predicate_warnings, compound_arity_diagnostics, pending_diagnostics accumulator — applies uniformly to DML's sql_expr slots (ADR-0032 §11).
  • The friendly-error layer translates engine messages through the engine.* catalog (ADR-0019, ADR-0030 §7, ADR-0032 §11.5).
  • The __rdbms_* rejection at every table-source slot (ADR-0030 §6, ADR-0032 §1/§4).
  • MAX_SUBGRAMMAR_DEPTH = 64 (ADR-0026) is the recursion budget.

Decision

1. The top-level statement shapes

insert_statement   := INSERT INTO table_name
                      [ '(' column_name_list ')' ]
                      ( values_clause | sql_select_compound )
                      [ on_conflict_clause ]
                      [ returning_clause ]
                      [ ';' ]

update_statement   := UPDATE table_name
                      SET assignment_list
                      [ where_clause ]
                      [ returning_clause ]
                      [ ';' ]

delete_statement   := DELETE FROM table_name
                      [ where_clause ]
                      [ returning_clause ]
                      [ ';' ]

Where:

values_clause      := VALUES value_tuple ( ',' value_tuple )*
value_tuple        := '(' sql_expr ( ',' sql_expr )* ')'
assignment_list    := assignment ( ',' assignment )*
assignment         := column_name '=' sql_expr
returning_clause   := RETURNING projection_list
on_conflict_clause := ON CONFLICT [ '(' column_name_list ')' ]
                      ( DO NOTHING
                      | DO UPDATE SET assignment_list [ where_clause ] )
  • table_name is the existing Tables ident slot, with reject_internal_table validator (ADR-0030 §6).
  • column_name_list is comma-separated Columns idents — the same shape data::INSERT (DSL) uses for Form A. Reuses the existing pattern; no new node infrastructure.
  • where_clause is WHERE Subgrammar(&sql_expr::SQL_OR_EXPR).
  • projection_list is the Phase-2 projection list, re-used unchanged.

The body of an INSERT … SELECT recurses through Node::Subgrammar(&sql_select::SQL_SELECT_COMPOUND) — the same recursion point Phase 2's CTE bodies, scalar subqueries, and IN/EXISTS bodies use. The Phase-2 SELECT grammar already admits an optional leading WITH (Amendment 2 / commit fd25904), so INSERT INTO t WITH x AS (…) SELECT * FROM x works for free.

2. Dispatch — DSL vs SQL on shared entry words

The entry words insert, update, delete are shared between DSL and SQL. This ADR fixes the dispatch rule and the mechanism that implements it (a known-risk item — see "Open implementation risks" below).

Simple mode → DSL only. SQL constructs (multi-row VALUES, INSERT…SELECT, RETURNING, UPSERT) attempted in simple mode produce a ValidationFailed "this is SQL" hint, matching the gating that select / with already produce (ADR-0030 §2).

Advanced mode → SQL first, DSL fallback. Each shared entry word's CommandNode.shape becomes a Choice between the SQL shape and the DSL shape, declared in that order:

shape: Node::Choice(&[SQL_INSERT_SHAPE, DSL_INSERT_SHAPE])

The walker's Choice is first-match-wins. The SQL shape's constructs are a strict superset of the DSL shape's for these statements (multi-row VALUES, INSERT…SELECT, RETURNING, UPSERT are SQL-only; DSL's --all-rows flag is DSL-only). Any structurally-ambiguous input — delete from t where id = 1 — matches the SQL shape first and routes through the SQL machinery.

This is the design point an earlier draft of ADR-0030 §2 left open ("choice order or a disambiguator decides"). The chosen disambiguator is Choice order: SQL > DSL in Advanced. A DSL-specific construct (e.g., --all-rows) won't match the SQL shape and falls through to DSL — accommodating, not surprising.

Cascade-summary parity (§7) and shortid auto-fill parity (§6) mean a DSL-style INSERT/UPDATE/DELETE in Advanced mode produces identical observable behaviour whether it routes through the SQL or DSL path. The user's mental model can be "Advanced mode runs SQL; DSL syntax still works as a familiar fallback."

Mode-gating mechanism

The existing is_advanced_only infrastructure (src/dsl/grammar/mod.rs::ADVANCED_ONLY_ENTRIES) gates by entry word, fires in the walker's dispatch_command_node path (walker/mod.rs:1707), and emits the "this is SQL" ValidationFailed verdict. It can't be reused to gate a Choice branch by mode.

Mechanism options for branch-level gating:

  • (a) Mode-aware validator on the SQL branch. Each SQL-shape Seq starts with a no-op Node::Validated whose validator inspects ctx.mode; in Simple mode it returns ValidationError(message_key: "advanced_mode.sql_in_simple") before any structural tokens commit. Trade-off: clean, reuses the existing validator path; the failure shape is the same WalkOutcome::ValidationFailed as the entry-word gate.
  • (b) Mode flag on Node::Choice. A new walker capability: Node::Choice branches can carry a min-mode marker. Heavier (touches the walker's walk_choice driver) but lets the grammar declare gating without per-branch boilerplate.
  • (c) Two CommandNodes per entry word; registry picks by mode. The dispatcher consults ctx.mode and skips the SQL CommandNode in Simple mode. Heavier (registry semantics change) and surfaces a "multiple CommandNodes per entry word" pattern the rest of the codebase doesn't use.

Default choice: (a) the validator approach. Smallest mechanism change. Walker-capability caveat: validators today are an Ident { validator: Option<fn> } field — there is no standalone "guard" node that runs a validator without consuming input. Sub-phase 3a's first job is to add Node::Guard(fn(&WalkContext) -> Result<(), ValidationError>) to the walker — a zero-consumption node whose only effect is to fail the Seq with a ValidationError in Simple mode. The existing internal-table rejection (reject_internal_table) shows the wiring pattern; Node::Guard is the "no-byte-consuming version" of it.

3. Multi-row INSERT … VALUES

INSERT INTO orders (customer_id, total) VALUES (1, 99.50), (2, 12.00), (3, 6.25)

The grammar's values_clause repeats value_tuple with a comma separator (Node::Repeated { inner: value_tuple, separator: Some(COMMA), min: 1 }). Each tuple matches the same arity as the (column_name_list) (see §8 for the arity diagnostic).

4. INSERT … SELECT

INSERT INTO archive SELECT * FROM orders WHERE created < '2025-01-01'
INSERT INTO summary (id, total) WITH t AS (…) SELECT id, sum(total) FROM t GROUP BY id

A Choice between values_clause and Subgrammar(&sql_select::SQL_SELECT_COMPOUND) after the optional column list. The Phase-2 SELECT machinery (CTEs, JOINs, subqueries, set ops) applies as the row source for free.

Arity verification: at parse time, the walker counts the declared column list (or, when absent, the target table's column count from the schema cache) against the SELECT's projection list arity. Mismatch emits diagnostic.insert_arity_mismatch (see §8).

5. RETURNING

RETURNING projection_list admits the Phase-2 projection_list unchanged — column refs, *, t.*, expressions with AS aliases, computed expressions.

Execution: the Command::Sql{Insert,Update,Delete} variants carry a returning: bool flag set by the walker's ast_builder when a RETURNING clause matched. The worker reads this flag (no SQL text re-parsing) and routes:

  • returning == true → prepare + step the statement, collect rows into a DataResult (per-result-column type recovery via the same column-origin path Phase 2 introduced, ADR-0032 §12 + Amendment 1).
  • returning == false → execute the statement, collect the affected-row count.

The flag is sufficient because the shape of the projection is already in the SQL the engine prepares; the worker doesn't need to re-parse to know which columns to read.

6. shortid auto-fill — worker post-fills

Parity with the DSL do_insert handler (ADR-0005 / ADR-0018).

For an INSERT whose (column_name_list) omits one or more shortid columns, the worker:

  1. Inspects the table metadata to find the auto-generated columns (serial, shortid) NOT in the user's column list.
  2. For each row in the VALUES list (single or multi-row), and for INSERT … SELECT each yielded row, synthesises a fresh shortid value (the existing shortid::generate path).
  3. Constructs the parameterised INSERT (or executes a rewrite of the SQL with the values bound positionally) and runs it.

The engine sees a fully-specified row; the user's submitted SQL is unmodified in history.log (ADR-0030 §11). The walker recognises an explicit value provided for an auto-gen column — that's an explicit override, not a missing slot — and emits a WARNING (§8b) but lets it through.

serial columns rely on SQLite's INTEGER PRIMARY KEY rowid auto-increment — no worker post-fill needed; the engine populates the value on its own.

7. Cascade summary on DELETE

Parity with the DSL do_delete handler (ADR-0014).

When Command::SqlDelete { target_table, sql, … } runs:

  1. Pre-execution cascade pre-count. For each child table that has an FK referencing target_table with ON DELETE CASCADE (or SET NULL), the worker runs a pre-count query of the form SELECT count(*) FROM <child> WHERE <fk_col> IN (SELECT <pk_col> FROM <target_table> WHERE <user_where_predicate>). The <user_where_predicate> is extracted from the SQL — see "Predicate extraction" below.
  2. The worker executes the DELETE.
  3. Post-execution summary. Per-relationship row counts + cascade impact, formatted by the same cascade-summary formatter the DSL path uses.

Predicate extraction. The DSL handler reuses the typed Expr AST to construct the pre-count subqueries. For SQL DELETE the walker captures the WHERE clause's source byte range from the matched path and re-injects that text into the pre-count SQL. This is grammar-as-text reuse, consistent with ADR-0030 §4's posture; no AST construction required.

When the SQL DELETE has no WHERE, the pre-count is unbounded ("all rows of target_table"). Cost: one extra count(*) per child table; cheaper than the DSL parallel path on a typical schema.

The cascade-summary formatter itself (format_cascade_summary in the DSL handler) is shared with do_sql_delete — same output, same i18n keys.

8. Diagnostics new to Phase 3

Three new keys, classified per ADR-0027's ERROR / WARNING model.

8.1. diagnostic.insert_arity_mismatch (ERROR)

Walker-level pre-flight when the declared (column_name_list)'s arity disagrees with a value_tuple's arity (or, for INSERT … SELECT, with the SELECT's projection list arity, counted at parse time via the same projection-counting infrastructure Phase-2's compound_arity_diagnostics uses).

For multi-row VALUES, each offending row emits its own diagnostic on that row's value_tuple span. A query with five rows and three wrong arities produces three diagnostics; the two right rows are silent. Pre-empts the engine's less-helpful "X values for Y columns" error.

For INSERT … SELECT, the diagnostic anchors on the SELECT's first projection_item span. The Phase-2 projection-count infrastructure (the comma-counting at depth = leg-depth used by compound_arity_diagnostics) is the underlying mechanism.

8.2. diagnostic.auto_column_overridden (WARNING)

Walker-level when an INSERT explicitly assigns a value to a column whose type is serial or shortid. Pedagogical: the explicit value bypasses the auto-counter / generator, which can collide with future auto-generated values.

Catalog wording: "column \{column}` is auto-generated (`{type}`); providing an explicit value bypasses the auto-counter and may collide with later auto-generated values"`.

The statement still runs — the warning is advisory, matching the Phase-2 predicate-warning posture (ADR-0027 §1 / ADR-0032 §11.6).

8.3. diagnostic.not_null_missing (WARNING)

Walker-level when an INSERT's (column_name_list) omits a column declared NOT NULL with no DEFAULT. Pedagogical nudge — the engine will reject the statement, but the walker can pre-flight the missing column at parse time.

Catalog wording: "column \{column}` is required (`NOT NULL`, no default); the statement will fail when run"`.

The walker doesn't BLOCK the statement (it's structurally valid). The engine does. The friendly layer translates the engine error via the existing NOT-NULL path.

8.4. Inherited diagnostics

The Phase-2 diagnostic passes apply to DML's sql_expr slots without further work:

  • schema_existence_diagnostics flags unknown columns in WHERE, SET, RETURNING, VALUES tuples.
  • sql_predicate_warnings flags = NULL, LIKE-on-numeric, type-mismatched comparisons in WHERE and CASE expressions.
  • The pending_diagnostics accumulator path stays available for in-walk emits.

9. UPSERT — ON CONFLICT … DO NOTHING / DO UPDATE

INSERT INTO t (id, name) VALUES (1, 'x')
ON CONFLICT (id) DO UPDATE SET name = excluded.name

INSERT INTO t (id, name) VALUES (1, 'x')
ON CONFLICT DO NOTHING

Grammar shape:

on_conflict_clause := ON CONFLICT [ '(' column_name_list ')' ]
                      ( DO NOTHING
                      | DO UPDATE SET assignment_list [ where_clause ] )

The optional (column_name_list) is the conflict target — which columns' unique constraint to react to. Standard SQL allows this list to be empty (ON CONFLICT DO NOTHING is shorthand for "any conflict").

The DO UPDATE branch reuses the same assignment_list and optional where_clause shapes as the top-level UPDATE statement.

Inside DO UPDATE expressions, the keyword excluded is a pseudo-table reference to the would-have-been-inserted row. Note: excluded is the SQLite + PostgreSQL spelling; standard SQL (MERGE … WHEN MATCHED) uses a different model. The playground commits to the excluded form because it is what SQLite supports and what the project's pedagogical target dialect uses (ADR-0001 / ADR-0002 — the engine is SQLite, even though its name is hidden from user-facing strings). This is a deliberate carve-out from ADR-0030 §7's engine-neutral posture, justified by the absence of a standard-SQL UPSERT spelling that both SQLite and PostgreSQL share.

The walker admits excluded as a table alias for ident-completion purposes inside the DO UPDATE branch — its columns are the target table's columns.

10. Execute path — three typed Command variants

Command gains three variants:

Command::SqlInsert {
    sql: String,
    source: Option<String>,         // for history.log
    target_table: String,           // for re-persistence
    listed_columns: Option<Vec<String>>,  // None if no col-list provided
}

Command::SqlUpdate {
    sql: String,
    source: Option<String>,
    target_table: String,
}

Command::SqlDelete {
    sql: String,
    source: Option<String>,
    target_table: String,
}

The walker's ast_builder for each shape extracts target_table and listed_columns from the matched path. The sql field is the validated source text the engine executes.

Database gains three request types:

Request::RunSqlInsert { command, reply }
Request::RunSqlUpdate { command, reply }
Request::RunSqlDelete { command, reply }

Each worker handler:

  1. Applies statement-specific pre-execution work (shortid auto-fill for RunSqlInsert; cascade pre-count for RunSqlDelete).
  2. Prepares + executes the statement. If RETURNING is present, collects the result set as a DataResult.
  3. Re-persists the target table's CSV (ADR-0030 §11) using the existing persistence write-through machinery.
  4. Returns the result (DataResult for RETURNING; affected-row count + cascade summary for plain DELETE; affected-row count for plain INSERT/UPDATE).

History logging: the worker writes the literal submitted line to history.log (ADR-0030 §11) before execution, exactly as run_select does.

11. Engine-error translation

Three new catalog keys (the friendly-error layer's engine.* translation set, ADR-0019 / ADR-0030 §7):

Key Engine condition
engine.unique_constraint_failed The engine's UNIQUE constraint failed: t.c (already exists; will be re-used as-is for UPSERT-without-conflict-target cases)
engine.not_null_constraint_failed Already exists; covers SQL-side NOT-NULL failures uniformly
engine.foreign_key_constraint_failed Already exists; SQL DELETE that violates FK without cascade

No NEW engine.* keys are required for Phase 3 — the existing FK / UNIQUE / NOT-NULL translation paths cover the engine-side DML errors. Phase 3 reuses them.

12. Result-column type recovery on RETURNING

A RETURNING projection that's a bare column reference recovers its playground type via the same column-origin metadata path Phase 2 introduced (resolve_select_column_types in db.rs, ADR-0032 §12 + Amendment 1).

Computed expressions in RETURNING stay typeless. The renderer uses neutral alignment for typeless columns, consistent with SELECT.

13. Out of scope

ID OOS
OOS-1 INSERT INTO t DEFAULT VALUES (the project's planned seed feature covers this)
OOS-2 INSERT OR REPLACE / OR IGNORE / OR ABORT / OR FAIL / OR ROLLBACK (SQLite-specific)
OOS-3 UPDATE … FROM other_table (SQLite/PostgreSQL extension)
OOS-4 WITH x AS (…) UPDATE … / WITH x AS (…) DELETE … (deferrable; SELECT-side WITH is in scope)
OOS-5 Indexed-by / not-indexed table-source modifiers (SQLite-specific)
OOS-6 Multi-statement batches (already OOS via ADR-0030 §13 OOS-4)

OOS-4 is a deliberate choice: WITH-prefixed DML is standard SQL since SQL:2003, but its execution semantics (the WITH-defined CTE is visible to the DML's WHERE / SET / RETURNING expressions) require additional scope-frame plumbing that doesn't yet exist on the DML side. The Phase-2 grammar admits WITH at the SELECT level; Phase 3 keeps WITH a SELECT-only prefix. A later phase can lift this restriction with a clean ADR Amendment.

Consequences

  • The walker gains a new node variant Node::Guard (§2 mechanism caveat) — a zero-byte-consumption gating node that can fail its enclosing Seq with a ValidationError. The walker driver's exhaustive walk_node_inner match grows one arm. Reusable beyond DML (Phase 4 DDL's similar shared-entry problem is a likely future consumer).
  • Command gains three variants (SqlInsert, SqlUpdate, SqlDelete); every exhaustive match Command callsite picks up three new arms (the recurring ADR-0028/0029 gotcha).
  • The DSL data::INSERT / data::UPDATE / data::DELETE CommandNodes' shapes become Choice(SQL_shape, DSL_shape) with the SQL branch mode-gated via a no-op leading validator (R1).
  • Three new walker diagnostics with three new catalog keys (diagnostic.insert_arity_mismatch, diagnostic.auto_column_overridden, diagnostic.not_null_missing).
  • Three new Request variants (RunSqlInsert/RunSqlUpdate/RunSqlDelete) and three new worker handlers (do_sql_insert/do_sql_update/do_sql_delete).
  • The persistence write-through path gains per-DML-kind entry points; the cascade-affected re-persistence in do_sql_delete is new (cascade-affected child tables are also re-persisted).
  • RETURNING lands a write-statement-returns-rows code path the project hasn't had before. Worker handlers either return affected-row count or a DataResult keyed by the returning: bool flag.
  • The cascade-summary formatter is reused between the DSL and SQL DELETE paths; the predicate-extraction byte-range injection (§7) is new code with the R2 mitigation.
  • shortid auto-fill code in do_sql_insert mirrors the DSL do_insert mechanism but takes a listed_columns slice from the parse rather than a typed Command::Insert.
  • The auto-snapshot machinery (ADR-0006) is unchanged but fires for SQL DML the same way it does for DSL DML — the snapshot trigger lives in the worker handlers.
  • All Phase-2 tests stay green (the SQL grammar pieces this ADR builds on are unchanged).
  • The DSL-vs-SQL dispatch mechanism (3a) is a foundation other projects in this codebase may need (e.g., Phase 4 DDL). Land it generically so it's reusable.

Initial DA review

A first-pass DA review of this ADR's draft surfaced ten critiques. The seven that surfaced real gaps were resolved in the body before this status moved to Proposed:

  1. OOS-4 (WITH-prefixed DML) under-justified → R4 risk describes the mechanism gap concretely.
  2. Dispatch mechanism hand-waved → §2's "Mode-gating mechanism" subsection enumerates three options and commits to the validator approach with R1 as the mitigation if it fails.
  3. RETURNING execution hand-waved → §5 commits to a returning: bool flag on the Command variants.
  4. Cascade-summary code-sharing claim too optimistic → §7's "Predicate extraction" subsection details the byte-range injection mechanism; R2 budgets for the performance-pathology case.
  5. excluded keyword's engine-neutrality status → §9 carves it out explicitly as a justified non-neutral choice.
  6. Multi-row INSERT arity diagnostic emit pattern unspecified → §8.1 clarifies "each offending row emits its own diagnostic" plus the INSERT…SELECT span.
  7. Sub-phase ordering put dispatch (the riskiest piece) last → reordered so 3a lands the mechanism FIRST on a no-op shape.

The remaining three critiques were either process points (DA gates at sub-phase boundaries — added per-sub-phase in Implementation notes) or already-covered (verification matrix — will live in the plan doc, matching Phase 2's pattern).

Per CLAUDE.md / the Phase 2 process lesson, the DA review of THIS ADR will get a second pass before status moves to Accepted. Pin: do not rubber-stamp.

Open implementation risks

Recorded explicitly so the implementation plan can budget for them, not to allow silent drift later.

R1. Mode-gated Choice branch mechanism unproven

§2's option (a) — a no-op validator at the head of the SQL shape's Seq, rejecting in Simple mode — is plausible but untested. If the walker's walk_seq commits the Seq before the validator runs, the SQL branch's NoMatch becomes Failed and the Choice can't fall through to the DSL branch.

Mitigation: sub-phase 3a lands the mechanism end-to-end on the smallest viable shape (a single SqlNop CommandNode whose shape is Choice(SQL_branch[mode-gated-validator + token], DSL_branch[different-token])) before any DML grammar lands. If the validator approach doesn't work, the alternatives (a mode-aware Node::Choice variant, or split-CommandNode dispatch) get evaluated before sub-phases 3b/3c/3d depend on the mechanism.

R2. Cascade-summary predicate extraction

§7 says the worker re-injects the WHERE-clause source bytes into pre-count subqueries. This is straightforward when the WHERE clause is a top-level predicate, but a future user might write WHERE clauses that themselves contain subqueries (e.g., DELETE FROM orders WHERE customer_id IN (SELECT id FROM customers WHERE …)). The byte-range injection still works structurally, but a performance pathology is possible if the subquery is expensive and the cascade child count is large (the same subquery runs N+1 times).

Mitigation: sub-phase 3c benchmarks the pre-count overhead on a representative schema before declaring the path acceptable. If the overhead is prohibitive, fall back to "report cascade count post-execution only" (less informative, but cheaper).

R3. RETURNING result-column type recovery on multi-row INSERT

§5 says result columns get the same column-origin type recovery SELECT uses. For a single-row INSERT … RETURNING this works identically to a SELECT. For multi-row INSERT, the engine emits one result row per inserted row; the column-origin metadata is the same for all (per ADR-0032 Amendment 1, column-origin is a property of the prepared statement, not the rows).

No risk here — calling it out so the implementer doesn't re-derive the question. The empty-table type-recovery test already pins this invariant.

R4. INSERT … SELECT with a CTE row source

The Phase-2 SELECT grammar already admits WITH … SELECT … at the compound level (ADR-0032 §10.3 / Amendment 2). The SQL_SELECT_COMPOUND recursion point §1 uses pulls this in for free. INSERT INTO archive WITH t AS (…) SELECT * FROM t parses and runs.

A user might reasonably expect WITH-prefixed DML to also work (see §13 OOS-4). That's explicitly out of scope for Phase 3; the discrepancy is documented here so the implementer doesn't need to figure it out from the grammar shape.

Implementation notes

Phased sub-plan with explicit exit gates per sub-phase. The implementation plan doc (docs/plans/<date>-adr-0033-phase-3.md, modelled on docs/plans/20260520-adr-0032-phase-2.md) refines these and adds the cross-cut verification matrix in 3i.

Ordering rationale. The dispatch mechanism (3a) is the foundation; if it fails to work, every later sub-phase has to be reworked. Land it first on a no-op shape, then build the DML grammar on top.

3a — Node::Guard + mode-gated Choice mechanism

Scope (in):

  • Add a new walker node variant Node::Guard(fn(&WalkContext) -> Result<(), ValidationError>) (per §2 mechanism caveat) — zero-byte consumption, fails the enclosing Seq with ValidationFailed when the validator returns Err.
  • Implement walk_guard in src/dsl/walker/driver.rs.
  • Add the reject_sql_in_simple_mode guard function.
  • Build a smoke-test grammar: an experimental CommandNode with shape: Choice(Seq[Guard(reject_sql_in_simple_mode), SQL_branch_token], DSL_branch_token). Verify the Choice falls through cleanly in Simple mode (DSL wins) and routes SQL-first in Advanced mode.

Scope (out):

  • Any real DML grammar (3b onwards).
  • Wiring the existing data::INSERT/UPDATE/DELETE shapes (3j).

Exit gate (required green tests):

  • Node::Guard admits via walk_guard; the walker driver's match arm exists and is tested.
  • Smoke-test grammar in Simple mode → DSL branch matches.
  • Smoke-test grammar in Advanced mode + SQL-shape input → SQL branch matches.
  • Smoke-test grammar in Simple mode + SQL-shape input → ValidationFailed with message_key: "advanced_mode.sql_in_simple".
  • Smoke-test grammar in Advanced mode + DSL-shape input → DSL branch matches (Choice fall-through works even when SQL branch is admissible).
  • All Phase-2 tests (1446 baseline) still green.
  • Clippy clean.

DA gate (written):

  • Does the Guard actually fire BEFORE the Seq commits? Failure mode: the Seq processes the Guard, the Guard fails, but the Seq has already "committed" to its branch and the Choice can't fall through. Test specifically: Advanced mode + an input that the DSL branch matches but the SQL branch's remaining tokens reject → must fall through to DSL, not Failed.
  • Is the Guard mechanism reusable beyond DML (e.g., for Phase 4 DDL's similar dispatch)? If yes, document the pattern in src/dsl/grammar/mod.rs doc comments so the next phase doesn't reinvent it.
  • If the Guard mechanism doesn't work (R1's worst case), the sub-phase REOPENS with mechanism option (b) or (c) evaluated before any DML grammar lands on top of broken scaffolding.

3b — INSERT grammar + minimal execution

Scope (in):

  • SQL_INSERT_SHAPE: single-row + multi-row VALUES; the optional (column_name_list); the __rdbms_* rejection on the table-name slot.
  • Command::SqlInsert { sql, source, target_table, listed_columns, returning: false }returning always false in 3b.
  • Request::RunSqlInsert + do_sql_insert worker handler (no shortid auto-fill yet — explicit values required).
  • Persistence write-through on the target table after execution.
  • History-log of the literal submitted line.

Scope (out):

  • INSERT … SELECT (3c).
  • RETURNING (3e).
  • UPSERT (3f).
  • shortid auto-fill (3d).
  • Diagnostics (3g).
  • DSL-vs-SQL dispatch wiring on data::INSERT (3h — development uses a separate data::SqlInsert CommandNode).

Exit gate (required green tests):

  • Single-row INSERT runs end-to-end; affected-row count surfaces; CSV is re-persisted.
  • Multi-row INSERT (INSERT INTO t VALUES (1, 'a'), (2, 'b')) runs; both rows land; CSV reflects them.
  • (column_name_list) variant works (INSERT INTO t (a, b) VALUES (1, 'x')).
  • __rdbms_* rejection at the INSERT target-table slot (per ADR-0030 §6 / ADR-0032 §1).
  • History line is the literal submitted SQL.
  • All Phase-2 tests (1446 baseline) still green.

DA gate (written):

  • Does an INSERT failing mid-multi-row leave persistence in a consistent state? (Engine transactional behaviour applies; worker must not re-persist on failure.)
  • Does the target_table extraction handle case-folding the same way the rest of the codebase does?
  • Are the existing DSL INSERT tests still green? Any DSL-side regression means 3b broke something it shouldn't have.

3c — INSERT … SELECT

Scope (in):

  • Add the row-source Choice to SQL_INSERT_SHAPE: values_clause | Subgrammar(&sql_select::SQL_SELECT_COMPOUND).
  • Update do_sql_insert to handle the SELECT-driven path (no special worker work needed beyond preparing the composite statement; the engine handles the insert-from-query flow).

Exit gate:

  • INSERT INTO archive SELECT * FROM orders WHERE … runs.
  • INSERT INTO target (a, b) SELECT x, y FROM source runs.
  • INSERT INTO t WITH x AS (…) SELECT * FROM x runs (R4 invariant — Phase-2 CTE prefix on SELECT applies).
  • Schema-existence diagnostics fire on the SELECT's projection (Phase-2 machinery, free).

DA gate (written):

  • Does the WITH-prefixed SELECT row source actually parse through SQL_SELECT_COMPOUND, or does the INSERT prefix break the recursion? Test on a real WITH … INSERT … SELECT.
  • Does the cascade-summary path (3e DELETE) accidentally apply to INSERTs whose SELECT touches the target table? (It must not — INSERTs have no cascade.)

3d — shortid auto-fill (worker)

Scope (in):

  • Detect omitted shortid columns at execution time in do_sql_insert.
  • Synthesise fresh shortid values; bind them positionally.
  • Cover both VALUES and SELECT row sources (for SELECT, the shortid columns are added to the projection's bind list before execution).

Exit gate:

  • INSERT INTO t (name) VALUES ('x') where t has a pk shortid column auto-fills that PK.
  • An explicit value for a shortid column is respected (override behaviour — warning fires per 3g but the value is preserved).
  • Multi-row INSERT produces distinct shortids per row.
  • INSERT … SELECT auto-fills shortids for each yielded row.

DA gate (written):

  • Are the synthesised shortids actually unique across a multi-row INSERT? (The generator should be called per-row, not once.)
  • Does the auto-fill respect the column's case-preserved name?
  • What happens if the table has both serial and shortid PK columns? (Spec: serial via SQLite rowid; shortid via worker post-fill. Verify no double-fill collision.)

3e — UPDATE grammar + execution

Scope (in):

  • SQL_UPDATE_SHAPE: target table, SET assignment list, optional WHERE.
  • Command::SqlUpdate, Request::RunSqlUpdate, do_sql_update.
  • Persistence write-through on the target table.

Exit gate:

  • Single-column UPDATE with WHERE runs; affected-row count surfaces; CSV reflects changes.
  • Multi-column UPDATE (SET a = 1, b = 2) runs.
  • UPDATE without WHERE runs (per ADR-0030 §12 — no --all-rows guard).
  • Schema-existence diagnostic fires on unknown columns in SET and WHERE.

DA gate (written):

  • Does the SET assignment expression accept the full sql_expr (function calls, subqueries, CASE)? Verify on a representative expression.
  • Does an UPDATE that touches an auto-gen column produce the 3g auto-gen warning?

3f — DELETE grammar + execution + cascade summary

Scope (in):

  • SQL_DELETE_SHAPE: target table, optional WHERE.
  • Command::SqlDelete, Request::RunSqlDelete, do_sql_delete.
  • Cascade-summary pre-count and post-execution formatting per §7.
  • Persistence write-through on target + affected child tables (cascade-affected tables also need re-persistence).

Exit gate:

  • DELETE with WHERE runs; affected-row count + cascade summary surface; CSVs of target AND cascade-affected children are re-persisted.
  • DELETE without WHERE runs (no --all-rows rail).
  • Cascade summary matches the DSL do_delete output on the same schema/data.
  • The cascade-summary code path is shared (no duplicate formatter).

DA gate (written):

  • R2 mitigation: does the predicate-extraction byte-range injection work on a WHERE containing a subquery? Test DELETE FROM a WHERE id IN (SELECT id FROM b WHERE …).
  • Does the cascade pre-count run in a way that doesn't see the post-DELETE state? (Order matters: pre-count BEFORE execute.)

3g — RETURNING

Scope (in):

  • Add RETURNING projection_list to all three top-level shapes.
  • Walker captures returning: bool on the Command variants.
  • Worker handlers route to DataResult collection when returning == true.
  • Result-column type recovery via the Phase-2 column-origin path.

Exit gate:

  • INSERT … RETURNING * returns the inserted row(s) as a DataResult; affected-row count is implicit (rows.len()).
  • UPDATE … RETURNING id, new_value returns the modified columns.
  • DELETE … RETURNING * returns what was just deleted (BEFORE the row is gone).
  • Result-column types recovered for bare-column refs (per ADR-0032 Amendment 1).
  • The cascade summary on a DELETE … RETURNING is preserved alongside the result set (both surface to the user).

DA gate (written):

  • Does the RETURNING result-set rendering use the same DataResult path SELECT uses? Or did a DML-specific copy sneak in?
  • Does RETURNING expr AS alias admit aliases the same way the projection list does?

3h — UPSERT (ON CONFLICT … DO NOTHING / DO UPDATE)

Scope (in):

  • Add on_conflict_clause to SQL_INSERT_SHAPE.
  • Admit excluded as a table alias inside DO UPDATE expressions (§9's clarification).
  • Walker captures the UPSERT shape; worker executes as-is (no special pre-processing).

Exit gate:

  • INSERT … ON CONFLICT (id) DO NOTHING runs on a conflicting row; affected-row count is 0.
  • INSERT … ON CONFLICT (id) DO UPDATE SET v = excluded.v runs; the conflicting row is updated to the new value.
  • ON CONFLICT DO NOTHING (no target spec) handles any conflict.
  • excluded.col completes / highlights as a column ref to the target table's columns inside the DO UPDATE expressions.

DA gate (written):

  • Does ON CONFLICT … DO UPDATE trigger persistence write-through? (Yes; the target row is modified.)
  • Is the excluded alias scoped to ONLY the DO UPDATE branch, not leaking into the outer INSERT?

3i — Diagnostics

Scope (in):

  • diagnostic.insert_arity_mismatch (§8.1).
  • diagnostic.auto_column_overridden (§8.2).
  • diagnostic.not_null_missing (§8.3).
  • Per-key positive + negative tests + catalog entries.

Exit gate:

  • One positive + one negative test per new key (the ADR-0032 §2d exit gate pattern):
    • insert_arity_mismatch: INSERT INTO t (a, b) VALUES (1, 2, 3) fires; matched arity doesn't.
    • auto_column_overridden: INSERT INTO t (id, name) VALUES (5, 'x') (id is serial) fires; omitting id doesn't.
    • not_null_missing: INSERT INTO t (a) VALUES (1) (b is NOT NULL no default) fires; including b doesn't.
  • Multi-row arity mismatch emits per-row, not just on the first offending row.
  • The inherited Phase-2 diagnostic passes (schema_existence, sql_predicate_warnings, compound_arity_diagnostics) still fire on DML's sql_expr slots.

DA gate (written):

  • Does auto_column_overridden produce a Warning, not an Error? (Spec: WARNING — the statement still runs.)
  • Does not_null_missing produce a Warning? (Same.)
  • Are the catalog wordings engine-neutral?

3j — Dispatch wiring on shared entry words

Scope (in):

  • Convert data::INSERT, data::UPDATE, data::DELETE shapes to Choice(SQL_shape, DSL_shape).
  • Remove the experimental separate data::SqlInsert / SqlUpdate / SqlDelete CommandNodes that 3a-3i used as scaffolding.
  • Verify DSL-only inputs still match the DSL branch in both modes.
  • Verify SQL-only inputs match the SQL branch in Advanced mode and fail with "this is SQL" in Simple mode.

Exit gate:

  • All existing DSL INSERT/UPDATE/DELETE tests still green (unmodified inputs, unmodified outputs).
  • All Phase-3 SQL DML tests still green via the new dispatch path (the SQL branch is reached through data::INSERT etc., not through data::SqlInsert).
  • A structurally-ambiguous input (delete from t where id = 1) in Advanced mode routes through the SQL branch (verifiable by checking that the cascade-summary code path fired — DSL and SQL paths share the formatter but the entry point differs).
  • A DSL-specific input (delete from t --all-rows) in Advanced mode falls through to the DSL branch.

DA gate (written):

  • Verify that no test was changed to make 3j pass — only the shape's structure changes; the observable behaviour is identical.
  • Is the test count higher than before 3j? It should be (the scaffolding CommandNodes had their own tests; now the shared CommandNodes carry both).

3k — Verification sweep

Scope (in):

  • Fill in the cross-cut verification matrix (docs/plans/<date>-adr-0033-phase-3.md).
  • Produce the phase-exit verification report.
  • DA final review (genuine, not rubber-stamp — see Phase 2's process lesson in docs/handoff/20260520-handoff-28.md §4).

Exit gate:

  • Every matrix row green with a concrete file::function test reference.
  • Final test run: zero failures, zero skips, no regressions from the Phase-2 baseline of 1446 / 0 / 1.
  • Clippy clean.
  • Report at docs/handoff/<date>-phase-3-verification.md documents all autonomous decisions with explicit user-confirmation pointers and the DA's written critiques (not just the verdict).

DA gate (written):

  • Did filling the matrix surface gaps that need closing? (Phase 2 surfaced THREE; expect at least one for Phase 3.)
  • Are any sub-phase exit gates above weaker than they should be in hindsight?
  • Did any sub-phase make an autonomous decision that's not in this ADR? If so, flag for retroactive user approval before phase-exit.

Out of scope (clarifications already established)

  • No engine type names in or out (ADR-0030 §5 / §7).
  • No STRICT, no SQLite storage clauses (ADR-0030 §7).
  • No transaction control (ADR-0030 §13 OOS-4).
  • No multi-statement batches (ADR-0030 §13 OOS-4).
  • No EXPLAIN of SQL DML (ADR-0030 §13 OOS-2 — the DSL explain still wraps what it already wraps).

Amendment 1 — Dispatch mechanism: category-grouped dispatch supersedes Node::Guard (2026-05-21)

This amendment supersedes §2's "Mode-gating mechanism" subsection and revises sub-phase 3a's scope. It was written during 3a implementation, after tracing the proposed mechanism against the actual walker code, and is recorded with explicit user approval before any code landed.

The finding — option (a) does not work as §2 frames it

§2 selected option (a): a standalone zero-byte Node::Guard(fn) placed as the first element of the SQL branch's Seq, inside a Choice(SQL_shape, DSL_shape). §2 framed this as the "smallest mechanism change" that "reuses the existing validator path" with the failure surfacing as the same WalkOutcome::ValidationFailed as the entry-word gate — implying no walk_choice change.

Tracing the four required behaviours against src/dsl/walker/driver.rs shows that framing is inaccurate, and that any guard-in-Choice mechanism is forced to modify walk_choice:

  1. walk_choice returns immediately on a branch Failed (driver.rs walk_choice) — it only falls through to the next branch on NoMatch. So in Simple mode, delete from t where id = 1 (which is valid DSL) would hit the SQL branch's guard, the guard would fail, and the user would see the "this is SQL" hint instead of the statement running as DSL. That contradicts §2's own "Simple mode → DSL only" promise. Fixing it requires a walk_choice change regardless of how the guard node is shaped.

  2. walk_seq treats a NoMatch at idx > 0 as a hard Failed (driver.rs walk_seq). A zero-byte Guard at idx 0 still advances idx to 1, pushing the SQL branch's first real token to idx 1. So in Advanced mode, delete from t --all-rows (DSL-only) cannot fall through to DSL — the SQL branch returns Failed, and walk_choice won't try the DSL branch. The literal form needs an additional walk_seq semantics change (idx == 0 → "no bytes consumed yet").

This is exactly the R1 risk the ADR budgeted for ("if walk_seq commits the Seq before the validator runs … the Choice can't fall through").

The replacement — category-grouped, mode-aware dispatch

Rather than gate a Choice branch from inside the recursive grammar, the dispatch decision moves up to the command dispatcher (walker::walk), which is where mode gating already lives: the existing whole-command gate (is_advanced_only + the Simple-mode short-circuit in walk) is precisely this pattern, hardcoded for select / with. The amendment generalises that existing code instead of inventing a parallel in-grammar mechanism.

Concretely:

  • Each registry entry carries a categoryCommandCategory::{Simple, Advanced}. The category lives at the registration site (the REGISTRY table), not as a field on every CommandNode literal, because category is a dispatcher concern, not intrinsic to a command's grammar. select / with (and, from 3b, the SQL insert / update / delete nodes) are Advanced; everything else is Simple. This subsumes ADVANCED_ONLY_ENTRIES: is_advanced_only(entry) becomes "every registry candidate for this entry word is Advanced".

  • A shared entry word has a node in both groups — e.g. a Simple DSL insert node and an Advanced SQL insert node, both with entry word insert. The entry-word lookup returns all candidates; the dispatcher selects by mode.

  • Mode-aware selection in walk (no combinator changes):

    • Simple mode considers Simple candidates. If a Simple candidate exists, it is committed. If none exists but an Advanced candidate does (a pure-SQL entry word like select), the dispatcher emits the advanced_mode.sql_in_simple ValidationFailed — identical to today's whole-command gate.
    • Advanced mode tries Advanced candidate(s) first, then the Simple candidate as a fallback. The first candidate that fully matches wins (delete from t where id = 1 → SQL; delete from t --all-rows → falls through to DSL because the SQL shape doesn't match --all-rows).
  • No Node::Guard, no walk_choice / walk_seq change. The recursive combinators are untouched; each candidate is walked cleanly and independently. Speculative evaluation runs on a scratch WalkContext; only the committed candidate is walked into the caller's context, so post-walk context state (completion / hint accumulators) always reflects the chosen candidate.

The two details (user-approved)

  1. "This is SQL" hint in Simple mode for SQL-shaped input. For SQL-only input in Simple mode (e.g. delete … returning *), the Simple candidate won't match. To keep parity with select / with, the dispatcher speculatively checks whether the Advanced candidate would match; if so, it emits the advanced_mode.sql_in_simple hint rather than a generic DSL parse error. Lives entirely in walk.

  2. Advanced-mode completion is SQL-first, DSL as full-line fallback. The Choice approach would have unioned SQL and DSL continuations mid-typing for free; the grouping approach surfaces one candidate's walk for completion / hints. The decision is to show SQL completions primarily in Advanced mode and fall to DSL only on a full DSL-shaped line — simpler and matching the "SQL-first, DSL as familiar fallback" intent. No unioned completion.

Consequences of the amendment

  • Node::Guard is NOT added. §2's mechanism caveat, Consequences bullet 1 ("the walker gains a new node variant Node::Guard"), and the original 3a scope are withdrawn.
  • REGISTRY's element type changes to carry the category alongside each &CommandNode. The handful of registry iteration sites (entry-word listing, command_for_entry_word, the completion entry-word filters, note_help) destructure the tuple; no behavioural change for the current single-candidate registry.
  • The Command variants (§10), worker handlers, RETURNING flag, cascade-summary plan, shortid auto-fill, and diagnostics are unaffected — this amendment is purely about how the dispatcher routes a shared entry word to the right grammar.
  • Sub-phase 3a now validates the dispatch mechanism, not the guard: a smoke registry with a shared entry word (a Simple and an Advanced node using distinguishable tokens) exercises the Simple/Advanced selection, the fallback, and the "this is SQL" path. The 3a exit gate's four cases map onto the new mechanism unchanged in spirit.
  • 3b adds the first real shared entry words to REGISTRY (the SQL insert / update / delete nodes). At that point the completion entry-word lists will need to de-duplicate the shared primary (two candidates, one entry word) — tracked for 3b, not 3a.
  • The mechanism remains reusable for Phase 4 DDL's analogous shared-entry problem (create, …): tag the SQL DDL nodes Advanced; the dispatcher handles the rest.

Amendment 2 — Cascade summary on SQL DELETE: count-diff parity supersedes WHERE-injected pre-count (2026-05-22)

This amendment supersedes §7's "Predicate extraction" subsection and the pre-count mechanism in §7 steps 13, and withdraws Open Implementation Risk R2. It was written during sub-phase 3f, after tracing §7's prescribed mechanism against the actual DSL do_delete handler, and is recorded with explicit user approval before any 3f code landed.

The finding — §7's premise about the DSL handler is incorrect

§7 prescribes a per-child pre-execution pre-count of the form SELECT count(*) FROM <child> WHERE <fk_col> IN (SELECT <pk_col> FROM <target_table> WHERE <user_where_predicate>), with the <user_where_predicate> re-injected from the SQL DELETE's WHERE-clause source bytes. Its stated justification:

"The DSL handler reuses the typed Expr AST to construct the pre-count subqueries."

Tracing do_delete in src/db.rs shows this premise is factually wrong. The DSL handler does not construct pre-count subqueries from the typed Expr. It detects cascade effects by before/after row-count diffing: read the inbound relationships, count each child table's rows before the DELETE, execute the DELETE inside a transaction, count each child after, and report the positive difference as a CascadeEffect. No pre-count query and no Expr-derived subquery exist anywhere in the path.

Two consequences of the correct mechanism

  1. Count-diff reports ON DELETE CASCADE row removals only. ON DELETE SET NULL does not change a child's row count, so the DSL path does not report it (its own code comment notes this). Reporting SET NULL would require value-level diffing.

  2. §7's pre-count is in tension with the parity promise. §2 promises "a DSL-style … DELETE in Advanced mode produces identical observable behaviour whether it routes through the SQL or DSL path", and the plan's 3f exit gate requires the SQL cascade summary to match the DSL do_delete output. A pre-count that counted ON DELETE CASCADE and SET NULL (as §7 step 1 says) would report more than the DSL path — breaking that parity. To match the DSL path it would have to drop SET NULL anyway, leaving WHERE-byte extraction and the R2 N+1-subquery pathology as pure cost with no benefit over count-diff.

The replacement — count-diff parity

do_sql_delete mirrors do_delete exactly: read inbound relationships, snapshot child row counts, run the verbatim validated DELETE SQL inside a transaction via execute_with_fk_enrichment, diff the child counts, build the same Vec<CascadeEffect>, and re-persist the target plus every cascade-affected child before commit. Because both handlers return the same DeleteResult and both route through CommandOutcome::Deletehandle_dsl_delete_success, the per-relationship summary formatter is already shared at the render layer — no formatter refactor and no duplicate path.

This dominates the §7 mechanism on every axis the 3f exit gate measures:

  • Exact parity with the DSL path — identical detection mechanism, not merely a matching formatter.
  • No WHERE-clause byte extraction. The verbatim DELETE (including any subquery in its WHERE) executes once; the engine cascades; the diff observes the result. The R2 invariant case (DELETE FROM a WHERE id IN (SELECT … )) is correct by construction and carries no N+1 cost. R2 is withdrawn.
  • Shared render with no duplicate formatter, satisfying §7's shared-formatter promise structurally.

SET NULL reporting — deferred for both paths (user-confirmed)

3f keeps strict DSL parity: ON DELETE CASCADE row removals are reported; SET NULL is not, on either path. Adding SET NULL reporting (value-level diffing) is a tracked future enhancement to be applied to both do_delete and do_sql_delete together so parity is preserved. The user confirmed this deferral.

Consequences of the amendment

  • §7's "Predicate extraction" subsection and pre-count steps 13 are superseded by count-diff; the per-relationship summary and shared-formatter outcomes are unchanged.
  • R2 (cascade-summary predicate extraction) is withdrawn — the mechanism it budgeted for is no longer used.
  • Command::SqlDelete carries { sql, target_table } only — no WHERE-clause field is needed (the worker never inspects the predicate).
  • The handoff-31 §4 "WHERE-byte-extraction is tractable for DELETE" heads-up is moot — no extraction happens.

Amendment 3 — Command identity is intrinsic; execution-mode is side-channel (deferred) (2026-05-23)

This amendment clarifies the command-identity model that §2 and Amendment 1 implement, corrects a false premise in the implementation plan's sub-phase 3j exit gate, and records a deferred follow-up (the execution-time mode side-channel). It was written during sub-phase 3j — wiring the shared insert / update / delete entry words — after the dispatch model's interaction with the existing test suite surfaced a question about what a "command" is and how input mode relates to it. Recorded with explicit user approval before any 3j code landed.

The model — a command is a mode-rooted grammar-path outcome

A command is the typed outcome of a grammar path rooted at the input mode: simple → update → … or advanced → update → …. Its identity is intrinsic — determined by which grammar matched, not by a mode flag carried on one shared command type. In Advanced mode the dispatcher (Amendment 1) tries the Advanced (SQL) candidate(s) first and falls back to the Simple (DSL) candidate when no Advanced branch recognizes a token; the fallback produces the Simple command.

Worked example (the --all-rows fall-through): delete from T --all-rows in Advanced mode — no Advanced (SQL) branch recognizes --all-rows (the SQL DELETE has no expression slot after the table to absorb it), so the path falls back to simple → delete → … → all-rows and yields Command::Delete { filter: AllRows } (the DSL command), exactly as it would in Simple mode. The mode rooted the path; the command that came out is the Simple one.

Counter-example (no fall-through): update T set x = 42 --all-rows in Advanced mode does not fall back — the SQL UPDATE's SET <expr> greedily consumes --all-rows as the expression 42 - -all - rows (with all/rows as column refs), so the SQL shape matches and Command::SqlUpdate is produced. Whether --all-rows is a DSL flag or part of a SQL expression is decided by which grammar wins, not by special-casing the token. (At execution this is harmless: the engine treats --all-rows as a SQL line comment, so the statement runs as update T set x = 42 — all rows — the same effect as the DSL flag.)

Simple mode commits the DSL candidate; an advanced-mode pointer combines

In Simple mode a shared entry word always commits its DSL candidate, so the user sees DSL completion and the real DSL error (with its position) — not a bare "this is SQL" that discards the actionable detail. A plain ValidationFailed/Mismatch "this is SQL" dispatch outcome is reserved for entry words that have no DSL form (select / with); for those the DSL surface has nothing to offer, so the simple-mode gate points at advanced mode (ADR-0030 §2).

To keep the SQL-discoverability the original §2 envisioned without losing the DSL fix, the rendering layer combines them: when a simple-mode line is a definite DSL error (not merely incomplete) and the same line would parse in advanced mode, the DSL error prose is suffixed with the advanced_mode.also_valid_sql pointer — e.g. for \Name`: Type a quoted string … (valid as SQL in advanced mode — `mode advanced` or prefix `:`). The DSL detail and the mode hint coexist (input_render::ambient_hint_in_mode/advanced_alternative_note`). Mid-typing (incomplete) input is not suffixed, to avoid noise during normal DSL entry. This supersedes the draft of this amendment's earlier suggestion that a SQL-shaped line in simple mode emits a bare "this is SQL" hint for shared words.

Overlapping inputs produce two distinct commands — both are tested

For a fully-overlapping shape — insert into T values (…) is valid in both grammars — Simple mode yields Command::Insert (a typed AST, DSL execution) and Advanced mode yields Command::SqlInsert (validated text, SQL execution). This is correct, not a defect: the two commands do the same thing but execute differently (ADR-0030 §4 — the DSL lowers to a typed AST; SQL is grammar-as-text run verbatim). Because they are distinct commands, each is tested in the mode that produces it — DSL grammar tests run in Simple mode; the SQL command variants are tested in Advanced mode.

Correction to the plan's 3j exit gate

The implementation plan's sub-phase 3j exit gate says "all existing DSL INSERT/UPDATE/DELETE tests still green — unmodified inputs, unmodified outputs", and its scope-out says "observable behaviour must be identical before and after 3j". That rested on a false premise: that the existing DSL DML tests run in Simple mode. They do not — the common helpers (ok/err/parse) call parse_command, which defaults to Advanced mode. Today that is harmless because insert/update/delete have no SQL competitor, so they route to the DSL command even in Advanced mode. Once they become shared entry words, §2's Advanced-mode SQL-first rule routes the overlap to the SQL command (Command::Sql*), changing the parsed variant those tests assert.

The corrected invariant: Simple-mode behaviour is unchanged; Advanced mode is SQL-first per §2; the DSL grammar is tested in Simple mode (its canonical surface, ADR-0003); both command variants are tested in their producing mode. No production behaviour regresses — the §6 (shortid auto-fill) and §7 (cascade summary) parity promises keep the two paths observably equivalent for overlapping inputs; only the implementation-internal Command variant differs, and only in Advanced mode.

Replay uses the same parser, in advanced mode — and needs no per-line mode

A consequence worth stating explicitly, because it is easy to over-think (and a prior reading of this work did). Replay parses each recorded line with the same schema-aware parser the interactive path uses, in advanced mode (the full surface) — it skips nothing and simplifies nothing. The only difference from an interactive line is the mode argument: interactive uses the user's current mode, replay fixes it to advanced.

This is correct and complete without the per-line execution-mode side-channel below, because of the §6 (shortid auto-fill) and §7 (cascade-summary) parity guarantees: a valid overlapping command produces identical effects whether it lowers to the DSL variant or the SQL variant. So replaying a log of valid commands in advanced mode is identical to typing those lines interactively in advanced mode — and, by parity, identical in effect to the simple-mode entry that originally produced them. (Replay only ever sees lines that already executed successfully: history.log is success-only, and ADR-0034's deferred journal replays ok lines only.)

The one observable nuance is purely in error reporting for an invalid line — which only arises from a hand-built script, never from a real journal. Such a line is still rejected and not applied; where the rejection lands just follows the grammar, exactly as interactively: an insert … values … line is SQL in advanced mode, so a wrong column-type value is rejected by the engine at execute time rather than by a DSL typed slot at parse time. This is not replay using a lesser parser — it is the same advanced-mode parse a user would get typing the line. Replay therefore does not depend on the deferred side-channel below.

Execution-time mode is side-channel — deferred to its own ADR

Independent of command identity, every command should — at execution time — know which of three modes it ran under: simple, advanced, or advanced-one-shot (the : escape from Simple mode, ADR-0003), so execution can adjust output without changing identity (e.g. a Simple-mode create table echoing the generated SQL when run in Advanced mode, while staying silent in Simple mode).

Today this exists only as a rendering side-channel (OutputLine.mode_at_submission, consumed by the echo-line mode tag in ui.rs). The Mode enum is two-way (Simple / Advanced); the one-shot distinction is a transient "effective mode" collapsed at submission; and neither Action::ExecuteDsl nor the database worker carries any mode. Wiring an execution-time mode side-channel — widening Mode to the three-way distinction and threading it through the Action → worker interface — is out of scope for Phase 3 and is deferred to its own ADR (user-confirmed). It is not required for Phase 3's dispatch to be correct: the routing and the --all-rows fall-through above are complete on the two-way mode. This amendment forward-references that future ADR so the requirement is not lost.

Consequences of the amendment

  • No code or grammar change from this amendment itself — it records the model that the dispatch (Amendment 1) already implements and corrects the plan's test-mode premise.
  • The 3j test migration moves the DSL grammar / completion tests to Simple mode (inputs and Command::Insert/Update/Delete assertions unchanged), keeps the --all-rows fall-through tests in Advanced mode (they validate the fallback to the Simple command), relies on the migrated tests/sql_*.rs for the SQL command variants in Advanced mode, and adds dispatcher routing tests (Advanced + structurally-ambiguous → SQL; Advanced + --all-rows → DSL fallback; Simple + a SQL-only entry word select → "this is SQL"; Simple + a shared word with a SQL-only construct → DSL error, carrying the combined pointer at the rendering layer).
  • The combined pointer lives in input_render::ambient_hint_in_mode (live typing) and App::dispatch_dsl (submit), both via the shared advanced_alternative_note, so a Simple-mode definite DSL error that would run as SQL gains the advanced_mode.also_valid_sql suffix in both surfaces (the submit path covers SQL constructs that surface only on submit, e.g. delete … returning). Found via the /runda round.
  • Internal-table rejection symmetrised (/runda finding B, ADR-0030 §6): the DSL data-command target slots (insert / update / delete / show data / show table) gained reject_internal_table, so __rdbms_* tables are refused in Simple mode too — previously only the advanced SQL grammar rejected them, so a simple-mode DML could touch the internal metadata tables.
  • The execution-time mode side-channel + three-way Mode is tracked as a future ADR; OutputLine.mode_at_submission remains the only mode side-channel until then. No structure built in 3j assumes mode is irrelevant to execution.

Amendment 4 — update … --all-rows falls back to the DSL; Amendment 3's counter-example was a bug (2026-05-27)

Designing the DSL → SQL teaching echo (ADR-0038) re-examined Amendment 3's counter-example — the claim that update T set x = 42 --all-rows in Advanced mode does not fall back to the DSL because the SQL UPDATE's SET <expr> "greedily consumes --all-rows as the expression 42 - -all - rows (with all/rows as column refs)." That is a bug, not correct behaviour, and this amendment reverses it. Recorded with explicit user approval (2026-05-27).

Why it is a bug

  • The walker has no -- comment support (it lexes two minus operators); the engine does treat -- as a line comment. So the walker's parse (42 - -all - rows) diverges from execution (which runs update T set x = 42).
  • Absent columns literally named all and rows, the walker is silently accepting a SET expression over non-existent columns — precisely the case ADR-0027 says to flag ("mark error if we know it will fail at runtime"). It only appears to succeed because the engine's comment leniency masks the divergence.
  • Trailing SQL comments are not a supported feature of the surface, so relying on the engine to treat --all-rows as one is wrong.

So the SQL UPDATE shape should not match … --all-rows.

Decision

The --all-rows token sequence makes the SQL UPDATE shape fail, so dispatch falls back to the DSL Command::Update { filter: AllRows } — restoring symmetry with delete … --all-rows, which already falls back (the SQL DELETE has no slot to absorb the flag). Consequences:

  • update … --all-rows in Advanced mode is the DSL command, runs all rows via the safe DSL path, and (ADR-0038) gains the teaching echo UPDATE T SET … with the WHERE omitted.
  • No -- comment feature is introduced; the playground still does not support trailing SQL comments (consistent with the rest of the surface, engine-neutral).

Mechanism — key on the adjacent --

The DSL --all-rows is matched by Node::Flag("all-rows") via consume_flag (walker/lex_helpers.rs), which requires an adjacent --. The SQL expression grammar has no flag node, so it consumes the same source as - -all - rows (two minus operators, all/rows as column refs) and the SQL UPDATE shape wins (SQL-first). Crucially, --all-rows and a legitimate - -all tokenise identically once split, so the only robust discriminator is the adjacency of the two dashes:

  • The fix: the SQL expression treats an adjacent -- as a boundary it will not consume, so the SQL UPDATE shape fails there and dispatch falls back to the DSL, whose Node::Flag matches --all-rows. (Adjacency is what separates the DSL flag from real arithmetic — see the preserved case below.)

Verified consequences (probed against the current grammar), which the build must lock down test-first:

  • update T set x = 42 --all-rows → DSL Update { AllRows } (the goal; inverts sql_dml_e2e.rs::e2e_update_all_rows_in_advanced_does_not_fall_back_to_dsl).
  • update T set x = 42 - -3unchanged, stays SqlUpdate (= 45): the dashes are space-separated, not an adjacent --. This invariant is non-negotiable and gets its own test.
  • update T set x = 42--3 and update T set x = 42 --col (adjacent -- before a number or a real column) → today they parse Ok as SqlUpdate; after the fix the SQL expression stops at -- and the DSL flag grammar (which accepts only all-rows) rejects the rest, so they become parse errors. This is a deliberate, acceptable behaviour change for these contrived adjacent-dash inputs — the playground does not support -- line comments, so there is no valid reading of an adjacent -- here. (User heads-up flagged 2026-05-27.)

The existing delete … --all-rows fall-back is unaffected. Folded into the ADR-0038 echo effort (the fix that makes update … --all-rows echoable).

Amendment 5 — also_valid_sql fires on validity, not just parse (2026-05-28)

Amendment 3 introduced the advanced_mode.also_valid_sql pointer and described its gate as "the same line would parse in advanced mode." Issue #1 surfaced that this is too weak: an advanced-mode positional INSERT INTO T VALUES (…) with a value count that doesn't match the target table's column count parses (the SQL grammar accepts any positional value count) but then fails at the engine. The pointer fired, telling the user to switch modes, and the same line failed again — the suggestion was misleading. Recorded with explicit user approval (2026-05-28).

Why "parse" was the wrong bar

"Would parse" is a purely syntactic check: the structural grammar accepts the input. It says nothing about whether subsequent static checks (type slots, INSERT arity, predicate warnings, schema-existence diagnostics, …) would also pass. The cross-mode pointer's job is to direct the user to the mode that actually runs the line; a syntactic-only gate over-promises in every case where a non-grammar static check would still reject. The Form B positional arity case is just the first one observed.

"Valid" — the explicit definition used by this amendment

The pointer fires only when the line is valid in advanced mode, where valid is the ADR-0027 sense — operationally, the value returned by crate::dsl::walker::input_verdict_in_mode(input, schema, Mode::Advanced) is None:

  • The structural parse succeeds (WalkOutcome::Match), and
  • No diagnostic of Severity::Warning or Severity::Error is produced by any pass that contributes to input_verdict_in_mode (schema-existence, SQL-predicate warnings, compound-arity errors, INSERT-arity errors — Form A and Form B — and any further diagnostic added to the pipeline in the future).

This is the same condition the in-app [ERR] / [WRN] validity indicator uses (ADR-0027 §2). Tying the pointer to it gives one canonical "is this line OK in this mode?" answer that every surface shares — by construction, the indicator and the pointer can never disagree about a given line.

Decision

  • advanced_alternative_note (src/input_render.rs) reads input_verdict_in_mode(…, Mode::Advanced). The pointer fires only for verdict None.
  • Any static diagnostic that participates in the verdict automatically participates in the gate. No bespoke pre-flight check lives in the pointer helper.
  • The Amendment 3 description ("would parse in advanced mode") should be read as a synonym for valid in this amendment's sense, not as a literal syntactic-only test.

Diagnostic added to close the issue #1 gap

ADR-0033 §8.1's existing dml_insert_arity_diagnostics checked only Form A (column list present). Its own doc-comment recorded the Form B case as deferred. This amendment closes that gap: the same function now also checks the Form B case (no column list) by comparing each VALUES tuple's arity against the target table's full column count from the schema cache, emitting a new diagnostic key diagnostic.insert_arity_mismatch_form_b. The [ERR] validity indicator now lights up at typing time for the user-reported scenario, and the cross-mode pointer gate sees the verdict and stays silent.

Tests

The amendment's behavioural promise is locked down by:

  • dsl::walker::tests::insert_form_b_arity_mismatch_under_supply_fires — the user's reported 4-col / 3-value case produces an Error diagnostic.
  • dsl::walker::tests::insert_form_b_arity_mismatch_over_supply_fires — symmetric case, 5 values for a 4-col table.
  • dsl::walker::tests::insert_form_b_arity_match_is_silent — happy path (every column supplied, autos included) stays silent.
  • dsl::walker::tests::insert_form_b_arity_unknown_table_is_silent — defensive: an unknown table defers to the schema-existence pass.
  • input_render::tests::ambient_hint_omits_advanced_pointer_when_form_b_value_count_wouldnt_match — the pointer is suppressed for the bug case via the validity verdict.
  • app::tests::simple_mode_submit_of_sql_construct_appends_advanced_pointer — multi-row VALUES (a true SQL-only construct) still fires the pointer when the table exists in the schema.

Behaviour preserved

The pointer still fires for every input that is genuinely SQL-only syntax in simple mode and is valid as advanced SQL — delete … returning *, multi-row VALUES against a known table, INSERT … ON CONFLICT … over a known target, and so on. The narrowing is exactly the set of inputs where switching modes wouldn't help, no more.

See also

  • ADR-0005 — the ten-type vocabulary INSERT works with.
  • ADR-0006 — the auto-snapshot before destructive ops.
  • ADR-0014 — the DSL DML model + cascade-summary precedent.
  • ADR-0015 — the persistence write-through path.
  • ADR-0018 — the DSL INSERT forms (A / B / C).
  • ADR-0019 — the friendly-error layer DML errors route through.
  • ADR-0027 — the validity indicator surfaces DML diagnostics.
  • ADR-0030 — the SQL surface in advanced mode; Phase 3 is this ADR's commission.
  • ADR-0031 — the SQL expression grammar, consumed by DML WHERE / SET / VALUES / RETURNING.
  • ADR-0032 — the SQL SELECT grammar; INSERT … SELECT and RETURNING reuse its projection-list and scope-frame machinery.