rdbms-playground/docs/adr/0033-sql-dml-grammar.md

# ADR-0033: The full SQL DML grammar — `INSERT` / `UPDATE` / `DELETE`

## Status

Accepted

Phase 3 implemented and verified through sub-phase 3k
(2026-05-23); see `docs/handoff/20260523-phase-3-verification.md`
for the phase-exit report and the filled cross-cut matrix in
`docs/plans/20260520-adr-0033-phase-3.md`. Amendments 1–3 below
are part of this acceptance.

> **Forward note (2026-05-26, ADR-0036 Accepted).** ADR-0036 **augments**
> the §10 model — it does **not** change verbatim execution and does
> **not** collapse the two-command identity of **Amendment 3** (which
> stands). It adds a *value-validation* step: `INSERT`/`UPDATE` literal
> values are parsed to typed `Value`s, **validated** against the column
> type (sharing the DSL's validators), and **retained** on the command
> (so a constraint error can show the offending value) — then the
> statement still executes **verbatim**. `Command::SqlInsert` gains a
> captured-literals payload; its execution and `plan_shortid_autofill`
> are unchanged. Expressions, `WHERE`, `INSERT … SELECT`, `RETURNING`,
> UPSERT, and `SELECT` are untouched. (The serial/shortid auto-fill
> difference vs simple mode is tracked separately as `requirements.md`
> X4.)

## Context

ADR-0030 commissions advanced mode as a body of SQL grammar
inside the unified grammar tree. Phase 1 ("foundations + first
`SELECT`") and Phase 2 ("`SELECT` — full") have shipped through
ADR-0031 and ADR-0032 respectively. The next slice on ADR-0030's
roadmap is **Phase 3 — DML**: `INSERT` / `UPDATE` / `DELETE` as
validated SQL.

ADR-0030 §3 fixed the headline scope:

> **`INSERT`** (single- and multi-row), **`UPDATE`**, **`DELETE`**.

ADR-0030 §4 fixed the execute path:

> **DML and `SELECT`** → executed as the **validated SQL itself**
> (re-rendered canonically from the matched parse, or the validated
> original text). … For DML the worker — knowing the statement kind
> and target table from the parse — runs the statement and
> re-persists that table's CSV.

ADR-0030 §11 fixed the persistence model. ADR-0030 §12 fixed the
safety posture: SQL `UPDATE`/`DELETE` without `WHERE` executes as
written (no `--all-rows` rail).

ADR-0030 §3 also flagged this ADR's existence in advance:

> Settle multi-row `INSERT` and `shortid` auto-fill on a SQL
> `INSERT`.

This ADR fixes the **grammar shape**, the **dispatch
architecture** for shared entry words, the **diagnostic surface**
new to DML, and the **execute-path** Command variants. It does
not revisit ADR-0030's controlling decisions; references in this
text to ADR-0030 §X mean "that decision is controlling."

### What ADR-0030, ADR-0031, ADR-0032 already fix

- One walker, grammar-as-text execution, ambient assistance for
  free (ADR-0030 §1/§4/§6/§8).
- `sql_expr` is the shared expression grammar — UPDATE `SET`
  expressions, WHERE predicates, INSERT VALUES tuples, RETURNING
  projection items all consume it (ADR-0031, ADR-0032 §5/§6).
- `Node::ScopedSubgrammar` + `from_scope_stack` carry CTEs and
  per-statement scope into nested constructs (ADR-0032 §10).
- The diagnostic pipeline — `schema_existence_diagnostics`,
  `sql_predicate_warnings`, `compound_arity_diagnostics`,
  `pending_diagnostics` accumulator — applies uniformly to DML's
  `sql_expr` slots (ADR-0032 §11).
- The friendly-error layer translates engine messages through the
  `engine.*` catalog (ADR-0019, ADR-0030 §7, ADR-0032 §11.5).
- The `__rdbms_*` rejection at every table-source slot
  (ADR-0030 §6, ADR-0032 §1/§4).
- `MAX_SUBGRAMMAR_DEPTH = 64` (ADR-0026) is the recursion budget.

## Decision

### 1. The top-level statement shapes

```
insert_statement   := INSERT INTO table_name
                      [ '(' column_name_list ')' ]
                      ( values_clause | sql_select_compound )
                      [ on_conflict_clause ]
                      [ returning_clause ]
                      [ ';' ]

update_statement   := UPDATE table_name
                      SET assignment_list
                      [ where_clause ]
                      [ returning_clause ]
                      [ ';' ]

delete_statement   := DELETE FROM table_name
                      [ where_clause ]
                      [ returning_clause ]
                      [ ';' ]
```

Where:

```
values_clause      := VALUES value_tuple ( ',' value_tuple )*
value_tuple        := '(' sql_expr ( ',' sql_expr )* ')'
assignment_list    := assignment ( ',' assignment )*
assignment         := column_name '=' sql_expr
returning_clause   := RETURNING projection_list
on_conflict_clause := ON CONFLICT [ '(' column_name_list ')' ]
                      ( DO NOTHING
                      | DO UPDATE SET assignment_list [ where_clause ] )
```

- `table_name` is the existing `Tables` ident slot, with
  `reject_internal_table` validator (ADR-0030 §6).
- `column_name_list` is comma-separated `Columns` idents — the
  same shape `data::INSERT` (DSL) uses for Form A. Reuses the
  existing pattern; no new node infrastructure.
- `where_clause` is `WHERE Subgrammar(&sql_expr::SQL_OR_EXPR)`.
- `projection_list` is the Phase-2 projection list, re-used
  unchanged.

The body of an `INSERT … SELECT` recurses through
`Node::Subgrammar(&sql_select::SQL_SELECT_COMPOUND)` — the same
recursion point Phase 2's CTE bodies, scalar subqueries, and
`IN`/`EXISTS` bodies use. The Phase-2 SELECT grammar already
admits an optional leading `WITH` (Amendment 2 / commit
`fd25904`), so `INSERT INTO t WITH x AS (…) SELECT * FROM x`
works for free.

### 2. Dispatch — DSL vs SQL on shared entry words

The entry words `insert`, `update`, `delete` are **shared**
between DSL and SQL. This ADR fixes the dispatch rule **and**
the mechanism that implements it (a known-risk item — see
"Open implementation risks" below).

**Simple mode → DSL only.** SQL constructs (multi-row VALUES,
INSERT…SELECT, RETURNING, UPSERT) attempted in simple mode
produce a `ValidationFailed` "this is SQL" hint, matching the
gating that `select` / `with` already produce (ADR-0030 §2).

**Advanced mode → SQL first, DSL fallback.** Each shared entry
word's `CommandNode.shape` becomes a `Choice` between the SQL
shape and the DSL shape, declared in that order:

```
shape: Node::Choice(&[SQL_INSERT_SHAPE, DSL_INSERT_SHAPE])
```

The walker's `Choice` is first-match-wins. The SQL shape's
constructs are a strict superset of the DSL shape's for these
statements (multi-row VALUES, INSERT…SELECT, RETURNING, UPSERT
are SQL-only; DSL's `--all-rows` flag is DSL-only). Any
structurally-ambiguous input — `delete from t where id = 1` —
matches the SQL shape first and routes through the SQL
machinery.

This is the design point an earlier draft of ADR-0030 §2 left
open ("choice order or a disambiguator decides"). The chosen
disambiguator is `Choice` order: **SQL > DSL in Advanced**. A
DSL-specific construct (e.g., `--all-rows`) won't match the SQL
shape and falls through to DSL — accommodating, not surprising.

Cascade-summary parity (§7) and `shortid` auto-fill parity (§6)
mean a DSL-style INSERT/UPDATE/DELETE in Advanced mode produces
identical observable behaviour whether it routes through the SQL
or DSL path. The user's mental model can be "Advanced mode runs
SQL; DSL syntax still works as a familiar fallback."

#### Mode-gating mechanism

The existing `is_advanced_only` infrastructure
(`src/dsl/grammar/mod.rs::ADVANCED_ONLY_ENTRIES`) gates by
**entry word**, fires in the walker's `dispatch_command_node`
path (`walker/mod.rs:1707`), and emits the "this is SQL"
ValidationFailed verdict. It can't be reused to gate a Choice
*branch* by mode.

Mechanism options for branch-level gating:

- **(a) Mode-aware validator on the SQL branch.** Each
  SQL-shape Seq starts with a no-op `Node::Validated` whose
  validator inspects `ctx.mode`; in Simple mode it returns
  `ValidationError(message_key: "advanced_mode.sql_in_simple")`
  before any structural tokens commit. Trade-off: clean, reuses
  the existing validator path; the failure shape is the same
  `WalkOutcome::ValidationFailed` as the entry-word gate.
- **(b) Mode flag on `Node::Choice`.** A new walker capability:
  `Node::Choice` branches can carry a min-mode marker. Heavier
  (touches the walker's `walk_choice` driver) but lets the
  grammar declare gating without per-branch boilerplate.
- **(c) Two CommandNodes per entry word; registry picks by
  mode.** The dispatcher consults `ctx.mode` and skips the SQL
  CommandNode in Simple mode. Heavier (registry semantics
  change) and surfaces a "multiple CommandNodes per entry word"
  pattern the rest of the codebase doesn't use.

**Default choice: (a) the validator approach.** Smallest
mechanism change. **Walker-capability caveat:** validators
today are an `Ident { validator: Option<fn> }` field — there
is no standalone "guard" node that runs a validator without
consuming input. Sub-phase 3a's first job is to add
`Node::Guard(fn(&WalkContext) -> Result<(), ValidationError>)`
to the walker — a zero-consumption node whose only effect is
to fail the Seq with a ValidationError in Simple mode. The
existing internal-table rejection (`reject_internal_table`)
shows the wiring pattern; `Node::Guard` is the
"no-byte-consuming version" of it.

### 3. Multi-row `INSERT … VALUES`

```
INSERT INTO orders (customer_id, total) VALUES (1, 99.50), (2, 12.00), (3, 6.25)
```

The grammar's `values_clause` repeats `value_tuple` with a comma
separator (`Node::Repeated { inner: value_tuple, separator:
Some(COMMA), min: 1 }`). Each tuple matches the same arity as
the `(column_name_list)` (see §8 for the arity diagnostic).

### 4. `INSERT … SELECT`

```
INSERT INTO archive SELECT * FROM orders WHERE created < '2025-01-01'
INSERT INTO summary (id, total) WITH t AS (…) SELECT id, sum(total) FROM t GROUP BY id
```

A `Choice` between `values_clause` and
`Subgrammar(&sql_select::SQL_SELECT_COMPOUND)` after the
optional column list. The Phase-2 SELECT machinery (CTEs, JOINs,
subqueries, set ops) applies as the row source for free.

Arity verification: at parse time, the walker counts the
declared column list (or, when absent, the target table's column
count from the schema cache) against the SELECT's projection
list arity. Mismatch emits `diagnostic.insert_arity_mismatch`
(see §8).

### 5. `RETURNING`

`RETURNING projection_list` admits the Phase-2 projection_list
unchanged — column refs, `*`, `t.*`, expressions with `AS`
aliases, computed expressions.

Execution: the `Command::Sql{Insert,Update,Delete}` variants
carry a `returning: bool` flag set by the walker's
`ast_builder` when a `RETURNING` clause matched. The worker
reads this flag (no SQL text re-parsing) and routes:

- `returning == true` → prepare + step the statement,
  collect rows into a `DataResult` (per-result-column type
  recovery via the same column-origin path Phase 2 introduced,
  ADR-0032 §12 + Amendment 1).
- `returning == false` → execute the statement,
  collect the affected-row count.

The flag is sufficient because the *shape* of the projection
is already in the SQL the engine prepares; the worker doesn't
need to re-parse to know which columns to read.

### 6. `shortid` auto-fill — worker post-fills

Parity with the DSL `do_insert` handler (ADR-0005 / ADR-0018).

For an `INSERT` whose `(column_name_list)` omits one or more
`shortid` columns, the worker:

1. Inspects the table metadata to find the auto-generated columns
   (`serial`, `shortid`) NOT in the user's column list.
2. For each row in the VALUES list (single or multi-row), and for
   `INSERT … SELECT` each yielded row, synthesises a fresh
   `shortid` value (the existing `shortid::generate` path).
3. Constructs the parameterised INSERT (or executes a rewrite of
   the SQL with the values bound positionally) and runs it.

The engine sees a fully-specified row; the user's submitted SQL
is unmodified in `history.log` (ADR-0030 §11). The walker
recognises an explicit value provided for an auto-gen column —
that's an explicit override, not a missing slot — and emits a
WARNING (§8b) but lets it through.

`serial` columns rely on SQLite's `INTEGER PRIMARY KEY` rowid
auto-increment — no worker post-fill needed; the engine
populates the value on its own.

### 7. Cascade summary on `DELETE`

Parity with the DSL `do_delete` handler (ADR-0014).

When `Command::SqlDelete { target_table, sql, … }` runs:

1. **Pre-execution cascade pre-count.** For each child table
   that has an FK referencing `target_table` with
   `ON DELETE CASCADE` (or `SET NULL`), the worker runs a
   pre-count query of the form `SELECT count(*) FROM <child>
   WHERE <fk_col> IN (SELECT <pk_col> FROM <target_table>
   WHERE <user_where_predicate>)`. The `<user_where_predicate>`
   is extracted from the SQL — see "Predicate extraction" below.
2. The worker executes the DELETE.
3. **Post-execution summary.** Per-relationship row counts +
   cascade impact, formatted by the same cascade-summary
   formatter the DSL path uses.

**Predicate extraction.** The DSL handler reuses the typed
`Expr` AST to construct the pre-count subqueries. For SQL
DELETE the walker captures the WHERE clause's source byte
range from the matched path and re-injects that text into
the pre-count SQL. This is grammar-as-text reuse, consistent
with ADR-0030 §4's posture; no AST construction required.

When the SQL DELETE has no WHERE, the pre-count is unbounded
("all rows of `target_table`"). Cost: one extra count(*) per
child table; cheaper than the DSL parallel path on a typical
schema.

The cascade-summary formatter itself (`format_cascade_summary`
in the DSL handler) is shared with `do_sql_delete` — same
output, same i18n keys.

### 8. Diagnostics new to Phase 3

Three new keys, classified per ADR-0027's ERROR / WARNING model.

#### 8.1. `diagnostic.insert_arity_mismatch` (ERROR)

Walker-level pre-flight when the declared `(column_name_list)`'s
arity disagrees with a `value_tuple`'s arity (or, for `INSERT …
SELECT`, with the SELECT's projection list arity, counted at
parse time via the same projection-counting infrastructure
Phase-2's `compound_arity_diagnostics` uses).

For multi-row VALUES, **each** offending row emits its own
diagnostic on that row's `value_tuple` span. A query with five
rows and three wrong arities produces three diagnostics; the
two right rows are silent. Pre-empts the engine's less-helpful
"X values for Y columns" error.

For `INSERT … SELECT`, the diagnostic anchors on the SELECT's
first `projection_item` span. The Phase-2 projection-count
infrastructure (the comma-counting at depth = leg-depth used by
`compound_arity_diagnostics`) is the underlying mechanism.

#### 8.2. `diagnostic.auto_column_overridden` (WARNING)

Walker-level when an INSERT explicitly assigns a value to a
column whose type is `serial` or `shortid`. Pedagogical: the
explicit value bypasses the auto-counter / generator, which can
collide with future auto-generated values.

Catalog wording: `"column \`{column}\` is auto-generated
(\`{type}\`); providing an explicit value bypasses the
auto-counter and may collide with later auto-generated values"`.

The statement still runs — the warning is advisory, matching
the Phase-2 predicate-warning posture (ADR-0027 §1 /
ADR-0032 §11.6).

#### 8.3. `diagnostic.not_null_missing` (WARNING)

Walker-level when an INSERT's `(column_name_list)` omits a
column declared `NOT NULL` with no `DEFAULT`. Pedagogical
nudge — the engine will reject the statement, but the
walker can pre-flight the missing column at parse time.

Catalog wording: `"column \`{column}\` is required (\`NOT NULL\`,
no default); the statement will fail when run"`.

The walker doesn't BLOCK the statement (it's structurally
valid). The engine does. The friendly layer translates the
engine error via the existing NOT-NULL path.

#### 8.4. Inherited diagnostics

The Phase-2 diagnostic passes apply to DML's `sql_expr` slots
without further work:

- `schema_existence_diagnostics` flags unknown columns in WHERE,
  SET, RETURNING, VALUES tuples.
- `sql_predicate_warnings` flags `= NULL`, `LIKE`-on-numeric,
  type-mismatched comparisons in WHERE and CASE expressions.
- The `pending_diagnostics` accumulator path stays available
  for in-walk emits.

### 9. UPSERT — `ON CONFLICT … DO NOTHING / DO UPDATE`

```
INSERT INTO t (id, name) VALUES (1, 'x')
ON CONFLICT (id) DO UPDATE SET name = excluded.name

INSERT INTO t (id, name) VALUES (1, 'x')
ON CONFLICT DO NOTHING
```

Grammar shape:

```
on_conflict_clause := ON CONFLICT [ '(' column_name_list ')' ]
                      ( DO NOTHING
                      | DO UPDATE SET assignment_list [ where_clause ] )
```

The optional `(column_name_list)` is the conflict target —
which columns' unique constraint to react to. Standard SQL allows
this list to be empty (`ON CONFLICT DO NOTHING` is shorthand for
"any conflict").

The DO UPDATE branch reuses the same `assignment_list` and
optional `where_clause` shapes as the top-level UPDATE statement.

Inside DO UPDATE expressions, the keyword `excluded` is a
pseudo-table reference to the would-have-been-inserted row.
**Note:** `excluded` is the SQLite + PostgreSQL spelling;
standard SQL (`MERGE … WHEN MATCHED`) uses a different model.
The playground commits to the `excluded` form because it is what
SQLite supports and what the project's pedagogical target dialect
uses (ADR-0001 / ADR-0002 — the engine is SQLite, even though
its name is hidden from user-facing strings). This is a
deliberate carve-out from ADR-0030 §7's engine-neutral posture,
justified by the absence of a standard-SQL UPSERT spelling that
both SQLite and PostgreSQL share.

The walker admits `excluded` as a table alias for
ident-completion purposes inside the DO UPDATE branch — its
columns are the target table's columns.

### 10. Execute path — three typed Command variants

`Command` gains three variants:

```
Command::SqlInsert {
    sql: String,
    source: Option<String>,         // for history.log
    target_table: String,           // for re-persistence
    listed_columns: Option<Vec<String>>,  // None if no col-list provided
}

Command::SqlUpdate {
    sql: String,
    source: Option<String>,
    target_table: String,
}

Command::SqlDelete {
    sql: String,
    source: Option<String>,
    target_table: String,
}
```

The walker's `ast_builder` for each shape extracts
`target_table` and `listed_columns` from the matched path. The
`sql` field is the validated source text the engine executes.

`Database` gains three request types:

```
Request::RunSqlInsert { command, reply }
Request::RunSqlUpdate { command, reply }
Request::RunSqlDelete { command, reply }
```

Each worker handler:

1. Applies statement-specific pre-execution work (shortid
   auto-fill for `RunSqlInsert`; cascade pre-count for
   `RunSqlDelete`).
2. Prepares + executes the statement. If RETURNING is present,
   collects the result set as a `DataResult`.
3. Re-persists the target table's CSV (ADR-0030 §11) using the
   existing persistence write-through machinery.
4. Returns the result (DataResult for RETURNING; affected-row
   count + cascade summary for plain DELETE; affected-row count
   for plain INSERT/UPDATE).

History logging: the worker writes the literal submitted line
to `history.log` (ADR-0030 §11) before execution, exactly as
`run_select` does.

### 11. Engine-error translation

Three new catalog keys (the friendly-error layer's engine.*
translation set, ADR-0019 / ADR-0030 §7):

| Key                                       | Engine condition                                                |
|-------------------------------------------|-----------------------------------------------------------------|
| `engine.unique_constraint_failed`         | The engine's `UNIQUE constraint failed: t.c` (already exists; will be re-used as-is for UPSERT-without-conflict-target cases) |
| `engine.not_null_constraint_failed`       | Already exists; covers SQL-side NOT-NULL failures uniformly     |
| `engine.foreign_key_constraint_failed`    | Already exists; SQL DELETE that violates FK without cascade     |

No NEW engine.* keys are required for Phase 3 — the existing FK
/ UNIQUE / NOT-NULL translation paths cover the engine-side DML
errors. Phase 3 reuses them.

### 12. Result-column type recovery on RETURNING

A RETURNING projection that's a bare column reference recovers
its playground type via the same column-origin metadata path
Phase 2 introduced (`resolve_select_column_types` in `db.rs`,
ADR-0032 §12 + Amendment 1).

Computed expressions in RETURNING stay typeless. The renderer
uses neutral alignment for typeless columns, consistent with
SELECT.

### 13. Out of scope

| ID    | OOS                                                                                  |
|-------|--------------------------------------------------------------------------------------|
| OOS-1 | `INSERT INTO t DEFAULT VALUES` (the project's planned seed feature covers this)      |
| OOS-2 | `INSERT OR REPLACE` / `OR IGNORE` / `OR ABORT` / `OR FAIL` / `OR ROLLBACK` (SQLite-specific) |
| OOS-3 | `UPDATE … FROM other_table` (SQLite/PostgreSQL extension)                           |
| OOS-4 | `WITH x AS (…) UPDATE …` / `WITH x AS (…) DELETE …` (deferrable; SELECT-side WITH is in scope) |
| OOS-5 | Indexed-by / not-indexed table-source modifiers (SQLite-specific)                   |
| OOS-6 | Multi-statement batches (already OOS via ADR-0030 §13 OOS-4)                        |

OOS-4 is a deliberate choice: WITH-prefixed DML is standard SQL
since SQL:2003, but its execution semantics (the WITH-defined
CTE is visible to the DML's WHERE / SET / RETURNING expressions)
require additional scope-frame plumbing that doesn't yet exist
on the DML side. The Phase-2 grammar admits WITH at the
SELECT level; Phase 3 keeps WITH a SELECT-only prefix. A later
phase can lift this restriction with a clean ADR Amendment.

## Consequences

- The walker gains a new node variant `Node::Guard` (§2
  mechanism caveat) — a zero-byte-consumption gating node that
  can fail its enclosing Seq with a `ValidationError`. The
  walker driver's exhaustive `walk_node_inner` match grows one
  arm. Reusable beyond DML (Phase 4 DDL's similar shared-entry
  problem is a likely future consumer).
- `Command` gains three variants (`SqlInsert`, `SqlUpdate`,
  `SqlDelete`); every exhaustive `match Command` callsite picks
  up three new arms (the recurring ADR-0028/0029 gotcha).
- The DSL `data::INSERT` / `data::UPDATE` / `data::DELETE`
  CommandNodes' shapes become `Choice(SQL_shape, DSL_shape)`
  with the SQL branch mode-gated via a no-op leading validator
  (R1).
- Three new walker diagnostics with three new catalog keys
  (`diagnostic.insert_arity_mismatch`,
  `diagnostic.auto_column_overridden`,
  `diagnostic.not_null_missing`).
- Three new `Request` variants
  (`RunSqlInsert`/`RunSqlUpdate`/`RunSqlDelete`) and three new
  worker handlers
  (`do_sql_insert`/`do_sql_update`/`do_sql_delete`).
- The persistence write-through path gains per-DML-kind entry
  points; the cascade-affected re-persistence in `do_sql_delete`
  is new (cascade-affected child tables are also re-persisted).
- `RETURNING` lands a write-statement-returns-rows code path
  the project hasn't had before. Worker handlers either return
  affected-row count or a `DataResult` keyed by the
  `returning: bool` flag.
- The cascade-summary formatter is reused between the DSL and
  SQL DELETE paths; the predicate-extraction byte-range
  injection (§7) is new code with the R2 mitigation.
- `shortid` auto-fill code in `do_sql_insert` mirrors the DSL
  `do_insert` mechanism but takes a `listed_columns` slice from
  the parse rather than a typed `Command::Insert`.
- The auto-snapshot machinery (ADR-0006) is unchanged but
  fires for SQL DML the same way it does for DSL DML — the
  snapshot trigger lives in the worker handlers.
- All Phase-2 tests stay green (the SQL grammar pieces this ADR
  builds on are unchanged).
- The DSL-vs-SQL dispatch mechanism (3a) is a foundation other
  projects in this codebase may need (e.g., Phase 4 DDL). Land
  it generically so it's reusable.

## Initial DA review

A first-pass DA review of this ADR's draft surfaced ten
critiques. The seven that surfaced real gaps were resolved in
the body before this status moved to Proposed:

1. OOS-4 (WITH-prefixed DML) under-justified → R4 risk
   describes the mechanism gap concretely.
2. Dispatch mechanism hand-waved → §2's "Mode-gating
   mechanism" subsection enumerates three options and commits
   to the validator approach with R1 as the mitigation if it
   fails.
3. RETURNING execution hand-waved → §5 commits to a
   `returning: bool` flag on the Command variants.
4. Cascade-summary code-sharing claim too optimistic → §7's
   "Predicate extraction" subsection details the byte-range
   injection mechanism; R2 budgets for the
   performance-pathology case.
5. `excluded` keyword's engine-neutrality status → §9 carves
   it out explicitly as a justified non-neutral choice.
6. Multi-row INSERT arity diagnostic emit pattern unspecified
   → §8.1 clarifies "each offending row emits its own
   diagnostic" plus the INSERT…SELECT span.
7. Sub-phase ordering put dispatch (the riskiest piece) last
   → reordered so 3a lands the mechanism FIRST on a no-op
   shape.

The remaining three critiques were either process points (DA
gates at sub-phase boundaries — added per-sub-phase in
Implementation notes) or already-covered (verification matrix —
will live in the plan doc, matching Phase 2's pattern).

Per CLAUDE.md / the Phase 2 process lesson, the DA review of
THIS ADR will get a second pass before status moves to
Accepted. Pin: do not rubber-stamp.

## Open implementation risks

Recorded explicitly so the implementation plan can budget for
them, not to allow silent drift later.

### R1. Mode-gated `Choice` branch mechanism unproven

§2's option (a) — a no-op validator at the head of the SQL
shape's Seq, rejecting in Simple mode — is plausible but
untested. If the walker's `walk_seq` commits the Seq before the
validator runs, the SQL branch's NoMatch becomes Failed and
the Choice can't fall through to the DSL branch.

**Mitigation:** sub-phase 3a lands the mechanism end-to-end on
the smallest viable shape (a single `SqlNop` CommandNode whose
shape is `Choice(SQL_branch[mode-gated-validator + token],
DSL_branch[different-token])`) before any DML grammar lands. If
the validator approach doesn't work, the alternatives (a
mode-aware `Node::Choice` variant, or split-CommandNode
dispatch) get evaluated before sub-phases 3b/3c/3d depend on
the mechanism.

### R2. Cascade-summary predicate extraction

§7 says the worker re-injects the WHERE-clause source bytes into
pre-count subqueries. This is straightforward when the WHERE
clause is a top-level predicate, but a future user might write
WHERE clauses that themselves contain subqueries (e.g.,
`DELETE FROM orders WHERE customer_id IN (SELECT id FROM customers WHERE …)`).
The byte-range injection still works structurally, but a
performance pathology is possible if the subquery is expensive
and the cascade child count is large (the same subquery runs N+1
times).

**Mitigation:** sub-phase 3c benchmarks the pre-count overhead
on a representative schema before declaring the path acceptable.
If the overhead is prohibitive, fall back to "report cascade
count post-execution only" (less informative, but cheaper).

### R3. RETURNING result-column type recovery on multi-row INSERT

§5 says result columns get the same column-origin type recovery
SELECT uses. For a single-row INSERT … RETURNING this works
identically to a SELECT. For multi-row INSERT, the engine emits
one result row per inserted row; the column-origin metadata is
the same for all (per ADR-0032 Amendment 1, column-origin is a
property of the prepared statement, not the rows).

No risk here — calling it out so the implementer doesn't
re-derive the question. The empty-table type-recovery test
already pins this invariant.

### R4. `INSERT … SELECT` with a CTE row source

The Phase-2 SELECT grammar already admits `WITH … SELECT …` at
the compound level (ADR-0032 §10.3 / Amendment 2). The
`SQL_SELECT_COMPOUND` recursion point §1 uses pulls this in for
free. `INSERT INTO archive WITH t AS (…) SELECT * FROM t` parses
and runs.

A user might reasonably expect WITH-prefixed DML to also work
(see §13 OOS-4). That's explicitly out of scope for Phase 3;
the discrepancy is documented here so the implementer doesn't
need to figure it out from the grammar shape.

## Implementation notes

Phased sub-plan with **explicit exit gates per sub-phase**. The
implementation plan doc (`docs/plans/<date>-adr-0033-phase-3.md`,
modelled on `docs/plans/20260520-adr-0032-phase-2.md`) refines
these and adds the cross-cut verification matrix in 3i.

**Ordering rationale.** The dispatch mechanism (3a) is the
foundation; if it fails to work, every later sub-phase has to
be reworked. Land it first on a no-op shape, then build the DML
grammar on top.

### 3a — `Node::Guard` + mode-gated Choice mechanism

**Scope (in):**
- Add a new walker node variant
  `Node::Guard(fn(&WalkContext) -> Result<(), ValidationError>)`
  (per §2 mechanism caveat) — zero-byte consumption, fails the
  enclosing Seq with `ValidationFailed` when the validator
  returns Err.
- Implement `walk_guard` in `src/dsl/walker/driver.rs`.
- Add the `reject_sql_in_simple_mode` guard function.
- Build a smoke-test grammar: an experimental `CommandNode`
  with `shape: Choice(Seq[Guard(reject_sql_in_simple_mode),
  SQL_branch_token], DSL_branch_token)`. Verify the Choice
  falls through cleanly in Simple mode (DSL wins) and routes
  SQL-first in Advanced mode.

**Scope (out):**
- Any real DML grammar (3b onwards).
- Wiring the existing `data::INSERT/UPDATE/DELETE` shapes (3j).

**Exit gate (required green tests):**
- `Node::Guard` admits via `walk_guard`; the walker driver's
  match arm exists and is tested.
- Smoke-test grammar in Simple mode → DSL branch matches.
- Smoke-test grammar in Advanced mode + SQL-shape input → SQL
  branch matches.
- Smoke-test grammar in Simple mode + SQL-shape input →
  `ValidationFailed` with `message_key:
  "advanced_mode.sql_in_simple"`.
- Smoke-test grammar in Advanced mode + DSL-shape input →
  DSL branch matches (Choice fall-through works even when SQL
  branch is admissible).
- All Phase-2 tests (1446 baseline) still green.
- Clippy clean.

**DA gate (written):**
- Does the Guard actually fire BEFORE the Seq commits? Failure
  mode: the Seq processes the Guard, the Guard fails, but the
  Seq has already "committed" to its branch and the Choice
  can't fall through. Test specifically: Advanced mode + an
  input that the DSL branch matches but the SQL branch's
  remaining tokens reject → must fall through to DSL, not
  Failed.
- Is the Guard mechanism reusable beyond DML (e.g., for
  Phase 4 DDL's similar dispatch)? If yes, document the
  pattern in `src/dsl/grammar/mod.rs` doc comments so the
  next phase doesn't reinvent it.
- If the Guard mechanism doesn't work (R1's worst case), the
  sub-phase REOPENS with mechanism option (b) or (c)
  evaluated before any DML grammar lands on top of broken
  scaffolding.

### 3b — INSERT grammar + minimal execution

**Scope (in):**
- `SQL_INSERT_SHAPE`: single-row + multi-row VALUES; the
  optional `(column_name_list)`; the `__rdbms_*` rejection on
  the table-name slot.
- `Command::SqlInsert { sql, source, target_table,
  listed_columns, returning: false }` — `returning` always
  false in 3b.
- `Request::RunSqlInsert` + `do_sql_insert` worker handler
  (no shortid auto-fill yet — explicit values required).
- Persistence write-through on the target table after
  execution.
- History-log of the literal submitted line.

**Scope (out):**
- `INSERT … SELECT` (3c).
- `RETURNING` (3e).
- `UPSERT` (3f).
- `shortid` auto-fill (3d).
- Diagnostics (3g).
- DSL-vs-SQL dispatch wiring on `data::INSERT` (3h —
  development uses a separate `data::SqlInsert` CommandNode).

**Exit gate (required green tests):**
- **Single-row INSERT** runs end-to-end; affected-row count
  surfaces; CSV is re-persisted.
- **Multi-row INSERT** (`INSERT INTO t VALUES (1, 'a'), (2, 'b')`)
  runs; both rows land; CSV reflects them.
- **`(column_name_list)`** variant works (`INSERT INTO t (a, b)
  VALUES (1, 'x')`).
- **`__rdbms_*` rejection** at the INSERT target-table slot
  (per ADR-0030 §6 / ADR-0032 §1).
- **History line** is the literal submitted SQL.
- All Phase-2 tests (1446 baseline) still green.

**DA gate (written):**
- Does an INSERT failing mid-multi-row leave persistence in a
  consistent state? (Engine transactional behaviour applies;
  worker must not re-persist on failure.)
- Does the `target_table` extraction handle case-folding the
  same way the rest of the codebase does?
- Are the existing DSL INSERT tests still green? Any DSL-side
  regression means 3b broke something it shouldn't have.

### 3c — INSERT … SELECT

**Scope (in):**
- Add the row-source `Choice` to `SQL_INSERT_SHAPE`:
  `values_clause | Subgrammar(&sql_select::SQL_SELECT_COMPOUND)`.
- Update `do_sql_insert` to handle the SELECT-driven path
  (no special worker work needed beyond preparing the
  composite statement; the engine handles the
  insert-from-query flow).

**Exit gate:**
- `INSERT INTO archive SELECT * FROM orders WHERE …` runs.
- `INSERT INTO target (a, b) SELECT x, y FROM source` runs.
- `INSERT INTO t WITH x AS (…) SELECT * FROM x` runs
  (R4 invariant — Phase-2 CTE prefix on SELECT applies).
- Schema-existence diagnostics fire on the SELECT's projection
  (Phase-2 machinery, free).

**DA gate (written):**
- Does the WITH-prefixed SELECT row source actually parse
  through SQL_SELECT_COMPOUND, or does the INSERT prefix
  break the recursion? Test on a real `WITH … INSERT … SELECT`.
- Does the cascade-summary path (3e DELETE) accidentally apply
  to INSERTs whose SELECT touches the target table? (It must
  not — INSERTs have no cascade.)

### 3d — `shortid` auto-fill (worker)

**Scope (in):**
- Detect omitted `shortid` columns at execution time in
  `do_sql_insert`.
- Synthesise fresh shortid values; bind them positionally.
- Cover both VALUES and SELECT row sources (for SELECT, the
  shortid columns are added to the projection's bind list
  before execution).

**Exit gate:**
- `INSERT INTO t (name) VALUES ('x')` where `t` has a `pk
  shortid` column auto-fills that PK.
- An explicit value for a shortid column is respected
  (override behaviour — warning fires per 3g but the value
  is preserved).
- Multi-row INSERT produces distinct shortids per row.
- INSERT … SELECT auto-fills shortids for each yielded row.

**DA gate (written):**
- Are the synthesised shortids actually unique across a
  multi-row INSERT? (The generator should be called per-row,
  not once.)
- Does the auto-fill respect the column's case-preserved name?
- What happens if the table has both `serial` and `shortid`
  PK columns? (Spec: serial via SQLite rowid; shortid via
  worker post-fill. Verify no double-fill collision.)

### 3e — UPDATE grammar + execution

**Scope (in):**
- `SQL_UPDATE_SHAPE`: target table, SET assignment list,
  optional WHERE.
- `Command::SqlUpdate`, `Request::RunSqlUpdate`,
  `do_sql_update`.
- Persistence write-through on the target table.

**Exit gate:**
- Single-column UPDATE with WHERE runs; affected-row count
  surfaces; CSV reflects changes.
- Multi-column UPDATE (`SET a = 1, b = 2`) runs.
- UPDATE without WHERE runs (per ADR-0030 §12 — no
  `--all-rows` guard).
- Schema-existence diagnostic fires on unknown columns in
  SET and WHERE.

**DA gate (written):**
- Does the SET assignment expression accept the full
  `sql_expr` (function calls, subqueries, CASE)? Verify on a
  representative expression.
- Does an UPDATE that touches an auto-gen column produce the
  3g auto-gen warning?

### 3f — DELETE grammar + execution + cascade summary

**Scope (in):**
- `SQL_DELETE_SHAPE`: target table, optional WHERE.
- `Command::SqlDelete`, `Request::RunSqlDelete`,
  `do_sql_delete`.
- Cascade-summary pre-count and post-execution formatting
  per §7.
- Persistence write-through on target + affected child tables
  (cascade-affected tables also need re-persistence).

**Exit gate:**
- DELETE with WHERE runs; affected-row count + cascade
  summary surface; CSVs of target AND cascade-affected
  children are re-persisted.
- DELETE without WHERE runs (no `--all-rows` rail).
- Cascade summary matches the DSL `do_delete` output on the
  same schema/data.
- The cascade-summary code path is shared (no duplicate
  formatter).

**DA gate (written):**
- R2 mitigation: does the predicate-extraction byte-range
  injection work on a WHERE containing a subquery? Test
  `DELETE FROM a WHERE id IN (SELECT id FROM b WHERE …)`.
- Does the cascade pre-count run in a way that doesn't see
  the post-DELETE state? (Order matters: pre-count BEFORE
  execute.)

### 3g — RETURNING

**Scope (in):**
- Add `RETURNING projection_list` to all three top-level
  shapes.
- Walker captures `returning: bool` on the Command variants.
- Worker handlers route to `DataResult` collection when
  `returning == true`.
- Result-column type recovery via the Phase-2 column-origin
  path.

**Exit gate:**
- `INSERT … RETURNING *` returns the inserted row(s) as a
  DataResult; affected-row count is implicit (rows.len()).
- `UPDATE … RETURNING id, new_value` returns the modified
  columns.
- `DELETE … RETURNING *` returns what was just deleted
  (BEFORE the row is gone).
- Result-column types recovered for bare-column refs
  (per ADR-0032 Amendment 1).
- The cascade summary on a `DELETE … RETURNING` is preserved
  alongside the result set (both surface to the user).

**DA gate (written):**
- Does the RETURNING result-set rendering use the same
  DataResult path SELECT uses? Or did a DML-specific copy
  sneak in?
- Does `RETURNING expr AS alias` admit aliases the same way
  the projection list does?

### 3h — UPSERT (ON CONFLICT … DO NOTHING / DO UPDATE)

**Scope (in):**
- Add `on_conflict_clause` to `SQL_INSERT_SHAPE`.
- Admit `excluded` as a table alias inside DO UPDATE
  expressions (§9's clarification).
- Walker captures the UPSERT shape; worker executes as-is
  (no special pre-processing).

**Exit gate:**
- `INSERT … ON CONFLICT (id) DO NOTHING` runs on a
  conflicting row; affected-row count is 0.
- `INSERT … ON CONFLICT (id) DO UPDATE SET v = excluded.v`
  runs; the conflicting row is updated to the new value.
- `ON CONFLICT DO NOTHING` (no target spec) handles any
  conflict.
- `excluded.col` completes / highlights as a column ref to the
  target table's columns inside the DO UPDATE expressions.

**DA gate (written):**
- Does ON CONFLICT … DO UPDATE trigger persistence
  write-through? (Yes; the target row is modified.)
- Is the `excluded` alias scoped to ONLY the DO UPDATE
  branch, not leaking into the outer INSERT?

### 3i — Diagnostics

**Scope (in):**
- `diagnostic.insert_arity_mismatch` (§8.1).
- `diagnostic.auto_column_overridden` (§8.2).
- `diagnostic.not_null_missing` (§8.3).
- Per-key positive + negative tests + catalog entries.

**Exit gate:**
- One positive + one negative test per new key (the
  ADR-0032 §2d exit gate pattern):
  - `insert_arity_mismatch`: `INSERT INTO t (a, b) VALUES
    (1, 2, 3)` fires; matched arity doesn't.
  - `auto_column_overridden`: `INSERT INTO t (id, name)
    VALUES (5, 'x')` (id is `serial`) fires; omitting `id`
    doesn't.
  - `not_null_missing`: `INSERT INTO t (a) VALUES (1)`
    (b is NOT NULL no default) fires; including `b` doesn't.
- Multi-row arity mismatch emits per-row, not just on the
  first offending row.
- The inherited Phase-2 diagnostic passes
  (`schema_existence`, `sql_predicate_warnings`,
  `compound_arity_diagnostics`) still fire on DML's
  `sql_expr` slots.

**DA gate (written):**
- Does `auto_column_overridden` produce a Warning, not an
  Error? (Spec: WARNING — the statement still runs.)
- Does `not_null_missing` produce a Warning? (Same.)
- Are the catalog wordings engine-neutral?

### 3j — Dispatch wiring on shared entry words

**Scope (in):**
- Convert `data::INSERT`, `data::UPDATE`, `data::DELETE`
  shapes to `Choice(SQL_shape, DSL_shape)`.
- Remove the experimental separate `data::SqlInsert` /
  `SqlUpdate` / `SqlDelete` CommandNodes that 3a-3i used as
  scaffolding.
- Verify DSL-only inputs still match the DSL branch in both
  modes.
- Verify SQL-only inputs match the SQL branch in Advanced
  mode and fail with "this is SQL" in Simple mode.

**Exit gate:**
- All existing DSL INSERT/UPDATE/DELETE tests still green
  (unmodified inputs, unmodified outputs).
- All Phase-3 SQL DML tests still green via the new dispatch
  path (the SQL branch is reached through `data::INSERT` etc.,
  not through `data::SqlInsert`).
- A structurally-ambiguous input (`delete from t where id =
  1`) in Advanced mode routes through the SQL branch
  (verifiable by checking that the cascade-summary code path
  fired — DSL and SQL paths share the formatter but the entry
  point differs).
- A DSL-specific input (`delete from t --all-rows`) in
  Advanced mode falls through to the DSL branch.

**DA gate (written):**
- Verify that no test was changed to make 3j pass — only the
  shape's structure changes; the observable behaviour is
  identical.
- Is the test count higher than before 3j? It should be (the
  scaffolding CommandNodes had their own tests; now the
  shared CommandNodes carry both).

### 3k — Verification sweep

**Scope (in):**
- Fill in the cross-cut verification matrix
  (`docs/plans/<date>-adr-0033-phase-3.md`).
- Produce the phase-exit verification report.
- DA final review (genuine, not rubber-stamp — see Phase 2's
  process lesson in `docs/handoff/20260520-handoff-28.md` §4).

**Exit gate:**
- Every matrix row green with a concrete `file::function`
  test reference.
- Final test run: zero failures, zero skips, no regressions
  from the Phase-2 baseline of 1446 / 0 / 1.
- Clippy clean.
- Report at `docs/handoff/<date>-phase-3-verification.md`
  documents all autonomous decisions with explicit
  user-confirmation pointers and the DA's written critiques
  (not just the verdict).

**DA gate (written):**
- Did filling the matrix surface gaps that need closing?
  (Phase 2 surfaced THREE; expect at least one for Phase 3.)
- Are any sub-phase exit gates above weaker than they should
  be in hindsight?
- Did any sub-phase make an autonomous decision that's not
  in this ADR? If so, flag for retroactive user approval
  before phase-exit.

## Out of scope (clarifications already established)

- **No engine type names** in or out (ADR-0030 §5 / §7).
- **No `STRICT`**, no SQLite storage clauses (ADR-0030 §7).
- **No transaction control** (ADR-0030 §13 OOS-4).
- **No multi-statement batches** (ADR-0030 §13 OOS-4).
- **No `EXPLAIN` of SQL DML** (ADR-0030 §13 OOS-2 — the DSL
  `explain` still wraps what it already wraps).

## Amendment 1 — Dispatch mechanism: category-grouped dispatch supersedes `Node::Guard` (2026-05-21)

This amendment **supersedes §2's "Mode-gating mechanism"
subsection** and **revises sub-phase 3a's scope**. It was
written during 3a implementation, after tracing the proposed
mechanism against the actual walker code, and is recorded with
explicit user approval before any code landed.

### The finding — option (a) does not work as §2 frames it

§2 selected option (a): a standalone zero-byte `Node::Guard(fn)`
placed as the first element of the SQL branch's `Seq`, inside a
`Choice(SQL_shape, DSL_shape)`. §2 framed this as the "smallest
mechanism change" that "reuses the existing validator path" with
the failure surfacing as the same `WalkOutcome::ValidationFailed`
as the entry-word gate — implying **no `walk_choice` change**.

Tracing the four required behaviours against
`src/dsl/walker/driver.rs` shows that framing is inaccurate, and
that **any** guard-in-`Choice` mechanism is forced to modify
`walk_choice`:

1. **`walk_choice` returns immediately on a branch `Failed`**
   (`driver.rs` `walk_choice`) — it only falls through to the
   next branch on `NoMatch`. So in **Simple mode**,
   `delete from t where id = 1` (which is *valid DSL*) would hit
   the SQL branch's guard, the guard would fail, and the user
   would see the "this is SQL" hint instead of the statement
   running as DSL. That contradicts §2's own "Simple mode → DSL
   only" promise. Fixing it requires a `walk_choice` change
   regardless of how the guard node is shaped.

2. **`walk_seq` treats a `NoMatch` at `idx > 0` as a hard
   `Failed`** (`driver.rs` `walk_seq`). A zero-byte Guard at
   `idx 0` still advances `idx` to 1, pushing the SQL branch's
   first real token to `idx 1`. So in **Advanced mode**,
   `delete from t --all-rows` (DSL-only) cannot fall through to
   DSL — the SQL branch returns `Failed`, and `walk_choice`
   won't try the DSL branch. The literal form needs an
   *additional* `walk_seq` semantics change (`idx == 0` → "no
   bytes consumed yet").

This is exactly the R1 risk the ADR budgeted for ("if `walk_seq`
commits the Seq before the validator runs … the Choice can't
fall through").

### The replacement — category-grouped, mode-aware dispatch

Rather than gate a `Choice` branch from inside the recursive
grammar, the dispatch decision moves up to the command
dispatcher (`walker::walk`), which is **where mode gating
already lives**: the existing whole-command gate
(`is_advanced_only` + the Simple-mode short-circuit in `walk`)
is precisely this pattern, hardcoded for `select` / `with`. The
amendment generalises that existing code instead of inventing a
parallel in-grammar mechanism.

Concretely:

- **Each registry entry carries a category** —
  `CommandCategory::{Simple, Advanced}`. The category lives at
  the registration site (the `REGISTRY` table), not as a field
  on every `CommandNode` literal, because category is a
  dispatcher concern, not intrinsic to a command's grammar.
  `select` / `with` (and, from 3b, the SQL `insert` / `update` /
  `delete` nodes) are `Advanced`; everything else is `Simple`.
  This subsumes `ADVANCED_ONLY_ENTRIES`: `is_advanced_only(entry)`
  becomes "every registry candidate for this entry word is
  `Advanced`".

- **A shared entry word has a node in both groups** — e.g. a
  `Simple` DSL `insert` node and an `Advanced` SQL `insert`
  node, both with entry word `insert`. The entry-word lookup
  returns *all* candidates; the dispatcher selects by mode.

- **Mode-aware selection in `walk` (no combinator changes):**
  - **Simple mode** considers `Simple` candidates. If a Simple
    candidate exists, it is committed. If none exists but an
    `Advanced` candidate does (a pure-SQL entry word like
    `select`), the dispatcher emits the
    `advanced_mode.sql_in_simple` `ValidationFailed` — identical
    to today's whole-command gate.
  - **Advanced mode** tries `Advanced` candidate(s) first, then
    the `Simple` candidate as a fallback. The first candidate
    that fully matches wins (`delete from t where id = 1` →
    SQL; `delete from t --all-rows` → falls through to DSL
    because the SQL shape doesn't match `--all-rows`).

- **No `Node::Guard`, no `walk_choice` / `walk_seq` change.** The
  recursive combinators are untouched; each candidate is walked
  cleanly and independently. Speculative evaluation runs on a
  scratch `WalkContext`; only the committed candidate is walked
  into the caller's context, so post-walk context state
  (completion / hint accumulators) always reflects the chosen
  candidate.

### The two details (user-approved)

1. **"This is SQL" hint in Simple mode for SQL-shaped input.**
   For SQL-only input in Simple mode (e.g.
   `delete … returning *`), the Simple candidate won't match.
   To keep parity with `select` / `with`, the dispatcher
   speculatively checks whether the `Advanced` candidate *would*
   match; if so, it emits the `advanced_mode.sql_in_simple`
   hint rather than a generic DSL parse error. Lives entirely
   in `walk`.

2. **Advanced-mode completion is SQL-first, DSL as full-line
   fallback.** The `Choice` approach would have unioned SQL and
   DSL continuations mid-typing for free; the grouping approach
   surfaces one candidate's walk for completion / hints. The
   decision is to show SQL completions primarily in Advanced
   mode and fall to DSL only on a full DSL-shaped line — simpler
   and matching the "SQL-first, DSL as familiar fallback"
   intent. No unioned completion.

### Consequences of the amendment

- **`Node::Guard` is NOT added.** §2's mechanism caveat,
  Consequences bullet 1 ("the walker gains a new node variant
  `Node::Guard`"), and the original 3a scope are withdrawn.
- **`REGISTRY`'s element type changes** to carry the category
  alongside each `&CommandNode`. The handful of registry
  iteration sites (entry-word listing, `command_for_entry_word`,
  the completion entry-word filters, `note_help`) destructure the
  tuple; no behavioural change for the current single-candidate
  registry.
- **The `Command` variants (§10), worker handlers, RETURNING
  flag, cascade-summary plan, shortid auto-fill, and diagnostics
  are unaffected** — this amendment is purely about how the
  dispatcher routes a shared entry word to the right grammar.
- **Sub-phase 3a now validates the dispatch mechanism**, not the
  guard: a smoke registry with a shared entry word (a `Simple`
  and an `Advanced` node using distinguishable tokens) exercises
  the Simple/Advanced selection, the fallback, and the
  "this is SQL" path. The 3a exit gate's four cases map onto the
  new mechanism unchanged in spirit.
- **3b adds the first real shared entry words** to `REGISTRY`
  (the SQL `insert` / `update` / `delete` nodes). At that point
  the completion entry-word *lists* will need to de-duplicate
  the shared primary (two candidates, one entry word) — tracked
  for 3b, not 3a.
- The mechanism remains reusable for Phase 4 DDL's analogous
  shared-entry problem (`create`, …): tag the SQL DDL nodes
  `Advanced`; the dispatcher handles the rest.

## Amendment 2 — Cascade summary on SQL DELETE: count-diff parity supersedes WHERE-injected pre-count (2026-05-22)

This amendment **supersedes §7's "Predicate extraction" subsection
and the pre-count mechanism in §7 steps 1–3**, and **withdraws Open
Implementation Risk R2**. It was written during sub-phase 3f, after
tracing §7's prescribed mechanism against the actual DSL `do_delete`
handler, and is recorded with explicit user approval before any 3f
code landed.

### The finding — §7's premise about the DSL handler is incorrect

§7 prescribes a per-child **pre-execution pre-count** of the form
`SELECT count(*) FROM <child> WHERE <fk_col> IN (SELECT <pk_col> FROM
<target_table> WHERE <user_where_predicate>)`, with the
`<user_where_predicate>` re-injected from the SQL DELETE's
WHERE-clause source bytes. Its stated justification:

> "The DSL handler reuses the typed `Expr` AST to construct the
> pre-count subqueries."

Tracing `do_delete` in `src/db.rs` shows this premise is factually
wrong. The DSL handler does **not** construct pre-count subqueries
from the typed `Expr`. It detects cascade effects by **before/after
row-count diffing**: read the inbound relationships, count each child
table's rows *before* the DELETE, execute the DELETE inside a
transaction, count each child *after*, and report the positive
difference as a `CascadeEffect`. No pre-count query and no
`Expr`-derived subquery exist anywhere in the path.

### Two consequences of the correct mechanism

1. **Count-diff reports `ON DELETE CASCADE` row removals only.** `ON
   DELETE SET NULL` does not change a child's row count, so the DSL
   path does not report it (its own code comment notes this).
   Reporting SET NULL would require value-level diffing.

2. **§7's pre-count is in tension with the parity promise.** §2
   promises "a DSL-style … DELETE in Advanced mode produces identical
   observable behaviour whether it routes through the SQL or DSL
   path", and the plan's 3f exit gate requires the SQL cascade summary
   to *match* the DSL `do_delete` output. A pre-count that counted
   `ON DELETE CASCADE` *and* `SET NULL` (as §7 step 1 says) would
   report **more** than the DSL path — breaking that parity. To match
   the DSL path it would have to drop SET NULL anyway, leaving
   WHERE-byte extraction and the R2 N+1-subquery pathology as pure
   cost with no benefit over count-diff.

### The replacement — count-diff parity

`do_sql_delete` mirrors `do_delete` exactly: read inbound
relationships, snapshot child row counts, run the **verbatim**
validated DELETE SQL inside a transaction via
`execute_with_fk_enrichment`, diff the child counts, build the same
`Vec<CascadeEffect>`, and re-persist the target plus every
cascade-affected child before commit. Because both handlers return
the same `DeleteResult` and both route through
`CommandOutcome::Delete` → `handle_dsl_delete_success`, the
per-relationship summary formatter is **already shared at the render
layer** — no formatter refactor and no duplicate path.

This dominates the §7 mechanism on every axis the 3f exit gate
measures:

- **Exact parity** with the DSL path — identical detection mechanism,
  not merely a matching formatter.
- **No WHERE-clause byte extraction.** The verbatim DELETE (including
  any subquery in its WHERE) executes once; the engine cascades; the
  diff observes the result. The R2 invariant case (`DELETE FROM a
  WHERE id IN (SELECT … )`) is correct by construction and carries no
  N+1 cost. **R2 is withdrawn.**
- **Shared render** with no duplicate formatter, satisfying §7's
  shared-formatter promise structurally.

### SET NULL reporting — deferred for both paths (user-confirmed)

3f keeps strict DSL parity: `ON DELETE CASCADE` row removals are
reported; `SET NULL` is not, on either path. Adding SET NULL
reporting (value-level diffing) is a tracked future enhancement to be
applied to **both** `do_delete` and `do_sql_delete` together so
parity is preserved. The user confirmed this deferral.

### Consequences of the amendment

- §7's "Predicate extraction" subsection and pre-count steps 1–3 are
  superseded by count-diff; the per-relationship summary and
  shared-formatter outcomes are unchanged.
- **R2 (cascade-summary predicate extraction) is withdrawn** — the
  mechanism it budgeted for is no longer used.
- `Command::SqlDelete` carries `{ sql, target_table }` only — no
  WHERE-clause field is needed (the worker never inspects the
  predicate).
- The handoff-31 §4 "WHERE-byte-extraction is tractable for DELETE"
  heads-up is moot — no extraction happens.

## Amendment 3 — Command identity is intrinsic; execution-mode is side-channel (deferred) (2026-05-23)

This amendment **clarifies the command-identity model** that §2 and
Amendment 1 implement, **corrects a false premise in the
implementation plan's sub-phase 3j exit gate**, and **records a
deferred follow-up** (the execution-time mode side-channel). It was
written during sub-phase 3j — wiring the shared `insert` / `update` /
`delete` entry words — after the dispatch model's interaction with the
existing test suite surfaced a question about what a "command" *is* and
how input mode relates to it. Recorded with explicit user approval
before any 3j code landed.

### The model — a command is a mode-rooted grammar-path outcome

A command is the typed outcome of a grammar path **rooted at the input
mode**: `simple → update → …` or `advanced → update → …`. Its identity
is **intrinsic** — determined by *which grammar matched*, not by a mode
flag carried on one shared command type. In Advanced mode the
dispatcher (Amendment 1) tries the Advanced (SQL) candidate(s) first and
**falls back to the Simple (DSL) candidate** when no Advanced branch
recognizes a token; the fallback produces the **Simple** command.

Worked example (the `--all-rows` fall-through): `delete from T
--all-rows` in Advanced mode — no Advanced (SQL) branch recognizes
`--all-rows` (the SQL `DELETE` has no expression slot after the table
to absorb it), so the path falls back to `simple → delete → … →
all-rows` and yields `Command::Delete { filter: AllRows }` (the DSL
command), exactly as it would in Simple mode. The mode rooted the path;
the command that came out is the Simple one.

*Counter-example (no fall-through):* `update T set x = 42 --all-rows`
in Advanced mode does **not** fall back — the SQL `UPDATE`'s
`SET <expr>` greedily consumes `--all-rows` as the expression
`42 - -all - rows` (with `all`/`rows` as column refs), so the SQL shape
matches and `Command::SqlUpdate` is produced. Whether `--all-rows` is a
DSL flag or part of a SQL expression is decided by which grammar wins,
not by special-casing the token. (At execution this is harmless: the
engine treats `--all-rows` as a SQL line comment, so the statement runs
as `update T set x = 42` — all rows — the same effect as the DSL flag.)

### Simple mode commits the DSL candidate; an advanced-mode pointer combines

In **Simple mode** a shared entry word **always commits its DSL
candidate**, so the user sees DSL completion and the *real* DSL error
(with its position) — not a bare "this is SQL" that discards the
actionable detail. A plain `ValidationFailed`/`Mismatch` "this is SQL"
dispatch outcome is reserved for entry words that have **no** DSL form
(`select` / `with`); for those the DSL surface has nothing to offer, so
the simple-mode gate points at advanced mode (ADR-0030 §2).

To keep the SQL-discoverability the original §2 envisioned *without*
losing the DSL fix, the rendering layer **combines** them: when a
simple-mode line is a *definite* DSL error (not merely incomplete) and
the same line would parse in advanced mode, the DSL error prose is
suffixed with the `advanced_mode.also_valid_sql` pointer — e.g.
`for \`Name\`: Type a quoted string … (valid as SQL in advanced mode —
\`mode advanced\` or prefix \`:\`)`. The DSL detail and the mode hint
coexist (`input_render::ambient_hint_in_mode` /
`advanced_alternative_note`). Mid-typing (incomplete) input is not
suffixed, to avoid noise during normal DSL entry. This supersedes the
draft of this amendment's earlier suggestion that a SQL-shaped line in
simple mode emits a *bare* "this is SQL" hint for shared words.

### Overlapping inputs produce two distinct commands — both are tested

For a **fully-overlapping** shape — `insert into T values (…)` is valid
in both grammars — Simple mode yields `Command::Insert` (a typed AST,
DSL execution) and Advanced mode yields `Command::SqlInsert` (validated
text, SQL execution). This is **correct, not a defect**: the two
commands *do the same thing but execute differently* (ADR-0030 §4 — the
DSL lowers to a typed AST; SQL is grammar-as-text run verbatim). Because
they are distinct commands, **each is tested in the mode that produces
it** — DSL grammar tests run in Simple mode; the SQL command variants
are tested in Advanced mode.

### Correction to the plan's 3j exit gate

The implementation plan's sub-phase 3j exit gate says "all existing DSL
`INSERT`/`UPDATE`/`DELETE` tests still green — unmodified inputs,
unmodified outputs", and its scope-out says "observable behaviour must
be identical before and after 3j". That rested on a **false premise**:
that the existing DSL DML tests run in Simple mode. They do not — the
common helpers (`ok`/`err`/`parse`) call `parse_command`, which
**defaults to Advanced mode**. Today that is harmless because
`insert`/`update`/`delete` have no SQL competitor, so they route to the
DSL command even in Advanced mode. Once they become **shared** entry
words, §2's Advanced-mode **SQL-first** rule routes the overlap to the
SQL command (`Command::Sql*`), changing the parsed variant those tests
assert.

The corrected invariant: **Simple-mode behaviour is unchanged; Advanced
mode is SQL-first per §2; the DSL grammar is tested in Simple mode (its
canonical surface, ADR-0003); both command variants are tested in their
producing mode.** No *production* behaviour regresses — the §6 (shortid
auto-fill) and §7 (cascade summary) parity promises keep the two paths
observably equivalent for overlapping inputs; only the
implementation-internal `Command` variant differs, and only in Advanced
mode.

### Replay uses the same parser, in advanced mode — and needs no per-line mode

A consequence worth stating explicitly, because it is easy to
over-think (and a prior reading of this work did). **Replay parses
each recorded line with the same schema-aware parser the interactive
path uses, in advanced mode** (the full surface) — it skips nothing
and simplifies nothing. The only difference from an interactive line
is the mode argument: interactive uses the user's current mode,
replay fixes it to advanced.

This is correct and complete **without** the per-line execution-mode
side-channel below, because of the §6 (shortid auto-fill) and §7
(cascade-summary) parity guarantees: a valid overlapping command
produces *identical effects* whether it lowers to the DSL variant or
the SQL variant. So replaying a log of valid commands in advanced
mode is identical to typing those lines interactively in advanced
mode — and, by parity, identical in effect to the simple-mode entry
that originally produced them. (Replay only ever sees lines that
already executed successfully: `history.log` is success-only, and
ADR-0034's deferred journal replays `ok` lines only.)

The one observable nuance is purely in **error reporting for an
invalid line** — which only arises from a hand-built script, never
from a real journal. Such a line is still rejected and not applied;
*where* the rejection lands just follows the grammar, exactly as
interactively: an `insert … values …` line is SQL in advanced mode,
so a wrong column-type value is rejected by the engine at execute
time rather than by a DSL typed slot at parse time. This is **not**
replay using a lesser parser — it is the same advanced-mode parse a
user would get typing the line. Replay therefore does **not** depend
on the deferred side-channel below.

### Execution-time mode is side-channel — deferred to its own ADR

Independent of command *identity*, every command should — at **execution
time** — know which of three modes it ran under: `simple`, `advanced`,
or `advanced-one-shot` (the `:` escape from Simple mode, ADR-0003), so
execution can adjust **output** without changing **identity** (e.g. a
Simple-mode `create table` echoing the generated SQL when run in
Advanced mode, while staying silent in Simple mode).

Today this exists only as a **rendering** side-channel
(`OutputLine.mode_at_submission`, consumed by the echo-line mode tag in
`ui.rs`). The `Mode` enum is two-way (`Simple` / `Advanced`); the
one-shot distinction is a transient "effective mode" collapsed at
submission; and neither `Action::ExecuteDsl` nor the database worker
carries any mode. Wiring an **execution-time** mode side-channel —
widening `Mode` to the three-way distinction and threading it through
the `Action` → worker interface — is **out of scope for Phase 3** and is
**deferred to its own ADR** (user-confirmed). It is *not* required for
Phase 3's dispatch to be correct: the routing and the `--all-rows`
fall-through above are complete on the two-way mode. This amendment
forward-references that future ADR so the requirement is not lost.

### Consequences of the amendment

- **No code or grammar change** from this amendment itself — it records
  the model that the dispatch (Amendment 1) already implements and
  corrects the plan's test-mode premise.
- The 3j test migration **moves the DSL grammar / completion tests to
  Simple mode** (inputs and `Command::Insert`/`Update`/`Delete`
  assertions unchanged), **keeps the `--all-rows` fall-through tests in
  Advanced mode** (they validate the fallback to the Simple command),
  relies on the migrated `tests/sql_*.rs` for the SQL command variants
  in Advanced mode, and adds dispatcher routing tests
  (Advanced + structurally-ambiguous → SQL; Advanced + `--all-rows` →
  DSL fallback; Simple + a SQL-only *entry word* `select` → "this is
  SQL"; Simple + a shared word with a SQL-only *construct* → DSL error,
  carrying the combined pointer at the rendering layer).
- The **combined pointer** lives in `input_render::ambient_hint_in_mode`
  (live typing) and `App::dispatch_dsl` (submit), both via the shared
  `advanced_alternative_note`, so a Simple-mode definite DSL error that
  would run as SQL gains the `advanced_mode.also_valid_sql` suffix in
  both surfaces (the submit path covers SQL constructs that surface
  only on submit, e.g. `delete … returning`). Found via the `/runda`
  round.
- **Internal-table rejection symmetrised** (`/runda` finding B,
  ADR-0030 §6): the DSL data-command target slots (`insert` / `update` /
  `delete` / `show data` / `show table`) gained
  `reject_internal_table`, so `__rdbms_*` tables are refused in Simple
  mode too — previously only the advanced SQL grammar rejected them, so
  a simple-mode DML could touch the internal metadata tables.
- The **execution-time mode side-channel + three-way `Mode`** is tracked
  as a future ADR; `OutputLine.mode_at_submission` remains the only mode
  side-channel until then. No structure built in 3j assumes mode is
  irrelevant to execution.

## Amendment 4 — `update … --all-rows` falls back to the DSL; Amendment 3's counter-example was a bug (2026-05-27)

Designing the DSL → SQL teaching echo (ADR-0038) re-examined Amendment 3's
**counter-example** — the claim that `update T set x = 42 --all-rows` in
Advanced mode does *not* fall back to the DSL because the SQL `UPDATE`'s
`SET <expr>` "greedily consumes `--all-rows` as the expression
`42 - -all - rows` (with `all`/`rows` as column refs)." **That is a bug,
not correct behaviour**, and this amendment reverses it. Recorded with
explicit user approval (2026-05-27).

### Why it is a bug

- The walker has **no `--` comment support** (it lexes two minus
  operators); the engine *does* treat `--` as a line comment. So the
  walker's parse (`42 - -all - rows`) **diverges from execution** (which
  runs `update T set x = 42`).
- Absent columns literally named `all` and `rows`, the walker is
  **silently accepting a `SET` expression over non-existent columns** —
  precisely the case ADR-0027 says to flag ("mark error if we know it
  will fail at runtime"). It only *appears* to succeed because the
  engine's comment leniency masks the divergence.
- **Trailing SQL comments are not a supported feature** of the surface,
  so relying on the engine to treat `--all-rows` as one is wrong.

So the SQL `UPDATE` shape should **not** match `… --all-rows`.

### Decision

The `--all-rows` token sequence makes the **SQL `UPDATE` shape fail**, so
dispatch **falls back to the DSL** `Command::Update { filter: AllRows }`
— restoring symmetry with `delete … --all-rows`, which already falls
back (the SQL `DELETE` has no slot to absorb the flag). Consequences:

- `update … --all-rows` in Advanced mode is the **DSL** command, runs all
  rows via the safe DSL path, and (ADR-0038) gains the teaching echo
  `UPDATE T SET … ` with the `WHERE` omitted.
- **No `--` comment feature is introduced**; the playground still does
  not support trailing SQL comments (consistent with the rest of the
  surface, engine-neutral).

### Mechanism — key on the **adjacent `--`**

The DSL `--all-rows` is matched by `Node::Flag("all-rows")` via
`consume_flag` (`walker/lex_helpers.rs`), which requires an **adjacent
`--`**. The SQL expression grammar has no flag node, so it consumes the
same source as `- -all - rows` (two minus operators, `all`/`rows` as
column refs) and the SQL `UPDATE` shape wins (SQL-first). Crucially,
`--all-rows` and a legitimate `- -all` tokenise *identically* once split,
so the **only** robust discriminator is the adjacency of the two dashes:

- **The fix:** the SQL expression treats an **adjacent `--`** as a
  boundary it will not consume, so the SQL `UPDATE` shape **fails** there
  and dispatch falls back to the DSL, whose `Node::Flag` matches
  `--all-rows`. (Adjacency is what separates the DSL flag from real
  arithmetic — see the preserved case below.)

Verified consequences (probed against the current grammar), which the
build must lock down test-first:

- `update T set x = 42 --all-rows` → DSL `Update { AllRows }` *(the goal;
  inverts `sql_dml_e2e.rs::e2e_update_all_rows_in_advanced_does_not_fall_back_to_dsl`)*.
- `update T set x = 42 - -3` → **unchanged**, stays `SqlUpdate` (= 45):
  the dashes are space-separated, not an adjacent `--`. This invariant is
  non-negotiable and gets its own test.
- `update T set x = 42--3` and `update T set x = 42 --col` *(adjacent
  `--` before a number or a real column)* → today they parse `Ok` as
  `SqlUpdate`; after the fix the SQL expression stops at `--` and the DSL
  flag grammar (which accepts only `all-rows`) rejects the rest, so they
  become **parse errors**. This is a deliberate, acceptable behaviour
  change for these contrived adjacent-dash inputs — the playground does
  not support `--` line comments, so there is no valid reading of an
  adjacent `--` here. *(User heads-up flagged 2026-05-27.)*

The existing `delete … --all-rows` fall-back is unaffected. Folded into
the ADR-0038 echo effort (the fix that makes `update … --all-rows`
echoable).

## Amendment 5 — `also_valid_sql` fires on *validity*, not just *parse* (2026-05-28)

Amendment 3 introduced the `advanced_mode.also_valid_sql` pointer and
described its gate as *"the same line would parse in advanced mode."*
Issue #1 surfaced that this is too weak: an advanced-mode positional
`INSERT INTO T VALUES (…)` with a value count that doesn't match the
target table's column count **parses** (the SQL grammar accepts any
positional value count) but then fails at the engine. The pointer
fired, telling the user to switch modes, and the same line failed
again — the suggestion was misleading. Recorded with explicit user
approval (2026-05-28).

### Why "parse" was the wrong bar

"Would parse" is a purely syntactic check: the structural grammar
accepts the input. It says nothing about whether subsequent static
checks (type slots, INSERT arity, predicate warnings, schema-existence
diagnostics, …) would also pass. The cross-mode pointer's job is to
direct the user to the mode that **actually** runs the line; a
syntactic-only gate over-promises in every case where a non-grammar
static check would still reject. The Form B positional arity case is
just the first one observed.

### "Valid" — the explicit definition used by this amendment

The pointer fires only when the line is **valid** in advanced mode,
where **valid** is the ADR-0027 sense — operationally, the value
returned by `crate::dsl::walker::input_verdict_in_mode(input, schema,
Mode::Advanced)` is `None`:

- The structural parse succeeds (`WalkOutcome::Match`), **and**
- No diagnostic of `Severity::Warning` or `Severity::Error` is
  produced by any pass that contributes to `input_verdict_in_mode`
  (schema-existence, SQL-predicate warnings, compound-arity errors,
  INSERT-arity errors — Form A *and* Form B — and any further
  diagnostic added to the pipeline in the future).

This is the same condition the in-app `[ERR]` / `[WRN]` validity
indicator uses (ADR-0027 §2). Tying the pointer to it gives one
canonical "is this line OK in this mode?" answer that every surface
shares — by construction, the indicator and the pointer can never
disagree about a given line.

### Decision

- `advanced_alternative_note` (`src/input_render.rs`) reads
  `input_verdict_in_mode(…, Mode::Advanced)`. The pointer fires only
  for verdict `None`.
- Any static diagnostic that participates in the verdict automatically
  participates in the gate. No bespoke pre-flight check lives in the
  pointer helper.
- The Amendment 3 description (*"would parse in advanced mode"*)
  should be read as a synonym for **valid** in this amendment's
  sense, not as a literal syntactic-only test.

### Diagnostic added to close the issue #1 gap

ADR-0033 §8.1's existing `dml_insert_arity_diagnostics` checked only
Form A (column list present). Its own doc-comment recorded the Form B
case as deferred. This amendment closes that gap: the same function
now also checks the Form B case (no column list) by comparing each
VALUES tuple's arity against the target table's full column count from
the schema cache, emitting a new diagnostic key
`diagnostic.insert_arity_mismatch_form_b`. The `[ERR]` validity
indicator now lights up at typing time for the user-reported scenario,
and the cross-mode pointer gate sees the verdict and stays silent.

### Tests

The amendment's behavioural promise is locked down by:

- `dsl::walker::tests::insert_form_b_arity_mismatch_under_supply_fires` —
  the user's reported 4-col / 3-value case produces an Error diagnostic.
- `dsl::walker::tests::insert_form_b_arity_mismatch_over_supply_fires` —
  symmetric case, 5 values for a 4-col table.
- `dsl::walker::tests::insert_form_b_arity_match_is_silent` — happy path
  (every column supplied, autos included) stays silent.
- `dsl::walker::tests::insert_form_b_arity_unknown_table_is_silent` —
  defensive: an unknown table defers to the schema-existence pass.
- `input_render::tests::ambient_hint_omits_advanced_pointer_when_form_b_value_count_wouldnt_match` —
  the pointer is suppressed for the bug case via the validity verdict.
- `app::tests::simple_mode_submit_of_sql_construct_appends_advanced_pointer` —
  multi-row VALUES (a true SQL-only construct) still fires the pointer
  when the table exists in the schema.

### Behaviour preserved

The pointer still fires for every input that is genuinely SQL-only
syntax in simple mode and is valid as advanced SQL — `delete …
returning *`, multi-row VALUES against a known table, `INSERT … ON
CONFLICT …` over a known target, and so on. The narrowing is
**exactly** the set of inputs where switching modes wouldn't help, no
more.

## See also

- ADR-0005 — the ten-type vocabulary INSERT works with.
- ADR-0006 — the auto-snapshot before destructive ops.
- ADR-0014 — the DSL DML model + cascade-summary precedent.
- ADR-0015 — the persistence write-through path.
- ADR-0018 — the DSL INSERT forms (A / B / C).
- ADR-0019 — the friendly-error layer DML errors route through.
- ADR-0027 — the validity indicator surfaces DML diagnostics.
- ADR-0030 — the SQL surface in advanced mode; Phase 3 is this
  ADR's commission.
- ADR-0031 — the SQL expression grammar, consumed by DML
  WHERE / SET / VALUES / RETURNING.
- ADR-0032 — the SQL SELECT grammar; `INSERT … SELECT` and
  RETURNING reuse its projection-list and scope-frame machinery.