c12ed1da9a
Three layered fixes for advanced/simple-mode positional INSERT
value-count mismatches (e.g. `insert into T values (...)` with the
wrong number of values for T's column count), plus ADR-0033
Amendment 5 recording the gate refinement.
Walker diagnostic (dml_insert_arity_diagnostics): the function's own
doc-comment recorded the no-column-list (Form B) case as deferred.
This commit closes that gap. Form B mismatches now emit a new
diagnostic.insert_arity_mismatch_form_b ERROR per offending tuple,
keyed off the target table's column count from the schema cache. The
[ERR] validity indicator (ADR-0027) lights up at typing time for the
reported scenario, no longer needing a submit.
Cross-mode pointer gate (advanced_alternative_note): refactored from
a hand-rolled Form B count check to a single input_verdict_in_mode(
input, schema, Mode::Advanced) call. The pointer fires only when the
verdict is None — the ADR-0027 sense of "valid". Any future static
check added to the verdict pipeline participates automatically; no
per-feature maintenance.
Teaching notes for the value-count cases users can hit before the
indicator turns red:
- simple-mode submit: insert.form_b_extra_values_note covers under-,
in-window, and over-supply against the Form B contract; suppressed
when the cross-mode pointer fires (to avoid parallel advice).
- advanced-mode dispatch pre-flight:
insert.form_b_positional_count_mismatch_note catches a submitted
mismatch with a teaching message before the engine produces its
raw NOT-NULL / type error.
The advanced_mode.also_valid_sql pointer wording was reworked to
"trying to write SQL? switch with `mode advanced`, or prefix `:` to
run once". One insta snapshot regenerated.
ADR-0033 Amendment 5 records the gate change: Amendment 3's "would
parse in advanced mode" now reads as "valid in advanced mode" in the
explicit verdict-is-None sense, with the precise definition spelled
out so future readers can't drift back to the syntactic-only reading.
ADR-0000 index entry updated; docs/requirements.md H1a citation added
listing the three new pedagogical strings.
Tests added (8): four walker arity tests (under-supply, over-supply,
match, unknown-table); two app-level teaching-note tests for the
sibling cases (under-supply, over-supply beyond total); one pointer-
gate unit test pinning the bug-case suppression; one gate-precedence
test ensuring only one advice line per error. Existing
simple_mode_submit_of_sql_construct_appends_advanced_pointer updated
to use a known schema (the new validity gate requires it).
Full suite: 2031 passed, 0 failed, 0 unexpected skips. Clippy clean.
1665 lines
73 KiB
Markdown
1665 lines
73 KiB
Markdown
# ADR-0033: The full SQL DML grammar — `INSERT` / `UPDATE` / `DELETE`
|
||
|
||
## Status
|
||
|
||
Accepted
|
||
|
||
Phase 3 implemented and verified through sub-phase 3k
|
||
(2026-05-23); see `docs/handoff/20260523-phase-3-verification.md`
|
||
for the phase-exit report and the filled cross-cut matrix in
|
||
`docs/plans/20260520-adr-0033-phase-3.md`. Amendments 1–3 below
|
||
are part of this acceptance.
|
||
|
||
> **Forward note (2026-05-26, ADR-0036 Accepted).** ADR-0036 **augments**
|
||
> the §10 model — it does **not** change verbatim execution and does
|
||
> **not** collapse the two-command identity of **Amendment 3** (which
|
||
> stands). It adds a *value-validation* step: `INSERT`/`UPDATE` literal
|
||
> values are parsed to typed `Value`s, **validated** against the column
|
||
> type (sharing the DSL's validators), and **retained** on the command
|
||
> (so a constraint error can show the offending value) — then the
|
||
> statement still executes **verbatim**. `Command::SqlInsert` gains a
|
||
> captured-literals payload; its execution and `plan_shortid_autofill`
|
||
> are unchanged. Expressions, `WHERE`, `INSERT … SELECT`, `RETURNING`,
|
||
> UPSERT, and `SELECT` are untouched. (The serial/shortid auto-fill
|
||
> difference vs simple mode is tracked separately as `requirements.md`
|
||
> X4.)
|
||
|
||
## Context
|
||
|
||
ADR-0030 commissions advanced mode as a body of SQL grammar
|
||
inside the unified grammar tree. Phase 1 ("foundations + first
|
||
`SELECT`") and Phase 2 ("`SELECT` — full") have shipped through
|
||
ADR-0031 and ADR-0032 respectively. The next slice on ADR-0030's
|
||
roadmap is **Phase 3 — DML**: `INSERT` / `UPDATE` / `DELETE` as
|
||
validated SQL.
|
||
|
||
ADR-0030 §3 fixed the headline scope:
|
||
|
||
> **`INSERT`** (single- and multi-row), **`UPDATE`**, **`DELETE`**.
|
||
|
||
ADR-0030 §4 fixed the execute path:
|
||
|
||
> **DML and `SELECT`** → executed as the **validated SQL itself**
|
||
> (re-rendered canonically from the matched parse, or the validated
|
||
> original text). … For DML the worker — knowing the statement kind
|
||
> and target table from the parse — runs the statement and
|
||
> re-persists that table's CSV.
|
||
|
||
ADR-0030 §11 fixed the persistence model. ADR-0030 §12 fixed the
|
||
safety posture: SQL `UPDATE`/`DELETE` without `WHERE` executes as
|
||
written (no `--all-rows` rail).
|
||
|
||
ADR-0030 §3 also flagged this ADR's existence in advance:
|
||
|
||
> Settle multi-row `INSERT` and `shortid` auto-fill on a SQL
|
||
> `INSERT`.
|
||
|
||
This ADR fixes the **grammar shape**, the **dispatch
|
||
architecture** for shared entry words, the **diagnostic surface**
|
||
new to DML, and the **execute-path** Command variants. It does
|
||
not revisit ADR-0030's controlling decisions; references in this
|
||
text to ADR-0030 §X mean "that decision is controlling."
|
||
|
||
### What ADR-0030, ADR-0031, ADR-0032 already fix
|
||
|
||
- One walker, grammar-as-text execution, ambient assistance for
|
||
free (ADR-0030 §1/§4/§6/§8).
|
||
- `sql_expr` is the shared expression grammar — UPDATE `SET`
|
||
expressions, WHERE predicates, INSERT VALUES tuples, RETURNING
|
||
projection items all consume it (ADR-0031, ADR-0032 §5/§6).
|
||
- `Node::ScopedSubgrammar` + `from_scope_stack` carry CTEs and
|
||
per-statement scope into nested constructs (ADR-0032 §10).
|
||
- The diagnostic pipeline — `schema_existence_diagnostics`,
|
||
`sql_predicate_warnings`, `compound_arity_diagnostics`,
|
||
`pending_diagnostics` accumulator — applies uniformly to DML's
|
||
`sql_expr` slots (ADR-0032 §11).
|
||
- The friendly-error layer translates engine messages through the
|
||
`engine.*` catalog (ADR-0019, ADR-0030 §7, ADR-0032 §11.5).
|
||
- The `__rdbms_*` rejection at every table-source slot
|
||
(ADR-0030 §6, ADR-0032 §1/§4).
|
||
- `MAX_SUBGRAMMAR_DEPTH = 64` (ADR-0026) is the recursion budget.
|
||
|
||
## Decision
|
||
|
||
### 1. The top-level statement shapes
|
||
|
||
```
|
||
insert_statement := INSERT INTO table_name
|
||
[ '(' column_name_list ')' ]
|
||
( values_clause | sql_select_compound )
|
||
[ on_conflict_clause ]
|
||
[ returning_clause ]
|
||
[ ';' ]
|
||
|
||
update_statement := UPDATE table_name
|
||
SET assignment_list
|
||
[ where_clause ]
|
||
[ returning_clause ]
|
||
[ ';' ]
|
||
|
||
delete_statement := DELETE FROM table_name
|
||
[ where_clause ]
|
||
[ returning_clause ]
|
||
[ ';' ]
|
||
```
|
||
|
||
Where:
|
||
|
||
```
|
||
values_clause := VALUES value_tuple ( ',' value_tuple )*
|
||
value_tuple := '(' sql_expr ( ',' sql_expr )* ')'
|
||
assignment_list := assignment ( ',' assignment )*
|
||
assignment := column_name '=' sql_expr
|
||
returning_clause := RETURNING projection_list
|
||
on_conflict_clause := ON CONFLICT [ '(' column_name_list ')' ]
|
||
( DO NOTHING
|
||
| DO UPDATE SET assignment_list [ where_clause ] )
|
||
```
|
||
|
||
- `table_name` is the existing `Tables` ident slot, with
|
||
`reject_internal_table` validator (ADR-0030 §6).
|
||
- `column_name_list` is comma-separated `Columns` idents — the
|
||
same shape `data::INSERT` (DSL) uses for Form A. Reuses the
|
||
existing pattern; no new node infrastructure.
|
||
- `where_clause` is `WHERE Subgrammar(&sql_expr::SQL_OR_EXPR)`.
|
||
- `projection_list` is the Phase-2 projection list, re-used
|
||
unchanged.
|
||
|
||
The body of an `INSERT … SELECT` recurses through
|
||
`Node::Subgrammar(&sql_select::SQL_SELECT_COMPOUND)` — the same
|
||
recursion point Phase 2's CTE bodies, scalar subqueries, and
|
||
`IN`/`EXISTS` bodies use. The Phase-2 SELECT grammar already
|
||
admits an optional leading `WITH` (Amendment 2 / commit
|
||
`fd25904`), so `INSERT INTO t WITH x AS (…) SELECT * FROM x`
|
||
works for free.
|
||
|
||
### 2. Dispatch — DSL vs SQL on shared entry words
|
||
|
||
The entry words `insert`, `update`, `delete` are **shared**
|
||
between DSL and SQL. This ADR fixes the dispatch rule **and**
|
||
the mechanism that implements it (a known-risk item — see
|
||
"Open implementation risks" below).
|
||
|
||
**Simple mode → DSL only.** SQL constructs (multi-row VALUES,
|
||
INSERT…SELECT, RETURNING, UPSERT) attempted in simple mode
|
||
produce a `ValidationFailed` "this is SQL" hint, matching the
|
||
gating that `select` / `with` already produce (ADR-0030 §2).
|
||
|
||
**Advanced mode → SQL first, DSL fallback.** Each shared entry
|
||
word's `CommandNode.shape` becomes a `Choice` between the SQL
|
||
shape and the DSL shape, declared in that order:
|
||
|
||
```
|
||
shape: Node::Choice(&[SQL_INSERT_SHAPE, DSL_INSERT_SHAPE])
|
||
```
|
||
|
||
The walker's `Choice` is first-match-wins. The SQL shape's
|
||
constructs are a strict superset of the DSL shape's for these
|
||
statements (multi-row VALUES, INSERT…SELECT, RETURNING, UPSERT
|
||
are SQL-only; DSL's `--all-rows` flag is DSL-only). Any
|
||
structurally-ambiguous input — `delete from t where id = 1` —
|
||
matches the SQL shape first and routes through the SQL
|
||
machinery.
|
||
|
||
This is the design point an earlier draft of ADR-0030 §2 left
|
||
open ("choice order or a disambiguator decides"). The chosen
|
||
disambiguator is `Choice` order: **SQL > DSL in Advanced**. A
|
||
DSL-specific construct (e.g., `--all-rows`) won't match the SQL
|
||
shape and falls through to DSL — accommodating, not surprising.
|
||
|
||
Cascade-summary parity (§7) and `shortid` auto-fill parity (§6)
|
||
mean a DSL-style INSERT/UPDATE/DELETE in Advanced mode produces
|
||
identical observable behaviour whether it routes through the SQL
|
||
or DSL path. The user's mental model can be "Advanced mode runs
|
||
SQL; DSL syntax still works as a familiar fallback."
|
||
|
||
#### Mode-gating mechanism
|
||
|
||
The existing `is_advanced_only` infrastructure
|
||
(`src/dsl/grammar/mod.rs::ADVANCED_ONLY_ENTRIES`) gates by
|
||
**entry word**, fires in the walker's `dispatch_command_node`
|
||
path (`walker/mod.rs:1707`), and emits the "this is SQL"
|
||
ValidationFailed verdict. It can't be reused to gate a Choice
|
||
*branch* by mode.
|
||
|
||
Mechanism options for branch-level gating:
|
||
|
||
- **(a) Mode-aware validator on the SQL branch.** Each
|
||
SQL-shape Seq starts with a no-op `Node::Validated` whose
|
||
validator inspects `ctx.mode`; in Simple mode it returns
|
||
`ValidationError(message_key: "advanced_mode.sql_in_simple")`
|
||
before any structural tokens commit. Trade-off: clean, reuses
|
||
the existing validator path; the failure shape is the same
|
||
`WalkOutcome::ValidationFailed` as the entry-word gate.
|
||
- **(b) Mode flag on `Node::Choice`.** A new walker capability:
|
||
`Node::Choice` branches can carry a min-mode marker. Heavier
|
||
(touches the walker's `walk_choice` driver) but lets the
|
||
grammar declare gating without per-branch boilerplate.
|
||
- **(c) Two CommandNodes per entry word; registry picks by
|
||
mode.** The dispatcher consults `ctx.mode` and skips the SQL
|
||
CommandNode in Simple mode. Heavier (registry semantics
|
||
change) and surfaces a "multiple CommandNodes per entry word"
|
||
pattern the rest of the codebase doesn't use.
|
||
|
||
**Default choice: (a) the validator approach.** Smallest
|
||
mechanism change. **Walker-capability caveat:** validators
|
||
today are an `Ident { validator: Option<fn> }` field — there
|
||
is no standalone "guard" node that runs a validator without
|
||
consuming input. Sub-phase 3a's first job is to add
|
||
`Node::Guard(fn(&WalkContext) -> Result<(), ValidationError>)`
|
||
to the walker — a zero-consumption node whose only effect is
|
||
to fail the Seq with a ValidationError in Simple mode. The
|
||
existing internal-table rejection (`reject_internal_table`)
|
||
shows the wiring pattern; `Node::Guard` is the
|
||
"no-byte-consuming version" of it.
|
||
|
||
### 3. Multi-row `INSERT … VALUES`
|
||
|
||
```
|
||
INSERT INTO orders (customer_id, total) VALUES (1, 99.50), (2, 12.00), (3, 6.25)
|
||
```
|
||
|
||
The grammar's `values_clause` repeats `value_tuple` with a comma
|
||
separator (`Node::Repeated { inner: value_tuple, separator:
|
||
Some(COMMA), min: 1 }`). Each tuple matches the same arity as
|
||
the `(column_name_list)` (see §8 for the arity diagnostic).
|
||
|
||
### 4. `INSERT … SELECT`
|
||
|
||
```
|
||
INSERT INTO archive SELECT * FROM orders WHERE created < '2025-01-01'
|
||
INSERT INTO summary (id, total) WITH t AS (…) SELECT id, sum(total) FROM t GROUP BY id
|
||
```
|
||
|
||
A `Choice` between `values_clause` and
|
||
`Subgrammar(&sql_select::SQL_SELECT_COMPOUND)` after the
|
||
optional column list. The Phase-2 SELECT machinery (CTEs, JOINs,
|
||
subqueries, set ops) applies as the row source for free.
|
||
|
||
Arity verification: at parse time, the walker counts the
|
||
declared column list (or, when absent, the target table's column
|
||
count from the schema cache) against the SELECT's projection
|
||
list arity. Mismatch emits `diagnostic.insert_arity_mismatch`
|
||
(see §8).
|
||
|
||
### 5. `RETURNING`
|
||
|
||
`RETURNING projection_list` admits the Phase-2 projection_list
|
||
unchanged — column refs, `*`, `t.*`, expressions with `AS`
|
||
aliases, computed expressions.
|
||
|
||
Execution: the `Command::Sql{Insert,Update,Delete}` variants
|
||
carry a `returning: bool` flag set by the walker's
|
||
`ast_builder` when a `RETURNING` clause matched. The worker
|
||
reads this flag (no SQL text re-parsing) and routes:
|
||
|
||
- `returning == true` → prepare + step the statement,
|
||
collect rows into a `DataResult` (per-result-column type
|
||
recovery via the same column-origin path Phase 2 introduced,
|
||
ADR-0032 §12 + Amendment 1).
|
||
- `returning == false` → execute the statement,
|
||
collect the affected-row count.
|
||
|
||
The flag is sufficient because the *shape* of the projection
|
||
is already in the SQL the engine prepares; the worker doesn't
|
||
need to re-parse to know which columns to read.
|
||
|
||
### 6. `shortid` auto-fill — worker post-fills
|
||
|
||
Parity with the DSL `do_insert` handler (ADR-0005 / ADR-0018).
|
||
|
||
For an `INSERT` whose `(column_name_list)` omits one or more
|
||
`shortid` columns, the worker:
|
||
|
||
1. Inspects the table metadata to find the auto-generated columns
|
||
(`serial`, `shortid`) NOT in the user's column list.
|
||
2. For each row in the VALUES list (single or multi-row), and for
|
||
`INSERT … SELECT` each yielded row, synthesises a fresh
|
||
`shortid` value (the existing `shortid::generate` path).
|
||
3. Constructs the parameterised INSERT (or executes a rewrite of
|
||
the SQL with the values bound positionally) and runs it.
|
||
|
||
The engine sees a fully-specified row; the user's submitted SQL
|
||
is unmodified in `history.log` (ADR-0030 §11). The walker
|
||
recognises an explicit value provided for an auto-gen column —
|
||
that's an explicit override, not a missing slot — and emits a
|
||
WARNING (§8b) but lets it through.
|
||
|
||
`serial` columns rely on SQLite's `INTEGER PRIMARY KEY` rowid
|
||
auto-increment — no worker post-fill needed; the engine
|
||
populates the value on its own.
|
||
|
||
### 7. Cascade summary on `DELETE`
|
||
|
||
Parity with the DSL `do_delete` handler (ADR-0014).
|
||
|
||
When `Command::SqlDelete { target_table, sql, … }` runs:
|
||
|
||
1. **Pre-execution cascade pre-count.** For each child table
|
||
that has an FK referencing `target_table` with
|
||
`ON DELETE CASCADE` (or `SET NULL`), the worker runs a
|
||
pre-count query of the form `SELECT count(*) FROM <child>
|
||
WHERE <fk_col> IN (SELECT <pk_col> FROM <target_table>
|
||
WHERE <user_where_predicate>)`. The `<user_where_predicate>`
|
||
is extracted from the SQL — see "Predicate extraction" below.
|
||
2. The worker executes the DELETE.
|
||
3. **Post-execution summary.** Per-relationship row counts +
|
||
cascade impact, formatted by the same cascade-summary
|
||
formatter the DSL path uses.
|
||
|
||
**Predicate extraction.** The DSL handler reuses the typed
|
||
`Expr` AST to construct the pre-count subqueries. For SQL
|
||
DELETE the walker captures the WHERE clause's source byte
|
||
range from the matched path and re-injects that text into
|
||
the pre-count SQL. This is grammar-as-text reuse, consistent
|
||
with ADR-0030 §4's posture; no AST construction required.
|
||
|
||
When the SQL DELETE has no WHERE, the pre-count is unbounded
|
||
("all rows of `target_table`"). Cost: one extra count(*) per
|
||
child table; cheaper than the DSL parallel path on a typical
|
||
schema.
|
||
|
||
The cascade-summary formatter itself (`format_cascade_summary`
|
||
in the DSL handler) is shared with `do_sql_delete` — same
|
||
output, same i18n keys.
|
||
|
||
### 8. Diagnostics new to Phase 3
|
||
|
||
Three new keys, classified per ADR-0027's ERROR / WARNING model.
|
||
|
||
#### 8.1. `diagnostic.insert_arity_mismatch` (ERROR)
|
||
|
||
Walker-level pre-flight when the declared `(column_name_list)`'s
|
||
arity disagrees with a `value_tuple`'s arity (or, for `INSERT …
|
||
SELECT`, with the SELECT's projection list arity, counted at
|
||
parse time via the same projection-counting infrastructure
|
||
Phase-2's `compound_arity_diagnostics` uses).
|
||
|
||
For multi-row VALUES, **each** offending row emits its own
|
||
diagnostic on that row's `value_tuple` span. A query with five
|
||
rows and three wrong arities produces three diagnostics; the
|
||
two right rows are silent. Pre-empts the engine's less-helpful
|
||
"X values for Y columns" error.
|
||
|
||
For `INSERT … SELECT`, the diagnostic anchors on the SELECT's
|
||
first `projection_item` span. The Phase-2 projection-count
|
||
infrastructure (the comma-counting at depth = leg-depth used by
|
||
`compound_arity_diagnostics`) is the underlying mechanism.
|
||
|
||
#### 8.2. `diagnostic.auto_column_overridden` (WARNING)
|
||
|
||
Walker-level when an INSERT explicitly assigns a value to a
|
||
column whose type is `serial` or `shortid`. Pedagogical: the
|
||
explicit value bypasses the auto-counter / generator, which can
|
||
collide with future auto-generated values.
|
||
|
||
Catalog wording: `"column \`{column}\` is auto-generated
|
||
(\`{type}\`); providing an explicit value bypasses the
|
||
auto-counter and may collide with later auto-generated values"`.
|
||
|
||
The statement still runs — the warning is advisory, matching
|
||
the Phase-2 predicate-warning posture (ADR-0027 §1 /
|
||
ADR-0032 §11.6).
|
||
|
||
#### 8.3. `diagnostic.not_null_missing` (WARNING)
|
||
|
||
Walker-level when an INSERT's `(column_name_list)` omits a
|
||
column declared `NOT NULL` with no `DEFAULT`. Pedagogical
|
||
nudge — the engine will reject the statement, but the
|
||
walker can pre-flight the missing column at parse time.
|
||
|
||
Catalog wording: `"column \`{column}\` is required (\`NOT NULL\`,
|
||
no default); the statement will fail when run"`.
|
||
|
||
The walker doesn't BLOCK the statement (it's structurally
|
||
valid). The engine does. The friendly layer translates the
|
||
engine error via the existing NOT-NULL path.
|
||
|
||
#### 8.4. Inherited diagnostics
|
||
|
||
The Phase-2 diagnostic passes apply to DML's `sql_expr` slots
|
||
without further work:
|
||
|
||
- `schema_existence_diagnostics` flags unknown columns in WHERE,
|
||
SET, RETURNING, VALUES tuples.
|
||
- `sql_predicate_warnings` flags `= NULL`, `LIKE`-on-numeric,
|
||
type-mismatched comparisons in WHERE and CASE expressions.
|
||
- The `pending_diagnostics` accumulator path stays available
|
||
for in-walk emits.
|
||
|
||
### 9. UPSERT — `ON CONFLICT … DO NOTHING / DO UPDATE`
|
||
|
||
```
|
||
INSERT INTO t (id, name) VALUES (1, 'x')
|
||
ON CONFLICT (id) DO UPDATE SET name = excluded.name
|
||
|
||
INSERT INTO t (id, name) VALUES (1, 'x')
|
||
ON CONFLICT DO NOTHING
|
||
```
|
||
|
||
Grammar shape:
|
||
|
||
```
|
||
on_conflict_clause := ON CONFLICT [ '(' column_name_list ')' ]
|
||
( DO NOTHING
|
||
| DO UPDATE SET assignment_list [ where_clause ] )
|
||
```
|
||
|
||
The optional `(column_name_list)` is the conflict target —
|
||
which columns' unique constraint to react to. Standard SQL allows
|
||
this list to be empty (`ON CONFLICT DO NOTHING` is shorthand for
|
||
"any conflict").
|
||
|
||
The DO UPDATE branch reuses the same `assignment_list` and
|
||
optional `where_clause` shapes as the top-level UPDATE statement.
|
||
|
||
Inside DO UPDATE expressions, the keyword `excluded` is a
|
||
pseudo-table reference to the would-have-been-inserted row.
|
||
**Note:** `excluded` is the SQLite + PostgreSQL spelling;
|
||
standard SQL (`MERGE … WHEN MATCHED`) uses a different model.
|
||
The playground commits to the `excluded` form because it is what
|
||
SQLite supports and what the project's pedagogical target dialect
|
||
uses (ADR-0001 / ADR-0002 — the engine is SQLite, even though
|
||
its name is hidden from user-facing strings). This is a
|
||
deliberate carve-out from ADR-0030 §7's engine-neutral posture,
|
||
justified by the absence of a standard-SQL UPSERT spelling that
|
||
both SQLite and PostgreSQL share.
|
||
|
||
The walker admits `excluded` as a table alias for
|
||
ident-completion purposes inside the DO UPDATE branch — its
|
||
columns are the target table's columns.
|
||
|
||
### 10. Execute path — three typed Command variants
|
||
|
||
`Command` gains three variants:
|
||
|
||
```
|
||
Command::SqlInsert {
|
||
sql: String,
|
||
source: Option<String>, // for history.log
|
||
target_table: String, // for re-persistence
|
||
listed_columns: Option<Vec<String>>, // None if no col-list provided
|
||
}
|
||
|
||
Command::SqlUpdate {
|
||
sql: String,
|
||
source: Option<String>,
|
||
target_table: String,
|
||
}
|
||
|
||
Command::SqlDelete {
|
||
sql: String,
|
||
source: Option<String>,
|
||
target_table: String,
|
||
}
|
||
```
|
||
|
||
The walker's `ast_builder` for each shape extracts
|
||
`target_table` and `listed_columns` from the matched path. The
|
||
`sql` field is the validated source text the engine executes.
|
||
|
||
`Database` gains three request types:
|
||
|
||
```
|
||
Request::RunSqlInsert { command, reply }
|
||
Request::RunSqlUpdate { command, reply }
|
||
Request::RunSqlDelete { command, reply }
|
||
```
|
||
|
||
Each worker handler:
|
||
|
||
1. Applies statement-specific pre-execution work (shortid
|
||
auto-fill for `RunSqlInsert`; cascade pre-count for
|
||
`RunSqlDelete`).
|
||
2. Prepares + executes the statement. If RETURNING is present,
|
||
collects the result set as a `DataResult`.
|
||
3. Re-persists the target table's CSV (ADR-0030 §11) using the
|
||
existing persistence write-through machinery.
|
||
4. Returns the result (DataResult for RETURNING; affected-row
|
||
count + cascade summary for plain DELETE; affected-row count
|
||
for plain INSERT/UPDATE).
|
||
|
||
History logging: the worker writes the literal submitted line
|
||
to `history.log` (ADR-0030 §11) before execution, exactly as
|
||
`run_select` does.
|
||
|
||
### 11. Engine-error translation
|
||
|
||
Three new catalog keys (the friendly-error layer's engine.*
|
||
translation set, ADR-0019 / ADR-0030 §7):
|
||
|
||
| Key | Engine condition |
|
||
|-------------------------------------------|-----------------------------------------------------------------|
|
||
| `engine.unique_constraint_failed` | The engine's `UNIQUE constraint failed: t.c` (already exists; will be re-used as-is for UPSERT-without-conflict-target cases) |
|
||
| `engine.not_null_constraint_failed` | Already exists; covers SQL-side NOT-NULL failures uniformly |
|
||
| `engine.foreign_key_constraint_failed` | Already exists; SQL DELETE that violates FK without cascade |
|
||
|
||
No NEW engine.* keys are required for Phase 3 — the existing FK
|
||
/ UNIQUE / NOT-NULL translation paths cover the engine-side DML
|
||
errors. Phase 3 reuses them.
|
||
|
||
### 12. Result-column type recovery on RETURNING
|
||
|
||
A RETURNING projection that's a bare column reference recovers
|
||
its playground type via the same column-origin metadata path
|
||
Phase 2 introduced (`resolve_select_column_types` in `db.rs`,
|
||
ADR-0032 §12 + Amendment 1).
|
||
|
||
Computed expressions in RETURNING stay typeless. The renderer
|
||
uses neutral alignment for typeless columns, consistent with
|
||
SELECT.
|
||
|
||
### 13. Out of scope
|
||
|
||
| ID | OOS |
|
||
|-------|--------------------------------------------------------------------------------------|
|
||
| OOS-1 | `INSERT INTO t DEFAULT VALUES` (the project's planned seed feature covers this) |
|
||
| OOS-2 | `INSERT OR REPLACE` / `OR IGNORE` / `OR ABORT` / `OR FAIL` / `OR ROLLBACK` (SQLite-specific) |
|
||
| OOS-3 | `UPDATE … FROM other_table` (SQLite/PostgreSQL extension) |
|
||
| OOS-4 | `WITH x AS (…) UPDATE …` / `WITH x AS (…) DELETE …` (deferrable; SELECT-side WITH is in scope) |
|
||
| OOS-5 | Indexed-by / not-indexed table-source modifiers (SQLite-specific) |
|
||
| OOS-6 | Multi-statement batches (already OOS via ADR-0030 §13 OOS-4) |
|
||
|
||
OOS-4 is a deliberate choice: WITH-prefixed DML is standard SQL
|
||
since SQL:2003, but its execution semantics (the WITH-defined
|
||
CTE is visible to the DML's WHERE / SET / RETURNING expressions)
|
||
require additional scope-frame plumbing that doesn't yet exist
|
||
on the DML side. The Phase-2 grammar admits WITH at the
|
||
SELECT level; Phase 3 keeps WITH a SELECT-only prefix. A later
|
||
phase can lift this restriction with a clean ADR Amendment.
|
||
|
||
## Consequences
|
||
|
||
- The walker gains a new node variant `Node::Guard` (§2
|
||
mechanism caveat) — a zero-byte-consumption gating node that
|
||
can fail its enclosing Seq with a `ValidationError`. The
|
||
walker driver's exhaustive `walk_node_inner` match grows one
|
||
arm. Reusable beyond DML (Phase 4 DDL's similar shared-entry
|
||
problem is a likely future consumer).
|
||
- `Command` gains three variants (`SqlInsert`, `SqlUpdate`,
|
||
`SqlDelete`); every exhaustive `match Command` callsite picks
|
||
up three new arms (the recurring ADR-0028/0029 gotcha).
|
||
- The DSL `data::INSERT` / `data::UPDATE` / `data::DELETE`
|
||
CommandNodes' shapes become `Choice(SQL_shape, DSL_shape)`
|
||
with the SQL branch mode-gated via a no-op leading validator
|
||
(R1).
|
||
- Three new walker diagnostics with three new catalog keys
|
||
(`diagnostic.insert_arity_mismatch`,
|
||
`diagnostic.auto_column_overridden`,
|
||
`diagnostic.not_null_missing`).
|
||
- Three new `Request` variants
|
||
(`RunSqlInsert`/`RunSqlUpdate`/`RunSqlDelete`) and three new
|
||
worker handlers
|
||
(`do_sql_insert`/`do_sql_update`/`do_sql_delete`).
|
||
- The persistence write-through path gains per-DML-kind entry
|
||
points; the cascade-affected re-persistence in `do_sql_delete`
|
||
is new (cascade-affected child tables are also re-persisted).
|
||
- `RETURNING` lands a write-statement-returns-rows code path
|
||
the project hasn't had before. Worker handlers either return
|
||
affected-row count or a `DataResult` keyed by the
|
||
`returning: bool` flag.
|
||
- The cascade-summary formatter is reused between the DSL and
|
||
SQL DELETE paths; the predicate-extraction byte-range
|
||
injection (§7) is new code with the R2 mitigation.
|
||
- `shortid` auto-fill code in `do_sql_insert` mirrors the DSL
|
||
`do_insert` mechanism but takes a `listed_columns` slice from
|
||
the parse rather than a typed `Command::Insert`.
|
||
- The auto-snapshot machinery (ADR-0006) is unchanged but
|
||
fires for SQL DML the same way it does for DSL DML — the
|
||
snapshot trigger lives in the worker handlers.
|
||
- All Phase-2 tests stay green (the SQL grammar pieces this ADR
|
||
builds on are unchanged).
|
||
- The DSL-vs-SQL dispatch mechanism (3a) is a foundation other
|
||
projects in this codebase may need (e.g., Phase 4 DDL). Land
|
||
it generically so it's reusable.
|
||
|
||
## Initial DA review
|
||
|
||
A first-pass DA review of this ADR's draft surfaced ten
|
||
critiques. The seven that surfaced real gaps were resolved in
|
||
the body before this status moved to Proposed:
|
||
|
||
1. OOS-4 (WITH-prefixed DML) under-justified → R4 risk
|
||
describes the mechanism gap concretely.
|
||
2. Dispatch mechanism hand-waved → §2's "Mode-gating
|
||
mechanism" subsection enumerates three options and commits
|
||
to the validator approach with R1 as the mitigation if it
|
||
fails.
|
||
3. RETURNING execution hand-waved → §5 commits to a
|
||
`returning: bool` flag on the Command variants.
|
||
4. Cascade-summary code-sharing claim too optimistic → §7's
|
||
"Predicate extraction" subsection details the byte-range
|
||
injection mechanism; R2 budgets for the
|
||
performance-pathology case.
|
||
5. `excluded` keyword's engine-neutrality status → §9 carves
|
||
it out explicitly as a justified non-neutral choice.
|
||
6. Multi-row INSERT arity diagnostic emit pattern unspecified
|
||
→ §8.1 clarifies "each offending row emits its own
|
||
diagnostic" plus the INSERT…SELECT span.
|
||
7. Sub-phase ordering put dispatch (the riskiest piece) last
|
||
→ reordered so 3a lands the mechanism FIRST on a no-op
|
||
shape.
|
||
|
||
The remaining three critiques were either process points (DA
|
||
gates at sub-phase boundaries — added per-sub-phase in
|
||
Implementation notes) or already-covered (verification matrix —
|
||
will live in the plan doc, matching Phase 2's pattern).
|
||
|
||
Per CLAUDE.md / the Phase 2 process lesson, the DA review of
|
||
THIS ADR will get a second pass before status moves to
|
||
Accepted. Pin: do not rubber-stamp.
|
||
|
||
## Open implementation risks
|
||
|
||
Recorded explicitly so the implementation plan can budget for
|
||
them, not to allow silent drift later.
|
||
|
||
### R1. Mode-gated `Choice` branch mechanism unproven
|
||
|
||
§2's option (a) — a no-op validator at the head of the SQL
|
||
shape's Seq, rejecting in Simple mode — is plausible but
|
||
untested. If the walker's `walk_seq` commits the Seq before the
|
||
validator runs, the SQL branch's NoMatch becomes Failed and
|
||
the Choice can't fall through to the DSL branch.
|
||
|
||
**Mitigation:** sub-phase 3a lands the mechanism end-to-end on
|
||
the smallest viable shape (a single `SqlNop` CommandNode whose
|
||
shape is `Choice(SQL_branch[mode-gated-validator + token],
|
||
DSL_branch[different-token])`) before any DML grammar lands. If
|
||
the validator approach doesn't work, the alternatives (a
|
||
mode-aware `Node::Choice` variant, or split-CommandNode
|
||
dispatch) get evaluated before sub-phases 3b/3c/3d depend on
|
||
the mechanism.
|
||
|
||
### R2. Cascade-summary predicate extraction
|
||
|
||
§7 says the worker re-injects the WHERE-clause source bytes into
|
||
pre-count subqueries. This is straightforward when the WHERE
|
||
clause is a top-level predicate, but a future user might write
|
||
WHERE clauses that themselves contain subqueries (e.g.,
|
||
`DELETE FROM orders WHERE customer_id IN (SELECT id FROM customers WHERE …)`).
|
||
The byte-range injection still works structurally, but a
|
||
performance pathology is possible if the subquery is expensive
|
||
and the cascade child count is large (the same subquery runs N+1
|
||
times).
|
||
|
||
**Mitigation:** sub-phase 3c benchmarks the pre-count overhead
|
||
on a representative schema before declaring the path acceptable.
|
||
If the overhead is prohibitive, fall back to "report cascade
|
||
count post-execution only" (less informative, but cheaper).
|
||
|
||
### R3. RETURNING result-column type recovery on multi-row INSERT
|
||
|
||
§5 says result columns get the same column-origin type recovery
|
||
SELECT uses. For a single-row INSERT … RETURNING this works
|
||
identically to a SELECT. For multi-row INSERT, the engine emits
|
||
one result row per inserted row; the column-origin metadata is
|
||
the same for all (per ADR-0032 Amendment 1, column-origin is a
|
||
property of the prepared statement, not the rows).
|
||
|
||
No risk here — calling it out so the implementer doesn't
|
||
re-derive the question. The empty-table type-recovery test
|
||
already pins this invariant.
|
||
|
||
### R4. `INSERT … SELECT` with a CTE row source
|
||
|
||
The Phase-2 SELECT grammar already admits `WITH … SELECT …` at
|
||
the compound level (ADR-0032 §10.3 / Amendment 2). The
|
||
`SQL_SELECT_COMPOUND` recursion point §1 uses pulls this in for
|
||
free. `INSERT INTO archive WITH t AS (…) SELECT * FROM t` parses
|
||
and runs.
|
||
|
||
A user might reasonably expect WITH-prefixed DML to also work
|
||
(see §13 OOS-4). That's explicitly out of scope for Phase 3;
|
||
the discrepancy is documented here so the implementer doesn't
|
||
need to figure it out from the grammar shape.
|
||
|
||
## Implementation notes
|
||
|
||
Phased sub-plan with **explicit exit gates per sub-phase**. The
|
||
implementation plan doc (`docs/plans/<date>-adr-0033-phase-3.md`,
|
||
modelled on `docs/plans/20260520-adr-0032-phase-2.md`) refines
|
||
these and adds the cross-cut verification matrix in 3i.
|
||
|
||
**Ordering rationale.** The dispatch mechanism (3a) is the
|
||
foundation; if it fails to work, every later sub-phase has to
|
||
be reworked. Land it first on a no-op shape, then build the DML
|
||
grammar on top.
|
||
|
||
### 3a — `Node::Guard` + mode-gated Choice mechanism
|
||
|
||
**Scope (in):**
|
||
- Add a new walker node variant
|
||
`Node::Guard(fn(&WalkContext) -> Result<(), ValidationError>)`
|
||
(per §2 mechanism caveat) — zero-byte consumption, fails the
|
||
enclosing Seq with `ValidationFailed` when the validator
|
||
returns Err.
|
||
- Implement `walk_guard` in `src/dsl/walker/driver.rs`.
|
||
- Add the `reject_sql_in_simple_mode` guard function.
|
||
- Build a smoke-test grammar: an experimental `CommandNode`
|
||
with `shape: Choice(Seq[Guard(reject_sql_in_simple_mode),
|
||
SQL_branch_token], DSL_branch_token)`. Verify the Choice
|
||
falls through cleanly in Simple mode (DSL wins) and routes
|
||
SQL-first in Advanced mode.
|
||
|
||
**Scope (out):**
|
||
- Any real DML grammar (3b onwards).
|
||
- Wiring the existing `data::INSERT/UPDATE/DELETE` shapes (3j).
|
||
|
||
**Exit gate (required green tests):**
|
||
- `Node::Guard` admits via `walk_guard`; the walker driver's
|
||
match arm exists and is tested.
|
||
- Smoke-test grammar in Simple mode → DSL branch matches.
|
||
- Smoke-test grammar in Advanced mode + SQL-shape input → SQL
|
||
branch matches.
|
||
- Smoke-test grammar in Simple mode + SQL-shape input →
|
||
`ValidationFailed` with `message_key:
|
||
"advanced_mode.sql_in_simple"`.
|
||
- Smoke-test grammar in Advanced mode + DSL-shape input →
|
||
DSL branch matches (Choice fall-through works even when SQL
|
||
branch is admissible).
|
||
- All Phase-2 tests (1446 baseline) still green.
|
||
- Clippy clean.
|
||
|
||
**DA gate (written):**
|
||
- Does the Guard actually fire BEFORE the Seq commits? Failure
|
||
mode: the Seq processes the Guard, the Guard fails, but the
|
||
Seq has already "committed" to its branch and the Choice
|
||
can't fall through. Test specifically: Advanced mode + an
|
||
input that the DSL branch matches but the SQL branch's
|
||
remaining tokens reject → must fall through to DSL, not
|
||
Failed.
|
||
- Is the Guard mechanism reusable beyond DML (e.g., for
|
||
Phase 4 DDL's similar dispatch)? If yes, document the
|
||
pattern in `src/dsl/grammar/mod.rs` doc comments so the
|
||
next phase doesn't reinvent it.
|
||
- If the Guard mechanism doesn't work (R1's worst case), the
|
||
sub-phase REOPENS with mechanism option (b) or (c)
|
||
evaluated before any DML grammar lands on top of broken
|
||
scaffolding.
|
||
|
||
### 3b — INSERT grammar + minimal execution
|
||
|
||
**Scope (in):**
|
||
- `SQL_INSERT_SHAPE`: single-row + multi-row VALUES; the
|
||
optional `(column_name_list)`; the `__rdbms_*` rejection on
|
||
the table-name slot.
|
||
- `Command::SqlInsert { sql, source, target_table,
|
||
listed_columns, returning: false }` — `returning` always
|
||
false in 3b.
|
||
- `Request::RunSqlInsert` + `do_sql_insert` worker handler
|
||
(no shortid auto-fill yet — explicit values required).
|
||
- Persistence write-through on the target table after
|
||
execution.
|
||
- History-log of the literal submitted line.
|
||
|
||
**Scope (out):**
|
||
- `INSERT … SELECT` (3c).
|
||
- `RETURNING` (3e).
|
||
- `UPSERT` (3f).
|
||
- `shortid` auto-fill (3d).
|
||
- Diagnostics (3g).
|
||
- DSL-vs-SQL dispatch wiring on `data::INSERT` (3h —
|
||
development uses a separate `data::SqlInsert` CommandNode).
|
||
|
||
**Exit gate (required green tests):**
|
||
- **Single-row INSERT** runs end-to-end; affected-row count
|
||
surfaces; CSV is re-persisted.
|
||
- **Multi-row INSERT** (`INSERT INTO t VALUES (1, 'a'), (2, 'b')`)
|
||
runs; both rows land; CSV reflects them.
|
||
- **`(column_name_list)`** variant works (`INSERT INTO t (a, b)
|
||
VALUES (1, 'x')`).
|
||
- **`__rdbms_*` rejection** at the INSERT target-table slot
|
||
(per ADR-0030 §6 / ADR-0032 §1).
|
||
- **History line** is the literal submitted SQL.
|
||
- All Phase-2 tests (1446 baseline) still green.
|
||
|
||
**DA gate (written):**
|
||
- Does an INSERT failing mid-multi-row leave persistence in a
|
||
consistent state? (Engine transactional behaviour applies;
|
||
worker must not re-persist on failure.)
|
||
- Does the `target_table` extraction handle case-folding the
|
||
same way the rest of the codebase does?
|
||
- Are the existing DSL INSERT tests still green? Any DSL-side
|
||
regression means 3b broke something it shouldn't have.
|
||
|
||
### 3c — INSERT … SELECT
|
||
|
||
**Scope (in):**
|
||
- Add the row-source `Choice` to `SQL_INSERT_SHAPE`:
|
||
`values_clause | Subgrammar(&sql_select::SQL_SELECT_COMPOUND)`.
|
||
- Update `do_sql_insert` to handle the SELECT-driven path
|
||
(no special worker work needed beyond preparing the
|
||
composite statement; the engine handles the
|
||
insert-from-query flow).
|
||
|
||
**Exit gate:**
|
||
- `INSERT INTO archive SELECT * FROM orders WHERE …` runs.
|
||
- `INSERT INTO target (a, b) SELECT x, y FROM source` runs.
|
||
- `INSERT INTO t WITH x AS (…) SELECT * FROM x` runs
|
||
(R4 invariant — Phase-2 CTE prefix on SELECT applies).
|
||
- Schema-existence diagnostics fire on the SELECT's projection
|
||
(Phase-2 machinery, free).
|
||
|
||
**DA gate (written):**
|
||
- Does the WITH-prefixed SELECT row source actually parse
|
||
through SQL_SELECT_COMPOUND, or does the INSERT prefix
|
||
break the recursion? Test on a real `WITH … INSERT … SELECT`.
|
||
- Does the cascade-summary path (3e DELETE) accidentally apply
|
||
to INSERTs whose SELECT touches the target table? (It must
|
||
not — INSERTs have no cascade.)
|
||
|
||
### 3d — `shortid` auto-fill (worker)
|
||
|
||
**Scope (in):**
|
||
- Detect omitted `shortid` columns at execution time in
|
||
`do_sql_insert`.
|
||
- Synthesise fresh shortid values; bind them positionally.
|
||
- Cover both VALUES and SELECT row sources (for SELECT, the
|
||
shortid columns are added to the projection's bind list
|
||
before execution).
|
||
|
||
**Exit gate:**
|
||
- `INSERT INTO t (name) VALUES ('x')` where `t` has a `pk
|
||
shortid` column auto-fills that PK.
|
||
- An explicit value for a shortid column is respected
|
||
(override behaviour — warning fires per 3g but the value
|
||
is preserved).
|
||
- Multi-row INSERT produces distinct shortids per row.
|
||
- INSERT … SELECT auto-fills shortids for each yielded row.
|
||
|
||
**DA gate (written):**
|
||
- Are the synthesised shortids actually unique across a
|
||
multi-row INSERT? (The generator should be called per-row,
|
||
not once.)
|
||
- Does the auto-fill respect the column's case-preserved name?
|
||
- What happens if the table has both `serial` and `shortid`
|
||
PK columns? (Spec: serial via SQLite rowid; shortid via
|
||
worker post-fill. Verify no double-fill collision.)
|
||
|
||
### 3e — UPDATE grammar + execution
|
||
|
||
**Scope (in):**
|
||
- `SQL_UPDATE_SHAPE`: target table, SET assignment list,
|
||
optional WHERE.
|
||
- `Command::SqlUpdate`, `Request::RunSqlUpdate`,
|
||
`do_sql_update`.
|
||
- Persistence write-through on the target table.
|
||
|
||
**Exit gate:**
|
||
- Single-column UPDATE with WHERE runs; affected-row count
|
||
surfaces; CSV reflects changes.
|
||
- Multi-column UPDATE (`SET a = 1, b = 2`) runs.
|
||
- UPDATE without WHERE runs (per ADR-0030 §12 — no
|
||
`--all-rows` guard).
|
||
- Schema-existence diagnostic fires on unknown columns in
|
||
SET and WHERE.
|
||
|
||
**DA gate (written):**
|
||
- Does the SET assignment expression accept the full
|
||
`sql_expr` (function calls, subqueries, CASE)? Verify on a
|
||
representative expression.
|
||
- Does an UPDATE that touches an auto-gen column produce the
|
||
3g auto-gen warning?
|
||
|
||
### 3f — DELETE grammar + execution + cascade summary
|
||
|
||
**Scope (in):**
|
||
- `SQL_DELETE_SHAPE`: target table, optional WHERE.
|
||
- `Command::SqlDelete`, `Request::RunSqlDelete`,
|
||
`do_sql_delete`.
|
||
- Cascade-summary pre-count and post-execution formatting
|
||
per §7.
|
||
- Persistence write-through on target + affected child tables
|
||
(cascade-affected tables also need re-persistence).
|
||
|
||
**Exit gate:**
|
||
- DELETE with WHERE runs; affected-row count + cascade
|
||
summary surface; CSVs of target AND cascade-affected
|
||
children are re-persisted.
|
||
- DELETE without WHERE runs (no `--all-rows` rail).
|
||
- Cascade summary matches the DSL `do_delete` output on the
|
||
same schema/data.
|
||
- The cascade-summary code path is shared (no duplicate
|
||
formatter).
|
||
|
||
**DA gate (written):**
|
||
- R2 mitigation: does the predicate-extraction byte-range
|
||
injection work on a WHERE containing a subquery? Test
|
||
`DELETE FROM a WHERE id IN (SELECT id FROM b WHERE …)`.
|
||
- Does the cascade pre-count run in a way that doesn't see
|
||
the post-DELETE state? (Order matters: pre-count BEFORE
|
||
execute.)
|
||
|
||
### 3g — RETURNING
|
||
|
||
**Scope (in):**
|
||
- Add `RETURNING projection_list` to all three top-level
|
||
shapes.
|
||
- Walker captures `returning: bool` on the Command variants.
|
||
- Worker handlers route to `DataResult` collection when
|
||
`returning == true`.
|
||
- Result-column type recovery via the Phase-2 column-origin
|
||
path.
|
||
|
||
**Exit gate:**
|
||
- `INSERT … RETURNING *` returns the inserted row(s) as a
|
||
DataResult; affected-row count is implicit (rows.len()).
|
||
- `UPDATE … RETURNING id, new_value` returns the modified
|
||
columns.
|
||
- `DELETE … RETURNING *` returns what was just deleted
|
||
(BEFORE the row is gone).
|
||
- Result-column types recovered for bare-column refs
|
||
(per ADR-0032 Amendment 1).
|
||
- The cascade summary on a `DELETE … RETURNING` is preserved
|
||
alongside the result set (both surface to the user).
|
||
|
||
**DA gate (written):**
|
||
- Does the RETURNING result-set rendering use the same
|
||
DataResult path SELECT uses? Or did a DML-specific copy
|
||
sneak in?
|
||
- Does `RETURNING expr AS alias` admit aliases the same way
|
||
the projection list does?
|
||
|
||
### 3h — UPSERT (ON CONFLICT … DO NOTHING / DO UPDATE)
|
||
|
||
**Scope (in):**
|
||
- Add `on_conflict_clause` to `SQL_INSERT_SHAPE`.
|
||
- Admit `excluded` as a table alias inside DO UPDATE
|
||
expressions (§9's clarification).
|
||
- Walker captures the UPSERT shape; worker executes as-is
|
||
(no special pre-processing).
|
||
|
||
**Exit gate:**
|
||
- `INSERT … ON CONFLICT (id) DO NOTHING` runs on a
|
||
conflicting row; affected-row count is 0.
|
||
- `INSERT … ON CONFLICT (id) DO UPDATE SET v = excluded.v`
|
||
runs; the conflicting row is updated to the new value.
|
||
- `ON CONFLICT DO NOTHING` (no target spec) handles any
|
||
conflict.
|
||
- `excluded.col` completes / highlights as a column ref to the
|
||
target table's columns inside the DO UPDATE expressions.
|
||
|
||
**DA gate (written):**
|
||
- Does ON CONFLICT … DO UPDATE trigger persistence
|
||
write-through? (Yes; the target row is modified.)
|
||
- Is the `excluded` alias scoped to ONLY the DO UPDATE
|
||
branch, not leaking into the outer INSERT?
|
||
|
||
### 3i — Diagnostics
|
||
|
||
**Scope (in):**
|
||
- `diagnostic.insert_arity_mismatch` (§8.1).
|
||
- `diagnostic.auto_column_overridden` (§8.2).
|
||
- `diagnostic.not_null_missing` (§8.3).
|
||
- Per-key positive + negative tests + catalog entries.
|
||
|
||
**Exit gate:**
|
||
- One positive + one negative test per new key (the
|
||
ADR-0032 §2d exit gate pattern):
|
||
- `insert_arity_mismatch`: `INSERT INTO t (a, b) VALUES
|
||
(1, 2, 3)` fires; matched arity doesn't.
|
||
- `auto_column_overridden`: `INSERT INTO t (id, name)
|
||
VALUES (5, 'x')` (id is `serial`) fires; omitting `id`
|
||
doesn't.
|
||
- `not_null_missing`: `INSERT INTO t (a) VALUES (1)`
|
||
(b is NOT NULL no default) fires; including `b` doesn't.
|
||
- Multi-row arity mismatch emits per-row, not just on the
|
||
first offending row.
|
||
- The inherited Phase-2 diagnostic passes
|
||
(`schema_existence`, `sql_predicate_warnings`,
|
||
`compound_arity_diagnostics`) still fire on DML's
|
||
`sql_expr` slots.
|
||
|
||
**DA gate (written):**
|
||
- Does `auto_column_overridden` produce a Warning, not an
|
||
Error? (Spec: WARNING — the statement still runs.)
|
||
- Does `not_null_missing` produce a Warning? (Same.)
|
||
- Are the catalog wordings engine-neutral?
|
||
|
||
### 3j — Dispatch wiring on shared entry words
|
||
|
||
**Scope (in):**
|
||
- Convert `data::INSERT`, `data::UPDATE`, `data::DELETE`
|
||
shapes to `Choice(SQL_shape, DSL_shape)`.
|
||
- Remove the experimental separate `data::SqlInsert` /
|
||
`SqlUpdate` / `SqlDelete` CommandNodes that 3a-3i used as
|
||
scaffolding.
|
||
- Verify DSL-only inputs still match the DSL branch in both
|
||
modes.
|
||
- Verify SQL-only inputs match the SQL branch in Advanced
|
||
mode and fail with "this is SQL" in Simple mode.
|
||
|
||
**Exit gate:**
|
||
- All existing DSL INSERT/UPDATE/DELETE tests still green
|
||
(unmodified inputs, unmodified outputs).
|
||
- All Phase-3 SQL DML tests still green via the new dispatch
|
||
path (the SQL branch is reached through `data::INSERT` etc.,
|
||
not through `data::SqlInsert`).
|
||
- A structurally-ambiguous input (`delete from t where id =
|
||
1`) in Advanced mode routes through the SQL branch
|
||
(verifiable by checking that the cascade-summary code path
|
||
fired — DSL and SQL paths share the formatter but the entry
|
||
point differs).
|
||
- A DSL-specific input (`delete from t --all-rows`) in
|
||
Advanced mode falls through to the DSL branch.
|
||
|
||
**DA gate (written):**
|
||
- Verify that no test was changed to make 3j pass — only the
|
||
shape's structure changes; the observable behaviour is
|
||
identical.
|
||
- Is the test count higher than before 3j? It should be (the
|
||
scaffolding CommandNodes had their own tests; now the
|
||
shared CommandNodes carry both).
|
||
|
||
### 3k — Verification sweep
|
||
|
||
**Scope (in):**
|
||
- Fill in the cross-cut verification matrix
|
||
(`docs/plans/<date>-adr-0033-phase-3.md`).
|
||
- Produce the phase-exit verification report.
|
||
- DA final review (genuine, not rubber-stamp — see Phase 2's
|
||
process lesson in `docs/handoff/20260520-handoff-28.md` §4).
|
||
|
||
**Exit gate:**
|
||
- Every matrix row green with a concrete `file::function`
|
||
test reference.
|
||
- Final test run: zero failures, zero skips, no regressions
|
||
from the Phase-2 baseline of 1446 / 0 / 1.
|
||
- Clippy clean.
|
||
- Report at `docs/handoff/<date>-phase-3-verification.md`
|
||
documents all autonomous decisions with explicit
|
||
user-confirmation pointers and the DA's written critiques
|
||
(not just the verdict).
|
||
|
||
**DA gate (written):**
|
||
- Did filling the matrix surface gaps that need closing?
|
||
(Phase 2 surfaced THREE; expect at least one for Phase 3.)
|
||
- Are any sub-phase exit gates above weaker than they should
|
||
be in hindsight?
|
||
- Did any sub-phase make an autonomous decision that's not
|
||
in this ADR? If so, flag for retroactive user approval
|
||
before phase-exit.
|
||
|
||
## Out of scope (clarifications already established)
|
||
|
||
- **No engine type names** in or out (ADR-0030 §5 / §7).
|
||
- **No `STRICT`**, no SQLite storage clauses (ADR-0030 §7).
|
||
- **No transaction control** (ADR-0030 §13 OOS-4).
|
||
- **No multi-statement batches** (ADR-0030 §13 OOS-4).
|
||
- **No `EXPLAIN` of SQL DML** (ADR-0030 §13 OOS-2 — the DSL
|
||
`explain` still wraps what it already wraps).
|
||
|
||
## Amendment 1 — Dispatch mechanism: category-grouped dispatch supersedes `Node::Guard` (2026-05-21)
|
||
|
||
This amendment **supersedes §2's "Mode-gating mechanism"
|
||
subsection** and **revises sub-phase 3a's scope**. It was
|
||
written during 3a implementation, after tracing the proposed
|
||
mechanism against the actual walker code, and is recorded with
|
||
explicit user approval before any code landed.
|
||
|
||
### The finding — option (a) does not work as §2 frames it
|
||
|
||
§2 selected option (a): a standalone zero-byte `Node::Guard(fn)`
|
||
placed as the first element of the SQL branch's `Seq`, inside a
|
||
`Choice(SQL_shape, DSL_shape)`. §2 framed this as the "smallest
|
||
mechanism change" that "reuses the existing validator path" with
|
||
the failure surfacing as the same `WalkOutcome::ValidationFailed`
|
||
as the entry-word gate — implying **no `walk_choice` change**.
|
||
|
||
Tracing the four required behaviours against
|
||
`src/dsl/walker/driver.rs` shows that framing is inaccurate, and
|
||
that **any** guard-in-`Choice` mechanism is forced to modify
|
||
`walk_choice`:
|
||
|
||
1. **`walk_choice` returns immediately on a branch `Failed`**
|
||
(`driver.rs` `walk_choice`) — it only falls through to the
|
||
next branch on `NoMatch`. So in **Simple mode**,
|
||
`delete from t where id = 1` (which is *valid DSL*) would hit
|
||
the SQL branch's guard, the guard would fail, and the user
|
||
would see the "this is SQL" hint instead of the statement
|
||
running as DSL. That contradicts §2's own "Simple mode → DSL
|
||
only" promise. Fixing it requires a `walk_choice` change
|
||
regardless of how the guard node is shaped.
|
||
|
||
2. **`walk_seq` treats a `NoMatch` at `idx > 0` as a hard
|
||
`Failed`** (`driver.rs` `walk_seq`). A zero-byte Guard at
|
||
`idx 0` still advances `idx` to 1, pushing the SQL branch's
|
||
first real token to `idx 1`. So in **Advanced mode**,
|
||
`delete from t --all-rows` (DSL-only) cannot fall through to
|
||
DSL — the SQL branch returns `Failed`, and `walk_choice`
|
||
won't try the DSL branch. The literal form needs an
|
||
*additional* `walk_seq` semantics change (`idx == 0` → "no
|
||
bytes consumed yet").
|
||
|
||
This is exactly the R1 risk the ADR budgeted for ("if `walk_seq`
|
||
commits the Seq before the validator runs … the Choice can't
|
||
fall through").
|
||
|
||
### The replacement — category-grouped, mode-aware dispatch
|
||
|
||
Rather than gate a `Choice` branch from inside the recursive
|
||
grammar, the dispatch decision moves up to the command
|
||
dispatcher (`walker::walk`), which is **where mode gating
|
||
already lives**: the existing whole-command gate
|
||
(`is_advanced_only` + the Simple-mode short-circuit in `walk`)
|
||
is precisely this pattern, hardcoded for `select` / `with`. The
|
||
amendment generalises that existing code instead of inventing a
|
||
parallel in-grammar mechanism.
|
||
|
||
Concretely:
|
||
|
||
- **Each registry entry carries a category** —
|
||
`CommandCategory::{Simple, Advanced}`. The category lives at
|
||
the registration site (the `REGISTRY` table), not as a field
|
||
on every `CommandNode` literal, because category is a
|
||
dispatcher concern, not intrinsic to a command's grammar.
|
||
`select` / `with` (and, from 3b, the SQL `insert` / `update` /
|
||
`delete` nodes) are `Advanced`; everything else is `Simple`.
|
||
This subsumes `ADVANCED_ONLY_ENTRIES`: `is_advanced_only(entry)`
|
||
becomes "every registry candidate for this entry word is
|
||
`Advanced`".
|
||
|
||
- **A shared entry word has a node in both groups** — e.g. a
|
||
`Simple` DSL `insert` node and an `Advanced` SQL `insert`
|
||
node, both with entry word `insert`. The entry-word lookup
|
||
returns *all* candidates; the dispatcher selects by mode.
|
||
|
||
- **Mode-aware selection in `walk` (no combinator changes):**
|
||
- **Simple mode** considers `Simple` candidates. If a Simple
|
||
candidate exists, it is committed. If none exists but an
|
||
`Advanced` candidate does (a pure-SQL entry word like
|
||
`select`), the dispatcher emits the
|
||
`advanced_mode.sql_in_simple` `ValidationFailed` — identical
|
||
to today's whole-command gate.
|
||
- **Advanced mode** tries `Advanced` candidate(s) first, then
|
||
the `Simple` candidate as a fallback. The first candidate
|
||
that fully matches wins (`delete from t where id = 1` →
|
||
SQL; `delete from t --all-rows` → falls through to DSL
|
||
because the SQL shape doesn't match `--all-rows`).
|
||
|
||
- **No `Node::Guard`, no `walk_choice` / `walk_seq` change.** The
|
||
recursive combinators are untouched; each candidate is walked
|
||
cleanly and independently. Speculative evaluation runs on a
|
||
scratch `WalkContext`; only the committed candidate is walked
|
||
into the caller's context, so post-walk context state
|
||
(completion / hint accumulators) always reflects the chosen
|
||
candidate.
|
||
|
||
### The two details (user-approved)
|
||
|
||
1. **"This is SQL" hint in Simple mode for SQL-shaped input.**
|
||
For SQL-only input in Simple mode (e.g.
|
||
`delete … returning *`), the Simple candidate won't match.
|
||
To keep parity with `select` / `with`, the dispatcher
|
||
speculatively checks whether the `Advanced` candidate *would*
|
||
match; if so, it emits the `advanced_mode.sql_in_simple`
|
||
hint rather than a generic DSL parse error. Lives entirely
|
||
in `walk`.
|
||
|
||
2. **Advanced-mode completion is SQL-first, DSL as full-line
|
||
fallback.** The `Choice` approach would have unioned SQL and
|
||
DSL continuations mid-typing for free; the grouping approach
|
||
surfaces one candidate's walk for completion / hints. The
|
||
decision is to show SQL completions primarily in Advanced
|
||
mode and fall to DSL only on a full DSL-shaped line — simpler
|
||
and matching the "SQL-first, DSL as familiar fallback"
|
||
intent. No unioned completion.
|
||
|
||
### Consequences of the amendment
|
||
|
||
- **`Node::Guard` is NOT added.** §2's mechanism caveat,
|
||
Consequences bullet 1 ("the walker gains a new node variant
|
||
`Node::Guard`"), and the original 3a scope are withdrawn.
|
||
- **`REGISTRY`'s element type changes** to carry the category
|
||
alongside each `&CommandNode`. The handful of registry
|
||
iteration sites (entry-word listing, `command_for_entry_word`,
|
||
the completion entry-word filters, `note_help`) destructure the
|
||
tuple; no behavioural change for the current single-candidate
|
||
registry.
|
||
- **The `Command` variants (§10), worker handlers, RETURNING
|
||
flag, cascade-summary plan, shortid auto-fill, and diagnostics
|
||
are unaffected** — this amendment is purely about how the
|
||
dispatcher routes a shared entry word to the right grammar.
|
||
- **Sub-phase 3a now validates the dispatch mechanism**, not the
|
||
guard: a smoke registry with a shared entry word (a `Simple`
|
||
and an `Advanced` node using distinguishable tokens) exercises
|
||
the Simple/Advanced selection, the fallback, and the
|
||
"this is SQL" path. The 3a exit gate's four cases map onto the
|
||
new mechanism unchanged in spirit.
|
||
- **3b adds the first real shared entry words** to `REGISTRY`
|
||
(the SQL `insert` / `update` / `delete` nodes). At that point
|
||
the completion entry-word *lists* will need to de-duplicate
|
||
the shared primary (two candidates, one entry word) — tracked
|
||
for 3b, not 3a.
|
||
- The mechanism remains reusable for Phase 4 DDL's analogous
|
||
shared-entry problem (`create`, …): tag the SQL DDL nodes
|
||
`Advanced`; the dispatcher handles the rest.
|
||
|
||
## Amendment 2 — Cascade summary on SQL DELETE: count-diff parity supersedes WHERE-injected pre-count (2026-05-22)
|
||
|
||
This amendment **supersedes §7's "Predicate extraction" subsection
|
||
and the pre-count mechanism in §7 steps 1–3**, and **withdraws Open
|
||
Implementation Risk R2**. It was written during sub-phase 3f, after
|
||
tracing §7's prescribed mechanism against the actual DSL `do_delete`
|
||
handler, and is recorded with explicit user approval before any 3f
|
||
code landed.
|
||
|
||
### The finding — §7's premise about the DSL handler is incorrect
|
||
|
||
§7 prescribes a per-child **pre-execution pre-count** of the form
|
||
`SELECT count(*) FROM <child> WHERE <fk_col> IN (SELECT <pk_col> FROM
|
||
<target_table> WHERE <user_where_predicate>)`, with the
|
||
`<user_where_predicate>` re-injected from the SQL DELETE's
|
||
WHERE-clause source bytes. Its stated justification:
|
||
|
||
> "The DSL handler reuses the typed `Expr` AST to construct the
|
||
> pre-count subqueries."
|
||
|
||
Tracing `do_delete` in `src/db.rs` shows this premise is factually
|
||
wrong. The DSL handler does **not** construct pre-count subqueries
|
||
from the typed `Expr`. It detects cascade effects by **before/after
|
||
row-count diffing**: read the inbound relationships, count each child
|
||
table's rows *before* the DELETE, execute the DELETE inside a
|
||
transaction, count each child *after*, and report the positive
|
||
difference as a `CascadeEffect`. No pre-count query and no
|
||
`Expr`-derived subquery exist anywhere in the path.
|
||
|
||
### Two consequences of the correct mechanism
|
||
|
||
1. **Count-diff reports `ON DELETE CASCADE` row removals only.** `ON
|
||
DELETE SET NULL` does not change a child's row count, so the DSL
|
||
path does not report it (its own code comment notes this).
|
||
Reporting SET NULL would require value-level diffing.
|
||
|
||
2. **§7's pre-count is in tension with the parity promise.** §2
|
||
promises "a DSL-style … DELETE in Advanced mode produces identical
|
||
observable behaviour whether it routes through the SQL or DSL
|
||
path", and the plan's 3f exit gate requires the SQL cascade summary
|
||
to *match* the DSL `do_delete` output. A pre-count that counted
|
||
`ON DELETE CASCADE` *and* `SET NULL` (as §7 step 1 says) would
|
||
report **more** than the DSL path — breaking that parity. To match
|
||
the DSL path it would have to drop SET NULL anyway, leaving
|
||
WHERE-byte extraction and the R2 N+1-subquery pathology as pure
|
||
cost with no benefit over count-diff.
|
||
|
||
### The replacement — count-diff parity
|
||
|
||
`do_sql_delete` mirrors `do_delete` exactly: read inbound
|
||
relationships, snapshot child row counts, run the **verbatim**
|
||
validated DELETE SQL inside a transaction via
|
||
`execute_with_fk_enrichment`, diff the child counts, build the same
|
||
`Vec<CascadeEffect>`, and re-persist the target plus every
|
||
cascade-affected child before commit. Because both handlers return
|
||
the same `DeleteResult` and both route through
|
||
`CommandOutcome::Delete` → `handle_dsl_delete_success`, the
|
||
per-relationship summary formatter is **already shared at the render
|
||
layer** — no formatter refactor and no duplicate path.
|
||
|
||
This dominates the §7 mechanism on every axis the 3f exit gate
|
||
measures:
|
||
|
||
- **Exact parity** with the DSL path — identical detection mechanism,
|
||
not merely a matching formatter.
|
||
- **No WHERE-clause byte extraction.** The verbatim DELETE (including
|
||
any subquery in its WHERE) executes once; the engine cascades; the
|
||
diff observes the result. The R2 invariant case (`DELETE FROM a
|
||
WHERE id IN (SELECT … )`) is correct by construction and carries no
|
||
N+1 cost. **R2 is withdrawn.**
|
||
- **Shared render** with no duplicate formatter, satisfying §7's
|
||
shared-formatter promise structurally.
|
||
|
||
### SET NULL reporting — deferred for both paths (user-confirmed)
|
||
|
||
3f keeps strict DSL parity: `ON DELETE CASCADE` row removals are
|
||
reported; `SET NULL` is not, on either path. Adding SET NULL
|
||
reporting (value-level diffing) is a tracked future enhancement to be
|
||
applied to **both** `do_delete` and `do_sql_delete` together so
|
||
parity is preserved. The user confirmed this deferral.
|
||
|
||
### Consequences of the amendment
|
||
|
||
- §7's "Predicate extraction" subsection and pre-count steps 1–3 are
|
||
superseded by count-diff; the per-relationship summary and
|
||
shared-formatter outcomes are unchanged.
|
||
- **R2 (cascade-summary predicate extraction) is withdrawn** — the
|
||
mechanism it budgeted for is no longer used.
|
||
- `Command::SqlDelete` carries `{ sql, target_table }` only — no
|
||
WHERE-clause field is needed (the worker never inspects the
|
||
predicate).
|
||
- The handoff-31 §4 "WHERE-byte-extraction is tractable for DELETE"
|
||
heads-up is moot — no extraction happens.
|
||
|
||
## Amendment 3 — Command identity is intrinsic; execution-mode is side-channel (deferred) (2026-05-23)
|
||
|
||
This amendment **clarifies the command-identity model** that §2 and
|
||
Amendment 1 implement, **corrects a false premise in the
|
||
implementation plan's sub-phase 3j exit gate**, and **records a
|
||
deferred follow-up** (the execution-time mode side-channel). It was
|
||
written during sub-phase 3j — wiring the shared `insert` / `update` /
|
||
`delete` entry words — after the dispatch model's interaction with the
|
||
existing test suite surfaced a question about what a "command" *is* and
|
||
how input mode relates to it. Recorded with explicit user approval
|
||
before any 3j code landed.
|
||
|
||
### The model — a command is a mode-rooted grammar-path outcome
|
||
|
||
A command is the typed outcome of a grammar path **rooted at the input
|
||
mode**: `simple → update → …` or `advanced → update → …`. Its identity
|
||
is **intrinsic** — determined by *which grammar matched*, not by a mode
|
||
flag carried on one shared command type. In Advanced mode the
|
||
dispatcher (Amendment 1) tries the Advanced (SQL) candidate(s) first and
|
||
**falls back to the Simple (DSL) candidate** when no Advanced branch
|
||
recognizes a token; the fallback produces the **Simple** command.
|
||
|
||
Worked example (the `--all-rows` fall-through): `delete from T
|
||
--all-rows` in Advanced mode — no Advanced (SQL) branch recognizes
|
||
`--all-rows` (the SQL `DELETE` has no expression slot after the table
|
||
to absorb it), so the path falls back to `simple → delete → … →
|
||
all-rows` and yields `Command::Delete { filter: AllRows }` (the DSL
|
||
command), exactly as it would in Simple mode. The mode rooted the path;
|
||
the command that came out is the Simple one.
|
||
|
||
*Counter-example (no fall-through):* `update T set x = 42 --all-rows`
|
||
in Advanced mode does **not** fall back — the SQL `UPDATE`'s
|
||
`SET <expr>` greedily consumes `--all-rows` as the expression
|
||
`42 - -all - rows` (with `all`/`rows` as column refs), so the SQL shape
|
||
matches and `Command::SqlUpdate` is produced. Whether `--all-rows` is a
|
||
DSL flag or part of a SQL expression is decided by which grammar wins,
|
||
not by special-casing the token. (At execution this is harmless: the
|
||
engine treats `--all-rows` as a SQL line comment, so the statement runs
|
||
as `update T set x = 42` — all rows — the same effect as the DSL flag.)
|
||
|
||
### Simple mode commits the DSL candidate; an advanced-mode pointer combines
|
||
|
||
In **Simple mode** a shared entry word **always commits its DSL
|
||
candidate**, so the user sees DSL completion and the *real* DSL error
|
||
(with its position) — not a bare "this is SQL" that discards the
|
||
actionable detail. A plain `ValidationFailed`/`Mismatch` "this is SQL"
|
||
dispatch outcome is reserved for entry words that have **no** DSL form
|
||
(`select` / `with`); for those the DSL surface has nothing to offer, so
|
||
the simple-mode gate points at advanced mode (ADR-0030 §2).
|
||
|
||
To keep the SQL-discoverability the original §2 envisioned *without*
|
||
losing the DSL fix, the rendering layer **combines** them: when a
|
||
simple-mode line is a *definite* DSL error (not merely incomplete) and
|
||
the same line would parse in advanced mode, the DSL error prose is
|
||
suffixed with the `advanced_mode.also_valid_sql` pointer — e.g.
|
||
`for \`Name\`: Type a quoted string … (valid as SQL in advanced mode —
|
||
\`mode advanced\` or prefix \`:\`)`. The DSL detail and the mode hint
|
||
coexist (`input_render::ambient_hint_in_mode` /
|
||
`advanced_alternative_note`). Mid-typing (incomplete) input is not
|
||
suffixed, to avoid noise during normal DSL entry. This supersedes the
|
||
draft of this amendment's earlier suggestion that a SQL-shaped line in
|
||
simple mode emits a *bare* "this is SQL" hint for shared words.
|
||
|
||
### Overlapping inputs produce two distinct commands — both are tested
|
||
|
||
For a **fully-overlapping** shape — `insert into T values (…)` is valid
|
||
in both grammars — Simple mode yields `Command::Insert` (a typed AST,
|
||
DSL execution) and Advanced mode yields `Command::SqlInsert` (validated
|
||
text, SQL execution). This is **correct, not a defect**: the two
|
||
commands *do the same thing but execute differently* (ADR-0030 §4 — the
|
||
DSL lowers to a typed AST; SQL is grammar-as-text run verbatim). Because
|
||
they are distinct commands, **each is tested in the mode that produces
|
||
it** — DSL grammar tests run in Simple mode; the SQL command variants
|
||
are tested in Advanced mode.
|
||
|
||
### Correction to the plan's 3j exit gate
|
||
|
||
The implementation plan's sub-phase 3j exit gate says "all existing DSL
|
||
`INSERT`/`UPDATE`/`DELETE` tests still green — unmodified inputs,
|
||
unmodified outputs", and its scope-out says "observable behaviour must
|
||
be identical before and after 3j". That rested on a **false premise**:
|
||
that the existing DSL DML tests run in Simple mode. They do not — the
|
||
common helpers (`ok`/`err`/`parse`) call `parse_command`, which
|
||
**defaults to Advanced mode**. Today that is harmless because
|
||
`insert`/`update`/`delete` have no SQL competitor, so they route to the
|
||
DSL command even in Advanced mode. Once they become **shared** entry
|
||
words, §2's Advanced-mode **SQL-first** rule routes the overlap to the
|
||
SQL command (`Command::Sql*`), changing the parsed variant those tests
|
||
assert.
|
||
|
||
The corrected invariant: **Simple-mode behaviour is unchanged; Advanced
|
||
mode is SQL-first per §2; the DSL grammar is tested in Simple mode (its
|
||
canonical surface, ADR-0003); both command variants are tested in their
|
||
producing mode.** No *production* behaviour regresses — the §6 (shortid
|
||
auto-fill) and §7 (cascade summary) parity promises keep the two paths
|
||
observably equivalent for overlapping inputs; only the
|
||
implementation-internal `Command` variant differs, and only in Advanced
|
||
mode.
|
||
|
||
### Replay uses the same parser, in advanced mode — and needs no per-line mode
|
||
|
||
A consequence worth stating explicitly, because it is easy to
|
||
over-think (and a prior reading of this work did). **Replay parses
|
||
each recorded line with the same schema-aware parser the interactive
|
||
path uses, in advanced mode** (the full surface) — it skips nothing
|
||
and simplifies nothing. The only difference from an interactive line
|
||
is the mode argument: interactive uses the user's current mode,
|
||
replay fixes it to advanced.
|
||
|
||
This is correct and complete **without** the per-line execution-mode
|
||
side-channel below, because of the §6 (shortid auto-fill) and §7
|
||
(cascade-summary) parity guarantees: a valid overlapping command
|
||
produces *identical effects* whether it lowers to the DSL variant or
|
||
the SQL variant. So replaying a log of valid commands in advanced
|
||
mode is identical to typing those lines interactively in advanced
|
||
mode — and, by parity, identical in effect to the simple-mode entry
|
||
that originally produced them. (Replay only ever sees lines that
|
||
already executed successfully: `history.log` is success-only, and
|
||
ADR-0034's deferred journal replays `ok` lines only.)
|
||
|
||
The one observable nuance is purely in **error reporting for an
|
||
invalid line** — which only arises from a hand-built script, never
|
||
from a real journal. Such a line is still rejected and not applied;
|
||
*where* the rejection lands just follows the grammar, exactly as
|
||
interactively: an `insert … values …` line is SQL in advanced mode,
|
||
so a wrong column-type value is rejected by the engine at execute
|
||
time rather than by a DSL typed slot at parse time. This is **not**
|
||
replay using a lesser parser — it is the same advanced-mode parse a
|
||
user would get typing the line. Replay therefore does **not** depend
|
||
on the deferred side-channel below.
|
||
|
||
### Execution-time mode is side-channel — deferred to its own ADR
|
||
|
||
Independent of command *identity*, every command should — at **execution
|
||
time** — know which of three modes it ran under: `simple`, `advanced`,
|
||
or `advanced-one-shot` (the `:` escape from Simple mode, ADR-0003), so
|
||
execution can adjust **output** without changing **identity** (e.g. a
|
||
Simple-mode `create table` echoing the generated SQL when run in
|
||
Advanced mode, while staying silent in Simple mode).
|
||
|
||
Today this exists only as a **rendering** side-channel
|
||
(`OutputLine.mode_at_submission`, consumed by the echo-line mode tag in
|
||
`ui.rs`). The `Mode` enum is two-way (`Simple` / `Advanced`); the
|
||
one-shot distinction is a transient "effective mode" collapsed at
|
||
submission; and neither `Action::ExecuteDsl` nor the database worker
|
||
carries any mode. Wiring an **execution-time** mode side-channel —
|
||
widening `Mode` to the three-way distinction and threading it through
|
||
the `Action` → worker interface — is **out of scope for Phase 3** and is
|
||
**deferred to its own ADR** (user-confirmed). It is *not* required for
|
||
Phase 3's dispatch to be correct: the routing and the `--all-rows`
|
||
fall-through above are complete on the two-way mode. This amendment
|
||
forward-references that future ADR so the requirement is not lost.
|
||
|
||
### Consequences of the amendment
|
||
|
||
- **No code or grammar change** from this amendment itself — it records
|
||
the model that the dispatch (Amendment 1) already implements and
|
||
corrects the plan's test-mode premise.
|
||
- The 3j test migration **moves the DSL grammar / completion tests to
|
||
Simple mode** (inputs and `Command::Insert`/`Update`/`Delete`
|
||
assertions unchanged), **keeps the `--all-rows` fall-through tests in
|
||
Advanced mode** (they validate the fallback to the Simple command),
|
||
relies on the migrated `tests/sql_*.rs` for the SQL command variants
|
||
in Advanced mode, and adds dispatcher routing tests
|
||
(Advanced + structurally-ambiguous → SQL; Advanced + `--all-rows` →
|
||
DSL fallback; Simple + a SQL-only *entry word* `select` → "this is
|
||
SQL"; Simple + a shared word with a SQL-only *construct* → DSL error,
|
||
carrying the combined pointer at the rendering layer).
|
||
- The **combined pointer** lives in `input_render::ambient_hint_in_mode`
|
||
(live typing) and `App::dispatch_dsl` (submit), both via the shared
|
||
`advanced_alternative_note`, so a Simple-mode definite DSL error that
|
||
would run as SQL gains the `advanced_mode.also_valid_sql` suffix in
|
||
both surfaces (the submit path covers SQL constructs that surface
|
||
only on submit, e.g. `delete … returning`). Found via the `/runda`
|
||
round.
|
||
- **Internal-table rejection symmetrised** (`/runda` finding B,
|
||
ADR-0030 §6): the DSL data-command target slots (`insert` / `update` /
|
||
`delete` / `show data` / `show table`) gained
|
||
`reject_internal_table`, so `__rdbms_*` tables are refused in Simple
|
||
mode too — previously only the advanced SQL grammar rejected them, so
|
||
a simple-mode DML could touch the internal metadata tables.
|
||
- The **execution-time mode side-channel + three-way `Mode`** is tracked
|
||
as a future ADR; `OutputLine.mode_at_submission` remains the only mode
|
||
side-channel until then. No structure built in 3j assumes mode is
|
||
irrelevant to execution.
|
||
|
||
## Amendment 4 — `update … --all-rows` falls back to the DSL; Amendment 3's counter-example was a bug (2026-05-27)
|
||
|
||
Designing the DSL → SQL teaching echo (ADR-0038) re-examined Amendment 3's
|
||
**counter-example** — the claim that `update T set x = 42 --all-rows` in
|
||
Advanced mode does *not* fall back to the DSL because the SQL `UPDATE`'s
|
||
`SET <expr>` "greedily consumes `--all-rows` as the expression
|
||
`42 - -all - rows` (with `all`/`rows` as column refs)." **That is a bug,
|
||
not correct behaviour**, and this amendment reverses it. Recorded with
|
||
explicit user approval (2026-05-27).
|
||
|
||
### Why it is a bug
|
||
|
||
- The walker has **no `--` comment support** (it lexes two minus
|
||
operators); the engine *does* treat `--` as a line comment. So the
|
||
walker's parse (`42 - -all - rows`) **diverges from execution** (which
|
||
runs `update T set x = 42`).
|
||
- Absent columns literally named `all` and `rows`, the walker is
|
||
**silently accepting a `SET` expression over non-existent columns** —
|
||
precisely the case ADR-0027 says to flag ("mark error if we know it
|
||
will fail at runtime"). It only *appears* to succeed because the
|
||
engine's comment leniency masks the divergence.
|
||
- **Trailing SQL comments are not a supported feature** of the surface,
|
||
so relying on the engine to treat `--all-rows` as one is wrong.
|
||
|
||
So the SQL `UPDATE` shape should **not** match `… --all-rows`.
|
||
|
||
### Decision
|
||
|
||
The `--all-rows` token sequence makes the **SQL `UPDATE` shape fail**, so
|
||
dispatch **falls back to the DSL** `Command::Update { filter: AllRows }`
|
||
— restoring symmetry with `delete … --all-rows`, which already falls
|
||
back (the SQL `DELETE` has no slot to absorb the flag). Consequences:
|
||
|
||
- `update … --all-rows` in Advanced mode is the **DSL** command, runs all
|
||
rows via the safe DSL path, and (ADR-0038) gains the teaching echo
|
||
`UPDATE T SET … ` with the `WHERE` omitted.
|
||
- **No `--` comment feature is introduced**; the playground still does
|
||
not support trailing SQL comments (consistent with the rest of the
|
||
surface, engine-neutral).
|
||
|
||
### Mechanism — key on the **adjacent `--`**
|
||
|
||
The DSL `--all-rows` is matched by `Node::Flag("all-rows")` via
|
||
`consume_flag` (`walker/lex_helpers.rs`), which requires an **adjacent
|
||
`--`**. The SQL expression grammar has no flag node, so it consumes the
|
||
same source as `- -all - rows` (two minus operators, `all`/`rows` as
|
||
column refs) and the SQL `UPDATE` shape wins (SQL-first). Crucially,
|
||
`--all-rows` and a legitimate `- -all` tokenise *identically* once split,
|
||
so the **only** robust discriminator is the adjacency of the two dashes:
|
||
|
||
- **The fix:** the SQL expression treats an **adjacent `--`** as a
|
||
boundary it will not consume, so the SQL `UPDATE` shape **fails** there
|
||
and dispatch falls back to the DSL, whose `Node::Flag` matches
|
||
`--all-rows`. (Adjacency is what separates the DSL flag from real
|
||
arithmetic — see the preserved case below.)
|
||
|
||
Verified consequences (probed against the current grammar), which the
|
||
build must lock down test-first:
|
||
|
||
- `update T set x = 42 --all-rows` → DSL `Update { AllRows }` *(the goal;
|
||
inverts `sql_dml_e2e.rs::e2e_update_all_rows_in_advanced_does_not_fall_back_to_dsl`)*.
|
||
- `update T set x = 42 - -3` → **unchanged**, stays `SqlUpdate` (= 45):
|
||
the dashes are space-separated, not an adjacent `--`. This invariant is
|
||
non-negotiable and gets its own test.
|
||
- `update T set x = 42--3` and `update T set x = 42 --col` *(adjacent
|
||
`--` before a number or a real column)* → today they parse `Ok` as
|
||
`SqlUpdate`; after the fix the SQL expression stops at `--` and the DSL
|
||
flag grammar (which accepts only `all-rows`) rejects the rest, so they
|
||
become **parse errors**. This is a deliberate, acceptable behaviour
|
||
change for these contrived adjacent-dash inputs — the playground does
|
||
not support `--` line comments, so there is no valid reading of an
|
||
adjacent `--` here. *(User heads-up flagged 2026-05-27.)*
|
||
|
||
The existing `delete … --all-rows` fall-back is unaffected. Folded into
|
||
the ADR-0038 echo effort (the fix that makes `update … --all-rows`
|
||
echoable).
|
||
|
||
## Amendment 5 — `also_valid_sql` fires on *validity*, not just *parse* (2026-05-28)
|
||
|
||
Amendment 3 introduced the `advanced_mode.also_valid_sql` pointer and
|
||
described its gate as *"the same line would parse in advanced mode."*
|
||
Issue #1 surfaced that this is too weak: an advanced-mode positional
|
||
`INSERT INTO T VALUES (…)` with a value count that doesn't match the
|
||
target table's column count **parses** (the SQL grammar accepts any
|
||
positional value count) but then fails at the engine. The pointer
|
||
fired, telling the user to switch modes, and the same line failed
|
||
again — the suggestion was misleading. Recorded with explicit user
|
||
approval (2026-05-28).
|
||
|
||
### Why "parse" was the wrong bar
|
||
|
||
"Would parse" is a purely syntactic check: the structural grammar
|
||
accepts the input. It says nothing about whether subsequent static
|
||
checks (type slots, INSERT arity, predicate warnings, schema-existence
|
||
diagnostics, …) would also pass. The cross-mode pointer's job is to
|
||
direct the user to the mode that **actually** runs the line; a
|
||
syntactic-only gate over-promises in every case where a non-grammar
|
||
static check would still reject. The Form B positional arity case is
|
||
just the first one observed.
|
||
|
||
### "Valid" — the explicit definition used by this amendment
|
||
|
||
The pointer fires only when the line is **valid** in advanced mode,
|
||
where **valid** is the ADR-0027 sense — operationally, the value
|
||
returned by `crate::dsl::walker::input_verdict_in_mode(input, schema,
|
||
Mode::Advanced)` is `None`:
|
||
|
||
- The structural parse succeeds (`WalkOutcome::Match`), **and**
|
||
- No diagnostic of `Severity::Warning` or `Severity::Error` is
|
||
produced by any pass that contributes to `input_verdict_in_mode`
|
||
(schema-existence, SQL-predicate warnings, compound-arity errors,
|
||
INSERT-arity errors — Form A *and* Form B — and any further
|
||
diagnostic added to the pipeline in the future).
|
||
|
||
This is the same condition the in-app `[ERR]` / `[WRN]` validity
|
||
indicator uses (ADR-0027 §2). Tying the pointer to it gives one
|
||
canonical "is this line OK in this mode?" answer that every surface
|
||
shares — by construction, the indicator and the pointer can never
|
||
disagree about a given line.
|
||
|
||
### Decision
|
||
|
||
- `advanced_alternative_note` (`src/input_render.rs`) reads
|
||
`input_verdict_in_mode(…, Mode::Advanced)`. The pointer fires only
|
||
for verdict `None`.
|
||
- Any static diagnostic that participates in the verdict automatically
|
||
participates in the gate. No bespoke pre-flight check lives in the
|
||
pointer helper.
|
||
- The Amendment 3 description (*"would parse in advanced mode"*)
|
||
should be read as a synonym for **valid** in this amendment's
|
||
sense, not as a literal syntactic-only test.
|
||
|
||
### Diagnostic added to close the issue #1 gap
|
||
|
||
ADR-0033 §8.1's existing `dml_insert_arity_diagnostics` checked only
|
||
Form A (column list present). Its own doc-comment recorded the Form B
|
||
case as deferred. This amendment closes that gap: the same function
|
||
now also checks the Form B case (no column list) by comparing each
|
||
VALUES tuple's arity against the target table's full column count from
|
||
the schema cache, emitting a new diagnostic key
|
||
`diagnostic.insert_arity_mismatch_form_b`. The `[ERR]` validity
|
||
indicator now lights up at typing time for the user-reported scenario,
|
||
and the cross-mode pointer gate sees the verdict and stays silent.
|
||
|
||
### Tests
|
||
|
||
The amendment's behavioural promise is locked down by:
|
||
|
||
- `dsl::walker::tests::insert_form_b_arity_mismatch_under_supply_fires` —
|
||
the user's reported 4-col / 3-value case produces an Error diagnostic.
|
||
- `dsl::walker::tests::insert_form_b_arity_mismatch_over_supply_fires` —
|
||
symmetric case, 5 values for a 4-col table.
|
||
- `dsl::walker::tests::insert_form_b_arity_match_is_silent` — happy path
|
||
(every column supplied, autos included) stays silent.
|
||
- `dsl::walker::tests::insert_form_b_arity_unknown_table_is_silent` —
|
||
defensive: an unknown table defers to the schema-existence pass.
|
||
- `input_render::tests::ambient_hint_omits_advanced_pointer_when_form_b_value_count_wouldnt_match` —
|
||
the pointer is suppressed for the bug case via the validity verdict.
|
||
- `app::tests::simple_mode_submit_of_sql_construct_appends_advanced_pointer` —
|
||
multi-row VALUES (a true SQL-only construct) still fires the pointer
|
||
when the table exists in the schema.
|
||
|
||
### Behaviour preserved
|
||
|
||
The pointer still fires for every input that is genuinely SQL-only
|
||
syntax in simple mode and is valid as advanced SQL — `delete …
|
||
returning *`, multi-row VALUES against a known table, `INSERT … ON
|
||
CONFLICT …` over a known target, and so on. The narrowing is
|
||
**exactly** the set of inputs where switching modes wouldn't help, no
|
||
more.
|
||
|
||
## See also
|
||
|
||
- ADR-0005 — the ten-type vocabulary INSERT works with.
|
||
- ADR-0006 — the auto-snapshot before destructive ops.
|
||
- ADR-0014 — the DSL DML model + cascade-summary precedent.
|
||
- ADR-0015 — the persistence write-through path.
|
||
- ADR-0018 — the DSL INSERT forms (A / B / C).
|
||
- ADR-0019 — the friendly-error layer DML errors route through.
|
||
- ADR-0027 — the validity indicator surfaces DML diagnostics.
|
||
- ADR-0030 — the SQL surface in advanced mode; Phase 3 is this
|
||
ADR's commission.
|
||
- ADR-0031 — the SQL expression grammar, consumed by DML
|
||
WHERE / SET / VALUES / RETURNING.
|
||
- ADR-0032 — the SQL SELECT grammar; `INSERT … SELECT` and
|
||
RETURNING reuse its projection-list and scope-frame machinery.
|