4e16d97fe0
Replaces ADR-0033 §2's original Node::Guard + Choice(SQL,DSL) mechanism,
which was found during 3a to be unworkable: any guard-in-Choice approach
forces a walk_choice change (walk_choice falls through only on NoMatch, so
simple-mode valid-DSL would wrongly surface "this is SQL"), and walk_seq
treats a NoMatch past idx 0 as a hard Failed, breaking advanced-mode DSL
fall-through.
Mechanism (Amendment 1): each REGISTRY entry is tagged
CommandCategory::{Simple, Advanced}, generalising the whole-command
is_advanced_only gate. walk() becomes a thin dispatcher over decide()
(mode-aware candidate selection: simple commits the DSL node or emits the
"this is SQL" hint; advanced tries SQL first, DSL as a full-line fallback)
and an extracted walk_one_command(); speculative match-testing runs on a
scratch WalkContext so the caller's context is only touched by the
committed walk. No Node::Guard, no walk_choice/walk_seq change.
6 dispatch smoke tests on a shared-entry-word smoke registry; 1446 baseline
green; clippy clean.
1194 lines
49 KiB
Markdown
1194 lines
49 KiB
Markdown
# ADR-0033: The full SQL DML grammar — `INSERT` / `UPDATE` / `DELETE`
|
|
|
|
## Status
|
|
|
|
Proposed
|
|
|
|
## Context
|
|
|
|
ADR-0030 commissions advanced mode as a body of SQL grammar
|
|
inside the unified grammar tree. Phase 1 ("foundations + first
|
|
`SELECT`") and Phase 2 ("`SELECT` — full") have shipped through
|
|
ADR-0031 and ADR-0032 respectively. The next slice on ADR-0030's
|
|
roadmap is **Phase 3 — DML**: `INSERT` / `UPDATE` / `DELETE` as
|
|
validated SQL.
|
|
|
|
ADR-0030 §3 fixed the headline scope:
|
|
|
|
> **`INSERT`** (single- and multi-row), **`UPDATE`**, **`DELETE`**.
|
|
|
|
ADR-0030 §4 fixed the execute path:
|
|
|
|
> **DML and `SELECT`** → executed as the **validated SQL itself**
|
|
> (re-rendered canonically from the matched parse, or the validated
|
|
> original text). … For DML the worker — knowing the statement kind
|
|
> and target table from the parse — runs the statement and
|
|
> re-persists that table's CSV.
|
|
|
|
ADR-0030 §11 fixed the persistence model. ADR-0030 §12 fixed the
|
|
safety posture: SQL `UPDATE`/`DELETE` without `WHERE` executes as
|
|
written (no `--all-rows` rail).
|
|
|
|
ADR-0030 §3 also flagged this ADR's existence in advance:
|
|
|
|
> Settle multi-row `INSERT` and `shortid` auto-fill on a SQL
|
|
> `INSERT`.
|
|
|
|
This ADR fixes the **grammar shape**, the **dispatch
|
|
architecture** for shared entry words, the **diagnostic surface**
|
|
new to DML, and the **execute-path** Command variants. It does
|
|
not revisit ADR-0030's controlling decisions; references in this
|
|
text to ADR-0030 §X mean "that decision is controlling."
|
|
|
|
### What ADR-0030, ADR-0031, ADR-0032 already fix
|
|
|
|
- One walker, grammar-as-text execution, ambient assistance for
|
|
free (ADR-0030 §1/§4/§6/§8).
|
|
- `sql_expr` is the shared expression grammar — UPDATE `SET`
|
|
expressions, WHERE predicates, INSERT VALUES tuples, RETURNING
|
|
projection items all consume it (ADR-0031, ADR-0032 §5/§6).
|
|
- `Node::ScopedSubgrammar` + `from_scope_stack` carry CTEs and
|
|
per-statement scope into nested constructs (ADR-0032 §10).
|
|
- The diagnostic pipeline — `schema_existence_diagnostics`,
|
|
`sql_predicate_warnings`, `compound_arity_diagnostics`,
|
|
`pending_diagnostics` accumulator — applies uniformly to DML's
|
|
`sql_expr` slots (ADR-0032 §11).
|
|
- The friendly-error layer translates engine messages through the
|
|
`engine.*` catalog (ADR-0019, ADR-0030 §7, ADR-0032 §11.5).
|
|
- The `__rdbms_*` rejection at every table-source slot
|
|
(ADR-0030 §6, ADR-0032 §1/§4).
|
|
- `MAX_SUBGRAMMAR_DEPTH = 64` (ADR-0026) is the recursion budget.
|
|
|
|
## Decision
|
|
|
|
### 1. The top-level statement shapes
|
|
|
|
```
|
|
insert_statement := INSERT INTO table_name
|
|
[ '(' column_name_list ')' ]
|
|
( values_clause | sql_select_compound )
|
|
[ on_conflict_clause ]
|
|
[ returning_clause ]
|
|
[ ';' ]
|
|
|
|
update_statement := UPDATE table_name
|
|
SET assignment_list
|
|
[ where_clause ]
|
|
[ returning_clause ]
|
|
[ ';' ]
|
|
|
|
delete_statement := DELETE FROM table_name
|
|
[ where_clause ]
|
|
[ returning_clause ]
|
|
[ ';' ]
|
|
```
|
|
|
|
Where:
|
|
|
|
```
|
|
values_clause := VALUES value_tuple ( ',' value_tuple )*
|
|
value_tuple := '(' sql_expr ( ',' sql_expr )* ')'
|
|
assignment_list := assignment ( ',' assignment )*
|
|
assignment := column_name '=' sql_expr
|
|
returning_clause := RETURNING projection_list
|
|
on_conflict_clause := ON CONFLICT [ '(' column_name_list ')' ]
|
|
( DO NOTHING
|
|
| DO UPDATE SET assignment_list [ where_clause ] )
|
|
```
|
|
|
|
- `table_name` is the existing `Tables` ident slot, with
|
|
`reject_internal_table` validator (ADR-0030 §6).
|
|
- `column_name_list` is comma-separated `Columns` idents — the
|
|
same shape `data::INSERT` (DSL) uses for Form A. Reuses the
|
|
existing pattern; no new node infrastructure.
|
|
- `where_clause` is `WHERE Subgrammar(&sql_expr::SQL_OR_EXPR)`.
|
|
- `projection_list` is the Phase-2 projection list, re-used
|
|
unchanged.
|
|
|
|
The body of an `INSERT … SELECT` recurses through
|
|
`Node::Subgrammar(&sql_select::SQL_SELECT_COMPOUND)` — the same
|
|
recursion point Phase 2's CTE bodies, scalar subqueries, and
|
|
`IN`/`EXISTS` bodies use. The Phase-2 SELECT grammar already
|
|
admits an optional leading `WITH` (Amendment 2 / commit
|
|
`fd25904`), so `INSERT INTO t WITH x AS (…) SELECT * FROM x`
|
|
works for free.
|
|
|
|
### 2. Dispatch — DSL vs SQL on shared entry words
|
|
|
|
The entry words `insert`, `update`, `delete` are **shared**
|
|
between DSL and SQL. This ADR fixes the dispatch rule **and**
|
|
the mechanism that implements it (a known-risk item — see
|
|
"Open implementation risks" below).
|
|
|
|
**Simple mode → DSL only.** SQL constructs (multi-row VALUES,
|
|
INSERT…SELECT, RETURNING, UPSERT) attempted in simple mode
|
|
produce a `ValidationFailed` "this is SQL" hint, matching the
|
|
gating that `select` / `with` already produce (ADR-0030 §2).
|
|
|
|
**Advanced mode → SQL first, DSL fallback.** Each shared entry
|
|
word's `CommandNode.shape` becomes a `Choice` between the SQL
|
|
shape and the DSL shape, declared in that order:
|
|
|
|
```
|
|
shape: Node::Choice(&[SQL_INSERT_SHAPE, DSL_INSERT_SHAPE])
|
|
```
|
|
|
|
The walker's `Choice` is first-match-wins. The SQL shape's
|
|
constructs are a strict superset of the DSL shape's for these
|
|
statements (multi-row VALUES, INSERT…SELECT, RETURNING, UPSERT
|
|
are SQL-only; DSL's `--all-rows` flag is DSL-only). Any
|
|
structurally-ambiguous input — `delete from t where id = 1` —
|
|
matches the SQL shape first and routes through the SQL
|
|
machinery.
|
|
|
|
This is the design point an earlier draft of ADR-0030 §2 left
|
|
open ("choice order or a disambiguator decides"). The chosen
|
|
disambiguator is `Choice` order: **SQL > DSL in Advanced**. A
|
|
DSL-specific construct (e.g., `--all-rows`) won't match the SQL
|
|
shape and falls through to DSL — accommodating, not surprising.
|
|
|
|
Cascade-summary parity (§7) and `shortid` auto-fill parity (§6)
|
|
mean a DSL-style INSERT/UPDATE/DELETE in Advanced mode produces
|
|
identical observable behaviour whether it routes through the SQL
|
|
or DSL path. The user's mental model can be "Advanced mode runs
|
|
SQL; DSL syntax still works as a familiar fallback."
|
|
|
|
#### Mode-gating mechanism
|
|
|
|
The existing `is_advanced_only` infrastructure
|
|
(`src/dsl/grammar/mod.rs::ADVANCED_ONLY_ENTRIES`) gates by
|
|
**entry word**, fires in the walker's `dispatch_command_node`
|
|
path (`walker/mod.rs:1707`), and emits the "this is SQL"
|
|
ValidationFailed verdict. It can't be reused to gate a Choice
|
|
*branch* by mode.
|
|
|
|
Mechanism options for branch-level gating:
|
|
|
|
- **(a) Mode-aware validator on the SQL branch.** Each
|
|
SQL-shape Seq starts with a no-op `Node::Validated` whose
|
|
validator inspects `ctx.mode`; in Simple mode it returns
|
|
`ValidationError(message_key: "advanced_mode.sql_in_simple")`
|
|
before any structural tokens commit. Trade-off: clean, reuses
|
|
the existing validator path; the failure shape is the same
|
|
`WalkOutcome::ValidationFailed` as the entry-word gate.
|
|
- **(b) Mode flag on `Node::Choice`.** A new walker capability:
|
|
`Node::Choice` branches can carry a min-mode marker. Heavier
|
|
(touches the walker's `walk_choice` driver) but lets the
|
|
grammar declare gating without per-branch boilerplate.
|
|
- **(c) Two CommandNodes per entry word; registry picks by
|
|
mode.** The dispatcher consults `ctx.mode` and skips the SQL
|
|
CommandNode in Simple mode. Heavier (registry semantics
|
|
change) and surfaces a "multiple CommandNodes per entry word"
|
|
pattern the rest of the codebase doesn't use.
|
|
|
|
**Default choice: (a) the validator approach.** Smallest
|
|
mechanism change. **Walker-capability caveat:** validators
|
|
today are an `Ident { validator: Option<fn> }` field — there
|
|
is no standalone "guard" node that runs a validator without
|
|
consuming input. Sub-phase 3a's first job is to add
|
|
`Node::Guard(fn(&WalkContext) -> Result<(), ValidationError>)`
|
|
to the walker — a zero-consumption node whose only effect is
|
|
to fail the Seq with a ValidationError in Simple mode. The
|
|
existing internal-table rejection (`reject_internal_table`)
|
|
shows the wiring pattern; `Node::Guard` is the
|
|
"no-byte-consuming version" of it.
|
|
|
|
### 3. Multi-row `INSERT … VALUES`
|
|
|
|
```
|
|
INSERT INTO orders (customer_id, total) VALUES (1, 99.50), (2, 12.00), (3, 6.25)
|
|
```
|
|
|
|
The grammar's `values_clause` repeats `value_tuple` with a comma
|
|
separator (`Node::Repeated { inner: value_tuple, separator:
|
|
Some(COMMA), min: 1 }`). Each tuple matches the same arity as
|
|
the `(column_name_list)` (see §8 for the arity diagnostic).
|
|
|
|
### 4. `INSERT … SELECT`
|
|
|
|
```
|
|
INSERT INTO archive SELECT * FROM orders WHERE created < '2025-01-01'
|
|
INSERT INTO summary (id, total) WITH t AS (…) SELECT id, sum(total) FROM t GROUP BY id
|
|
```
|
|
|
|
A `Choice` between `values_clause` and
|
|
`Subgrammar(&sql_select::SQL_SELECT_COMPOUND)` after the
|
|
optional column list. The Phase-2 SELECT machinery (CTEs, JOINs,
|
|
subqueries, set ops) applies as the row source for free.
|
|
|
|
Arity verification: at parse time, the walker counts the
|
|
declared column list (or, when absent, the target table's column
|
|
count from the schema cache) against the SELECT's projection
|
|
list arity. Mismatch emits `diagnostic.insert_arity_mismatch`
|
|
(see §8).
|
|
|
|
### 5. `RETURNING`
|
|
|
|
`RETURNING projection_list` admits the Phase-2 projection_list
|
|
unchanged — column refs, `*`, `t.*`, expressions with `AS`
|
|
aliases, computed expressions.
|
|
|
|
Execution: the `Command::Sql{Insert,Update,Delete}` variants
|
|
carry a `returning: bool` flag set by the walker's
|
|
`ast_builder` when a `RETURNING` clause matched. The worker
|
|
reads this flag (no SQL text re-parsing) and routes:
|
|
|
|
- `returning == true` → prepare + step the statement,
|
|
collect rows into a `DataResult` (per-result-column type
|
|
recovery via the same column-origin path Phase 2 introduced,
|
|
ADR-0032 §12 + Amendment 1).
|
|
- `returning == false` → execute the statement,
|
|
collect the affected-row count.
|
|
|
|
The flag is sufficient because the *shape* of the projection
|
|
is already in the SQL the engine prepares; the worker doesn't
|
|
need to re-parse to know which columns to read.
|
|
|
|
### 6. `shortid` auto-fill — worker post-fills
|
|
|
|
Parity with the DSL `do_insert` handler (ADR-0005 / ADR-0018).
|
|
|
|
For an `INSERT` whose `(column_name_list)` omits one or more
|
|
`shortid` columns, the worker:
|
|
|
|
1. Inspects the table metadata to find the auto-generated columns
|
|
(`serial`, `shortid`) NOT in the user's column list.
|
|
2. For each row in the VALUES list (single or multi-row), and for
|
|
`INSERT … SELECT` each yielded row, synthesises a fresh
|
|
`shortid` value (the existing `shortid::generate` path).
|
|
3. Constructs the parameterised INSERT (or executes a rewrite of
|
|
the SQL with the values bound positionally) and runs it.
|
|
|
|
The engine sees a fully-specified row; the user's submitted SQL
|
|
is unmodified in `history.log` (ADR-0030 §11). The walker
|
|
recognises an explicit value provided for an auto-gen column —
|
|
that's an explicit override, not a missing slot — and emits a
|
|
WARNING (§8b) but lets it through.
|
|
|
|
`serial` columns rely on SQLite's `INTEGER PRIMARY KEY` rowid
|
|
auto-increment — no worker post-fill needed; the engine
|
|
populates the value on its own.
|
|
|
|
### 7. Cascade summary on `DELETE`
|
|
|
|
Parity with the DSL `do_delete` handler (ADR-0014).
|
|
|
|
When `Command::SqlDelete { target_table, sql, … }` runs:
|
|
|
|
1. **Pre-execution cascade pre-count.** For each child table
|
|
that has an FK referencing `target_table` with
|
|
`ON DELETE CASCADE` (or `SET NULL`), the worker runs a
|
|
pre-count query of the form `SELECT count(*) FROM <child>
|
|
WHERE <fk_col> IN (SELECT <pk_col> FROM <target_table>
|
|
WHERE <user_where_predicate>)`. The `<user_where_predicate>`
|
|
is extracted from the SQL — see "Predicate extraction" below.
|
|
2. The worker executes the DELETE.
|
|
3. **Post-execution summary.** Per-relationship row counts +
|
|
cascade impact, formatted by the same cascade-summary
|
|
formatter the DSL path uses.
|
|
|
|
**Predicate extraction.** The DSL handler reuses the typed
|
|
`Expr` AST to construct the pre-count subqueries. For SQL
|
|
DELETE the walker captures the WHERE clause's source byte
|
|
range from the matched path and re-injects that text into
|
|
the pre-count SQL. This is grammar-as-text reuse, consistent
|
|
with ADR-0030 §4's posture; no AST construction required.
|
|
|
|
When the SQL DELETE has no WHERE, the pre-count is unbounded
|
|
("all rows of `target_table`"). Cost: one extra count(*) per
|
|
child table; cheaper than the DSL parallel path on a typical
|
|
schema.
|
|
|
|
The cascade-summary formatter itself (`format_cascade_summary`
|
|
in the DSL handler) is shared with `do_sql_delete` — same
|
|
output, same i18n keys.
|
|
|
|
### 8. Diagnostics new to Phase 3
|
|
|
|
Three new keys, classified per ADR-0027's ERROR / WARNING model.
|
|
|
|
#### 8.1. `diagnostic.insert_arity_mismatch` (ERROR)
|
|
|
|
Walker-level pre-flight when the declared `(column_name_list)`'s
|
|
arity disagrees with a `value_tuple`'s arity (or, for `INSERT …
|
|
SELECT`, with the SELECT's projection list arity, counted at
|
|
parse time via the same projection-counting infrastructure
|
|
Phase-2's `compound_arity_diagnostics` uses).
|
|
|
|
For multi-row VALUES, **each** offending row emits its own
|
|
diagnostic on that row's `value_tuple` span. A query with five
|
|
rows and three wrong arities produces three diagnostics; the
|
|
two right rows are silent. Pre-empts the engine's less-helpful
|
|
"X values for Y columns" error.
|
|
|
|
For `INSERT … SELECT`, the diagnostic anchors on the SELECT's
|
|
first `projection_item` span. The Phase-2 projection-count
|
|
infrastructure (the comma-counting at depth = leg-depth used by
|
|
`compound_arity_diagnostics`) is the underlying mechanism.
|
|
|
|
#### 8.2. `diagnostic.auto_column_overridden` (WARNING)
|
|
|
|
Walker-level when an INSERT explicitly assigns a value to a
|
|
column whose type is `serial` or `shortid`. Pedagogical: the
|
|
explicit value bypasses the auto-counter / generator, which can
|
|
collide with future auto-generated values.
|
|
|
|
Catalog wording: `"column \`{column}\` is auto-generated
|
|
(\`{type}\`); providing an explicit value bypasses the
|
|
auto-counter and may collide with later auto-generated values"`.
|
|
|
|
The statement still runs — the warning is advisory, matching
|
|
the Phase-2 predicate-warning posture (ADR-0027 §1 /
|
|
ADR-0032 §11.6).
|
|
|
|
#### 8.3. `diagnostic.not_null_missing` (WARNING)
|
|
|
|
Walker-level when an INSERT's `(column_name_list)` omits a
|
|
column declared `NOT NULL` with no `DEFAULT`. Pedagogical
|
|
nudge — the engine will reject the statement, but the
|
|
walker can pre-flight the missing column at parse time.
|
|
|
|
Catalog wording: `"column \`{column}\` is required (\`NOT NULL\`,
|
|
no default); the statement will fail when run"`.
|
|
|
|
The walker doesn't BLOCK the statement (it's structurally
|
|
valid). The engine does. The friendly layer translates the
|
|
engine error via the existing NOT-NULL path.
|
|
|
|
#### 8.4. Inherited diagnostics
|
|
|
|
The Phase-2 diagnostic passes apply to DML's `sql_expr` slots
|
|
without further work:
|
|
|
|
- `schema_existence_diagnostics` flags unknown columns in WHERE,
|
|
SET, RETURNING, VALUES tuples.
|
|
- `sql_predicate_warnings` flags `= NULL`, `LIKE`-on-numeric,
|
|
type-mismatched comparisons in WHERE and CASE expressions.
|
|
- The `pending_diagnostics` accumulator path stays available
|
|
for in-walk emits.
|
|
|
|
### 9. UPSERT — `ON CONFLICT … DO NOTHING / DO UPDATE`
|
|
|
|
```
|
|
INSERT INTO t (id, name) VALUES (1, 'x')
|
|
ON CONFLICT (id) DO UPDATE SET name = excluded.name
|
|
|
|
INSERT INTO t (id, name) VALUES (1, 'x')
|
|
ON CONFLICT DO NOTHING
|
|
```
|
|
|
|
Grammar shape:
|
|
|
|
```
|
|
on_conflict_clause := ON CONFLICT [ '(' column_name_list ')' ]
|
|
( DO NOTHING
|
|
| DO UPDATE SET assignment_list [ where_clause ] )
|
|
```
|
|
|
|
The optional `(column_name_list)` is the conflict target —
|
|
which columns' unique constraint to react to. Standard SQL allows
|
|
this list to be empty (`ON CONFLICT DO NOTHING` is shorthand for
|
|
"any conflict").
|
|
|
|
The DO UPDATE branch reuses the same `assignment_list` and
|
|
optional `where_clause` shapes as the top-level UPDATE statement.
|
|
|
|
Inside DO UPDATE expressions, the keyword `excluded` is a
|
|
pseudo-table reference to the would-have-been-inserted row.
|
|
**Note:** `excluded` is the SQLite + PostgreSQL spelling;
|
|
standard SQL (`MERGE … WHEN MATCHED`) uses a different model.
|
|
The playground commits to the `excluded` form because it is what
|
|
SQLite supports and what the project's pedagogical target dialect
|
|
uses (ADR-0001 / ADR-0002 — the engine is SQLite, even though
|
|
its name is hidden from user-facing strings). This is a
|
|
deliberate carve-out from ADR-0030 §7's engine-neutral posture,
|
|
justified by the absence of a standard-SQL UPSERT spelling that
|
|
both SQLite and PostgreSQL share.
|
|
|
|
The walker admits `excluded` as a table alias for
|
|
ident-completion purposes inside the DO UPDATE branch — its
|
|
columns are the target table's columns.
|
|
|
|
### 10. Execute path — three typed Command variants
|
|
|
|
`Command` gains three variants:
|
|
|
|
```
|
|
Command::SqlInsert {
|
|
sql: String,
|
|
source: Option<String>, // for history.log
|
|
target_table: String, // for re-persistence
|
|
listed_columns: Option<Vec<String>>, // None if no col-list provided
|
|
}
|
|
|
|
Command::SqlUpdate {
|
|
sql: String,
|
|
source: Option<String>,
|
|
target_table: String,
|
|
}
|
|
|
|
Command::SqlDelete {
|
|
sql: String,
|
|
source: Option<String>,
|
|
target_table: String,
|
|
}
|
|
```
|
|
|
|
The walker's `ast_builder` for each shape extracts
|
|
`target_table` and `listed_columns` from the matched path. The
|
|
`sql` field is the validated source text the engine executes.
|
|
|
|
`Database` gains three request types:
|
|
|
|
```
|
|
Request::RunSqlInsert { command, reply }
|
|
Request::RunSqlUpdate { command, reply }
|
|
Request::RunSqlDelete { command, reply }
|
|
```
|
|
|
|
Each worker handler:
|
|
|
|
1. Applies statement-specific pre-execution work (shortid
|
|
auto-fill for `RunSqlInsert`; cascade pre-count for
|
|
`RunSqlDelete`).
|
|
2. Prepares + executes the statement. If RETURNING is present,
|
|
collects the result set as a `DataResult`.
|
|
3. Re-persists the target table's CSV (ADR-0030 §11) using the
|
|
existing persistence write-through machinery.
|
|
4. Returns the result (DataResult for RETURNING; affected-row
|
|
count + cascade summary for plain DELETE; affected-row count
|
|
for plain INSERT/UPDATE).
|
|
|
|
History logging: the worker writes the literal submitted line
|
|
to `history.log` (ADR-0030 §11) before execution, exactly as
|
|
`run_select` does.
|
|
|
|
### 11. Engine-error translation
|
|
|
|
Three new catalog keys (the friendly-error layer's engine.*
|
|
translation set, ADR-0019 / ADR-0030 §7):
|
|
|
|
| Key | Engine condition |
|
|
|-------------------------------------------|-----------------------------------------------------------------|
|
|
| `engine.unique_constraint_failed` | The engine's `UNIQUE constraint failed: t.c` (already exists; will be re-used as-is for UPSERT-without-conflict-target cases) |
|
|
| `engine.not_null_constraint_failed` | Already exists; covers SQL-side NOT-NULL failures uniformly |
|
|
| `engine.foreign_key_constraint_failed` | Already exists; SQL DELETE that violates FK without cascade |
|
|
|
|
No NEW engine.* keys are required for Phase 3 — the existing FK
|
|
/ UNIQUE / NOT-NULL translation paths cover the engine-side DML
|
|
errors. Phase 3 reuses them.
|
|
|
|
### 12. Result-column type recovery on RETURNING
|
|
|
|
A RETURNING projection that's a bare column reference recovers
|
|
its playground type via the same column-origin metadata path
|
|
Phase 2 introduced (`resolve_select_column_types` in `db.rs`,
|
|
ADR-0032 §12 + Amendment 1).
|
|
|
|
Computed expressions in RETURNING stay typeless. The renderer
|
|
uses neutral alignment for typeless columns, consistent with
|
|
SELECT.
|
|
|
|
### 13. Out of scope
|
|
|
|
| ID | OOS |
|
|
|-------|--------------------------------------------------------------------------------------|
|
|
| OOS-1 | `INSERT INTO t DEFAULT VALUES` (the project's planned seed feature covers this) |
|
|
| OOS-2 | `INSERT OR REPLACE` / `OR IGNORE` / `OR ABORT` / `OR FAIL` / `OR ROLLBACK` (SQLite-specific) |
|
|
| OOS-3 | `UPDATE … FROM other_table` (SQLite/PostgreSQL extension) |
|
|
| OOS-4 | `WITH x AS (…) UPDATE …` / `WITH x AS (…) DELETE …` (deferrable; SELECT-side WITH is in scope) |
|
|
| OOS-5 | Indexed-by / not-indexed table-source modifiers (SQLite-specific) |
|
|
| OOS-6 | Multi-statement batches (already OOS via ADR-0030 §13 OOS-4) |
|
|
|
|
OOS-4 is a deliberate choice: WITH-prefixed DML is standard SQL
|
|
since SQL:2003, but its execution semantics (the WITH-defined
|
|
CTE is visible to the DML's WHERE / SET / RETURNING expressions)
|
|
require additional scope-frame plumbing that doesn't yet exist
|
|
on the DML side. The Phase-2 grammar admits WITH at the
|
|
SELECT level; Phase 3 keeps WITH a SELECT-only prefix. A later
|
|
phase can lift this restriction with a clean ADR Amendment.
|
|
|
|
## Consequences
|
|
|
|
- The walker gains a new node variant `Node::Guard` (§2
|
|
mechanism caveat) — a zero-byte-consumption gating node that
|
|
can fail its enclosing Seq with a `ValidationError`. The
|
|
walker driver's exhaustive `walk_node_inner` match grows one
|
|
arm. Reusable beyond DML (Phase 4 DDL's similar shared-entry
|
|
problem is a likely future consumer).
|
|
- `Command` gains three variants (`SqlInsert`, `SqlUpdate`,
|
|
`SqlDelete`); every exhaustive `match Command` callsite picks
|
|
up three new arms (the recurring ADR-0028/0029 gotcha).
|
|
- The DSL `data::INSERT` / `data::UPDATE` / `data::DELETE`
|
|
CommandNodes' shapes become `Choice(SQL_shape, DSL_shape)`
|
|
with the SQL branch mode-gated via a no-op leading validator
|
|
(R1).
|
|
- Three new walker diagnostics with three new catalog keys
|
|
(`diagnostic.insert_arity_mismatch`,
|
|
`diagnostic.auto_column_overridden`,
|
|
`diagnostic.not_null_missing`).
|
|
- Three new `Request` variants
|
|
(`RunSqlInsert`/`RunSqlUpdate`/`RunSqlDelete`) and three new
|
|
worker handlers
|
|
(`do_sql_insert`/`do_sql_update`/`do_sql_delete`).
|
|
- The persistence write-through path gains per-DML-kind entry
|
|
points; the cascade-affected re-persistence in `do_sql_delete`
|
|
is new (cascade-affected child tables are also re-persisted).
|
|
- `RETURNING` lands a write-statement-returns-rows code path
|
|
the project hasn't had before. Worker handlers either return
|
|
affected-row count or a `DataResult` keyed by the
|
|
`returning: bool` flag.
|
|
- The cascade-summary formatter is reused between the DSL and
|
|
SQL DELETE paths; the predicate-extraction byte-range
|
|
injection (§7) is new code with the R2 mitigation.
|
|
- `shortid` auto-fill code in `do_sql_insert` mirrors the DSL
|
|
`do_insert` mechanism but takes a `listed_columns` slice from
|
|
the parse rather than a typed `Command::Insert`.
|
|
- The auto-snapshot machinery (ADR-0006) is unchanged but
|
|
fires for SQL DML the same way it does for DSL DML — the
|
|
snapshot trigger lives in the worker handlers.
|
|
- All Phase-2 tests stay green (the SQL grammar pieces this ADR
|
|
builds on are unchanged).
|
|
- The DSL-vs-SQL dispatch mechanism (3a) is a foundation other
|
|
projects in this codebase may need (e.g., Phase 4 DDL). Land
|
|
it generically so it's reusable.
|
|
|
|
## Initial DA review
|
|
|
|
A first-pass DA review of this ADR's draft surfaced ten
|
|
critiques. The seven that surfaced real gaps were resolved in
|
|
the body before this status moved to Proposed:
|
|
|
|
1. OOS-4 (WITH-prefixed DML) under-justified → R4 risk
|
|
describes the mechanism gap concretely.
|
|
2. Dispatch mechanism hand-waved → §2's "Mode-gating
|
|
mechanism" subsection enumerates three options and commits
|
|
to the validator approach with R1 as the mitigation if it
|
|
fails.
|
|
3. RETURNING execution hand-waved → §5 commits to a
|
|
`returning: bool` flag on the Command variants.
|
|
4. Cascade-summary code-sharing claim too optimistic → §7's
|
|
"Predicate extraction" subsection details the byte-range
|
|
injection mechanism; R2 budgets for the
|
|
performance-pathology case.
|
|
5. `excluded` keyword's engine-neutrality status → §9 carves
|
|
it out explicitly as a justified non-neutral choice.
|
|
6. Multi-row INSERT arity diagnostic emit pattern unspecified
|
|
→ §8.1 clarifies "each offending row emits its own
|
|
diagnostic" plus the INSERT…SELECT span.
|
|
7. Sub-phase ordering put dispatch (the riskiest piece) last
|
|
→ reordered so 3a lands the mechanism FIRST on a no-op
|
|
shape.
|
|
|
|
The remaining three critiques were either process points (DA
|
|
gates at sub-phase boundaries — added per-sub-phase in
|
|
Implementation notes) or already-covered (verification matrix —
|
|
will live in the plan doc, matching Phase 2's pattern).
|
|
|
|
Per CLAUDE.md / the Phase 2 process lesson, the DA review of
|
|
THIS ADR will get a second pass before status moves to
|
|
Accepted. Pin: do not rubber-stamp.
|
|
|
|
## Open implementation risks
|
|
|
|
Recorded explicitly so the implementation plan can budget for
|
|
them, not to allow silent drift later.
|
|
|
|
### R1. Mode-gated `Choice` branch mechanism unproven
|
|
|
|
§2's option (a) — a no-op validator at the head of the SQL
|
|
shape's Seq, rejecting in Simple mode — is plausible but
|
|
untested. If the walker's `walk_seq` commits the Seq before the
|
|
validator runs, the SQL branch's NoMatch becomes Failed and
|
|
the Choice can't fall through to the DSL branch.
|
|
|
|
**Mitigation:** sub-phase 3a lands the mechanism end-to-end on
|
|
the smallest viable shape (a single `SqlNop` CommandNode whose
|
|
shape is `Choice(SQL_branch[mode-gated-validator + token],
|
|
DSL_branch[different-token])`) before any DML grammar lands. If
|
|
the validator approach doesn't work, the alternatives (a
|
|
mode-aware `Node::Choice` variant, or split-CommandNode
|
|
dispatch) get evaluated before sub-phases 3b/3c/3d depend on
|
|
the mechanism.
|
|
|
|
### R2. Cascade-summary predicate extraction
|
|
|
|
§7 says the worker re-injects the WHERE-clause source bytes into
|
|
pre-count subqueries. This is straightforward when the WHERE
|
|
clause is a top-level predicate, but a future user might write
|
|
WHERE clauses that themselves contain subqueries (e.g.,
|
|
`DELETE FROM orders WHERE customer_id IN (SELECT id FROM customers WHERE …)`).
|
|
The byte-range injection still works structurally, but a
|
|
performance pathology is possible if the subquery is expensive
|
|
and the cascade child count is large (the same subquery runs N+1
|
|
times).
|
|
|
|
**Mitigation:** sub-phase 3c benchmarks the pre-count overhead
|
|
on a representative schema before declaring the path acceptable.
|
|
If the overhead is prohibitive, fall back to "report cascade
|
|
count post-execution only" (less informative, but cheaper).
|
|
|
|
### R3. RETURNING result-column type recovery on multi-row INSERT
|
|
|
|
§5 says result columns get the same column-origin type recovery
|
|
SELECT uses. For a single-row INSERT … RETURNING this works
|
|
identically to a SELECT. For multi-row INSERT, the engine emits
|
|
one result row per inserted row; the column-origin metadata is
|
|
the same for all (per ADR-0032 Amendment 1, column-origin is a
|
|
property of the prepared statement, not the rows).
|
|
|
|
No risk here — calling it out so the implementer doesn't
|
|
re-derive the question. The empty-table type-recovery test
|
|
already pins this invariant.
|
|
|
|
### R4. `INSERT … SELECT` with a CTE row source
|
|
|
|
The Phase-2 SELECT grammar already admits `WITH … SELECT …` at
|
|
the compound level (ADR-0032 §10.3 / Amendment 2). The
|
|
`SQL_SELECT_COMPOUND` recursion point §1 uses pulls this in for
|
|
free. `INSERT INTO archive WITH t AS (…) SELECT * FROM t` parses
|
|
and runs.
|
|
|
|
A user might reasonably expect WITH-prefixed DML to also work
|
|
(see §13 OOS-4). That's explicitly out of scope for Phase 3;
|
|
the discrepancy is documented here so the implementer doesn't
|
|
need to figure it out from the grammar shape.
|
|
|
|
## Implementation notes
|
|
|
|
Phased sub-plan with **explicit exit gates per sub-phase**. The
|
|
implementation plan doc (`docs/plans/<date>-adr-0033-phase-3.md`,
|
|
modelled on `docs/plans/20260520-adr-0032-phase-2.md`) refines
|
|
these and adds the cross-cut verification matrix in 3i.
|
|
|
|
**Ordering rationale.** The dispatch mechanism (3a) is the
|
|
foundation; if it fails to work, every later sub-phase has to
|
|
be reworked. Land it first on a no-op shape, then build the DML
|
|
grammar on top.
|
|
|
|
### 3a — `Node::Guard` + mode-gated Choice mechanism
|
|
|
|
**Scope (in):**
|
|
- Add a new walker node variant
|
|
`Node::Guard(fn(&WalkContext) -> Result<(), ValidationError>)`
|
|
(per §2 mechanism caveat) — zero-byte consumption, fails the
|
|
enclosing Seq with `ValidationFailed` when the validator
|
|
returns Err.
|
|
- Implement `walk_guard` in `src/dsl/walker/driver.rs`.
|
|
- Add the `reject_sql_in_simple_mode` guard function.
|
|
- Build a smoke-test grammar: an experimental `CommandNode`
|
|
with `shape: Choice(Seq[Guard(reject_sql_in_simple_mode),
|
|
SQL_branch_token], DSL_branch_token)`. Verify the Choice
|
|
falls through cleanly in Simple mode (DSL wins) and routes
|
|
SQL-first in Advanced mode.
|
|
|
|
**Scope (out):**
|
|
- Any real DML grammar (3b onwards).
|
|
- Wiring the existing `data::INSERT/UPDATE/DELETE` shapes (3j).
|
|
|
|
**Exit gate (required green tests):**
|
|
- `Node::Guard` admits via `walk_guard`; the walker driver's
|
|
match arm exists and is tested.
|
|
- Smoke-test grammar in Simple mode → DSL branch matches.
|
|
- Smoke-test grammar in Advanced mode + SQL-shape input → SQL
|
|
branch matches.
|
|
- Smoke-test grammar in Simple mode + SQL-shape input →
|
|
`ValidationFailed` with `message_key:
|
|
"advanced_mode.sql_in_simple"`.
|
|
- Smoke-test grammar in Advanced mode + DSL-shape input →
|
|
DSL branch matches (Choice fall-through works even when SQL
|
|
branch is admissible).
|
|
- All Phase-2 tests (1446 baseline) still green.
|
|
- Clippy clean.
|
|
|
|
**DA gate (written):**
|
|
- Does the Guard actually fire BEFORE the Seq commits? Failure
|
|
mode: the Seq processes the Guard, the Guard fails, but the
|
|
Seq has already "committed" to its branch and the Choice
|
|
can't fall through. Test specifically: Advanced mode + an
|
|
input that the DSL branch matches but the SQL branch's
|
|
remaining tokens reject → must fall through to DSL, not
|
|
Failed.
|
|
- Is the Guard mechanism reusable beyond DML (e.g., for
|
|
Phase 4 DDL's similar dispatch)? If yes, document the
|
|
pattern in `src/dsl/grammar/mod.rs` doc comments so the
|
|
next phase doesn't reinvent it.
|
|
- If the Guard mechanism doesn't work (R1's worst case), the
|
|
sub-phase REOPENS with mechanism option (b) or (c)
|
|
evaluated before any DML grammar lands on top of broken
|
|
scaffolding.
|
|
|
|
### 3b — INSERT grammar + minimal execution
|
|
|
|
**Scope (in):**
|
|
- `SQL_INSERT_SHAPE`: single-row + multi-row VALUES; the
|
|
optional `(column_name_list)`; the `__rdbms_*` rejection on
|
|
the table-name slot.
|
|
- `Command::SqlInsert { sql, source, target_table,
|
|
listed_columns, returning: false }` — `returning` always
|
|
false in 3b.
|
|
- `Request::RunSqlInsert` + `do_sql_insert` worker handler
|
|
(no shortid auto-fill yet — explicit values required).
|
|
- Persistence write-through on the target table after
|
|
execution.
|
|
- History-log of the literal submitted line.
|
|
|
|
**Scope (out):**
|
|
- `INSERT … SELECT` (3c).
|
|
- `RETURNING` (3e).
|
|
- `UPSERT` (3f).
|
|
- `shortid` auto-fill (3d).
|
|
- Diagnostics (3g).
|
|
- DSL-vs-SQL dispatch wiring on `data::INSERT` (3h —
|
|
development uses a separate `data::SqlInsert` CommandNode).
|
|
|
|
**Exit gate (required green tests):**
|
|
- **Single-row INSERT** runs end-to-end; affected-row count
|
|
surfaces; CSV is re-persisted.
|
|
- **Multi-row INSERT** (`INSERT INTO t VALUES (1, 'a'), (2, 'b')`)
|
|
runs; both rows land; CSV reflects them.
|
|
- **`(column_name_list)`** variant works (`INSERT INTO t (a, b)
|
|
VALUES (1, 'x')`).
|
|
- **`__rdbms_*` rejection** at the INSERT target-table slot
|
|
(per ADR-0030 §6 / ADR-0032 §1).
|
|
- **History line** is the literal submitted SQL.
|
|
- All Phase-2 tests (1446 baseline) still green.
|
|
|
|
**DA gate (written):**
|
|
- Does an INSERT failing mid-multi-row leave persistence in a
|
|
consistent state? (Engine transactional behaviour applies;
|
|
worker must not re-persist on failure.)
|
|
- Does the `target_table` extraction handle case-folding the
|
|
same way the rest of the codebase does?
|
|
- Are the existing DSL INSERT tests still green? Any DSL-side
|
|
regression means 3b broke something it shouldn't have.
|
|
|
|
### 3c — INSERT … SELECT
|
|
|
|
**Scope (in):**
|
|
- Add the row-source `Choice` to `SQL_INSERT_SHAPE`:
|
|
`values_clause | Subgrammar(&sql_select::SQL_SELECT_COMPOUND)`.
|
|
- Update `do_sql_insert` to handle the SELECT-driven path
|
|
(no special worker work needed beyond preparing the
|
|
composite statement; the engine handles the
|
|
insert-from-query flow).
|
|
|
|
**Exit gate:**
|
|
- `INSERT INTO archive SELECT * FROM orders WHERE …` runs.
|
|
- `INSERT INTO target (a, b) SELECT x, y FROM source` runs.
|
|
- `INSERT INTO t WITH x AS (…) SELECT * FROM x` runs
|
|
(R4 invariant — Phase-2 CTE prefix on SELECT applies).
|
|
- Schema-existence diagnostics fire on the SELECT's projection
|
|
(Phase-2 machinery, free).
|
|
|
|
**DA gate (written):**
|
|
- Does the WITH-prefixed SELECT row source actually parse
|
|
through SQL_SELECT_COMPOUND, or does the INSERT prefix
|
|
break the recursion? Test on a real `WITH … INSERT … SELECT`.
|
|
- Does the cascade-summary path (3e DELETE) accidentally apply
|
|
to INSERTs whose SELECT touches the target table? (It must
|
|
not — INSERTs have no cascade.)
|
|
|
|
### 3d — `shortid` auto-fill (worker)
|
|
|
|
**Scope (in):**
|
|
- Detect omitted `shortid` columns at execution time in
|
|
`do_sql_insert`.
|
|
- Synthesise fresh shortid values; bind them positionally.
|
|
- Cover both VALUES and SELECT row sources (for SELECT, the
|
|
shortid columns are added to the projection's bind list
|
|
before execution).
|
|
|
|
**Exit gate:**
|
|
- `INSERT INTO t (name) VALUES ('x')` where `t` has a `pk
|
|
shortid` column auto-fills that PK.
|
|
- An explicit value for a shortid column is respected
|
|
(override behaviour — warning fires per 3g but the value
|
|
is preserved).
|
|
- Multi-row INSERT produces distinct shortids per row.
|
|
- INSERT … SELECT auto-fills shortids for each yielded row.
|
|
|
|
**DA gate (written):**
|
|
- Are the synthesised shortids actually unique across a
|
|
multi-row INSERT? (The generator should be called per-row,
|
|
not once.)
|
|
- Does the auto-fill respect the column's case-preserved name?
|
|
- What happens if the table has both `serial` and `shortid`
|
|
PK columns? (Spec: serial via SQLite rowid; shortid via
|
|
worker post-fill. Verify no double-fill collision.)
|
|
|
|
### 3e — UPDATE grammar + execution
|
|
|
|
**Scope (in):**
|
|
- `SQL_UPDATE_SHAPE`: target table, SET assignment list,
|
|
optional WHERE.
|
|
- `Command::SqlUpdate`, `Request::RunSqlUpdate`,
|
|
`do_sql_update`.
|
|
- Persistence write-through on the target table.
|
|
|
|
**Exit gate:**
|
|
- Single-column UPDATE with WHERE runs; affected-row count
|
|
surfaces; CSV reflects changes.
|
|
- Multi-column UPDATE (`SET a = 1, b = 2`) runs.
|
|
- UPDATE without WHERE runs (per ADR-0030 §12 — no
|
|
`--all-rows` guard).
|
|
- Schema-existence diagnostic fires on unknown columns in
|
|
SET and WHERE.
|
|
|
|
**DA gate (written):**
|
|
- Does the SET assignment expression accept the full
|
|
`sql_expr` (function calls, subqueries, CASE)? Verify on a
|
|
representative expression.
|
|
- Does an UPDATE that touches an auto-gen column produce the
|
|
3g auto-gen warning?
|
|
|
|
### 3f — DELETE grammar + execution + cascade summary
|
|
|
|
**Scope (in):**
|
|
- `SQL_DELETE_SHAPE`: target table, optional WHERE.
|
|
- `Command::SqlDelete`, `Request::RunSqlDelete`,
|
|
`do_sql_delete`.
|
|
- Cascade-summary pre-count and post-execution formatting
|
|
per §7.
|
|
- Persistence write-through on target + affected child tables
|
|
(cascade-affected tables also need re-persistence).
|
|
|
|
**Exit gate:**
|
|
- DELETE with WHERE runs; affected-row count + cascade
|
|
summary surface; CSVs of target AND cascade-affected
|
|
children are re-persisted.
|
|
- DELETE without WHERE runs (no `--all-rows` rail).
|
|
- Cascade summary matches the DSL `do_delete` output on the
|
|
same schema/data.
|
|
- The cascade-summary code path is shared (no duplicate
|
|
formatter).
|
|
|
|
**DA gate (written):**
|
|
- R2 mitigation: does the predicate-extraction byte-range
|
|
injection work on a WHERE containing a subquery? Test
|
|
`DELETE FROM a WHERE id IN (SELECT id FROM b WHERE …)`.
|
|
- Does the cascade pre-count run in a way that doesn't see
|
|
the post-DELETE state? (Order matters: pre-count BEFORE
|
|
execute.)
|
|
|
|
### 3g — RETURNING
|
|
|
|
**Scope (in):**
|
|
- Add `RETURNING projection_list` to all three top-level
|
|
shapes.
|
|
- Walker captures `returning: bool` on the Command variants.
|
|
- Worker handlers route to `DataResult` collection when
|
|
`returning == true`.
|
|
- Result-column type recovery via the Phase-2 column-origin
|
|
path.
|
|
|
|
**Exit gate:**
|
|
- `INSERT … RETURNING *` returns the inserted row(s) as a
|
|
DataResult; affected-row count is implicit (rows.len()).
|
|
- `UPDATE … RETURNING id, new_value` returns the modified
|
|
columns.
|
|
- `DELETE … RETURNING *` returns what was just deleted
|
|
(BEFORE the row is gone).
|
|
- Result-column types recovered for bare-column refs
|
|
(per ADR-0032 Amendment 1).
|
|
- The cascade summary on a `DELETE … RETURNING` is preserved
|
|
alongside the result set (both surface to the user).
|
|
|
|
**DA gate (written):**
|
|
- Does the RETURNING result-set rendering use the same
|
|
DataResult path SELECT uses? Or did a DML-specific copy
|
|
sneak in?
|
|
- Does `RETURNING expr AS alias` admit aliases the same way
|
|
the projection list does?
|
|
|
|
### 3h — UPSERT (ON CONFLICT … DO NOTHING / DO UPDATE)
|
|
|
|
**Scope (in):**
|
|
- Add `on_conflict_clause` to `SQL_INSERT_SHAPE`.
|
|
- Admit `excluded` as a table alias inside DO UPDATE
|
|
expressions (§9's clarification).
|
|
- Walker captures the UPSERT shape; worker executes as-is
|
|
(no special pre-processing).
|
|
|
|
**Exit gate:**
|
|
- `INSERT … ON CONFLICT (id) DO NOTHING` runs on a
|
|
conflicting row; affected-row count is 0.
|
|
- `INSERT … ON CONFLICT (id) DO UPDATE SET v = excluded.v`
|
|
runs; the conflicting row is updated to the new value.
|
|
- `ON CONFLICT DO NOTHING` (no target spec) handles any
|
|
conflict.
|
|
- `excluded.col` completes / highlights as a column ref to the
|
|
target table's columns inside the DO UPDATE expressions.
|
|
|
|
**DA gate (written):**
|
|
- Does ON CONFLICT … DO UPDATE trigger persistence
|
|
write-through? (Yes; the target row is modified.)
|
|
- Is the `excluded` alias scoped to ONLY the DO UPDATE
|
|
branch, not leaking into the outer INSERT?
|
|
|
|
### 3i — Diagnostics
|
|
|
|
**Scope (in):**
|
|
- `diagnostic.insert_arity_mismatch` (§8.1).
|
|
- `diagnostic.auto_column_overridden` (§8.2).
|
|
- `diagnostic.not_null_missing` (§8.3).
|
|
- Per-key positive + negative tests + catalog entries.
|
|
|
|
**Exit gate:**
|
|
- One positive + one negative test per new key (the
|
|
ADR-0032 §2d exit gate pattern):
|
|
- `insert_arity_mismatch`: `INSERT INTO t (a, b) VALUES
|
|
(1, 2, 3)` fires; matched arity doesn't.
|
|
- `auto_column_overridden`: `INSERT INTO t (id, name)
|
|
VALUES (5, 'x')` (id is `serial`) fires; omitting `id`
|
|
doesn't.
|
|
- `not_null_missing`: `INSERT INTO t (a) VALUES (1)`
|
|
(b is NOT NULL no default) fires; including `b` doesn't.
|
|
- Multi-row arity mismatch emits per-row, not just on the
|
|
first offending row.
|
|
- The inherited Phase-2 diagnostic passes
|
|
(`schema_existence`, `sql_predicate_warnings`,
|
|
`compound_arity_diagnostics`) still fire on DML's
|
|
`sql_expr` slots.
|
|
|
|
**DA gate (written):**
|
|
- Does `auto_column_overridden` produce a Warning, not an
|
|
Error? (Spec: WARNING — the statement still runs.)
|
|
- Does `not_null_missing` produce a Warning? (Same.)
|
|
- Are the catalog wordings engine-neutral?
|
|
|
|
### 3j — Dispatch wiring on shared entry words
|
|
|
|
**Scope (in):**
|
|
- Convert `data::INSERT`, `data::UPDATE`, `data::DELETE`
|
|
shapes to `Choice(SQL_shape, DSL_shape)`.
|
|
- Remove the experimental separate `data::SqlInsert` /
|
|
`SqlUpdate` / `SqlDelete` CommandNodes that 3a-3i used as
|
|
scaffolding.
|
|
- Verify DSL-only inputs still match the DSL branch in both
|
|
modes.
|
|
- Verify SQL-only inputs match the SQL branch in Advanced
|
|
mode and fail with "this is SQL" in Simple mode.
|
|
|
|
**Exit gate:**
|
|
- All existing DSL INSERT/UPDATE/DELETE tests still green
|
|
(unmodified inputs, unmodified outputs).
|
|
- All Phase-3 SQL DML tests still green via the new dispatch
|
|
path (the SQL branch is reached through `data::INSERT` etc.,
|
|
not through `data::SqlInsert`).
|
|
- A structurally-ambiguous input (`delete from t where id =
|
|
1`) in Advanced mode routes through the SQL branch
|
|
(verifiable by checking that the cascade-summary code path
|
|
fired — DSL and SQL paths share the formatter but the entry
|
|
point differs).
|
|
- A DSL-specific input (`delete from t --all-rows`) in
|
|
Advanced mode falls through to the DSL branch.
|
|
|
|
**DA gate (written):**
|
|
- Verify that no test was changed to make 3j pass — only the
|
|
shape's structure changes; the observable behaviour is
|
|
identical.
|
|
- Is the test count higher than before 3j? It should be (the
|
|
scaffolding CommandNodes had their own tests; now the
|
|
shared CommandNodes carry both).
|
|
|
|
### 3k — Verification sweep
|
|
|
|
**Scope (in):**
|
|
- Fill in the cross-cut verification matrix
|
|
(`docs/plans/<date>-adr-0033-phase-3.md`).
|
|
- Produce the phase-exit verification report.
|
|
- DA final review (genuine, not rubber-stamp — see Phase 2's
|
|
process lesson in `docs/handoff/20260520-handoff-28.md` §4).
|
|
|
|
**Exit gate:**
|
|
- Every matrix row green with a concrete `file::function`
|
|
test reference.
|
|
- Final test run: zero failures, zero skips, no regressions
|
|
from the Phase-2 baseline of 1446 / 0 / 1.
|
|
- Clippy clean.
|
|
- Report at `docs/handoff/<date>-phase-3-verification.md`
|
|
documents all autonomous decisions with explicit
|
|
user-confirmation pointers and the DA's written critiques
|
|
(not just the verdict).
|
|
|
|
**DA gate (written):**
|
|
- Did filling the matrix surface gaps that need closing?
|
|
(Phase 2 surfaced THREE; expect at least one for Phase 3.)
|
|
- Are any sub-phase exit gates above weaker than they should
|
|
be in hindsight?
|
|
- Did any sub-phase make an autonomous decision that's not
|
|
in this ADR? If so, flag for retroactive user approval
|
|
before phase-exit.
|
|
|
|
## Out of scope (clarifications already established)
|
|
|
|
- **No engine type names** in or out (ADR-0030 §5 / §7).
|
|
- **No `STRICT`**, no SQLite storage clauses (ADR-0030 §7).
|
|
- **No transaction control** (ADR-0030 §13 OOS-4).
|
|
- **No multi-statement batches** (ADR-0030 §13 OOS-4).
|
|
- **No `EXPLAIN` of SQL DML** (ADR-0030 §13 OOS-2 — the DSL
|
|
`explain` still wraps what it already wraps).
|
|
|
|
## Amendment 1 — Dispatch mechanism: category-grouped dispatch supersedes `Node::Guard` (2026-05-21)
|
|
|
|
This amendment **supersedes §2's "Mode-gating mechanism"
|
|
subsection** and **revises sub-phase 3a's scope**. It was
|
|
written during 3a implementation, after tracing the proposed
|
|
mechanism against the actual walker code, and is recorded with
|
|
explicit user approval before any code landed.
|
|
|
|
### The finding — option (a) does not work as §2 frames it
|
|
|
|
§2 selected option (a): a standalone zero-byte `Node::Guard(fn)`
|
|
placed as the first element of the SQL branch's `Seq`, inside a
|
|
`Choice(SQL_shape, DSL_shape)`. §2 framed this as the "smallest
|
|
mechanism change" that "reuses the existing validator path" with
|
|
the failure surfacing as the same `WalkOutcome::ValidationFailed`
|
|
as the entry-word gate — implying **no `walk_choice` change**.
|
|
|
|
Tracing the four required behaviours against
|
|
`src/dsl/walker/driver.rs` shows that framing is inaccurate, and
|
|
that **any** guard-in-`Choice` mechanism is forced to modify
|
|
`walk_choice`:
|
|
|
|
1. **`walk_choice` returns immediately on a branch `Failed`**
|
|
(`driver.rs` `walk_choice`) — it only falls through to the
|
|
next branch on `NoMatch`. So in **Simple mode**,
|
|
`delete from t where id = 1` (which is *valid DSL*) would hit
|
|
the SQL branch's guard, the guard would fail, and the user
|
|
would see the "this is SQL" hint instead of the statement
|
|
running as DSL. That contradicts §2's own "Simple mode → DSL
|
|
only" promise. Fixing it requires a `walk_choice` change
|
|
regardless of how the guard node is shaped.
|
|
|
|
2. **`walk_seq` treats a `NoMatch` at `idx > 0` as a hard
|
|
`Failed`** (`driver.rs` `walk_seq`). A zero-byte Guard at
|
|
`idx 0` still advances `idx` to 1, pushing the SQL branch's
|
|
first real token to `idx 1`. So in **Advanced mode**,
|
|
`delete from t --all-rows` (DSL-only) cannot fall through to
|
|
DSL — the SQL branch returns `Failed`, and `walk_choice`
|
|
won't try the DSL branch. The literal form needs an
|
|
*additional* `walk_seq` semantics change (`idx == 0` → "no
|
|
bytes consumed yet").
|
|
|
|
This is exactly the R1 risk the ADR budgeted for ("if `walk_seq`
|
|
commits the Seq before the validator runs … the Choice can't
|
|
fall through").
|
|
|
|
### The replacement — category-grouped, mode-aware dispatch
|
|
|
|
Rather than gate a `Choice` branch from inside the recursive
|
|
grammar, the dispatch decision moves up to the command
|
|
dispatcher (`walker::walk`), which is **where mode gating
|
|
already lives**: the existing whole-command gate
|
|
(`is_advanced_only` + the Simple-mode short-circuit in `walk`)
|
|
is precisely this pattern, hardcoded for `select` / `with`. The
|
|
amendment generalises that existing code instead of inventing a
|
|
parallel in-grammar mechanism.
|
|
|
|
Concretely:
|
|
|
|
- **Each registry entry carries a category** —
|
|
`CommandCategory::{Simple, Advanced}`. The category lives at
|
|
the registration site (the `REGISTRY` table), not as a field
|
|
on every `CommandNode` literal, because category is a
|
|
dispatcher concern, not intrinsic to a command's grammar.
|
|
`select` / `with` (and, from 3b, the SQL `insert` / `update` /
|
|
`delete` nodes) are `Advanced`; everything else is `Simple`.
|
|
This subsumes `ADVANCED_ONLY_ENTRIES`: `is_advanced_only(entry)`
|
|
becomes "every registry candidate for this entry word is
|
|
`Advanced`".
|
|
|
|
- **A shared entry word has a node in both groups** — e.g. a
|
|
`Simple` DSL `insert` node and an `Advanced` SQL `insert`
|
|
node, both with entry word `insert`. The entry-word lookup
|
|
returns *all* candidates; the dispatcher selects by mode.
|
|
|
|
- **Mode-aware selection in `walk` (no combinator changes):**
|
|
- **Simple mode** considers `Simple` candidates. If a Simple
|
|
candidate exists, it is committed. If none exists but an
|
|
`Advanced` candidate does (a pure-SQL entry word like
|
|
`select`), the dispatcher emits the
|
|
`advanced_mode.sql_in_simple` `ValidationFailed` — identical
|
|
to today's whole-command gate.
|
|
- **Advanced mode** tries `Advanced` candidate(s) first, then
|
|
the `Simple` candidate as a fallback. The first candidate
|
|
that fully matches wins (`delete from t where id = 1` →
|
|
SQL; `delete from t --all-rows` → falls through to DSL
|
|
because the SQL shape doesn't match `--all-rows`).
|
|
|
|
- **No `Node::Guard`, no `walk_choice` / `walk_seq` change.** The
|
|
recursive combinators are untouched; each candidate is walked
|
|
cleanly and independently. Speculative evaluation runs on a
|
|
scratch `WalkContext`; only the committed candidate is walked
|
|
into the caller's context, so post-walk context state
|
|
(completion / hint accumulators) always reflects the chosen
|
|
candidate.
|
|
|
|
### The two details (user-approved)
|
|
|
|
1. **"This is SQL" hint in Simple mode for SQL-shaped input.**
|
|
For SQL-only input in Simple mode (e.g.
|
|
`delete … returning *`), the Simple candidate won't match.
|
|
To keep parity with `select` / `with`, the dispatcher
|
|
speculatively checks whether the `Advanced` candidate *would*
|
|
match; if so, it emits the `advanced_mode.sql_in_simple`
|
|
hint rather than a generic DSL parse error. Lives entirely
|
|
in `walk`.
|
|
|
|
2. **Advanced-mode completion is SQL-first, DSL as full-line
|
|
fallback.** The `Choice` approach would have unioned SQL and
|
|
DSL continuations mid-typing for free; the grouping approach
|
|
surfaces one candidate's walk for completion / hints. The
|
|
decision is to show SQL completions primarily in Advanced
|
|
mode and fall to DSL only on a full DSL-shaped line — simpler
|
|
and matching the "SQL-first, DSL as familiar fallback"
|
|
intent. No unioned completion.
|
|
|
|
### Consequences of the amendment
|
|
|
|
- **`Node::Guard` is NOT added.** §2's mechanism caveat,
|
|
Consequences bullet 1 ("the walker gains a new node variant
|
|
`Node::Guard`"), and the original 3a scope are withdrawn.
|
|
- **`REGISTRY`'s element type changes** to carry the category
|
|
alongside each `&CommandNode`. The handful of registry
|
|
iteration sites (entry-word listing, `command_for_entry_word`,
|
|
the completion entry-word filters, `note_help`) destructure the
|
|
tuple; no behavioural change for the current single-candidate
|
|
registry.
|
|
- **The `Command` variants (§10), worker handlers, RETURNING
|
|
flag, cascade-summary plan, shortid auto-fill, and diagnostics
|
|
are unaffected** — this amendment is purely about how the
|
|
dispatcher routes a shared entry word to the right grammar.
|
|
- **Sub-phase 3a now validates the dispatch mechanism**, not the
|
|
guard: a smoke registry with a shared entry word (a `Simple`
|
|
and an `Advanced` node using distinguishable tokens) exercises
|
|
the Simple/Advanced selection, the fallback, and the
|
|
"this is SQL" path. The 3a exit gate's four cases map onto the
|
|
new mechanism unchanged in spirit.
|
|
- **3b adds the first real shared entry words** to `REGISTRY`
|
|
(the SQL `insert` / `update` / `delete` nodes). At that point
|
|
the completion entry-word *lists* will need to de-duplicate
|
|
the shared primary (two candidates, one entry word) — tracked
|
|
for 3b, not 3a.
|
|
- The mechanism remains reusable for Phase 4 DDL's analogous
|
|
shared-entry problem (`create`, …): tag the SQL DDL nodes
|
|
`Advanced`; the dispatcher handles the rest.
|
|
|
|
## See also
|
|
|
|
- ADR-0005 — the ten-type vocabulary INSERT works with.
|
|
- ADR-0006 — the auto-snapshot before destructive ops.
|
|
- ADR-0014 — the DSL DML model + cascade-summary precedent.
|
|
- ADR-0015 — the persistence write-through path.
|
|
- ADR-0018 — the DSL INSERT forms (A / B / C).
|
|
- ADR-0019 — the friendly-error layer DML errors route through.
|
|
- ADR-0027 — the validity indicator surfaces DML diagnostics.
|
|
- ADR-0030 — the SQL surface in advanced mode; Phase 3 is this
|
|
ADR's commission.
|
|
- ADR-0031 — the SQL expression grammar, consumed by DML
|
|
WHERE / SET / VALUES / RETURNING.
|
|
- ADR-0032 — the SQL SELECT grammar; `INSERT … SELECT` and
|
|
RETURNING reuse its projection-list and scope-frame machinery.
|