From 10e5197c197709169e4d64207cca901e3d524b1b Mon Sep 17 00:00:00 2001 From: "claude@clouddev1" Date: Fri, 29 May 2026 20:45:21 +0000 Subject: [PATCH] feat: bring simple-mode insert arity diagnostics to parity with advanced MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A wrong-count simple-mode insert now shows the friendly per-column arity message at typing time (instead of a bare "expected `,`/`)`") and is blocked from dispatch at submit — unifying simple and advanced mode onto the one ADR-0027 model (structural parse + ERROR diagnostic), where they had diverged. Grammar: a simple-mode-only arity gate (dsl_insert_value_list) routes a wrong-count DSL insert tuple to the type-blind fallback so it matches structurally and the per-tuple arity diagnostic fires. The gate is gated to simple mode, so advanced behaviour is unchanged. count_tuple_values and the target-column selection (insert_target_columns) are now shared by both grammars. Diagnostic: dml_insert_arity_diagnostics is mode-aware — advanced Form B expects all columns; simple Form B/C expects the user-fillable columns (serial/shortid auto-fill). It counts the DSL Form A role and scans the keyword-less Form C tuple. New catalog keys name the fillable/auto split and the all-auto-table case. Submit: a wrong-count DSL insert now parses Ok + carries the ERROR diagnostic, so a unified Ok-arm pre-flight (dsl_insert_count_mismatch_notes) blocks dispatch and teaches; the previous Err-arm note retires. advanced_alternative_note's gate now reads the validity verdict so it still fires for the parse-Ok-with-error shape. Docs: ADR-0036 Amendment 2 (+ README index) and requirements.md H1a. --- docs/adr/0036-typed-dml-values-vs-verbatim.md | 61 +++ docs/adr/README.md | 2 +- docs/requirements.md | 13 +- src/app.rs | 90 ++++- src/dsl/grammar/data.rs | 67 +++- src/dsl/grammar/shared.rs | 152 +++++--- src/dsl/grammar/sql_insert.rs | 56 +-- src/dsl/walker/mod.rs | 360 +++++++++++++++--- src/friendly/keys.rs | 9 + src/friendly/strings/en-US.yaml | 10 +- src/input_render.rs | 94 ++++- tests/typing_surface/insert_form_b.rs | 43 ++- tests/typing_surface/insert_form_c.rs | 25 +- ..._is_invalid@form_b_extra_serial_value.snap | 29 +- ..._at_close_paren@form_b_too_few_values.snap | 12 +- ...e_count_is_invalid@form_c_wrong_count.snap | 29 +- 16 files changed, 812 insertions(+), 240 deletions(-) diff --git a/docs/adr/0036-typed-dml-values-vs-verbatim.md b/docs/adr/0036-typed-dml-values-vs-verbatim.md index a571772..1771ded 100644 --- a/docs/adr/0036-typed-dml-values-vs-verbatim.md +++ b/docs/adr/0036-typed-dml-values-vs-verbatim.md @@ -422,6 +422,67 @@ So the live highlight catches *numeric-shape* mismatches (`int`/`decimal`/ `bool`), not malformed dates. The column-type **hint** still shows for every type. +## Amendment 2 — Simple-mode arity-diagnostic parity (issue #17, 2026-05-29) + +Phase 3b (Amendment 1) introduced, for the **advanced** grammar, an +*arity-gating tuple lookahead* (`tuple_value_list`, `sql_insert.rs`): +a wrong-count `VALUES` tuple is routed to the type-blind fallback so it +**structurally matches** and the §8.1 arity diagnostic +(`dml_insert_arity_diagnostics`) fires its friendly *"N column(s) … M +value(s)"* message instead of the engine's raw error. + +The **simple-mode DSL** insert (`data.rs` → `column_value_list`) never +had that gate: its value list is a fixed-length typed `Seq`, so a +wrong-count tuple **Mismatched** (parse error) and the arity diagnostic +never ran. Simple-mode learners — the primary audience — got a bare +`expected `,`/`)`` while advanced mode got the friendly message. Issue +#1 had papered over this with a *submit-time* teaching note bolted onto +the parse-error path; that workaround existed only because the +diagnostic couldn't fire in simple mode. + +**This amendment brings the two modes onto one model.** + +1. **Grammar (simple-mode only).** A `tuple_value_list`-style gate + (`dsl_insert_value_list`, `data.rs`) routes a wrong-count DSL insert + tuple to the shared type-blind `FALLBACK_VALUE_LIST`, so it matches + and the diagnostic fires. The gate is **gated to `Mode::Simple`**: in + advanced mode the DSL insert node stays strict (a non-SQL shape like + Form C must not spuriously match — advanced inserts are owned by + `sql_insert.rs`), so **advanced behaviour is byte-for-byte + unchanged**. `count_tuple_values` + the target-column selection + (`insert_target_columns`) are now shared by both grammars so the gate + and the typed slots never disagree. + +2. **Diagnostic is mode-aware.** `dml_insert_arity_diagnostics` gains a + `mode` parameter. The expected count differs by mode **because the + modes' auto-fill differs** (the very distinction this ADR preserves): + - **advanced Form B** auto-fills nothing → expects **all** columns + (unchanged); + - **simple Form B/C** auto-fills `serial`/`shortid` (ADR-0018 §3) → + expects the **user-fillable** columns. + It also now counts the DSL Form A column role (`insert_first_item`) + alongside the SQL `insert_column`, and scans the simple-mode Form C + tuple (no `values` keyword). New catalog keys + `diagnostic.insert_arity_mismatch_form_b_simple` (names the fillable + *and* the auto-generated columns) and + `diagnostic.insert_arity_mismatch_all_auto` (every column + auto-generated). + +3. **Unified submit pre-flight.** A wrong-count DSL insert now parses + `Ok` (so the typing-time diagnostic can fire), so dispatch is gated + by an Ok-arm pre-flight (`dsl_insert_count_mismatch_notes`) that + **blocks `ExecuteDsl`** and shows the teaching note — mirroring the + advanced Ok-arm pre-flight (`form_b_positional_count_mismatch_note`). + The issue #1 Err-arm note retires. The user-facing invalidity is the + ADR-0027 validity verdict (the `[ERR]` indicator), which reads the + diagnostic; the structural `InputState` is now `Valid` (it parses). + +**Scope guard.** This is *arity-diagnostic UX parity only*. It does +**not** consolidate value-handling, execution, or auto-fill across modes +— the deliberate mode-distinctness this ADR was narrowed to preserve +stands. The count difference per mode is a *consequence* of that +distinctness, not a violation of it. + ## See also - ADR-0030 §4 / ADR-0033 §10 — the execute-path this ADR **augments** diff --git a/docs/adr/README.md b/docs/adr/README.md index 88eebd2..a23a129 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -41,7 +41,7 @@ This directory contains the project's ADRs, recorded per - [ADR-0033 — The full SQL DML grammar (`INSERT` / `UPDATE` / `DELETE`)](0033-sql-dml-grammar.md) — **Accepted** (implemented + verified through sub-phase 3k, 2026-05-23; phase-exit report `docs/handoff/20260523-phase-3-verification.md`), the Phase-3 grammar commissioned by ADR-0030 §3: single- and multi-row `INSERT` (incl. `INSERT … SELECT` recursing through ADR-0032's `SQL_SELECT_COMPOUND`), `UPDATE` with `SET` assignment list, `DELETE`, all three optionally followed by `RETURNING projection_list`, plus full `ON CONFLICT … DO NOTHING / DO UPDATE` UPSERT on INSERT; **fixes the DSL-vs-SQL dispatch architecture for shared entry words (`insert`/`update`/`delete`)**: SQL-first / DSL-fallback in Advanced mode via a `Choice(SQL_shape, DSL_shape)` per shape, gated by a new walker capability `Node::Guard(fn)` — a zero-byte-consumption gating node that fails the enclosing Seq with a `ValidationError`; carries `Command::SqlInsert` / `SqlUpdate` / `SqlDelete` variants and `do_sql_*` worker handlers each of which knows the target table (for re-persistence) and the `returning: bool` flag (for DataResult routing); `shortid` auto-fill mirrors the DSL `do_insert` mechanism via worker post-fill; SQL DELETE produces the same per-relationship cascade summary the DSL DELETE does (ADR-0014 parity); three new walker diagnostics (`insert_arity_mismatch` ERROR, `auto_column_overridden` WARNING, `not_null_missing` WARNING) with positive + negative tests each; OOS list explicitly carves out `DEFAULT VALUES` (the project's planned seed feature), SQLite-specific `OR REPLACE` / `OR IGNORE` / `OR ABORT` / `OR FAIL` / `OR ROLLBACK` prefixes, `UPDATE FROM` multi-table updates, and WITH-prefixed DML; the `excluded` keyword inside `ON CONFLICT DO UPDATE` is a deliberate carve-out from ADR-0030 §7's engine-neutral posture (no standard-SQL UPSERT spelling exists that SQLite and PostgreSQL share); eleven phased sub-phases each with explicit exit gates + written DA gate, opening with the dispatch mechanism before any DML grammar lands; initial DA review recorded seven critiques that were resolved before status moved to Proposed; **Amendment 1 supersedes §2's dispatch mechanism**: the originally-chosen `Node::Guard(fn)` + `Choice(SQL_shape, DSL_shape)` was found during 3a to be unworkable as framed (any guard-in-`Choice` mechanism forces a `walk_choice` change — `walk_choice` only falls through on `NoMatch`, so Simple-mode valid-DSL would wrongly surface "this is SQL", and `walk_seq` treats a `NoMatch` past `idx 0` as a hard `Failed`, breaking Advanced-mode DSL fall-through); replaced by **category-grouped, mode-aware dispatch** in `walker::walk` (each `REGISTRY` entry tagged `CommandCategory::{Simple, Advanced}`, generalising the existing whole-command `is_advanced_only` gate), shared entry words carrying a node in both groups, no `Node::Guard` and no `walk_choice`/`walk_seq` change, advanced-mode completion SQL-first with DSL as a full-line fallback; **Amendment 2 (sub-phase 3f) supersedes §7's cascade mechanism**: the WHERE-injected per-child pre-count rested on a premise that was factually wrong about the DSL handler (which detects cascades by before/after row-count diffing inside a transaction, not by `Expr`-derived pre-count subqueries) and would have broken the §2 parity promise by reporting `SET NULL` the DSL path doesn't; replaced by mirroring `do_delete`'s count-diff exactly (verbatim DELETE executes, child-count diff observes the cascade — `ON DELETE CASCADE` row removals only, SET NULL deferred for both paths to preserve parity), which shares the render-layer formatter for free via `CommandOutcome::Delete` and **withdraws risk R2** (no WHERE-byte extraction, no N+1 subquery); **Amendment 3 (sub-phase 3j) records the command-identity model and defers the execution-mode side-channel**: a command is the typed outcome of a *mode-rooted* grammar path and its identity is intrinsic (Advanced mode tries SQL first, falls back to the *Simple* DSL command when no SQL branch matches a token, e.g. `delete … --all-rows`; note `update … --all-rows` does *not* fall back — the SQL `SET` expression eats `--all-rows`, harmless since the engine treats it as a comment); **Simple mode commits the DSL candidate for shared words** so the *real* DSL error surfaces, and when that line would also run in advanced mode the rendering layer **combines** them — DSL error **plus** an `advanced_mode.also_valid_sql` pointer ("… (valid as SQL in advanced mode)") — keeping the actionable DSL fix while pointing at advanced mode; bare "this is SQL" is reserved for entry words with no DSL form (`select`/`with`); a fully-overlapping input (`insert … values …`) legitimately yields *two distinct commands* (`Command::Insert` typed-AST vs `Command::SqlInsert` validated-text) that do the same thing but execute differently (ADR-0030 §4), so each is tested in the mode that produces it; **corrects the plan's 3j exit-gate premise** that the DSL DML tests run in Simple mode (they call `parse_command`, which defaults to Advanced) — the real invariant is "Simple-mode behaviour unchanged, Advanced mode SQL-first, DSL grammar tested in Simple mode, both variants tested in their producing mode", with §6/§7 parity keeping the paths observably equivalent; and **defers to its own future ADR** the execution-time mode side-channel (three-way `Mode`: simple/advanced/advanced-one-shot threaded through `Action`→worker, for mode-dependent *output* like echoing generated SQL) — today only the *rendering* side-channel `OutputLine.mode_at_submission` exists, and the three-way distinction is not required for Phase 3 dispatch correctness; **Amendment 4, 2026-05-27** (design agreed, pending impl): **reverses Amendment 3's `update … --all-rows` counter-example as a bug** — surfaced by the ADR-0038 echo design. The walker has no `--` comment support (it lexes two minus operators) while the engine treats `--` as a comment, so `update T set x=42 --all-rows` was silently parsed as the expression `42 - -all - rows` over **non-existent columns** `all`/`rows` (an ADR-0027 "flag-if-it-will-fail" case) and matched `SqlUpdate`. Decision: the `--all-rows` sequence makes the SQL `UPDATE` shape **fail**, so dispatch **falls back to the DSL `Update { AllRows }`** — symmetry with `delete … --all-rows`; no `--` comment feature introduced (trailing comments stay unsupported). Inverts `sql_dml_e2e.rs::e2e_update_all_rows_in_advanced_does_not_fall_back_to_dsl`; mechanism settled test-first in the build; folded into the ADR-0038 effort (makes `update … --all-rows` echoable); **Amendment 5, 2026-05-28** (implemented + verified, user-confirmed): `advanced_mode.also_valid_sql` (the cross-mode pointer from Amendment 3) fires on **validity**, not just **parse** — *"valid"* meaning `input_verdict_in_mode(input, schema, Mode::Advanced) == None` in the ADR-0027 sense (parse succeeds *and* no Warning/Error diagnostic from any pass). Surfaced by **issue #1**: a positional `INSERT INTO T VALUES (…)` (no column list) with a value count that didn't match the target's column count parsed in advanced but failed at the engine, so the syntactic-only Amendment-3 gate promised a mode switch that wouldn't help. Closes the gap by (a) extending `dml_insert_arity_diagnostics` (§8.1, previously Form A only — its own doc-comment deferred Form B) to also check the no-column-list form against the schema's column count, emitting a new `diagnostic.insert_arity_mismatch_form_b` ERROR per offending tuple, and (b) refactoring `advanced_alternative_note` to read the validity verdict instead of running its own bespoke check — any static diagnostic added to the pipeline in the future automatically participates in the pointer gate. Side benefit: the `[ERR]` validity indicator now lights up at typing time for the reported scenario, no longer needing a submit to learn the line is wrong. Tests pinned: `insert_form_b_arity_mismatch_under_supply_fires` / `_over_supply_fires` / `_match_is_silent` / `_unknown_table_is_silent` (walker); `ambient_hint_omits_advanced_pointer_when_form_b_value_count_wouldnt_match` (gate); `simple_mode_submit_of_sql_construct_appends_advanced_pointer` (pointer still fires for genuine SQL-only constructs against a known schema). Amendment 3's "would parse in advanced mode" should henceforth be read as a synonym for "valid in advanced mode" in this stricter sense; the user-confirmed behavioural change is exactly the issue #1 bug case (no other input flips its pointer state) - [ADR-0034 — `history.log` as a complete command journal; replay reads success-only](0034-history-journal-and-replay-filter.md) — **Accepted**, resolves a three-way tension in `history.log`'s roles found while implementing ADR-0033 3f: (1) the persistent log is success-only while the in-memory Up/Down recall ring records *every* submission (success or failure, "so users can recall and edit typo'd commands"), and the ring is re-seeded from the log on project open — so **failed commands are recallable within a session but silently lost across sessions**; (2) replay wants the state-building (successful) commands while recall wants everything typed, which one success-only file cannot serve; (3) `replay history.log` never actually worked — `run_replay` parses each whole line through the DSL parser with no understanding of the `||` record shape, so a real log fails on line 1, and **no test ever fed the pipe format to replay** (the `replay_history_log_records_subcommands_only` test only checks what replay *writes*, never replays the log as input). Decision: `history.log` becomes a **complete journal** — every submission recorded, tagged `ok`/`err` via the status field the format already reserved (ADR-0015 §5) — and **each consumer filters**: hydration reads all records (cross-session recall matches in-session), replay reads `ok` only (and learns the journal format, while still accepting bare-command `.commands` scripts; detection by the leading timestamp+status prefix so a `|` inside a bare command isn't misread). Successful commands stay journalled transactionally by the worker; failed commands are journalled `err` best-effort from the runtime/app error path (a parse failure never reaches the worker). Amends ADR-0006's "successfully executed" wording and ADR-0015 §5 ("status always `ok`") / §12 (hydration). Code deferred to two tracked test-first sub-tasks (journal-failures+filtering; replay-parses-journal-format); existing all-`ok` logs need no migration; **implemented 2026-05-24** (plan `docs/plans/20260524-adr-0034-history-journal.md`); **Amendment 1 (2026-05-24): replay filters out app-lifecycle commands** — a working `replay history.log` (the §3 fix) exposed that the journal also records `save as`/`load`/`new`/`export`/`import`/`rebuild`/`mode` (which would panic the worker dispatch or abort the replay), so replay now re-applies **only** schema/data write commands and **skips** every `Command::App` + nested `Command::Replay`; **all skips continue** (never abort — reversing the prior nested-`replay` refusal, so a journal containing a once-run `replay` needs no hand-editing, and the infinite-loop footgun is closed by construction), with a `[skip]` **warning** on `import` and nested-`replay` skips (their omission can leave replayed state incomplete) and silent skips for the rest; `replay.error_nested` removed, `replay.skipped_import`/`replay.skipped_replay` added, `ReplayCompleted` carries `warnings` - [ADR-0035 — Advanced-mode SQL DDL](0035-advanced-mode-sql-ddl.md) — **Accepted** (design agreed 2026-05-24; validated end-to-end by sub-phases 4a/4a.2/4a.3/4b `CREATE TABLE` (incl. foreign keys) + 4c `DROP TABLE [IF EXISTS]` + 4d `CREATE [UNIQUE] INDEX` / `DROP INDEX [IF EXISTS]` + 4e `ALTER TABLE` add/drop/rename column + 4f `ALTER TABLE … ALTER COLUMN TYPE` + 4g `ALTER TABLE` add/drop constraint + add FK + 4h `ALTER TABLE … RENAME TO` + 4i verification sweep (completion merge + simple/advanced completion colour + describe of table-level constraints + self-ref FK indicator + CREATE-TABLE help/usage), implemented 2026-05-25/26 — **Phase 4 complete**; **Amendment 1, 2026-05-26**: drop a composite UNIQUE via a derived, engine-neutral `unique_` name that reuses the existing `DROP CONSTRAINT ` grammar — no new syntax, no metadata, §4g anonymity intact; `describe` shows the name; dropping a UNIQUE-covered *column* now refuses with that name + the drop command), **Phase 4** of the ADR-0030 roadmap (peer of 0031/0032/0033) and **clarifies ADR-0030 §4**. Advanced-mode `CREATE`/`DROP`/`ALTER TABLE` + `CREATE`/`DROP INDEX` get their **own per-statement commands** (`SqlCreateTable`/`SqlAlterTable`/`SqlDropTable`/`SqlCreateIndex`/`SqlDropIndex`), like DML's `Sql*` set — but unlike DML they **execute *structurally*, not verbatim** (raw execution would lose the playground's types, named relationships, and `STRICT`; "verbatim" was a DML convenience, not a rule). Handlers **reuse the low-level schema/metadata helpers** where the operation matches simple mode and **stand alone where the SQL surface is richer** (clarity over forced refactoring); simple mode is untouched (additive). Dispatch: `create`/`drop` reuse ADR-0033 Amendment 1's category-grouped mode-aware dispatch (SQL-first, simple fallback); `alter` is a new advanced-only entry word. Full surface (no pre-emptive cuts, `Q4`): `CREATE TABLE` with column + table constraints, single/compound `PRIMARY KEY`, inline + table-level `FOREIGN KEY` → **named relationships** (one statement = one command = **one undo step**, ADR-0006); `ALTER TABLE` add/drop/rename column, `ALTER COLUMN TYPE`, add/drop constraint, add FK, **`RENAME TO`** (advanced-only table rename — new low-level op renaming the table + its CSV + the relationship and table-CHECK metadata, closing the rename half of `C1`); `CREATE [UNIQUE] INDEX` / `DROP INDEX`. Type slot accepts the ten playground keywords **and** standard-SQL aliases (`integer`→`int`, `varchar`→`text`, `timestamp`→`datetime`, …; length args accepted-and-ignored; no engine type names in/out — ADR-0030 §5). `CHECK`/`DEFAULT` reuse ADR-0031 `sql_expr`. **Pre-implementation `/runda` refinements (2026-05-24, user-confirmed):** `CREATE TABLE`/`DROP TABLE` **admit `IF [NOT] EXISTS`** (no-op-that-succeeds-with-a-note — a near-universal cross-vendor idiom, reclassified *into* scope, not engine-specific); `INTEGER PRIMARY KEY` maps to a **plain `int`** PK, *not* auto-increment (`serial` stays the sole auto-increment type). **Column-type-conversion is unified** (ADR-0017 engine, mode-appropriate policy): clean auto-converts and incompatible/own-type-static cases refuse in both modes, but a **lossy** change refuses-by-default in simple mode (`--force-conversion` opts in) while advanced mode **performs it with a loss note** and relies on **`undo` as the safety net** — no force flag, no dropping to simple mode (a payoff of shipping ADR-0006 first). OOS: views/triggers/txn-control/PRAGMA/etc. (ADR-0030 §3), the Postgres `USING` clause, and the DSL→SQL teaching echo (ADR-0030 Phase 5). Sub-phases 4a–4i, plus **4a.2** (per-column `CHECK`/`DEFAULT` via raw `sql_expr` text — `sql_expr` is validate-only, no `Expr` AST — + composite `UNIQUE(a,b)`; no new internal table) and **4a.3** (table-level/multi-column `CHECK`, landed via the new `__rdbms_playground_table_checks` metadata table because SQLite has no PRAGMA for CHECK; the builder tells a table-level CHECK from a column-level one by element position) and **4b** (foreign keys — inline `REFERENCES` + table-level `FOREIGN KEY` → ADR-0013 named relationships in the create transaction, one undo step; self-references + bare `REFERENCES ` supported, user-confirmed) and **4c** (`DROP TABLE [IF EXISTS]` → `SqlDropTable`, reusing `do_drop_table`; `IF EXISTS` is a no-op-with-note via `DropOutcome::Skipped`) and **4d** (`CREATE [UNIQUE] INDEX [IF NOT EXISTS] [] ON (cols)` → `SqlCreateIndex` and `DROP INDEX [IF EXISTS] ` → `SqlDropIndex`, reusing `do_add_index`/`do_drop_index`; **`CREATE UNIQUE INDEX` admitted** — ADR-0025 **Amendment 1** — via an additive `IndexSchema.unique` flag that round-trips through `project.yaml` and rebuild, with `[unique]` markers in the structure view + items panel, while simple-mode `add unique index` stays deferred; `IF [NOT] EXISTS` reuses the 4c skip path; `create`/`drop` each gain a *second* advanced node, exercising the all-candidates dispatch) and **4e** (`ALTER TABLE` add/drop/rename column → `SqlAlterTable`; `alter` is a new advanced-**only** entry word, runtime-decomposed to the existing `do_add_column`/`do_drop_column`/`do_rename_column` — no new worker layer; `do_add_column` extended to consume raw `default_sql`/`check_sql` so ADD COLUMN reaches CREATE-TABLE constraint parity; drop/rename refuse a column any CHECK references (table-level AND column-level, incl. a column's own self-check on rename) — the 4a.3 deferral, via a raw-CHECK-text tokenizer in the shared executors, so it guards both surfaces and fixes a latent rename-drift bug; SQL DROP COLUMN refuses an index-covered column with no `--cascade` spelling; the column executors + `do_add_index` gained an internal-`__rdbms_*`-table guard — all user-confirmed) and **4f** (`ALTER TABLE … ALTER COLUMN TYPE` → a fourth `AlterTableAction`, runtime-decomposed to the existing `change_column_type` with `ChangeColumnMode::ForceConversion` — which **is** the §7 advanced policy: lossy converts *with a note* (no force flag), incompatible + ADR-0017 static refusals (`↔ blob`, same-type, `date ↔ datetime`, non-`int → serial`) still refuse, while **`int → serial` is allowed** (auto-fills nulls + UNIQUE, ADR-0018 §8 — the §7 "→serial refused" summary is looser than the code); the builder discriminates the fourth branch by the **`type` keyword** (unique — ADD COLUMN's type is an ident), the type slot reuses `SQL_TYPE`; the internal-`__rdbms_*` guard was folded into `do_change_column_type`, closing the simple `change column` exposure too — user-confirmed) and **4g** (`ALTER TABLE … ADD [CONSTRAINT ] (CHECK | UNIQUE | FOREIGN KEY)` + `DROP CONSTRAINT `; ADD = CHECK + composite UNIQUE + FK, with `ADD PRIMARY KEY` and a *named* UNIQUE refused — composite UNIQUE is anonymous in our model; each ADD reuses a low-level path (table-CHECK/UNIQUE rebuild with a dry-run guard; FK → `add_relationship`, bare `REFERENCES

` → parent single-PK), DROP CONSTRAINT resolves the name to a table-CHECK then a child-side FK; **named table-CHECKs round-trip** via a nullable `name` column on `__rdbms_playground_table_checks` (**rebuild-only** arrival — pre-4g projects gain it on `rebuild`, a named add on an un-upgraded project is refused with a friendly "rebuild first" message) *and* a `project.yaml` `check_constraints` extension to an `{expr, name}` mapping (the bare-string form still reads); the internal-`__rdbms_*` guard was folded into `do_add_constraint`/`do_add_relationship`, completing that guard class — all user-confirmed) and **4h** (`ALTER TABLE … RENAME TO` — the one genuinely new low-level op, `do_rename_table`: a native engine rename plus one-transaction reconciliation of every metadata row naming the table (`__rdbms_playground_columns`, **both ends** of `__rdbms_playground_relationships`, `__rdbms_playground_table_checks`), the CSV file (the existing rewrite+delete path — no new persistence method), and **CHECK text that qualifies a column with the old table name** (`T.age`→`U.age`, a planning-`/runda` finding — the engine rewrites the live CHECK but the stored text would drift and break a fresh rebuild; `rewrite_check_table_qualifier` keeps them in step); grammar splits the `rename` verb into one branch with an inner Choice on a distinct second keyword (`column` vs `to`), the new-name slot mirroring the `CREATE TABLE` name slot; refuses same-name / existing-target / `__rdbms_*` / non-existent, with **case-insensitive** collision checks behind an engine-neutral pre-check (a finished-slice `/runda` finding — the engine matches names case-insensitively); auto-named indexes *and* relationships keep their stale names (only table-name columns update — §6 scope); one undo step; advanced-only, closing the rename half of `C1` — all user-confirmed) and **4i** (the verification sweep that completes Phase 4: the shared-entry-word completion merge + the simple-vs-advanced completion colour-when-mixed with Both→Advanced→Simple block ordering; `describe` of table-level composite UNIQUE + table CHECK; the self-ref FK pre-submit indicator fix; and the CREATE-TABLE help/usage skeleton refresh). **All of Phase 4 (4a–4i) is shipped.** Each sub-phase has exit + DA gates; **Amendment 2, 2026-05-27** (design agreed, pending impl): a **standard-first dialect stance** (refines ADR-0030's "standard SQL" posture — ISO spelling is canonical + echoed where one exists; a vendor shorthand may be *accepted* but isn't canonical; where ISO offers none, *one* documented vendor spelling is a deliberate extension) + an `ALTER COLUMN` **constraint gap-fill** surfaced by the ADR-0038 echo design: makes ISO `ALTER COLUMN … SET DATA TYPE` the canonical type-change verb with `TYPE` retained as a synonym (**reverses §4f's "no `SET DATA TYPE`"**), and adds `SET/DROP DEFAULT` (ISO) + `SET/DROP NOT NULL` (the one documented extension — ISO has no in-place NOT-NULL verb; PostgreSQL's chosen for being type-independent), all **rebuild-backed via the existing ADR-0029 `do_add_constraint`/`do_drop_constraint` executors** (dry-run + internal-table guards free, no new worker layer), reaching simple↔advanced constraint-mod **parity for NOT NULL + DEFAULT**; the **rebuild stays hidden** (Category-1 engine detail, ADR-0038). Residual gap left open + flagged: dropping a **column-level (anonymous) UNIQUE/CHECK** (no portable name — same class as Am1's parallel gap), which ADR-0038's catalogue marks "no headline echo" -- [ADR-0036 — Value validation for advanced-mode DML](0036-typed-dml-values-vs-verbatim.md) — **Accepted** (design agreed + `/runda`'d 2026-05-26; mechanism then **deliberately narrowed** from "bind literals via the DSL path" to surgical **"validate-and-retain, execute verbatim"** after the user resisted consolidating the modes and a concrete auto-fill difference confirmed even the single-row literal case isn't identical across modes; **Phases 1–2 implemented** 2026-05-26 — `INSERT … VALUES` and `UPDATE … SET` literal validation + offending-value retention, capture-at-parse with no grammar change; **Phase 3a implemented** 2026-05-26 — live typed-slot hints + numeric-shape highlighting for `UPDATE`/UPSERT `SET col = ` via a boundary-aware lookahead (Amendment 1 corrects this ADR's naive-`Choice` sketch); **Phase 3b implemented** 2026-05-27 — per-position typed slots for `INSERT … VALUES` (single/multi-row, Form A/B) via a new zero-width `Node::SetColumn` primitive + an arity-gating tuple lookahead that preserves the §8.1 arity diagnostic; **fully implemented**). **Augments — does NOT supersede — ADR-0030 §4 / ADR-0033 §10**: execution stays verbatim, ADR-0033 Amendment 3's two-command identity (`Insert` vs `SqlInsert`) **stands**. The problem (investigated 2026-05-26; characterization test `sql_insert.rs::sql_dml_skips_app_level_value_validation_that_the_dsl_enforces` proves it): advanced-mode SQL DML gets **none** of the DSL's value feedback — a malformed `date` like `2025/01/15` is silently written, and the offending value is missing from constraint errors — because literal values are spliced into text and discarded (only `STRICT` storage types check them). **Fix (surgical): validate each literal value against its column type before the verbatim insert, and retain it for error reporting — sharing only the per-type validators (`Value::bind_for_column`/`validate_date`/`shortid::validate`), nothing else.** No binding, no statement reconstruction, no auto-fill change, no command-identity collapse — because the two gaps are closed by validation + retention alone, and executing the user's own text is already safe. The literal set = `NULL`/boolean/string/**signed**-numeric; arithmetic/functions/subqueries/column-refs are expressions (skipped — the engine evaluates them). `WHERE` not validated (it's an expression in general; motivation met by `VALUES`/`SET`); `SELECT`/`INSERT … SELECT`/`RETURNING`/`ON CONFLICT` need no special handling since execution is untouched. Phased: **Phase 1** capture-at-parse + validate + retain for `INSERT … VALUES` (no grammar change, no reparse — closes both proven gaps); **Phase 2** `UPDATE … SET` literals; **Phase 3** completion hinting/highlighting (the only part needing a grammar change — a typed-literal slot vs `sql_expr` reusing the DSL `TypedValueSlot`s at `data.rs:141`/`189`/`269`, discriminated by a **boundary-aware lookahead** not a naive `Choice` per **Amendment 1**; split into **3a** `SET` (done) and **3b** `VALUES` (pending); supersedes only Phase 1/2's literal *detection*, not the validation/enrichment on top). Non-goals: binding/reconstruction, collapsing command identity (Am3 stands), changing `serial`/`shortid` auto-fill (`requirements.md` **X4**, a separate possible-bug), a structural `SELECT`, a full SQL-expression AST. Embodies `requirements.md` **X5** (share a *mechanic*, not a *command*); the neutral "that value" safety net (ADR-0035 Amendment 1) stays correct for genuinely-computed values +- [ADR-0036 — Value validation for advanced-mode DML](0036-typed-dml-values-vs-verbatim.md) — **Accepted** (design agreed + `/runda`'d 2026-05-26; mechanism then **deliberately narrowed** from "bind literals via the DSL path" to surgical **"validate-and-retain, execute verbatim"** after the user resisted consolidating the modes and a concrete auto-fill difference confirmed even the single-row literal case isn't identical across modes; **Phases 1–2 implemented** 2026-05-26 — `INSERT … VALUES` and `UPDATE … SET` literal validation + offending-value retention, capture-at-parse with no grammar change; **Phase 3a implemented** 2026-05-26 — live typed-slot hints + numeric-shape highlighting for `UPDATE`/UPSERT `SET col = ` via a boundary-aware lookahead (Amendment 1 corrects this ADR's naive-`Choice` sketch); **Phase 3b implemented** 2026-05-27 — per-position typed slots for `INSERT … VALUES` (single/multi-row, Form A/B) via a new zero-width `Node::SetColumn` primitive + an arity-gating tuple lookahead that preserves the §8.1 arity diagnostic; **fully implemented**). **Augments — does NOT supersede — ADR-0030 §4 / ADR-0033 §10**: execution stays verbatim, ADR-0033 Amendment 3's two-command identity (`Insert` vs `SqlInsert`) **stands**. The problem (investigated 2026-05-26; characterization test `sql_insert.rs::sql_dml_skips_app_level_value_validation_that_the_dsl_enforces` proves it): advanced-mode SQL DML gets **none** of the DSL's value feedback — a malformed `date` like `2025/01/15` is silently written, and the offending value is missing from constraint errors — because literal values are spliced into text and discarded (only `STRICT` storage types check them). **Fix (surgical): validate each literal value against its column type before the verbatim insert, and retain it for error reporting — sharing only the per-type validators (`Value::bind_for_column`/`validate_date`/`shortid::validate`), nothing else.** No binding, no statement reconstruction, no auto-fill change, no command-identity collapse — because the two gaps are closed by validation + retention alone, and executing the user's own text is already safe. The literal set = `NULL`/boolean/string/**signed**-numeric; arithmetic/functions/subqueries/column-refs are expressions (skipped — the engine evaluates them). `WHERE` not validated (it's an expression in general; motivation met by `VALUES`/`SET`); `SELECT`/`INSERT … SELECT`/`RETURNING`/`ON CONFLICT` need no special handling since execution is untouched. Phased: **Phase 1** capture-at-parse + validate + retain for `INSERT … VALUES` (no grammar change, no reparse — closes both proven gaps); **Phase 2** `UPDATE … SET` literals; **Phase 3** completion hinting/highlighting (the only part needing a grammar change — a typed-literal slot vs `sql_expr` reusing the DSL `TypedValueSlot`s at `data.rs:141`/`189`/`269`, discriminated by a **boundary-aware lookahead** not a naive `Choice` per **Amendment 1**; split into **3a** `SET` (done) and **3b** `VALUES` (pending); supersedes only Phase 1/2's literal *detection*, not the validation/enrichment on top). Non-goals: binding/reconstruction, collapsing command identity (Am3 stands), changing `serial`/`shortid` auto-fill (`requirements.md` **X4**, a separate possible-bug), a structural `SELECT`, a full SQL-expression AST. Embodies `requirements.md` **X5** (share a *mechanic*, not a *command*); the neutral "that value" safety net (ADR-0035 Amendment 1) stays correct for genuinely-computed values; **Amendment 2 (issue #17, 2026-05-29)** brings the §8.1 arity diagnostic to **simple mode** at parity with advanced: a `tuple_value_list`-style gate (`dsl_insert_value_list`, **simple-mode-gated** so advanced is byte-for-byte unchanged) routes a wrong-count DSL insert tuple to the type-blind fallback so it matches and the friendly arity diagnostic fires (instead of a bare `expected `,`/`)``); `dml_insert_arity_diagnostics` is now mode-aware (advanced Form B = all columns, simple Form B/C = user-fillable since serial/shortid auto-fill, ADR-0018 §3), counts the DSL Form A role (`insert_first_item`) and the keyword-less Form C tuple, with new keys `insert_arity_mismatch_form_b_simple` / `_all_auto`; a wrong-count DSL insert now parses `Ok` + carries the ERROR diagnostic (the `[ERR]` verdict), with a unified Ok-arm submit pre-flight (`dsl_insert_count_mismatch_notes`) blocking dispatch + teaching (the issue #1 Err-arm note retires). **Arity-UX parity only — no consolidation of value-handling/execution/auto-fill; the deliberate mode-distinctness stands** - [ADR-0037 — Execution-time mode side-channel (the three-way submission mode)](0037-execution-time-mode-side-channel.md) — **Accepted** (design agreed 2026-05-27; channel **implemented + verified end-to-end** by its motivating consumer — ADR-0038's fully-shipped DSL → SQL teaching echo — across handoff-46 `04c8e42` (channel + first echo slice), handoff-47 `90479cb` (full Bucket A), `275c726` (Bucket B resolved-name + multi-statement renderers), `e6ad1ae` (the category-3 `--dont-convert` caveat — gated on this channel too), and `2aab457` (the §4 styled-runs rendering polish)), **redeems the follow-up deferred by ADR-0033 Amendment 3** (which named this ADR and its motivating consumer). Establishes the channel that lets a command know, **at execution time**, the effective mode it ran under — so execution can adjust **output** without touching **identity** (the motivating case: a DSL-form `create table` echoing the equivalent SQL when run in advanced mode, silent in simple — ADR-0030 §10, realised by ADR-0038). Introduces a **new per-submission enum `SubmissionMode` { Simple, Advanced, AdvancedOneShot }** — *refining* Amendment 3's "widen `Mode`" sketch: the persistent input `Mode` stays **two-way** (`mode.rs` keeps the one-shot `:` out of persistent state), and the three-way distinction lives on the per-submission channel where the transient `:` belongs. Resolved at submit time (Simple+`:` → `AdvancedOneShot`; Advanced `:` is a no-op), threaded through `Action::ExecuteDsl` → worker, **output-only** (no executor branches its *effect* on it — Amendment 3 forbids behavioural mode dependence). The worker builds the teaching echo (+ category-3 expansion data — ADR-0038) for DSL-form commands in advanced/one-shot mode and returns it; the App renders it beneath `[ok]`. Co-located with execution because the echo's harder forms (resolved auto-names, generated `shortid`s, conversion counts) are worker-computed facts, and gating on mode means the work happens only when shown. Alternatives weighed + rejected: widening `Mode` (conflates transient/persistent state); App-side gating with the worker always emitting echo data (computes unconditionally, doesn't generalise, re-opens the render-side framing ruled against). Scope: channel + resolution rule only — the renderer/catalogue/`Value → SQL-literal` are ADR-0038, the `ALTER COLUMN` gap-fill is the ADR-0035 amendment - [ADR-0038 — The DSL → SQL teaching echo](0038-dsl-to-sql-teaching-echo.md) — **Accepted** (design agreed 2026-05-27; **fully implemented + verified** — every catalogue row in §7 Buckets A + B and the §6 category-3 prose round-trips per line through the advanced walker per §1, and the §4 de-emphasised styled-runs polish is wired: handoff-46 `04c8e42` shipped the channel + create-table slice, handoff-47 `90479cb` the full Bucket A expansion + a skeleton contract-gap fix (dropped per-column `DEFAULT`/`CHECK`), `275c726` the Bucket B resolved-name + multi-statement renderers (auto- and user-named `add index`, positional `drop index`, `add`/`drop relationship` in both selector forms, `drop column --cascade`, `add relationship --create-fk`), `e6ad1ae` the last category-3 line — the `change column --dont-convert` *caveat* (shortid + transform notes were already surfaced via pre-existing `client_side.*` keys), and `2aab457` the §4 styled-runs polish: a new `OutputKind::TeachingEcho` custom rendering branch (dimmed `Executing SQL:` prefix + the SQL re-lexed in advanced mode for token-class colouring, same as the input echo) plus a new `OutputStyleClass::Hint` for every cat-3 prose line — caveat *and* the existing illuminating notes, user-confirmed broader scope), **realises ADR-0030 §10** (the teaching bridge) — the Phase-5 echo **ADR-0035 §12 forward-referenced** — building on **ADR-0037** (the `SubmissionMode` gate) and **ADR-0035 Amendment 2** (standard-first dialect + `ALTER COLUMN` gap-fill). When a **DSL-form** command runs in advanced/one-shot mode, the worker emits the equivalent SQL beneath `[ok]` as a de-emphasised styled `OutputLine` (ADR-0028); the App renders it. **Defining invariant — the copy-paste contract:** every echoed line is *runnable advanced-mode SQL* (round-trip-tested: parse the echo → same-effect command; a planned "copy the echo" affordance depends on it). **Type vocabulary = the playground's own keywords** (`serial`/`shortid`/…, accepted by `from_sql_name`, decision (a)); **statement shape = the standard-first dialect** (Am2). **DML uses substituted literals, not `?`** (per-type `Value → SQL-literal`, round-trip-safe; `blob` moot — no literal syntax exists; auto-gen columns omitted to match `do_insert` + X4). **Firing reality — a DDL + `show data` feature:** in advanced mode `insert`/`update`/`delete … where` are SQL-first (`Sql*` = already SQL = nothing to echo per §10); only DSL-*only* spellings echo (DDL + `show data` + the `delete`/`update … --all-rows` fall-throughs — the latter via **ADR-0033 Amendment 4**, a bug-fix folded in here that reverses Amendment 3's `update … --all-rows` misparse). **Three-category framework** for "what happens beyond the literal SQL": **(1) engine-implementation-hiding** (the rebuild, rowid PK, non-PK `serial` MAX+1) — *never surfaced*; **(2) decomposable into advanced SQL** (`drop column --cascade`, `--create-fk` relationship) — *shown as the runnable multi-line sequence, one statement per line*; **(3) playground type-behaviour with no SQL-expressible form** (`shortid` generation — no `shortid()`; type-conversion transforms — no `USING`) — *de-emphasised prose expansion from the worker's `client_side.*` notes*. Carries the **full catalogue** (Buckets A single-statement / B resolved-name + multi-line / C no-echo) mapping every DSL-form command to its echo. OOS: reverse SQL→DSL echo (§13 OOS-5), app commands / `show table` / `explain` / `replay`, a `blob` literal, the column-level UNIQUE/CHECK drop residual (Bucket C until Am2's gap closes), and surfacing any category-1 engine internal - [ADR-0039 — EXPLAIN over advanced-mode SQL queries](0039-explain-over-advanced-sql.md) — **Accepted** (decision recorded 2026-05-27; **implementation deferred** as a follow-up to the ADR-0037/0038 echo effort — not in that pass), **supersedes ADR-0030 §13 OOS-2**. Lets `explain` wrap the advanced SQL commands (`Select`/`SqlInsert`/`SqlUpdate`/`SqlDelete`) in addition to the DSL `ShowData`/`Update`/`Delete` it already covers (ADR-0028), running `EXPLAIN QUERY PLAN` over the validated SQL text through the existing ADR-0028 span-styled plan tree (advanced mode only; DSL `explain` unchanged in both modes). Reframes OOS-2 as a *deferred* scope exclusion (per ADR-0000's new out-of-scope discipline), not a principled rejection — surfaced 2026-05-27 while characterising advanced-mode explain (suspected a bug; it was OOS-2 as written). Self-contained, orthogonal to the echo (the echo renders SQL *from* DSL; this explains SQL the user *wrote*). OOS (deferred): EXPLAIN of DDL (no query plan exists). Built test-first when picked up diff --git a/docs/requirements.md b/docs/requirements.md index 24fcc84..4dab829 100644 --- a/docs/requirements.md +++ b/docs/requirements.md @@ -540,10 +540,15 @@ handoff-14 cleanup; 449 after B2/C2.) names the schema-correct next token (`,` between values, `)` after the last) instead of the type-blind close-paren, and so a wrong-arity closed tuple surfaces the real parse error rather - than a misleading "submit with Enter"). A systematic pass is - still pending; simple-mode wrong-count tuples still get a generic - expected-token message rather than the friendly arity diagnostic - advanced mode shows (issue #17). + than a misleading "submit with Enter"; **issue #17 / ADR-0036 + Amendment 2, 2026-05-29** then brought the §8.1 arity diagnostic + to **simple mode** at parity with advanced — a wrong-count DSL + insert (Form A/B/C) now fires the friendly *"N value(s) for + `col`…"* message at typing time, counted against the user-fillable + columns, with `serial`/`shortid` auto-fill named; new keys + `diagnostic.insert_arity_mismatch_form_b_simple` / + `diagnostic.insert_arity_mismatch_all_auto`). A systematic pass is + still pending. - [ ] **H2** `hint` provides contextual help for the current input or the most recent error. - [ ] **H3** `help` provides general reference and per-command diff --git a/src/app.rs b/src/app.rs index 4919966..49fb9c3 100644 --- a/src/app.rs +++ b/src/app.rs @@ -1361,6 +1361,30 @@ impl App { source: input.to_string(), }]; } + // Issue #17: simple-mode (DSL) counterpart. A wrong-count + // DSL insert now parses `Ok` (so the typing-time arity + // diagnostic can fire), so dispatch is gated here — the + // same teaching the old parse-error path showed, now with + // the insert reliably blocked from reaching the worker. + if let Some(notes) = crate::input_render::dsl_insert_count_mismatch_notes( + input, + &cmd, + &self.schema_cache, + ) { + self.push_output(OutputLine { + text: crate::t!("dsl.running", input = input), + kind: OutputKind::Echo, + mode_at_submission: mode, + styled_runs: None, + }); + for note in notes { + self.note_error(note); + } + self.note_error(render_usage_block(input)); + return vec![Action::JournalFailure { + source: input.to_string(), + }]; + } self.push_output(OutputLine { text: crate::t!("dsl.running", input = input), kind: OutputKind::Echo, @@ -1426,19 +1450,13 @@ impl App { { self.note_error(note); } - // Issue #1 sub-task 2: append a teaching note when the - // Form B `insert into values (…)` line failed - // because the user supplied more values than the - // non-auto-generated columns expect. The parse error - // shows the literal "expected `)`"; the note explains - // *why* fewer values are expected and shows the - // column-list override path. - if mode == Mode::Simple - && let Some(note) = - crate::input_render::form_b_extra_values_note(input, &self.schema_cache) - { - self.note_error(note); - } + // Issue #1 sub-task 2's Form B teaching note used to be + // appended here, because a wrong-count Form B insert + // failed to parse and landed in this Err arm. As of issue + // #17 such tuples parse `Ok` (so the typing-time arity + // diagnostic fires) and the teaching + dispatch block now + // live in the Ok arm's `dsl_insert_count_mismatch_notes` + // pre-flight — a single model shared with advanced mode. // ADR-0021 §2: append the usage block (if a // known command-entry keyword was consumed) or // the available-commands fallback (§5). @@ -3129,6 +3147,52 @@ mod tests { ); } + #[test] + fn simple_mode_submit_of_form_b_count_mismatch_does_not_dispatch() { + // Issue #17 EXECUTION SAFETY. Once simple-mode wrong-count Form B + // tuples parse `Ok` (so the typing-time arity diagnostic can + // fire), the submit path must still NOT dispatch the insert — a + // wrong-count insert would otherwise reach the worker and fail + // (or, worse, write the wrong row). The unified Ok-arm pre-flight + // must block dispatch exactly as the old parse-error path did. + let mut app = App::new(); + install_customers_schema_two_serials(&mut app); + type_str(&mut app, "insert into Customers values ('Oli', 52, 3)"); + let actions = submit(&mut app); + assert!( + !actions.iter().any(|a| matches!(a, Action::ExecuteDsl { .. })), + "simple-mode Form B count mismatch must NOT dispatch; got: {actions:?}", + ); + } + + #[test] + fn simple_mode_submit_of_form_b_undersupply_does_not_dispatch() { + // Companion to the above for under-supply. + let mut app = App::new(); + install_customers_schema_two_serials(&mut app); + type_str(&mut app, "insert into Customers values ('Oli')"); + let actions = submit(&mut app); + assert!( + !actions.iter().any(|a| matches!(a, Action::ExecuteDsl { .. })), + "simple-mode Form B under-supply must NOT dispatch; got: {actions:?}", + ); + } + + #[test] + fn simple_mode_submit_of_form_a_count_mismatch_does_not_dispatch() { + // Form A (explicit column list) wrong count must also not + // dispatch — it previously parse-errored; the unified pre-flight + // must keep it blocked. + let mut app = App::new(); + install_customers_schema_two_serials(&mut app); + type_str(&mut app, "insert into Customers (Name, Age) values ('Oli')"); + let actions = submit(&mut app); + assert!( + !actions.iter().any(|a| matches!(a, Action::ExecuteDsl { .. })), + "simple-mode Form A count mismatch must NOT dispatch; got: {actions:?}", + ); + } + #[test] fn simple_mode_submit_of_sql_construct_appends_advanced_pointer() { // ADR-0033 Amendment 3 (+ Amendment 5): submitting a line in diff --git a/src/dsl/grammar/data.rs b/src/dsl/grammar/data.rs index 7939f39..a9c08c2 100644 --- a/src/dsl/grammar/data.rs +++ b/src/dsl/grammar/data.rs @@ -27,7 +27,10 @@ use crate::dsl::command::{Command, Expr, RowFilter}; use crate::dsl::grammar::{ CommandNode, IdentSource, Node, NumberValidator, ValidationError, Word, expr, - shared::{column_value_list, current_column_value}, + shared::{ + FALLBACK_VALUE_LIST, column_value_list, count_tuple_values, + current_column_value, insert_target_columns, + }, sql_delete, sql_insert, sql_select, sql_update, }; use crate::dsl::walker::context::WalkContext; @@ -141,12 +144,13 @@ static INSERT_COMMA: Node = Node::Punct(','); /// First-paren resolver (ADR-0024 §Phase D Form-C type-awareness). /// Peeks the first token after `(` to route to Form A's /// column-name list or Form C's typed value list. -fn insert_first_paren(_ctx: &WalkContext, source: &str, pos: usize) -> Node { +fn insert_first_paren(ctx: &WalkContext, source: &str, pos: usize) -> Node { if first_paren_item_is_value_literal(source, pos) { - // Form C — bare value list. `column_value_list` with no - // user-listed columns dispatches per non-auto-generated - // column, exactly as Form B does. - Node::DynamicSubgrammar(column_value_list) + // Form C — bare value list. Arity-gated exactly like Form B's + // `values (…)`: a correct-count tuple gets the typed per-column + // slots; a wrong-count tuple routes to the type-blind fallback + // so it still matches and the arity diagnostic fires (issue #17). + dsl_insert_value_list(ctx, source, pos) } else { // Form A (or Form A in progress / empty paren). Node::Repeated { @@ -189,12 +193,51 @@ fn first_paren_item_is_value_literal(source: &str, pos: usize) -> bool { const INSERT_PAREN_LIST: Node = Node::Lookahead(insert_first_paren); -/// Schema-aware value list: when the walker has a populated -/// `current_table_columns`, unfolds to a `Seq` of typed slots -/// per column (`int_slot`, `text_slot`, …). When schemaless, -/// falls back to the pre-Phase-D `Repeated(VALUE_LITERAL, ',', 1)` -/// shape (ADR-0024 §Phase D §column_value_list). -const INSERT_VALUES_LIST: Node = Node::DynamicSubgrammar(column_value_list); +/// Insert value-list arity gate (issue #17) — the simple-mode DSL +/// counterpart of the advanced grammar's `tuple_value_list` +/// (`sql_insert.rs`). Routes a correct-arity tuple to the typed +/// per-column slots ([`column_value_list`]) and a wrong-arity tuple to +/// the type-blind [`FALLBACK_VALUE_LIST`], so the wrong-count tuple +/// still structurally matches and the per-tuple arity diagnostic +/// (ADR-0033 §8.1, made mode-aware for issue #17) fires its friendly +/// message instead of a bare "expected `,`/`)`". +/// +/// Target arity comes from [`insert_target_columns`] — the same source +/// `column_value_list` uses, so gate and slots never disagree. `None` +/// (schemaless / unknown table / all-auto-generated) → fallback: either +/// we can't gate (schemaless) or the all-auto case wants the tuple to +/// match so the diagnostic can explain it. +/// +/// **Simple-mode only.** The fallback routing is what lets a wrong-count +/// tuple structurally match (so the diagnostic fires); that is a +/// simple-mode behaviour. In advanced mode the DSL insert node must stay +/// strict — otherwise a non-SQL shape like Form C (`insert into T +/// (1, 2)`, no `values`) would spuriously match here and be accepted in +/// advanced mode, where SQL requires `values` and the dedicated SQL +/// grammar (`sql_insert.rs`) owns inserts. Keeping advanced strict +/// preserves the pre-#17 advanced behaviour exactly (issue #17). +fn dsl_insert_value_list(ctx: &WalkContext, source: &str, pos: usize) -> Node { + if ctx.mode != crate::mode::Mode::Simple { + return Node::DynamicSubgrammar(column_value_list); + } + let Some(cols) = insert_target_columns(ctx) else { + return FALLBACK_VALUE_LIST; + }; + let (count, closed) = count_tuple_values(source, pos); + let arity_ok = if closed { count == cols.len() } else { count <= cols.len() }; + if arity_ok { + Node::DynamicSubgrammar(column_value_list) + } else { + FALLBACK_VALUE_LIST + } +} + +/// Schema-aware value list, arity-gated (issue #17): a correct-count +/// tuple unfolds to a `Seq` of typed slots per column (`int_slot`, +/// `text_slot`, …); a wrong-count tuple or a schemaless walk falls back +/// to the type-blind `Repeated(VALUE_LITERAL, ',', 1)` shape (ADR-0024 +/// §Phase D §column_value_list). +const INSERT_VALUES_LIST: Node = Node::Lookahead(dsl_insert_value_list); const INSERT_OPTIONAL_VALUES_NODES: &[Node] = &[ Node::Word(Word::keyword("values")), diff --git a/src/dsl/grammar/shared.rs b/src/dsl/grammar/shared.rs index 56e7721..7a6e320 100644 --- a/src/dsl/grammar/shared.rs +++ b/src/dsl/grammar/shared.rs @@ -371,12 +371,116 @@ pub(crate) const FALLBACK_VALUE_LITERAL: Node = Node::Hinted { inner: &FALLBACK_VALUE_LITERAL_INNER, }; -const FALLBACK_VALUE_LIST: Node = Node::Repeated { +/// The type-blind value list. `pub(crate)` so the insert value-list +/// arity gate (`data.rs`, issue #17) can route a wrong-count tuple here +/// — exactly as the advanced grammar's `tuple_value_list` does — so the +/// tuple still structurally matches and the per-tuple arity diagnostic +/// (ADR-0033 §8.1) fires instead of a bare "expected `,`/`)`". +pub(crate) const FALLBACK_VALUE_LIST: Node = Node::Repeated { inner: &FALLBACK_VALUE_LITERAL, separator: Some(&Node::Punct(',')), min: 1, }; +/// The columns an insert value tuple maps onto (ADR-0024 §Phase D). +/// +/// Mirrors `db::do_insert`'s `user_cols` logic. `None` when the walker +/// is schemaless, the table is unknown, or — for Form B/C — every column +/// is auto-generated (callers fall back to the type-blind value list). +/// +/// - **Form A** (`user_listed_columns` set): the listed columns, in the +/// user's order; names the schema doesn't know are dropped. +/// - **Form B/C** (no column list): the table's non-auto-generated +/// columns, in declaration order. `serial` / `shortid` are skipped +/// because the simple-mode dispatch auto-fills them (ADR-0018 §3). +/// +/// This is the single source of truth shared by [`column_value_list`] +/// (which builds the typed slots) and the `data.rs` arity gate (which +/// counts them) so the two never disagree (issue #17). +pub fn insert_target_columns<'c>( + ctx: &'c WalkContext<'_>, +) -> Option> { + let table_cols = ctx.current_table_columns.as_ref()?; + if table_cols.is_empty() { + return None; + } + let cols: Vec<&TableColumn> = ctx.user_listed_columns.as_ref().map_or_else( + || { + table_cols + .iter() + .filter(|c| !matches!(c.user_type, Type::Serial | Type::ShortId)) + .collect() + }, + |user_listed| { + user_listed + .iter() + .filter_map(|name| { + table_cols + .iter() + .find(|c| c.name.eq_ignore_ascii_case(name)) + }) + .collect() + }, + ); + if cols.is_empty() { None } else { Some(cols) } +} + +/// Count the value positions in a `VALUES`/insert tuple whose contents +/// begin at `pos` (just past the opening `(`), and whether the tuple is +/// *closed* (a depth-0 `)` was reached) vs still being typed (scan hit +/// end-of-input first). Depth-aware: commas nested in a function call / +/// subquery (paren depth ≥ 1) or inside a string literal are not +/// separators. Returns `(0, _)` for an empty tuple `()`. +/// +/// Shared by the advanced grammar's `tuple_value_list` (`sql_insert.rs`) +/// and the simple-mode DSL insert arity gate (`data.rs`) so both modes +/// count tuple values identically (issue #17). +pub(crate) fn count_tuple_values(source: &str, pos: usize) -> (usize, bool) { + let bytes = source.as_bytes(); + let mut i = pos; + let mut depth: i32 = 0; + let mut commas = 0usize; + let mut seen_value = false; + let mut closed = false; + while i < bytes.len() { + match bytes[i] { + b'\'' => { + // Skip a single-quoted string literal (`''` escape). + i += 1; + seen_value = true; + while i < bytes.len() { + if bytes[i] == b'\'' { + if bytes.get(i + 1) == Some(&b'\'') { + i += 2; + continue; + } + i += 1; + break; + } + i += 1; + } + continue; + } + b'(' => { + depth += 1; + seen_value = true; + } + b')' => { + if depth == 0 { + closed = true; + break; // tuple close + } + depth -= 1; + } + b',' if depth == 0 => commas += 1, + b if !b.is_ascii_whitespace() => seen_value = true, + _ => {} + } + i += 1; + } + (if seen_value { commas + 1 } else { 0 }, closed) +} + /// Value slot keyed on `WalkContext::current_column`. /// /// Picks the typed slot for the column whose name was most @@ -399,50 +503,12 @@ pub fn current_column_value(ctx: &WalkContext) -> Node { /// `Repeated(VALUE_LITERAL, ',', 1)` shape so existing /// callers/tests continue to work. pub fn column_value_list(ctx: &WalkContext) -> Node { - let Some(table_cols) = ctx.current_table_columns.as_ref() else { + // Target columns per the shared insert mapping (Form A = listed, + // Form B/C = non-auto-generated). `None` → schemaless / unknown + // table / all-auto-generated → the type-blind fallback list. + let Some(target_cols) = insert_target_columns(ctx) else { return FALLBACK_VALUE_LIST; }; - if table_cols.is_empty() { - return FALLBACK_VALUE_LIST; - } - // Three dispatch shapes (ADR-0024 §Phase D §column_value_list, - // matching `db::do_insert`'s user_cols logic): - // - // 1. Form A — user listed explicit columns - // (`insert into T (col1, col2, …) values (…)`): one slot - // per listed column, in the user's order, types resolved - // from the schema. - // 2. Form B — bare values keyword - // (`insert into T values (…)`): one slot per non-auto- - // generated column of T, in declaration order. Serial / - // shortid columns are skipped because the dispatch path - // auto-fills them (ADR-0018 §3). - // 3. Schemaless / fallback: the generic value-literal list. - let target_cols: Vec<&TableColumn> = ctx.user_listed_columns.as_ref().map_or_else( - || { - // Form B — exclude auto-generated columns. - table_cols - .iter() - .filter(|c| !matches!(c.user_type, Type::Serial | Type::ShortId)) - .collect() - }, - |user_listed| { - // Form A — resolve each listed name from the schema. - // Names the schema doesn't know about silently drop; - // the bind-time path catches unknown columns. - user_listed - .iter() - .filter_map(|name| { - table_cols - .iter() - .find(|c| c.name.eq_ignore_ascii_case(name)) - }) - .collect() - }, - ); - if target_cols.is_empty() { - return FALLBACK_VALUE_LIST; - } // Build a Seq of typed slots interleaved with commas. Each // slot embeds its column name so the hint resolver can // mention the column by name ("for `Email`: Type a quoted diff --git a/src/dsl/grammar/sql_insert.rs b/src/dsl/grammar/sql_insert.rs index a1dbc5b..ab3411e 100644 --- a/src/dsl/grammar/sql_insert.rs +++ b/src/dsl/grammar/sql_insert.rs @@ -14,7 +14,7 @@ //! sub-phases. use crate::completion::TableColumn; -use crate::dsl::grammar::shared::SET_VALUE; +use crate::dsl::grammar::shared::{SET_VALUE, count_tuple_values}; use crate::dsl::grammar::sql_expr; use crate::dsl::grammar::sql_select::{ RETURNING_CLAUSE, SQL_SELECT_COMPOUND, WHERE_CLAUSE, reject_internal_table, @@ -127,57 +127,9 @@ fn target_value_columns(ctx: &WalkContext) -> Vec { ) } -/// Count the value positions in the `VALUES` tuple whose contents begin -/// at `pos` (just past the opening `(`), and whether the tuple is -/// *closed* (a depth-0 `)` was reached) vs still being typed (scan hit -/// end-of-input first). Depth-aware: commas nested in a function call / -/// subquery (paren depth ≥ 1) or inside a string literal are not -/// separators. Returns `(0, _)` for an empty tuple `()`. -fn count_tuple_values(source: &str, pos: usize) -> (usize, bool) { - let bytes = source.as_bytes(); - let mut i = pos; - let mut depth: i32 = 0; - let mut commas = 0usize; - let mut seen_value = false; - let mut closed = false; - while i < bytes.len() { - match bytes[i] { - b'\'' => { - // Skip a single-quoted string literal (`''` escape). - i += 1; - seen_value = true; - while i < bytes.len() { - if bytes[i] == b'\'' { - if bytes.get(i + 1) == Some(&b'\'') { - i += 2; - continue; - } - i += 1; - break; - } - i += 1; - } - continue; - } - b'(' => { - depth += 1; - seen_value = true; - } - b')' => { - if depth == 0 { - closed = true; - break; // tuple close - } - depth -= 1; - } - b',' if depth == 0 => commas += 1, - b if !b.is_ascii_whitespace() => seen_value = true, - _ => {} - } - i += 1; - } - (if seen_value { commas + 1 } else { 0 }, closed) -} +// `count_tuple_values` moved to `grammar::shared` (issue #17) so the +// simple-mode DSL insert arity gate can share it; the advanced grammar +// imports it above. /// Tuple value-list lookahead (ADR-0036 Phase 3b). Gates the typed /// per-column path on arity so the typed `Seq` is used only where it diff --git a/src/dsl/walker/mod.rs b/src/dsl/walker/mod.rs index e44dd68..bf6d34e 100644 --- a/src/dsl/walker/mod.rs +++ b/src/dsl/walker/mod.rs @@ -1465,10 +1465,16 @@ fn dml_auto_column_diagnostics( fn dml_insert_arity_diagnostics( path: &MatchedPath, schema: Option<&crate::completion::SchemaCache>, + mode: crate::mode::Mode, ) -> Vec { use crate::dsl::grammar::IdentSource; + use crate::dsl::types::Type; use outcome::{Diagnostic, MatchedKind, Severity}; + // Form A column count: the explicit `(col, …)` list. The SQL + // (advanced) grammar tags these `insert_column`; the DSL (simple) + // grammar tags them `insert_first_item` (issue #17) — count both so + // a DSL Form A insert isn't mis-classified as Form B. let col_arity = path .items .iter() @@ -1477,31 +1483,70 @@ fn dml_insert_arity_diagnostics( it.kind, MatchedKind::Ident { source: IdentSource::Columns, - role: "insert_column", + role: "insert_column" | "insert_first_item", } ) }) .count(); - // Resolve the expected arity + which message template to use. - // Form A: explicit column list → its own length. - // Form B: no list → the target table's column count, *if* we know - // it. Without a schema or a recognised target the pass goes - // silent (the unknown-table case is flagged by the schema- - // existence pass instead). - let (expected, message_key): (usize, &'static str) = if col_arity > 0 { - (col_arity, "diagnostic.insert_arity_mismatch") - } else { - let Some(schema) = schema else { - return Vec::new(); - }; - let Some(target) = path.items.iter().find_map(|it| match it.kind { + // Is this an INSERT? `into` is insert-exclusive in both grammars + // (update uses `set`, delete `from`, show `data`), so it tells the + // DSL insert apart from other commands that also tag a `table_name`. + let is_insert = path + .items + .iter() + .any(|it| matches!(it.kind, MatchedKind::Word("into"))); + // Insert target table. The SQL grammar tags it `insert_target_table` + // (insert-specific); the DSL grammar reuses the generic `table_name` + // role, so only trust that when `into` confirmed an insert (issue + // #17). + let target_table: Option<&str> = path + .items + .iter() + .find_map(|it| match it.kind { MatchedKind::Ident { source: IdentSource::Tables, role: "insert_target_table", } => Some(it.text.as_str()), + MatchedKind::Ident { + source: IdentSource::Tables, + role: "table_name", + } if is_insert => Some(it.text.as_str()), _ => None, - }) else { + }); + + // Resolve the expected arity + a message builder. The builder + // captures the per-case args because the message key (and its + // placeholders) differ by form and mode. + // + // - **Form A** (both modes): the listed-column count, mode-neutral. + // - **Advanced Form B**: every column needs a value (auto-fills + // nothing, ADR-0036) → the full table count. + // - **Simple Form B** (issue #17): the dispatch auto-fills + // serial/shortid (ADR-0018 §3), so only the user-fillable columns + // take values. When some columns are skipped the message names + // both sets; when none are skipped the count equals the table's + // and the plain Form B wording is accurate; when *every* column is + // auto-generated no value belongs at all. + type MsgFn<'a> = Box String + 'a>; + let (expected, make_message): (usize, MsgFn) = if col_arity > 0 { + ( + col_arity, + Box::new(move |actual| { + crate::friendly::translate( + "diagnostic.insert_arity_mismatch", + &[ + ("expected", &col_arity as &dyn std::fmt::Display), + ("actual", &actual as &dyn std::fmt::Display), + ], + ) + }), + ) + } else { + let Some(schema) = schema else { + return Vec::new(); + }; + let Some(target) = target_table else { return Vec::new(); }; let Some(cols) = schema.table_columns.get(target) else { @@ -1510,35 +1555,123 @@ fn dml_insert_arity_diagnostics( if cols.is_empty() { return Vec::new(); } - (cols.len(), "diagnostic.insert_arity_mismatch_form_b") + let is_auto = |t: Type| matches!(t, Type::Serial | Type::ShortId); + let fillable: Vec<&str> = cols + .iter() + .filter(|c| !is_auto(c.user_type)) + .map(|c| c.name.as_str()) + .collect(); + if mode == crate::mode::Mode::Advanced || fillable.len() == cols.len() { + // Advanced Form B (all columns), or simple Form B over a + // table with no auto-generated columns — "all N" is accurate. + let expected = if mode == crate::mode::Mode::Advanced { + cols.len() + } else { + fillable.len() + }; + ( + expected, + Box::new(move |actual| { + crate::friendly::translate( + "diagnostic.insert_arity_mismatch_form_b", + &[ + ("expected", &expected as &dyn std::fmt::Display), + ("actual", &actual as &dyn std::fmt::Display), + ], + ) + }), + ) + } else if fillable.is_empty() { + // Simple Form B, every column auto-generated → no value + // belongs at all. + let table = target.to_string(); + ( + 0, + Box::new(move |actual| { + crate::friendly::translate( + "diagnostic.insert_arity_mismatch_all_auto", + &[ + ("table", &table as &dyn std::fmt::Display), + ("actual", &actual as &dyn std::fmt::Display), + ], + ) + }), + ) + } else { + // Simple Form B with a mix — name the user-fillable and the + // auto-generated columns so the learner understands the skip. + let expected = fillable.len(); + let columns = fillable + .iter() + .map(|n| format!("`{n}`")) + .collect::>() + .join(", "); + let skipped = cols + .iter() + .filter(|c| is_auto(c.user_type)) + .map(|c| format!("`{}`", c.name)) + .collect::>() + .join(", "); + ( + expected, + Box::new(move |actual| { + crate::friendly::translate( + "diagnostic.insert_arity_mismatch_form_b_simple", + &[ + ("expected", &expected as &dyn std::fmt::Display), + ("columns", &columns as &dyn std::fmt::Display), + ("skipped", &skipped as &dyn std::fmt::Display), + ("actual", &actual as &dyn std::fmt::Display), + ], + ) + }), + ) + } }; - // Index of the row-source keyword (first VALUES / SELECT / WITH). - let Some(kw_idx) = path + // Locate the row source to scan. Form A/B and INSERT…SELECT carry a + // `values`/`select`/`with` keyword. The DSL Form C (`insert into T + // (1, 2)` — bare value list, no `values`) has none: its single value + // tuple follows the target table directly, with no column-name list + // (col_arity == 0 here), so scan from just after the table and treat + // it with the VALUES tuple logic (issue #17). + let (kw, tail): (&str, &[outcome::MatchedItem]) = if let Some(kw_idx) = path .items .iter() .position(|it| matches!(it.kind, MatchedKind::Word("values" | "select" | "with"))) - else { - return Vec::new(); - }; - let MatchedKind::Word(kw) = path.items[kw_idx].kind else { + { + let MatchedKind::Word(kw) = path.items[kw_idx].kind else { + return Vec::new(); + }; + (kw, &path.items[kw_idx + 1..]) + } else if is_insert && mode == crate::mode::Mode::Simple { + // DSL Form C (`insert into T (1, 2)` — bare value list, no + // `values` keyword) is simple-mode only; advanced SQL always + // carries a `values`/`select`/`with` keyword, so this branch + // stays out of the advanced path (its arity is handled above). + let Some(tbl_idx) = path.items.iter().position(|it| { + matches!( + it.kind, + MatchedKind::Ident { + source: IdentSource::Tables, + role: "insert_target_table" | "table_name", + } + ) + }) else { + return Vec::new(); + }; + ("values", &path.items[tbl_idx + 1..]) + } else { return Vec::new(); }; let mut diagnostics = Vec::new(); - let tail = &path.items[kw_idx + 1..]; let emit = |span: (usize, usize), actual: usize, diagnostics: &mut Vec| { diagnostics.push(Diagnostic { severity: Severity::Error, span, - message: crate::friendly::translate( - message_key, - &[ - ("expected", &expected as &dyn std::fmt::Display), - ("actual", &actual as &dyn std::fmt::Display), - ], - ), + message: make_message(actual), }); }; @@ -1600,7 +1733,8 @@ fn dml_insert_arity_diagnostics( } } if proj_arity != expected { - emit(anchor.unwrap_or(path.items[kw_idx].span), proj_arity, &mut diagnostics); + let fallback = tail.first().map_or((0, 0), |it| it.span); + emit(anchor.unwrap_or(fallback), proj_arity, &mut diagnostics); } } diagnostics @@ -2872,7 +3006,7 @@ fn walk_one_command<'a>( // INSERT…SELECT projection. Form A uses the column list's // length; Form B uses the schema's column count for the // target table. - d.extend(dml_insert_arity_diagnostics(&path, ctx.schema)); + d.extend(dml_insert_arity_diagnostics(&path, ctx.schema, ctx.mode)); // ADR-0033 §8.3 — WARNING when an INSERT's column list omits // a NOT-NULL-no-default (non-auto-gen) column. d.extend(dml_not_null_missing_diagnostics(&path, ctx.schema)); @@ -4252,23 +4386,30 @@ mod tests { #[test] fn phase_d_insert_form_b_skips_serial_column() { - // Form B: `insert into values (…)` excludes - // auto-generated columns from the value list. Supplying - // a value for the serial column is a count mismatch. + // Form B: `insert into values (…)` excludes auto-generated + // columns from the value list. Supplying a value for the serial + // column is a count mismatch. As of issue #17 such a wrong-count + // tuple parses **structurally** (routed to the type-blind + // fallback) so the friendly arity diagnostic can fire — the + // mismatch is now reported as an ERROR diagnostic rather than a + // bare parse error, matching advanced mode. Dispatch is gated by + // the submit pre-flight, not by a parse failure. let schema = schema_with( "Customers", &[("id", Type::Serial), ("Name", Type::Text)], ); - // Two values where Form B expects one (Name only): - let err = parse_command_with_schema( + // Two values where Form B expects one (Name only): structurally + // parses, but the simple-mode arity diagnostic flags it (Form B + // expects 1 value for `Name`; `id` is auto-generated). + let diags = diag_keys_simple( "insert into Customers values (1, 'Alice')", &schema, - ) - .expect_err("Form B should reject user-supplied serial"); - match err { - crate::dsl::ParseError::Invalid { .. } => {} - other => panic!("expected Invalid, got {other:?}"), - } + ); + assert!( + diags.iter().any(|d| d.contains("1 value(s)") && d.contains("2 given")), + "Form B serial-skip count mismatch must fire the arity \ + diagnostic (expected 1, got 2); got {diags:?}", + ); } #[test] @@ -4703,6 +4844,139 @@ mod tests { .collect() } + /// Simple-mode counterpart of [`diag_keys`] — the DSL surface + /// (ADR-0003). Issue #17: the arity diagnostic must fire in simple + /// mode too, with the user-fillable (serial-skipped) Form B count. + fn diag_keys_simple(source: &str, schema: &SchemaCache) -> Vec { + let mut ctx = super::context::WalkContext::with_schema(schema); + ctx.mode = crate::mode::Mode::Simple; + let (result, _cmd) = + super::walk(source, super::outcome::WalkBound::EndOfInput, &mut ctx); + result.map_or_else(Vec::new, |r| { + r.diagnostics.into_iter().map(|d| d.message).collect() + }) + } + + #[test] + fn insert_arity_mismatch_simple_form_b_uses_user_fillable_count() { + // Issue #17: simple-mode Form B skips serial/shortid (auto- + // filled), so `Customers(id serial, Name, Age, SerNo)` expects + // 2 values (Name, Age). Three values is a mismatch — and simple + // mode must surface the friendly arity diagnostic (as advanced + // mode already does), counted against the user-fillable columns, + // not the full table. + let schema = schema_with( + "Customers", + &[ + ("id", Type::Serial), + ("Name", Type::Text), + ("Age", Type::Int), + ("SerNo", Type::Serial), + ], + ); + let diags = diag_keys_simple( + "insert into Customers values ('Oli', 52, 3)", + &schema, + ); + assert!( + diags.iter().any(|d| d.contains("2 value(s)") && d.contains("3 given")), + "simple Form B must fire arity diagnostic with user-fillable \ + count (2) and actual (3); got {diags:?}", + ); + // Pedagogical: names the user-fillable and the auto-generated + // columns so the learner understands the skip. + assert!( + diags.iter().any(|d| d.contains("Name") + && d.contains("Age") + && d.contains("id") + && d.contains("SerNo")), + "message should name fillable (Name, Age) and auto-gen (id, \ + SerNo) columns; got {diags:?}", + ); + } + + #[test] + fn insert_arity_mismatch_simple_form_a_uses_listed_count() { + // Simple Form A: the explicit column list sets the expected + // count (mode-neutral). `(Name, Age)` lists 2 columns; one value + // is a mismatch. The DSL Form A column role is `insert_first_item` + // (issue #17 — the diagnostic counts it as well as the SQL + // `insert_column`). + let schema = schema_with( + "Customers", + &[("id", Type::Serial), ("Name", Type::Text), ("Age", Type::Int)], + ); + let diags = diag_keys_simple( + "insert into Customers (Name, Age) values ('Oli')", + &schema, + ); + assert!( + diags.iter().any(|d| d.contains("names 2 column(s)") && d.contains("1 value(s)")), + "simple Form A must fire the column-list arity diagnostic; got {diags:?}", + ); + } + + #[test] + fn insert_arity_mismatch_simple_form_c_uses_user_fillable_count() { + // Simple Form C (`insert into T (vals)`, no `values` keyword) + // shares Form B's count semantics — the diagnostic must fire even + // though there is no `values` keyword to anchor on (issue #17). + let schema = schema_with( + "Customers", + &[("id", Type::Serial), ("Name", Type::Text), ("Age", Type::Int)], + ); + let diags = diag_keys_simple( + "insert into Customers ('Oli', 52, 3)", + &schema, + ); + assert!( + diags.iter().any(|d| d.contains("2 value(s)") && d.contains("3 given")), + "simple Form C must fire the arity diagnostic (expected 2, got 3); got {diags:?}", + ); + } + + #[test] + fn insert_arity_mismatch_simple_all_auto_table() { + // Edge: every column auto-generated (all serial/shortid). The + // user-fillable count is 0, so any supplied value is a mismatch — + // the tailored "all auto-generated" message fires (issue #17). + let schema = schema_with( + "Counters", + &[("id", Type::Serial), ("seq", Type::Serial)], + ); + let diags = diag_keys_simple( + "insert into Counters values (1)", + &schema, + ); + assert!( + diags.iter().any(|d| d.contains("all auto-generated") + || d.contains("auto-generated, so no values")), + "all-auto table must fire the tailored message; got {diags:?}", + ); + } + + #[test] + fn insert_arity_advanced_form_b_still_uses_full_column_count() { + // Guard: issue #17 made the diagnostic mode-aware, but advanced + // Form B must keep the full-column-count semantics (auto-fills + // nothing, ADR-0036) — `('Oli', 52, 3)` for a 4-column table + // reports "all 4 column(s)", not the simple-mode user-fillable 2. + let schema = schema_with( + "Customers", + &[ + ("id", Type::Serial), + ("Name", Type::Text), + ("Age", Type::Int), + ("SerNo", Type::Serial), + ], + ); + let diags = diag_keys("insert into Customers values ('Oli', 52, 3)", &schema); + assert!( + diags.iter().any(|d| d.contains("all 4 column(s)") && d.contains("3 value(s)")), + "advanced Form B must keep the full-column count (4); got {diags:?}", + ); + } + #[test] fn unknown_qualifier_in_qualified_ref_is_error() { let schema = two_table_schema(); diff --git a/src/friendly/keys.rs b/src/friendly/keys.rs index 45ed99a..313a657 100644 --- a/src/friendly/keys.rs +++ b/src/friendly/keys.rs @@ -51,6 +51,15 @@ pub const KEYS_AND_PLACEHOLDERS: &[(&str, &[&str])] = &[ "diagnostic.insert_arity_mismatch_form_b", &["expected", "actual"], ), + // ADR-0036 Amendment 1 / issue #17: simple-mode Form B variants. + ( + "diagnostic.insert_arity_mismatch_form_b_simple", + &["expected", "columns", "skipped", "actual"], + ), + ( + "diagnostic.insert_arity_mismatch_all_auto", + &["table", "actual"], + ), ("diagnostic.not_null_missing", &["column"]), ("diagnostic.like_numeric", &["column", "type"]), ("diagnostic.projection_alias_misplaced", &["alias", "clause"]), diff --git a/src/friendly/strings/en-US.yaml b/src/friendly/strings/en-US.yaml index 14087e1..ee0b254 100644 --- a/src/friendly/strings/en-US.yaml +++ b/src/friendly/strings/en-US.yaml @@ -577,8 +577,16 @@ diagnostic: # ADR-0033 §8 — Phase-3 DML diagnostic keys. auto_column_overridden: "column `{column}` is auto-generated (`{type}`); providing an explicit value bypasses the auto-counter and may collide with later auto-generated values" insert_arity_mismatch: "the column list names {expected} column(s) but {actual} value(s) are given" - # ADR-0033 §8.1 / Amendment 5: Form B (no column list) variant. + # ADR-0033 §8.1 / Amendment 5: Form B (no column list) variant + # (advanced mode — auto-fills nothing, so every column needs a value). insert_arity_mismatch_form_b: "with no column list, all {expected} column(s) need a value but {actual} value(s) are given" + # ADR-0036 Amendment 1 / issue #17: simple-mode Form B. The DSL + # auto-fills serial/shortid columns, so only the user-fillable columns + # take values — name both sets so the learner understands the skip. + insert_arity_mismatch_form_b_simple: "without a column list, supply {expected} value(s) for {columns} — {skipped} auto-generated; {actual} given" + # ADR-0036 Amendment 1 / issue #17: simple-mode Form B where every + # column is auto-generated, so the values list takes nothing. + insert_arity_mismatch_all_auto: "every column of `{table}` is auto-generated, so no values are needed, but {actual} value(s) are given" not_null_missing: "column `{column}` is required (`NOT NULL`, no default); the statement will fail when run" # Engine-error translations: an engine-rejected SQL statement diff --git a/src/input_render.rs b/src/input_render.rs index 89fc860..ac49d1d 100644 --- a/src/input_render.rs +++ b/src/input_render.rs @@ -323,10 +323,20 @@ pub fn advanced_alternative_note( input: &str, cache: &crate::completion::SchemaCache, ) -> Option { - let definite_dsl_error = matches!( - classify_input_with_schema_in_mode(input, cache, Mode::Simple), - InputState::DefiniteErrorAt(_) - ); + // The line must be *definitely* invalid in simple mode — a definite + // parse error, or (issue #17) a parse that succeeds structurally but + // carries a blocking ERROR diagnostic such as a value-count + // mismatch. Incomplete input (still being typed) and empty input are + // excluded so the pointer doesn't flicker mid-keystroke. + let definite_dsl_error = match classify_input_with_schema_in_mode(input, cache, Mode::Simple) + { + InputState::DefiniteErrorAt(_) => true, + InputState::Valid => { + crate::dsl::walker::input_verdict_in_mode(input, Some(cache), Mode::Simple) + == Some(crate::dsl::walker::outcome::Severity::Error) + } + InputState::Empty | InputState::IncompleteAtEof => false, + }; if !definite_dsl_error { return None; } @@ -363,6 +373,82 @@ pub fn advanced_alternative_note( /// without adding pedagogy. The cross-mode pointer wins because it /// only fires when switching modes actually works (issue #1 sub-task /// 1's gate); when it doesn't fire, this note steps in. +/// Submit-time pre-flight for a simple-mode (DSL) `Command::Insert` +/// whose positional value count doesn't match the expected count +/// (issue #17). Returns the advice line(s) to display when there is a +/// mismatch — the caller (`dispatch_dsl`) blocks dispatch whenever this +/// is `Some`, so a wrong-count insert never reaches the worker. `None` +/// when the command isn't an insert, the table is unknown, or the count +/// already matches. +/// +/// This is the simple-mode counterpart of the advanced Ok-arm pre-flight +/// (`form_b_positional_count_mismatch_note`). Both modes now parse a +/// wrong-count insert as `Ok` (so the typing-time arity diagnostic can +/// fire — issue #17), so dispatch is gated here, uniformly, rather than +/// by a parse error. +/// +/// Expected count: Form A (explicit `(col, …)`) → the listed count; +/// Form B/C (no list) → the user-fillable (non-auto-generated) count, +/// since the dispatch auto-fills serial/shortid (ADR-0018 §3). +/// +/// Advice selection mirrors the previous Err-arm logic: the cross-mode +/// pointer wins when the same text is valid in advanced mode; otherwise +/// Form B/C shows the rich teaching note (names the fillable + auto +/// columns and the Form-A override) and Form A shows the column-list +/// arity message. +#[must_use] +pub fn dsl_insert_count_mismatch_notes( + input: &str, + cmd: &crate::dsl::command::Command, + cache: &crate::completion::SchemaCache, +) -> Option> { + use crate::dsl::command::Command; + use crate::dsl::types::Type; + let Command::Insert { + table, + columns, + values, + } = cmd + else { + return None; + }; + let table_cols = cache.table_columns.get(table)?; + let is_auto = |t: Type| matches!(t, Type::Serial | Type::ShortId); + let expected = columns.as_ref().map_or_else( + || table_cols.iter().filter(|c| !is_auto(c.user_type)).count(), + Vec::len, + ); + if values.len() == expected { + return None; // counts match — nothing to flag, dispatch proceeds + } + // Count mismatch → the caller blocks dispatch. Build the advice. + // The cross-mode pointer is the single most useful line when the + // same text is valid in advanced mode, so it suppresses the rest. + if let Some(pointer) = advanced_alternative_note(input, cache) { + return Some(vec![pointer]); + } + let note = if columns.is_some() { + // Form A: the column-list arity message. + crate::t!( + "diagnostic.insert_arity_mismatch", + expected = expected, + actual = values.len() + ) + } else { + // Form B/C: the rich teaching note. Falls back to the all-auto + // explanation for a table whose columns are all auto-generated + // (the override note doesn't apply there). + form_b_extra_values_note(input, cache).unwrap_or_else(|| { + crate::t!( + "diagnostic.insert_arity_mismatch_all_auto", + table = table, + actual = values.len() + ) + }) + }; + Some(vec![note]) +} + #[must_use] pub fn form_b_extra_values_note( input: &str, diff --git a/tests/typing_surface/insert_form_b.rs b/tests/typing_surface/insert_form_b.rs index 49ca8e2..b994f6f 100644 --- a/tests/typing_surface/insert_form_b.rs +++ b/tests/typing_surface/insert_form_b.rs @@ -154,16 +154,21 @@ fn form_b_in_progress_without_closing_paren_is_incomplete() { #[test] fn form_b_with_too_few_values_is_invalid_at_close_paren() { let schema = schema_serial_pk(); - let a = assess_at_end("insert into Customers values ('Alice')", &schema); - // Only one value supplied; Form B for Customers needs two. - // The grammar's typed slot list expects another `,` - // before the `)`. Classify as DefiniteError or Incomplete - // (which one depends on whether the closing `)` is past - // the missing slot). - assert!( - !matches!(a.state, InputState::Valid), - "input with too-few values must NOT be Valid, got {:?}", - a.state, + let input = "insert into Customers values ('Alice')"; + let a = assess_at_end(input, &schema); + // Only one value supplied; Form B for Customers needs two. As of + // issue #17 a wrong-count tuple parses *structurally* (so the + // friendly arity diagnostic can fire) — the user-facing invalidity + // is the validity verdict (the `[ERR]` indicator), not the + // structural `state`. + assert_eq!( + rdbms_playground::dsl::walker::input_verdict_in_mode( + input, + Some(&schema), + rdbms_playground::mode::Mode::Simple, + ), + Some(rdbms_playground::dsl::walker::Severity::Error), + "too-few values must light the [ERR] verdict", ); crate::snap!("form_b_too_few_values", a); } @@ -174,14 +179,16 @@ fn form_b_with_extra_value_for_serial_column_is_invalid() { // (treating it as the first slot) means an extra value // overall — Customers has 3 columns but Form B accepts 2. let schema = schema_serial_pk(); - let a = assess_at_end( - "insert into Customers values (1, 'Alice', 'a@b.c')", - &schema, - ); - assert!( - !matches!(a.state, InputState::Valid), - "Form B with a value-for-serial must be invalid, got {:?}", - a.state, + let input = "insert into Customers values (1, 'Alice', 'a@b.c')"; + let a = assess_at_end(input, &schema); + assert_eq!( + rdbms_playground::dsl::walker::input_verdict_in_mode( + input, + Some(&schema), + rdbms_playground::mode::Mode::Simple, + ), + Some(rdbms_playground::dsl::walker::Severity::Error), + "Form B with a value-for-serial must light the [ERR] verdict", ); crate::snap!("form_b_extra_serial_value", a); } diff --git a/tests/typing_surface/insert_form_c.rs b/tests/typing_surface/insert_form_c.rs index 43ff4ab..1499b65 100644 --- a/tests/typing_surface/insert_form_c.rs +++ b/tests/typing_surface/insert_form_c.rs @@ -85,18 +85,21 @@ fn form_c_rejects_number_for_text_column() { #[test] fn form_c_wrong_value_count_is_invalid() { - // Customers Form C expects exactly two values (id:serial - // skipped). Three values is a count mismatch — caught at - // parse time now. + // Customers Form C expects exactly two values (id:serial skipped). + // Three values is a count mismatch. As of issue #17 it parses + // structurally (so the friendly arity diagnostic fires); the + // user-facing invalidity is the validity verdict (`[ERR]`). let schema = schema_serial_pk(); - let a = assess_at_end( - "insert into Customers ('Alice', 'a@b.c', 'extra')", - &schema, - ); - assert!( - !matches!(a.state, InputState::Valid), - "Form C with too many values must be invalid, got {:?}", - a.state, + let input = "insert into Customers ('Alice', 'a@b.c', 'extra')"; + let a = assess_at_end(input, &schema); + assert_eq!( + rdbms_playground::dsl::walker::input_verdict_in_mode( + input, + Some(&schema), + rdbms_playground::mode::Mode::Simple, + ), + Some(rdbms_playground::dsl::walker::Severity::Error), + "Form C with too many values must light the [ERR] verdict", ); crate::snap!("form_c_wrong_count", a); } diff --git a/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_b__form_b_with_extra_value_for_serial_column_is_invalid@form_b_extra_serial_value.snap b/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_b__form_b_with_extra_value_for_serial_column_is_invalid@form_b_extra_serial_value.snap index a70709f..0bfecb4 100644 --- a/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_b__form_b_with_extra_value_for_serial_column_is_invalid@form_b_extra_serial_value.snap +++ b/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_b__form_b_with_extra_value_for_serial_column_is_invalid@form_b_extra_serial_value.snap @@ -1,37 +1,20 @@ --- source: tests/typing_surface/insert_form_b.rs -assertion_line: 186 +assertion_line: 193 description: "input=\"insert into Customers values (1, 'Alice', 'a@b.c')\" cursor=50" expression: "& a" --- Assessment { input: "insert into Customers values (1, 'Alice', 'a@b.c')", cursor: 50, - state: DefiniteErrorAt( - 30, - ), + state: Valid, hint: Some( Prose( - "for `Name`: Type a quoted string (e.g. 'Alice') or null (`id` auto-generated — skipped here; list columns explicitly, e.g. `insert into T (...) values (...)`, to set it.) (trying to write SQL? switch with `mode advanced`, or prefix `:` to run once)", + "without a column list, supply 2 value(s) for `Name`, `Email` — `id` auto-generated; 3 given (trying to write SQL? switch with `mode advanced`, or prefix `:` to run once)", ), ), - completion: Some( - Completion { - replaced_range: ( - 50, - 50, - ), - partial_prefix: "", - candidates: [ - Candidate { - text: "null", - kind: Keyword, - mode: Both, - }, - ], - }, - ), - parse_result: Err( - "Invalid(definite)", + completion: None, + parse_result: Ok( + "Insert", ), } diff --git a/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_b__form_b_with_too_few_values_is_invalid_at_close_paren@form_b_too_few_values.snap b/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_b__form_b_with_too_few_values_is_invalid_at_close_paren@form_b_too_few_values.snap index f302076..9f1850d 100644 --- a/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_b__form_b_with_too_few_values_is_invalid_at_close_paren@form_b_too_few_values.snap +++ b/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_b__form_b_with_too_few_values_is_invalid_at_close_paren@form_b_too_few_values.snap @@ -1,22 +1,20 @@ --- source: tests/typing_surface/insert_form_b.rs -assertion_line: 168 +assertion_line: 173 description: "input=\"insert into Customers values ('Alice')\" cursor=38" expression: "& a" --- Assessment { input: "insert into Customers values ('Alice')", cursor: 38, - state: DefiniteErrorAt( - 37, - ), + state: Valid, hint: Some( Prose( - "after `insert into Customers values ('Alice'`, expected `,` — usage: insert into [([, ...])] [values] ([, ...])", + "without a column list, supply 2 value(s) for `Name`, `Email` — `id` auto-generated; 1 given", ), ), completion: None, - parse_result: Err( - "Invalid(definite)", + parse_result: Ok( + "Insert", ), } diff --git a/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_c__form_c_wrong_value_count_is_invalid@form_c_wrong_count.snap b/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_c__form_c_wrong_value_count_is_invalid@form_c_wrong_count.snap index c8f1c22..38b2ec3 100644 --- a/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_c__form_c_wrong_value_count_is_invalid@form_c_wrong_count.snap +++ b/tests/typing_surface/snapshots/typing_surface_matrix__typing_surface__insert_form_c__form_c_wrong_value_count_is_invalid@form_c_wrong_count.snap @@ -1,22 +1,35 @@ --- source: tests/typing_surface/insert_form_c.rs -assertion_line: 101 +assertion_line: 104 description: "input=\"insert into Customers ('Alice', 'a@b.c', 'extra')\" cursor=49" expression: "& a" --- Assessment { input: "insert into Customers ('Alice', 'a@b.c', 'extra')", cursor: 49, - state: DefiniteErrorAt( - 39, - ), + state: Valid, hint: Some( Prose( - "after `insert into Customers ('Alice', 'a@b.c'`, expected `)` — usage: insert into
[([, ...])] [values] ([, ...])", + "without a column list, supply 2 value(s) for `Name`, `Email` — `id` auto-generated; 3 given", ), ), - completion: None, - parse_result: Err( - "Invalid(definite)", + completion: Some( + Completion { + replaced_range: ( + 49, + 49, + ), + partial_prefix: "", + candidates: [ + Candidate { + text: "values", + kind: Keyword, + mode: Both, + }, + ], + }, + ), + parse_result: Ok( + "Insert", ), }