feat: bring simple-mode insert arity diagnostics to parity with advanced

A wrong-count simple-mode insert now shows the friendly per-column arity
message at typing time (instead of a bare "expected `,`/`)`") and is
blocked from dispatch at submit — unifying simple and advanced mode onto
the one ADR-0027 model (structural parse + ERROR diagnostic), where they
had diverged.

Grammar: a simple-mode-only arity gate (dsl_insert_value_list) routes a
wrong-count DSL insert tuple to the type-blind fallback so it matches
structurally and the per-tuple arity diagnostic fires. The gate is gated
to simple mode, so advanced behaviour is unchanged. count_tuple_values
and the target-column selection (insert_target_columns) are now shared
by both grammars.

Diagnostic: dml_insert_arity_diagnostics is mode-aware — advanced Form B
expects all columns; simple Form B/C expects the user-fillable columns
(serial/shortid auto-fill). It counts the DSL Form A role and scans the
keyword-less Form C tuple. New catalog keys name the fillable/auto split
and the all-auto-table case.

Submit: a wrong-count DSL insert now parses Ok + carries the ERROR
diagnostic, so a unified Ok-arm pre-flight (dsl_insert_count_mismatch_notes)
blocks dispatch and teaches; the previous Err-arm note retires.
advanced_alternative_note's gate now reads the validity verdict so it
still fires for the parse-Ok-with-error shape.

Docs: ADR-0036 Amendment 2 (+ README index) and requirements.md H1a.
This commit is contained in:
claude@clouddev1
2026-05-29 20:45:21 +00:00
parent 7cccf4eabb
commit 10e5197c19
16 changed files with 812 additions and 240 deletions
+55 -12
View File
@@ -27,7 +27,10 @@
use crate::dsl::command::{Command, Expr, RowFilter};
use crate::dsl::grammar::{
CommandNode, IdentSource, Node, NumberValidator, ValidationError, Word, expr,
shared::{column_value_list, current_column_value},
shared::{
FALLBACK_VALUE_LIST, column_value_list, count_tuple_values,
current_column_value, insert_target_columns,
},
sql_delete, sql_insert, sql_select, sql_update,
};
use crate::dsl::walker::context::WalkContext;
@@ -141,12 +144,13 @@ static INSERT_COMMA: Node = Node::Punct(',');
/// First-paren resolver (ADR-0024 §Phase D Form-C type-awareness).
/// Peeks the first token after `(` to route to Form A's
/// column-name list or Form C's typed value list.
fn insert_first_paren(_ctx: &WalkContext, source: &str, pos: usize) -> Node {
fn insert_first_paren(ctx: &WalkContext, source: &str, pos: usize) -> Node {
if first_paren_item_is_value_literal(source, pos) {
// Form C — bare value list. `column_value_list` with no
// user-listed columns dispatches per non-auto-generated
// column, exactly as Form B does.
Node::DynamicSubgrammar(column_value_list)
// Form C — bare value list. Arity-gated exactly like Form B's
// `values (…)`: a correct-count tuple gets the typed per-column
// slots; a wrong-count tuple routes to the type-blind fallback
// so it still matches and the arity diagnostic fires (issue #17).
dsl_insert_value_list(ctx, source, pos)
} else {
// Form A (or Form A in progress / empty paren).
Node::Repeated {
@@ -189,12 +193,51 @@ fn first_paren_item_is_value_literal(source: &str, pos: usize) -> bool {
const INSERT_PAREN_LIST: Node = Node::Lookahead(insert_first_paren);
/// Schema-aware value list: when the walker has a populated
/// `current_table_columns`, unfolds to a `Seq` of typed slots
/// per column (`int_slot`, `text_slot`, …). When schemaless,
/// falls back to the pre-Phase-D `Repeated(VALUE_LITERAL, ',', 1)`
/// shape (ADR-0024 §Phase D §column_value_list).
const INSERT_VALUES_LIST: Node = Node::DynamicSubgrammar(column_value_list);
/// Insert value-list arity gate (issue #17) — the simple-mode DSL
/// counterpart of the advanced grammar's `tuple_value_list`
/// (`sql_insert.rs`). Routes a correct-arity tuple to the typed
/// per-column slots ([`column_value_list`]) and a wrong-arity tuple to
/// the type-blind [`FALLBACK_VALUE_LIST`], so the wrong-count tuple
/// still structurally matches and the per-tuple arity diagnostic
/// (ADR-0033 §8.1, made mode-aware for issue #17) fires its friendly
/// message instead of a bare "expected `,`/`)`".
///
/// Target arity comes from [`insert_target_columns`] — the same source
/// `column_value_list` uses, so gate and slots never disagree. `None`
/// (schemaless / unknown table / all-auto-generated) → fallback: either
/// we can't gate (schemaless) or the all-auto case wants the tuple to
/// match so the diagnostic can explain it.
///
/// **Simple-mode only.** The fallback routing is what lets a wrong-count
/// tuple structurally match (so the diagnostic fires); that is a
/// simple-mode behaviour. In advanced mode the DSL insert node must stay
/// strict — otherwise a non-SQL shape like Form C (`insert into T
/// (1, 2)`, no `values`) would spuriously match here and be accepted in
/// advanced mode, where SQL requires `values` and the dedicated SQL
/// grammar (`sql_insert.rs`) owns inserts. Keeping advanced strict
/// preserves the pre-#17 advanced behaviour exactly (issue #17).
fn dsl_insert_value_list(ctx: &WalkContext, source: &str, pos: usize) -> Node {
if ctx.mode != crate::mode::Mode::Simple {
return Node::DynamicSubgrammar(column_value_list);
}
let Some(cols) = insert_target_columns(ctx) else {
return FALLBACK_VALUE_LIST;
};
let (count, closed) = count_tuple_values(source, pos);
let arity_ok = if closed { count == cols.len() } else { count <= cols.len() };
if arity_ok {
Node::DynamicSubgrammar(column_value_list)
} else {
FALLBACK_VALUE_LIST
}
}
/// Schema-aware value list, arity-gated (issue #17): a correct-count
/// tuple unfolds to a `Seq` of typed slots per column (`int_slot`,
/// `text_slot`, …); a wrong-count tuple or a schemaless walk falls back
/// to the type-blind `Repeated(VALUE_LITERAL, ',', 1)` shape (ADR-0024
/// §Phase D §column_value_list).
const INSERT_VALUES_LIST: Node = Node::Lookahead(dsl_insert_value_list);
const INSERT_OPTIONAL_VALUES_NODES: &[Node] = &[
Node::Word(Word::keyword("values")),
+109 -43
View File
@@ -371,12 +371,116 @@ pub(crate) const FALLBACK_VALUE_LITERAL: Node = Node::Hinted {
inner: &FALLBACK_VALUE_LITERAL_INNER,
};
const FALLBACK_VALUE_LIST: Node = Node::Repeated {
/// The type-blind value list. `pub(crate)` so the insert value-list
/// arity gate (`data.rs`, issue #17) can route a wrong-count tuple here
/// — exactly as the advanced grammar's `tuple_value_list` does — so the
/// tuple still structurally matches and the per-tuple arity diagnostic
/// (ADR-0033 §8.1) fires instead of a bare "expected `,`/`)`".
pub(crate) const FALLBACK_VALUE_LIST: Node = Node::Repeated {
inner: &FALLBACK_VALUE_LITERAL,
separator: Some(&Node::Punct(',')),
min: 1,
};
/// The columns an insert value tuple maps onto (ADR-0024 §Phase D).
///
/// Mirrors `db::do_insert`'s `user_cols` logic. `None` when the walker
/// is schemaless, the table is unknown, or — for Form B/C — every column
/// is auto-generated (callers fall back to the type-blind value list).
///
/// - **Form A** (`user_listed_columns` set): the listed columns, in the
/// user's order; names the schema doesn't know are dropped.
/// - **Form B/C** (no column list): the table's non-auto-generated
/// columns, in declaration order. `serial` / `shortid` are skipped
/// because the simple-mode dispatch auto-fills them (ADR-0018 §3).
///
/// This is the single source of truth shared by [`column_value_list`]
/// (which builds the typed slots) and the `data.rs` arity gate (which
/// counts them) so the two never disagree (issue #17).
pub fn insert_target_columns<'c>(
ctx: &'c WalkContext<'_>,
) -> Option<Vec<&'c TableColumn>> {
let table_cols = ctx.current_table_columns.as_ref()?;
if table_cols.is_empty() {
return None;
}
let cols: Vec<&TableColumn> = ctx.user_listed_columns.as_ref().map_or_else(
|| {
table_cols
.iter()
.filter(|c| !matches!(c.user_type, Type::Serial | Type::ShortId))
.collect()
},
|user_listed| {
user_listed
.iter()
.filter_map(|name| {
table_cols
.iter()
.find(|c| c.name.eq_ignore_ascii_case(name))
})
.collect()
},
);
if cols.is_empty() { None } else { Some(cols) }
}
/// Count the value positions in a `VALUES`/insert tuple whose contents
/// begin at `pos` (just past the opening `(`), and whether the tuple is
/// *closed* (a depth-0 `)` was reached) vs still being typed (scan hit
/// end-of-input first). Depth-aware: commas nested in a function call /
/// subquery (paren depth ≥ 1) or inside a string literal are not
/// separators. Returns `(0, _)` for an empty tuple `()`.
///
/// Shared by the advanced grammar's `tuple_value_list` (`sql_insert.rs`)
/// and the simple-mode DSL insert arity gate (`data.rs`) so both modes
/// count tuple values identically (issue #17).
pub(crate) fn count_tuple_values(source: &str, pos: usize) -> (usize, bool) {
let bytes = source.as_bytes();
let mut i = pos;
let mut depth: i32 = 0;
let mut commas = 0usize;
let mut seen_value = false;
let mut closed = false;
while i < bytes.len() {
match bytes[i] {
b'\'' => {
// Skip a single-quoted string literal (`''` escape).
i += 1;
seen_value = true;
while i < bytes.len() {
if bytes[i] == b'\'' {
if bytes.get(i + 1) == Some(&b'\'') {
i += 2;
continue;
}
i += 1;
break;
}
i += 1;
}
continue;
}
b'(' => {
depth += 1;
seen_value = true;
}
b')' => {
if depth == 0 {
closed = true;
break; // tuple close
}
depth -= 1;
}
b',' if depth == 0 => commas += 1,
b if !b.is_ascii_whitespace() => seen_value = true,
_ => {}
}
i += 1;
}
(if seen_value { commas + 1 } else { 0 }, closed)
}
/// Value slot keyed on `WalkContext::current_column`.
///
/// Picks the typed slot for the column whose name was most
@@ -399,50 +503,12 @@ pub fn current_column_value(ctx: &WalkContext) -> Node {
/// `Repeated(VALUE_LITERAL, ',', 1)` shape so existing
/// callers/tests continue to work.
pub fn column_value_list(ctx: &WalkContext) -> Node {
let Some(table_cols) = ctx.current_table_columns.as_ref() else {
// Target columns per the shared insert mapping (Form A = listed,
// Form B/C = non-auto-generated). `None` → schemaless / unknown
// table / all-auto-generated → the type-blind fallback list.
let Some(target_cols) = insert_target_columns(ctx) else {
return FALLBACK_VALUE_LIST;
};
if table_cols.is_empty() {
return FALLBACK_VALUE_LIST;
}
// Three dispatch shapes (ADR-0024 §Phase D §column_value_list,
// matching `db::do_insert`'s user_cols logic):
//
// 1. Form A — user listed explicit columns
// (`insert into T (col1, col2, …) values (…)`): one slot
// per listed column, in the user's order, types resolved
// from the schema.
// 2. Form B — bare values keyword
// (`insert into T values (…)`): one slot per non-auto-
// generated column of T, in declaration order. Serial /
// shortid columns are skipped because the dispatch path
// auto-fills them (ADR-0018 §3).
// 3. Schemaless / fallback: the generic value-literal list.
let target_cols: Vec<&TableColumn> = ctx.user_listed_columns.as_ref().map_or_else(
|| {
// Form B — exclude auto-generated columns.
table_cols
.iter()
.filter(|c| !matches!(c.user_type, Type::Serial | Type::ShortId))
.collect()
},
|user_listed| {
// Form A — resolve each listed name from the schema.
// Names the schema doesn't know about silently drop;
// the bind-time path catches unknown columns.
user_listed
.iter()
.filter_map(|name| {
table_cols
.iter()
.find(|c| c.name.eq_ignore_ascii_case(name))
})
.collect()
},
);
if target_cols.is_empty() {
return FALLBACK_VALUE_LIST;
}
// Build a Seq of typed slots interleaved with commas. Each
// slot embeds its column name so the hint resolver can
// mention the column by name ("for `Email`: Type a quoted
+4 -52
View File
@@ -14,7 +14,7 @@
//! sub-phases.
use crate::completion::TableColumn;
use crate::dsl::grammar::shared::SET_VALUE;
use crate::dsl::grammar::shared::{SET_VALUE, count_tuple_values};
use crate::dsl::grammar::sql_expr;
use crate::dsl::grammar::sql_select::{
RETURNING_CLAUSE, SQL_SELECT_COMPOUND, WHERE_CLAUSE, reject_internal_table,
@@ -127,57 +127,9 @@ fn target_value_columns(ctx: &WalkContext) -> Vec<TableColumn> {
)
}
/// Count the value positions in the `VALUES` tuple whose contents begin
/// at `pos` (just past the opening `(`), and whether the tuple is
/// *closed* (a depth-0 `)` was reached) vs still being typed (scan hit
/// end-of-input first). Depth-aware: commas nested in a function call /
/// subquery (paren depth ≥ 1) or inside a string literal are not
/// separators. Returns `(0, _)` for an empty tuple `()`.
fn count_tuple_values(source: &str, pos: usize) -> (usize, bool) {
let bytes = source.as_bytes();
let mut i = pos;
let mut depth: i32 = 0;
let mut commas = 0usize;
let mut seen_value = false;
let mut closed = false;
while i < bytes.len() {
match bytes[i] {
b'\'' => {
// Skip a single-quoted string literal (`''` escape).
i += 1;
seen_value = true;
while i < bytes.len() {
if bytes[i] == b'\'' {
if bytes.get(i + 1) == Some(&b'\'') {
i += 2;
continue;
}
i += 1;
break;
}
i += 1;
}
continue;
}
b'(' => {
depth += 1;
seen_value = true;
}
b')' => {
if depth == 0 {
closed = true;
break; // tuple close
}
depth -= 1;
}
b',' if depth == 0 => commas += 1,
b if !b.is_ascii_whitespace() => seen_value = true,
_ => {}
}
i += 1;
}
(if seen_value { commas + 1 } else { 0 }, closed)
}
// `count_tuple_values` moved to `grammar::shared` (issue #17) so the
// simple-mode DSL insert arity gate can share it; the advanced grammar
// imports it above.
/// Tuple value-list lookahead (ADR-0036 Phase 3b). Gates the typed
/// per-column path on arity so the typed `Seq` is used only where it