ADR-0024 Phase D: data commands at chumsky parity

Migrate the four data commands at four entry words: show
(show data / show table), insert, update, delete. Walker now
owns the entire command set introduced through ADR-0014.

Scope deviation from ADR-0024: full schema-aware value typing
via DynamicSubgrammar(column_value_list) is deferred. The
walker accepts any value at any position — matching the
existing chumsky parser's behaviour, where per-column type
checks happen at bind time. The DynamicSubgrammar Node
variant and WalkContext schema fields stay declared so the
infrastructure is in place when the schema cache plumbs
through parse_command (a future refinement). All existing
tests pass on the new shape.

Walker extensions:
- StringLit terminal — wired to the consume_string_literal
  helper that mirrors the legacy lexer's `''` escape handling.
  MatchedItem text carries the unescaped payload; span covers
  the surrounding quotes.
- Bridge: Incomplete error wording now appends `, found end
  of input` (matching the chumsky-side structural error
  contract that `structural_error_for_show_data_without_arg`
  asserts on).

Grammar:
- src/dsl/grammar/data.rs: SHOW (Choice of show_data /
  show_table), INSERT (three forms folded into a single shape
  via a Choice ordered to disambiguate Form B's `values`
  keyword from Forms A/C's `(`-prefixed content; the inner
  paren list is a Choice(VALUE_LITERAL, Ident{Columns}) with
  VALUE_LITERAL ordered first so `true`/`false`/`null` match
  their Word branch rather than the broader identifier catch-
  all), UPDATE (assignments + filter), DELETE (filter).
- VALUE_LITERAL = Choice(Word("null"), Word("true"),
  Word("false"), NumberLit, StringLit) — matches the chumsky
  `value_literal()`.
- WHERE_CLAUSE / FILTER_CLAUSE shared between update and
  delete.
- AST builders walk MatchedPath items in order, using role
  tags (`update_set_column`, `filter_column`,
  `insert_first_item`) to discriminate column references
  belonging to different shapes within the same command.

Tests:
- 13 new walker-specific tests covering all data forms:
  show data / show table, insert with each of three forms,
  insert with negative numbers, update with single + multiple
  assignments + where, update with --all-rows, delete with
  where, delete with --all-rows, update/delete without filter
  errors, replay still routes via chumsky.
- Total: 838 passed, 0 failed, 1 ignored (was 825 / 1).
- cargo clippy --all-targets -- -D warnings clean.
This commit is contained in:
claude@clouddev1
2026-05-15 07:20:53 +00:00
parent 6bb688251b
commit c2accc2385
5 changed files with 741 additions and 14 deletions
+531
View File
@@ -0,0 +1,531 @@
//! Data command nodes (ADR-0024 §migration Phase D).
//!
//! Five commands at four entry words: `show` (show data /
//! show table), `insert`, `update`, `delete`. The walker route
//! owns these end-to-end.
//!
//! Phase D scope deviation note: ADR-0024's Phase D describes
//! "full schema awareness" via `DynamicSubgrammar
//! (column_value_list)` that unfolds typed slots per column. This
//! milestone lands the data commands at functional parity with
//! the existing chumsky parser — value slots accept any
//! literal regardless of column type, with type validation
//! happening at bind time (matching today's behaviour). The
//! `DynamicSubgrammar` machinery and schema-cache plumbing are
//! deferred to a follow-up refinement; the trie shape is
//! ready to consume them when the schema reference flows
//! through `parse_command`.
use crate::dsl::command::{Command, RowFilter};
use crate::dsl::grammar::{
CommandNode, IdentSource, Node, ValidationError, Word,
};
use crate::dsl::value::Value;
use crate::dsl::walker::outcome::{MatchedItem, MatchedKind, MatchedPath};
// =================================================================
// Building blocks
// =================================================================
const TABLE_NAME_EXISTING: Node = Node::Ident {
source: IdentSource::Tables,
role: "table_name",
validator: None,
highlight_override: None,
};
// `value_literal` — null / true / false / number / string. The
// chumsky-side equivalent (`value_literal()` in dsl/parser.rs).
const VALUE_LITERAL_CHOICES: &[Node] = &[
Node::Word(Word::keyword("null")),
Node::Word(Word::keyword("true")),
Node::Word(Word::keyword("false")),
Node::NumberLit { validator: None },
Node::StringLit,
];
const VALUE_LITERAL: Node = Node::Choice(VALUE_LITERAL_CHOICES);
// =================================================================
// show — `show (data|table) <T>`
// =================================================================
const SHOW_DATA_NODES: &[Node] = &[
Node::Word(Word::keyword("data")),
TABLE_NAME_EXISTING,
];
const SHOW_DATA: Node = Node::Seq(SHOW_DATA_NODES);
const SHOW_TABLE_NODES: &[Node] = &[
Node::Word(Word::keyword("table")),
TABLE_NAME_EXISTING,
];
const SHOW_TABLE: Node = Node::Seq(SHOW_TABLE_NODES);
const SHOW_CHOICES: &[Node] = &[SHOW_DATA, SHOW_TABLE];
const SHOW_SHAPE: Node = Node::Choice(SHOW_CHOICES);
// =================================================================
// insert — `insert into <T> (<a>,<b>,…) values (<v>,<v>,…)`
// | `insert into <T> values (<v>,…)`
// | `insert into <T> (<v>,…)`
// =================================================================
//
// Forms A (with column list) and C (bare value list) both start
// with `(`. To avoid the walker's "first commit wins" semantics
// rejecting Form C when the inner content is values rather than
// column names, the inside of the first paren is parsed as a
// repeated `Choice(Ident, ValueLiteral)`. The AST builder then
// disambiguates: if a `values` keyword follows the first paren,
// the inner content was column names; otherwise it was values.
const INSERT_PAREN_ITEM_CHOICES: &[Node] = &[
// VALUE_LITERAL first so that `true`/`false`/`null` match
// their Word branch rather than the broader Ident{Columns}
// catch-all (consume_ident doesn't filter against the
// keyword set; without this ordering, `(true)` would lex
// as a column-name list).
VALUE_LITERAL,
Node::Ident {
source: IdentSource::Columns,
role: "insert_first_item",
validator: None,
highlight_override: None,
},
];
const INSERT_PAREN_ITEM: Node = Node::Choice(INSERT_PAREN_ITEM_CHOICES);
const INSERT_PAREN_LIST: Node = Node::Repeated {
inner: &INSERT_PAREN_ITEM,
separator: Some(&Node::Punct(',')),
min: 1,
};
const INSERT_VALUES_LIST: Node = Node::Repeated {
inner: &VALUE_LITERAL,
separator: Some(&Node::Punct(',')),
min: 1,
};
const INSERT_OPTIONAL_VALUES_NODES: &[Node] = &[
Node::Word(Word::keyword("values")),
Node::Punct('('),
INSERT_VALUES_LIST,
Node::Punct(')'),
];
const INSERT_OPTIONAL_VALUES: Node = Node::Optional(&Node::Seq(INSERT_OPTIONAL_VALUES_NODES));
const INSERT_PAREN_FIRST_NODES: &[Node] = &[
Node::Punct('('),
INSERT_PAREN_LIST,
Node::Punct(')'),
INSERT_OPTIONAL_VALUES,
];
const INSERT_PAREN_FIRST: Node = Node::Seq(INSERT_PAREN_FIRST_NODES);
const INSERT_VALUES_KEYWORD_FIRST_NODES: &[Node] = &[
Node::Word(Word::keyword("values")),
Node::Punct('('),
INSERT_VALUES_LIST,
Node::Punct(')'),
];
const INSERT_VALUES_KEYWORD_FIRST: Node = Node::Seq(INSERT_VALUES_KEYWORD_FIRST_NODES);
const INSERT_AFTER_TABLE_CHOICES: &[Node] =
&[INSERT_VALUES_KEYWORD_FIRST, INSERT_PAREN_FIRST];
const INSERT_AFTER_TABLE: Node = Node::Choice(INSERT_AFTER_TABLE_CHOICES);
const INSERT_NODES: &[Node] = &[
Node::Word(Word::keyword("into")),
TABLE_NAME_EXISTING,
INSERT_AFTER_TABLE,
];
const INSERT_SHAPE: Node = Node::Seq(INSERT_NODES);
// =================================================================
// update — `update <T> set <col>=<v>[, <col>=<v>] (where … | --all-rows)`
// =================================================================
const UPDATE_ASSIGNMENT_NODES: &[Node] = &[
Node::Ident {
source: IdentSource::Columns,
role: "update_set_column",
validator: None,
highlight_override: None,
},
Node::Punct('='),
VALUE_LITERAL,
];
const UPDATE_ASSIGNMENT: Node = Node::Seq(UPDATE_ASSIGNMENT_NODES);
const UPDATE_ASSIGNMENTS: Node = Node::Repeated {
inner: &UPDATE_ASSIGNMENT,
separator: Some(&Node::Punct(',')),
min: 1,
};
const WHERE_CLAUSE_NODES: &[Node] = &[
Node::Word(Word::keyword("where")),
Node::Ident {
source: IdentSource::Columns,
role: "filter_column",
validator: None,
highlight_override: None,
},
Node::Punct('='),
VALUE_LITERAL,
];
const WHERE_CLAUSE: Node = Node::Seq(WHERE_CLAUSE_NODES);
const FILTER_CHOICES: &[Node] = &[WHERE_CLAUSE, Node::Flag("all-rows")];
const FILTER_CLAUSE: Node = Node::Choice(FILTER_CHOICES);
const UPDATE_NODES: &[Node] = &[
TABLE_NAME_EXISTING,
Node::Word(Word::keyword("set")),
UPDATE_ASSIGNMENTS,
FILTER_CLAUSE,
];
const UPDATE_SHAPE: Node = Node::Seq(UPDATE_NODES);
// =================================================================
// delete — `delete from <T> (where … | --all-rows)`
// =================================================================
const DELETE_NODES: &[Node] = &[
Node::Word(Word::keyword("from")),
TABLE_NAME_EXISTING,
FILTER_CLAUSE,
];
const DELETE_SHAPE: Node = Node::Seq(DELETE_NODES);
// =================================================================
// AST builders
// =================================================================
fn ident_text<'a>(path: &'a MatchedPath, role: &str) -> Option<&'a str> {
path.items.iter().find_map(|i| match &i.kind {
MatchedKind::Ident { role: r } if *r == role => Some(i.text.as_str()),
_ => None,
})
}
fn require_ident(path: &MatchedPath, role: &'static str) -> Result<String, ValidationError> {
ident_text(path, role)
.map(str::to_string)
.ok_or_else(|| ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", format!("missing {role}"))],
})
}
/// Convert a `MatchedItem` whose kind is one of the `value_literal`
/// variants (Word("null"|"true"|"false"), NumberLit, StringLit) to
/// a `Value`. Returns None for non-value items.
fn item_to_value(item: &MatchedItem) -> Option<Value> {
match &item.kind {
MatchedKind::Word("null") => Some(Value::Null),
MatchedKind::Word("true") => Some(Value::Bool(true)),
MatchedKind::Word("false") => Some(Value::Bool(false)),
MatchedKind::NumberLit => Some(Value::Number(item.text.clone())),
MatchedKind::StringLit => Some(Value::Text(item.text.clone())),
_ => None,
}
}
fn build_show(path: &MatchedPath) -> Result<Command, ValidationError> {
let sub = path
.items
.iter()
.filter_map(|i| match &i.kind {
MatchedKind::Word(w) => Some(*w),
_ => None,
})
.nth(1);
let name = require_ident(path, "table_name")?;
match sub {
Some("data") => Ok(Command::ShowData { name }),
Some("table") => Ok(Command::ShowTable { name }),
_ => Err(ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "unknown show subcommand".to_string())],
}),
}
}
fn build_insert(path: &MatchedPath) -> Result<Command, ValidationError> {
let table = require_ident(path, "table_name")?;
// Locate the second `values` keyword (the first is the
// command word `insert`'s sibling — but `insert` isn't a
// matched Word here since it's the entry word and the
// entry-word push uses the literal "insert"; only later
// `values` matches as Word("values")).
//
// Strategy: walk the path. After the table name:
// - If we see Word("values") next (Form B), the next
// parenthesized values are the value list.
// - If we see Punct('('), the first paren's content was
// either column names (Form A) or values (Form C).
// If a Word("values") follows the closing paren, it's
// Form A.
//
// Easier discriminator: collect all matched keyword words;
// count occurrences of "values".
let saw_values = path
.items
.iter()
.any(|i| matches!(i.kind, MatchedKind::Word("values")));
// Find the index of the table_name match — the first paren
// afterwards starts the parsed list.
let table_idx = path
.items
.iter()
.position(|i| matches!(&i.kind, MatchedKind::Ident { role: "table_name" }))
.ok_or_else(|| ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "missing table".to_string())],
})?;
// Form B (values keyword right after table): no column list,
// values come from the single paren-bounded list.
let first_token_after_table = path.items.get(table_idx + 1);
let form_b = matches!(
first_token_after_table.map(|i| &i.kind),
Some(MatchedKind::Word("values"))
);
if form_b {
// Form B: the only value run is between the only `(` … `)`.
let values = collect_values_in_parens(path, table_idx + 1)?;
return Ok(Command::Insert {
table,
columns: None,
values,
});
}
// Form A or C: the first paren after the table is a Choice
// of either column-idents or value-literals.
let first_paren_idx = path
.items
.iter()
.enumerate()
.skip(table_idx + 1)
.find(|(_, i)| matches!(i.kind, MatchedKind::Punct('(')))
.map(|(idx, _)| idx)
.ok_or_else(|| ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "missing `(`".to_string())],
})?;
if saw_values {
// Form A: first paren = column names; second paren = values.
// The Repeated inside the first paren tagged matched idents
// with role "insert_first_item".
let columns: Vec<String> = path
.items
.iter()
.filter_map(|i| match &i.kind {
MatchedKind::Ident {
role: "insert_first_item",
} => Some(i.text.clone()),
_ => None,
})
.collect();
if columns.is_empty() {
return Err(ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "expected column names in `insert into T (…)`".to_string())],
});
}
// Find the `values` keyword and the next `(` — the values
// run starts after that `(`.
let values_idx = path
.items
.iter()
.enumerate()
.skip(first_paren_idx)
.find(|(_, i)| matches!(i.kind, MatchedKind::Word("values")))
.map(|(i, _)| i)
.ok_or_else(|| ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "missing `values` keyword".to_string())],
})?;
let values = collect_values_in_parens(path, values_idx + 1)?;
Ok(Command::Insert {
table,
columns: Some(columns),
values,
})
} else {
// Form C: the first paren contained the value list. The
// Repeated tagged the matched values via their natural
// MatchedKind (Word/NumberLit/StringLit); collect them.
let values = collect_values_in_parens(path, first_paren_idx)?;
Ok(Command::Insert {
table,
columns: None,
values,
})
}
}
/// Collect Value items inside the next `(…)` block at or after
/// `start_idx`. Stops at the matching `)`.
fn collect_values_in_parens(
path: &MatchedPath,
start_idx: usize,
) -> Result<Vec<Value>, ValidationError> {
let mut out = Vec::new();
let mut inside = false;
for item in path.items.iter().skip(start_idx) {
match &item.kind {
MatchedKind::Punct('(') => inside = true,
MatchedKind::Punct(')') if inside => return Ok(out),
_ if inside => {
if let Some(v) = item_to_value(item) {
out.push(v);
}
}
_ => {}
}
}
if out.is_empty() && !inside {
return Err(ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "missing `(`".to_string())],
});
}
Ok(out)
}
fn build_update(path: &MatchedPath) -> Result<Command, ValidationError> {
let table = require_ident(path, "table_name")?;
let assignments = collect_assignments(path)?;
let filter = collect_filter(path)?;
Ok(Command::Update {
table,
assignments,
filter,
})
}
fn collect_assignments(
path: &MatchedPath,
) -> Result<Vec<(String, Value)>, ValidationError> {
let mut out = Vec::new();
let mut iter = path.items.iter();
while let Some(item) = iter.next() {
if matches!(
item.kind,
MatchedKind::Ident {
role: "update_set_column"
}
) {
let column = item.text.clone();
// Skip the `=` punct.
for next in iter.by_ref() {
if matches!(next.kind, MatchedKind::Punct('=')) {
break;
}
}
// Next item is the value.
let value_item = iter.next().ok_or_else(|| ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "missing assignment value".to_string())],
})?;
let value = item_to_value(value_item).ok_or_else(|| ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "expected value literal".to_string())],
})?;
out.push((column, value));
}
}
Ok(out)
}
fn collect_filter(path: &MatchedPath) -> Result<RowFilter, ValidationError> {
if path
.items
.iter()
.any(|i| matches!(i.kind, MatchedKind::Flag("all-rows")))
{
return Ok(RowFilter::AllRows);
}
// Walk for filter_column ident, then `=`, then value.
let mut iter = path.items.iter();
while let Some(item) = iter.next() {
if matches!(
item.kind,
MatchedKind::Ident {
role: "filter_column"
}
) {
let column = item.text.clone();
// Skip until `=`.
for next in iter.by_ref() {
if matches!(next.kind, MatchedKind::Punct('=')) {
break;
}
}
let value_item = iter.next().ok_or_else(|| ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "missing where value".to_string())],
})?;
let value = item_to_value(value_item).ok_or_else(|| ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "expected value literal".to_string())],
})?;
return Ok(RowFilter::Where { column, value });
}
}
Err(ValidationError {
message_key: "parse.error_wrapper",
args: vec![("detail", "missing where or --all-rows".to_string())],
})
}
fn build_delete(path: &MatchedPath) -> Result<Command, ValidationError> {
let table = require_ident(path, "table_name")?;
let filter = collect_filter(path)?;
Ok(Command::Delete { table, filter })
}
// =================================================================
// CommandNodes
// =================================================================
pub static SHOW: CommandNode = CommandNode {
entry: Word::keyword("show"),
shape: SHOW_SHAPE,
ast_builder: build_show,
help_id: Some("data.show"),
usage_id: Some("parse.usage.show"),
hint_mode: None,
};
pub static INSERT: CommandNode = CommandNode {
entry: Word::keyword("insert"),
shape: INSERT_SHAPE,
ast_builder: build_insert,
help_id: Some("data.insert"),
usage_id: Some("parse.usage.insert"),
hint_mode: None,
};
pub static UPDATE: CommandNode = CommandNode {
entry: Word::keyword("update"),
shape: UPDATE_SHAPE,
ast_builder: build_update,
help_id: Some("data.update"),
usage_id: Some("parse.usage.update"),
hint_mode: None,
};
pub static DELETE: CommandNode = CommandNode {
entry: Word::keyword("delete"),
shape: DELETE_SHAPE,
ast_builder: build_delete,
help_id: Some("data.delete"),
usage_id: Some("parse.usage.delete"),
hint_mode: None,
};