ADR-0024 Phase D (full): schema-aware value typing

Schema-aware typed value slots — the central design claim of
ADR-0024 §Phase D. Insert / update / delete value slots now
dispatch on the user-facing column type at parse time, rejecting
mis-shaped input with localised wording instead of waiting for
the bind-time error.

What changed:

**SchemaCache extension** (`src/completion.rs`):
- New `TableColumn { name, user_type }` for per-table column
  metadata.
- `SchemaCache.table_columns: HashMap<String, Vec<TableColumn>>`.
- `SchemaCache::columns_for_table(name)` — case-insensitive
  lookup, mirrors the walker's case-insensitive entry-word
  resolution.

**WalkContext schema plumbing** (`src/dsl/walker/context.rs`):
- `WalkContext<'a>` gains a lifetime and a `schema: Option<&'a
  SchemaCache>`. `WalkContext::new()` keeps the schemaless
  default; `with_schema(s)` is the new schema-aware constructor.

**Parser entry point** (`src/dsl/parser.rs`):
- `parse_command_with_schema(input, schema)` is the new public
  schema-aware variant. `parse_command(input)` becomes a thin
  wrapper that delegates with `None` for back-compat.
- Internal `try_walker_route` accepts an `Option<&SchemaCache>`
  and threads it into the WalkContext.

**Node::Ident writes_table/writes_column** (`src/dsl/grammar/mod.rs`):
- Two new fields on `Node::Ident`. When `writes_table: true` and
  `source: Tables`, the walker writes the matched ident's name
  into `current_table` and resolves `current_table_columns`
  against the schema cache. When `writes_column: true` and
  `source: Columns`, the walker writes the resolved
  `TableColumn` into `current_column`.

**Walker driver DynamicSubgrammar dispatch** (`src/dsl/walker/driver.rs`):
- The `Node::DynamicSubgrammar(factory)` branch now resolves the
  factory at walk time and `Box::leak`s the result so its inner
  static-slice fields (Choice/Seq) have the lifetime the walker
  expects (per ADR-0024 §sub-grammars). The leak is bounded by
  command-shape complexity per walk; per-walk arena is a future
  optimisation.
- `walk_ident` extends to perform the schema writes when the
  flags are set.

**Typed value slot factories + dynamic sub-grammars** (`src/dsl/grammar/shared.rs`):
- `int_slot` / `real_slot` / `decimal_slot` / `bool_slot` /
  `text_slot` / `date_slot` / `datetime_slot` / `blob_slot` —
  one per `Type`. Each accepts the appropriate literal kind plus
  `null`; integer-only validator rejects `3.14` at int columns;
  decimal validator pins numeric shape.
- `slot_for_type(ty) -> Node` is the dispatcher.
- `current_column_value(ctx) -> Node` is the dynamic sub-grammar
  for `set col = …` and `where col = …` values; reads
  `current_column` and dispatches via `slot_for_type`.
- `column_value_list(ctx) -> Node` is the dynamic sub-grammar
  for `insert into T values (…)`; reads `current_table_columns`
  and unfolds a Seq of typed slots separated by commas.
- Both fall back to the schemaless `VALUE_LITERAL` choice when
  the context lacks the schema-resolved entries — keeps
  schemaless `parse_command` callers (tests, replay path)
  working.

**Data-command grammar wires the new types** (`src/dsl/grammar/data.rs`):
- `TABLE_NAME_INSERT` / `TABLE_NAME_WRITES` (new): table-name
  slots that set `writes_table: true`. Used by insert / update /
  delete to populate `current_table_columns`.
- `SET_COLUMN` / `FILTER_COLUMN` (new): column-name slots in
  `set col=…` / `where col=…` set `writes_column: true`.
- `INSERT_VALUES_LIST` becomes `DynamicSubgrammar(column_value_list)`.
- `UPDATE_ASSIGNMENT` and `WHERE_CLAUSE` use
  `PER_COLUMN_VALUE = DynamicSubgrammar(current_column_value)`.

**Runtime plumbs schema-with-types** (`src/runtime.rs`):
- `refresh_schema_cache` calls `describe_table` for each table
  and populates `SchemaCache::table_columns` with
  `TableColumn { name, user_type }` entries. Best-effort: a
  `describe_table` miss leaves that table unpopulated and the
  walker falls back to schemaless dispatch.

**App dispatches with schema** (`src/app.rs`):
- `dispatch_dsl` routes through `parse_command_with_schema(&self
  .schema_cache, …)` so live typing/dispatch sees the typed
  slots. The replay path stays schemaless (deferred — replay
  bind-time errors still catch type mismatches).

**Catalog** (`src/friendly/strings/en-US.yaml`, `src/friendly/keys.rs`):
- New `parse.custom.bind_type_mismatch` entry with `{found}` and
  `{expected}` placeholders. Surfaced by the int_slot /
  decimal_slot validators.

Tests:
- 11 new walker-side Phase D tests cover insert / update /
  delete with schemas — typed acceptance per column, decimal
  rejection at int columns, null acceptance at any slot,
  multi-assignment per-column dispatch, schemaless fallback.
- The pre-existing `parse_command(input)` test suite (no
  schema) still passes — the fallback path is behaviour-
  preserving.
- 828 passing total, 0 failing, 1 ignored. Clippy clean.
This commit is contained in:
claude@clouddev1
2026-05-15 17:45:56 +00:00
parent 85817791dc
commit abebd7944f
14 changed files with 754 additions and 74 deletions
+40 -3
View File
@@ -29,16 +29,30 @@ use crate::dsl::{ParseError, parse_command};
/// `add 1:n relationship`) — adding more is a one-line edit.
const COMPOSITE_CANDIDATES: &[(&str, &str)] = &[("1", "1:n")];
/// Per-project schema lookup cache (ADR-0022 §9).
/// Per-project schema lookup cache (ADR-0022 §9, ADR-0024 §Phase D).
///
/// Held by `App::schema_cache` and consulted by the completion
/// engine for identifier slots. Empty by default; the runtime
/// refreshes on project load and after successful DDL.
/// engine for identifier slots and by the walker for schema-aware
/// value-slot dispatch (Phase D full). Empty by default; the
/// runtime refreshes on project load and after successful DDL.
#[derive(Debug, Clone, Default)]
pub struct SchemaCache {
pub tables: Vec<String>,
pub columns: Vec<String>,
pub relationships: Vec<String>,
/// Per-table column metadata with user-facing types
/// (ADR-0024 §Phase D). Keyed by table name; lookup is
/// case-insensitive in `columns_for_table` so the walker
/// can resolve `Customers` regardless of how it was typed.
pub table_columns: std::collections::HashMap<String, Vec<TableColumn>>,
}
/// One column's user-facing type info, scoped to a table
/// (ADR-0024 §Phase D, §WalkContext).
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct TableColumn {
pub name: String,
pub user_type: crate::dsl::types::Type,
}
impl SchemaCache {
@@ -54,6 +68,24 @@ impl SchemaCache {
IdentSource::NewName | IdentSource::Types | IdentSource::Free => &[],
}
}
/// Per-table column metadata lookup. Case-insensitive on
/// the table name so the walker can resolve identifiers
/// the user typed in either case (ADR-0009 — keywords are
/// case-insensitive, identifiers preserve case; this helper
/// matches the walker's case-insensitive entry-word lookup
/// rather than the strict-case `tables` Vec).
///
/// Returns `None` when no table matches; an empty `Vec`
/// when the table exists but has no columns (rare —
/// CSV-empty tables still carry PK columns in metadata).
#[must_use]
pub fn columns_for_table(&self, table: &str) -> Option<&[TableColumn]> {
self.table_columns
.iter()
.find(|(name, _)| name.eq_ignore_ascii_case(table))
.map(|(_, cols)| cols.as_slice())
}
}
/// What the grammar would accept at the end of `leading`,
@@ -1051,6 +1083,7 @@ mod tests {
tables: vec!["Customers".to_string(), "Orders".to_string()],
columns: vec![],
relationships: vec![],
..SchemaCache::default()
};
// After `show data ` the parser expects a table name.
let cs = cands_with("show data ", 10, &cache);
@@ -1063,6 +1096,7 @@ mod tests {
tables: vec!["Customers".to_string()],
columns: vec!["Email".to_string(), "Name".to_string()],
relationships: vec![],
..SchemaCache::default()
};
// After `drop column from Customers: ` the parser
// expects a column name (existing).
@@ -1076,6 +1110,7 @@ mod tests {
tables: vec![],
columns: vec![],
relationships: vec!["cust_orders".to_string(), "ord_items".to_string()],
..SchemaCache::default()
};
// After `drop relationship ` the parser expects either
// an identifier (relationship name) or `from`. Schema
@@ -1092,6 +1127,7 @@ mod tests {
tables: vec!["Customers".to_string(), "Orders".to_string()],
columns: vec![],
relationships: vec![],
..SchemaCache::default()
};
// Typed `Cu` after `show data ` — only `Customers`
// matches.
@@ -1224,6 +1260,7 @@ mod tests {
tables: vec!["Existing".to_string()],
columns: vec!["AlsoExisting".to_string()],
relationships: vec![],
..SchemaCache::default()
};
let cs = cands_with("create table ", 13, &cache);
assert!(cs.is_empty(), "got {cs:?}");