13 KiB
Session handoff — 2026-05-15 (13)
Thirteenth handover. This session executed handoff-12 §1: the systematic typing-surface test matrix. The matrix was built across all thirteen command families and, in the process, surfaced and fixed five production bugs that the 859-test suite had missed — exactly the class of issue §1 was designed to catch.
The headline: the matrix works. It found real bugs on its first three Form A cells and kept finding them through the DDL families. Every bug is now fixed and pinned by a regression cell. See §2 for the bug list and §4 for what's still open.
State at handoff
Branch: main. Working tree clean. Local HEAD is ahead of
origin/main by twenty-four commits across four sessions (the
user pushes asynchronously; nothing here is pushed).
Commits since handoff-12 (3b1955c):
0b15ce0 Walker + parser: surface mid-typing after separators and Form C/A ambiguity
24e641b Matrix: typing-surface infrastructure + insert Form A coverage
fffb44f input_render: schema-aware classify_input for wrong-count value lists
a9a04cf Matrix: insert Form B coverage
37db2f5 Matrix: insert Form C + update + delete coverage
216e7ba DDL grammar: writes_table on table-name slots for column narrowing
c7ecc64 Matrix: create table, DDL, and app-command coverage
(Plus 3b1955c itself — the handoff-12 commit — which was the
prior HEAD.)
Tests: 989 passing, 0 failing, 1 ignored (up from 859).
The 130 new tests are the typing-surface matrix
(tests/typing_surface_matrix.rs + tests/typing_surface/).
The ignored test is the long-standing ```ignore doc-test
in src/friendly/mod.rs.
Clippy: clean with nursery lints + -D warnings.
§1. What shipped — the typing-surface matrix
Per handoff-12 §1. The matrix is a folder of per-command test
files under tests/typing_surface/, driven by a single test
binary (tests/typing_surface_matrix.rs).
Infrastructure (tests/typing_surface/mod.rs)
- Five canonical schema shapes (handoff-12 §1
CANONICAL_SCHEMA_SHAPES):
schema_empty,schema_serial_pk(Customers: id:serial, Name:text, Email:text),schema_text_pk(Items: Code:text, Title:text),schema_multi_table(Customers + Orders, disjoint column names — the leakage-detection shape),schema_every_type(Things: one column perTypevariant). assess(input, cursor, schema) -> Assessment— runs the four typing-surface APIs at one cell:classify_input_with_schema,ambient_hint,candidates_at_cursor,parse_command_with_schema.- Property-assertion helpers —
assert_no_candidate_named(leakage detection),assert_candidate_present,hint_prose_contains, etc. snap!macro — wrapsinsta::assert_debug_snapshot!with a stable per-cell suffix. Every cell has both a snapshot (drift detection) and explicit property assertions (bug-class regressions). This dual approach is deliberate: the four handoff-12 bugs were properties (no leakage, X offered, not red), not snapshots.
Coverage — all 13 command families
| File | Tests | Notes |
|---|---|---|
insert_form_a.rs |
23 | explicit column list |
insert_form_b.rs |
12 | bare values keyword |
insert_form_c.rs |
8 | bare value list |
update_with_where.rs |
12 | |
update_all_rows.rs |
4 | |
delete_with_where.rs |
7 | |
delete_all_rows.rs |
3 | |
create_table.rs |
11 | |
drop_column.rs |
6 | drove the §2.2 fix |
drop_relationship.rs |
7 | endpoints + by-name |
add_relationship.rs |
9 | |
rename_change_column.rs |
8 | |
app_commands.rs |
17 | quit/help/save/… |
| (infrastructure smoke) | 3 | in mod.rs |
§2. Bugs the matrix found — all fixed
Five production bugs, each now pinned by a matrix cell plus (for the parser/walker ones) the regression survives in the matrix.
Bug 1 — walk_repeated rolled back mid-typing after a
separator. insert into T (a, at EOF rolled the matched ,
back, the outer (-list then expected ), saw the ,, and
reported DefiniteErrorAt at the comma. The input pane flashed
red on every comma. Fix (src/dsl/walker/driver.rs): when the
separator consumed and the inner failed at post-separator EOF,
propagate Incomplete instead of rolling back. Commit 0b15ce0.
Bug 2 — Form C accepted column-shaped items. insert into T (Name) (no values keyword) walked to a complete match and
produced Insert { columns: None, values: [] } — Form C's value
collector silently drops ident-shaped items. The user meant
Form A. Fix (src/dsl/grammar/data.rs build_insert): reject
with a ValidationError (parse.custom.insert_form_a_missing_values)
naming the Form-A continuation; the input now classifies as
IncompleteAtEof. Commit 0b15ce0.
Bug 3 — ValidationFailed lost the expected set.
completion_probe / expected_at_input returned an empty
expected set for ValidationFailed outcomes, so Tab had nothing
to offer at the new Form-A flag point. Fix
(src/dsl/walker/mod.rs): surface result.tail_expected (the
skipped-Optional expectations captured before validation fired).
Commit 0b15ce0.
Bug 4 — classify_input was schemaless. Wrong-count Form B
value lists (insert into Customers values ('Alice') against a
3-column table) showed as Valid until submit. Fix
(src/input_render.rs): added classify_input_with_schema,
wired render_input_runs to it. Schemaless classify_input
kept public for schema-independent regression tests. Commit
fffb44f.
Bug 5 — DDL table-name slots didn't writes_table (this is
handoff-12 §2.2). drop column from Customers: offered every
table's columns. Same for rename/change column and the
relationship qualified-column slots (from T.col). Fix
(src/dsl/grammar/ddl.rs): writes_table: true on
TABLE_NAME_EXISTING and on DR_PARENT/DR_CHILD/AR_PARENT/
AR_CHILD table idents. The deliberately-documenting test
completion::tests::drop_column_from_offers_only_current_table_columns
now asserts per-table narrowing. Commit 216e7ba.
§3. CRITICAL: open finding needing a user decision
Partial entry words classify as DefiniteErrorAt. While
typing the first word of a command — q, qu, qui — the
input is classified DefiniteErrorAt(0), identical to a
genuinely unknown command (frobulate). Cause: the walker only
engages once a complete registered entry word is present
(try_walker_route returns None for qu, and the router
emits the synthetic unknown-command error). So the input pane
shows qu in error-red even though it is a valid prefix of
quit.
Tab-completion still works (qu → quit is offered), so the
user can recover. But the red overlay while typing the very
first word is jarring.
This is documented — not silently asserted as correct — in
app_commands::partial_entry_word_classifies_as_definite_error_but_completes.
The next agent (or the user) should decide: is partial-entry
red overlay acceptable, or should a partial that prefixes a
registered entry word classify as IncompleteAtEof? A fix
would need the router to recognise "this token is a prefix of
≥1 entry word, no other token typed yet" and classify
accordingly. Not done this session — flagged per CLAUDE.md's
"escalate, don't decide" rule.
§4. Honest scope disclosures
Two places where this session's matrix is lighter than handoff-12 §1's literal text. Neither was confirmed with the user as a deferral, so treat them as open:
4.1 Cursor positions: meaningful transitions, not every byte
offset. Handoff-12 §1's pseudo-code said
EVERY_TOKEN_BOUNDARY_AND_MID_TOKEN. The matrix covers
end-of-input positions at each meaningful typing transition,
plus partial-keyword mid-token cases (insert, qu,
create table Customers with pk). It does not exhaustively
walk every interior byte offset with the cursor. The judgment
call: end-of-input is where the user actually pauses and reads
the hint; interior-cursor behaviour is mostly the cursor-split
rendering logic, already covered by input_render tests. If the
user wants literal every-offset coverage, that is a follow-up.
4.2 Assertion (5) is parse-layer, not a live dispatch
differential. Handoff-12 §1 assertion (5) wanted
"parse_command_with_schema matches dispatch behavior — what
the grammar accepts equals what do_insert/do_update accept."
The matrix records parse_result (Ok/Err + command kind) in
every cell's snapshot, so grammar-acceptance is pinned. It does
not run the actual db::do_insert / do_update against the
same input and diff the two. A true differential would need a
test DB per cell (testcontainers-grade) — heavier. The
Phase-D-aligned grammar (Form B skips auto-gen, typed slots)
already mirrors do_insert's user_cols contract by
construction, so the risk is low, but this is a genuine gap
versus the literal spec.
§5. Deferred items — status vs. handoff-12 §2
§2.2 items — addressed this session
- DDL
writes_table— DONE (Bug 5 above). - Add/drop-relationship qualified-column narrowing — DONE
(
writes_tableon the endpoint table idents; the.colslots now narrow). Handoff-12 called this "low priority"; it fell out for free once the pattern was established.
§2.2 items — still open
- Form C is type-unaware. Confirmed and pinned by
insert_form_c::form_c_type_unaware_grammar_accepts_decimal_for_int_column—insert into Customers (3.14, …)parses Valid even though3.14lands in a non-real column slot; bind time catches it. Handoff-12 said "defer until users care." Still deferred — the next agent should confirm with the user whether to make Form C typed (requires Form A/C lookahead disambiguation in the walker). value_slot_serial/shortidForm-A-only wording. The pedagogical "Skippingid(auto-generated)…" Prose hint atinsert into T values (for an auto-gen table is still not added. Unchanged.
§2.1 carry-forwards — all still open, untouched this session
CommandNode.help_idnot consumed.Box::leakmemory growth per dynamic walk (the matrix's schema-aware walks exercise this more than before, but it's still bounded per walk; no arena yet).- Replay path is schemaless — note
classify_inputis now schema-aware (Bug 4), butruntime::run_replaystill calls the schemalessparse_command. Per-line schema re-fetch during replay is still unbuilt. - HintMode is detection-based, not node-attached.
- Ranker is scaffolding-only.
- Dead
parse.token.*catalog entries.
§6. New API surface (delta vs. handoff-12)
input_render::classify_input_with_schema(input, &SchemaCache) -> InputState— schema-aware classification.render_input_runsuses it. Schemalessclassify_inputretained.parse.custom.insert_form_a_missing_valuescatalog key (placeholdercolumns).- Walker
walk_repeatedIncomplete-on-mid-typing semantics (driver.rs) — see Bug 1. completion_probe/expected_at_inputsurfacetail_expectedforValidationFailedoutcomes — see Bug 3.
§7. How to take over
- Read this file.
- Read handoff-12, then 11 if you lack prior context.
- Read
CLAUDE.md— especially "Out of scope and non-blocking require user confirmation" and "Test-first debugging." This session followed both: every bug was a failing matrix cell first. - Run
cargo test— 989 passing, 0 failing, 1 ignored. - Run
cargo clippy --all-targets -- -D warnings— clean. - Decide §3 (partial-entry-word red overlay) with the user — it is the one open finding from this session.
- Confirm the §4 scope disclosures with the user — do they want exhaustive cursor coverage and/or a live dispatch differential, or is the current matrix sufficient?
- Then pick from §5 — Form C typing, the §2.1
carry-forwards (
help_id, arena, schemaless replay, node-attached HintMode).
Working with the matrix
- The matrix is one test binary. Run a single family with
cargo test --test typing_surface_matrix <family>. - New cells: add a
#[test]to the relevanttests/typing_surface/<family>.rs, callassess/assess_at_end, assert properties, end withcrate::snap!("<name>", a). - First run of a new snapshot fails (no baseline). Generate
baselines with
INSTA_UPDATE=always cargo test --test typing_surface_matrix <family>, then review the.snapfiles before committing. - When a matrix cell fails after a grammar change: if the new behaviour is correct, update the snapshot; if it's a regression, the snapshot just earned its keep.
§8. Commit narrative
0b15ce0— three coupled fixes (walk_repeated rollback, Form C/A disambiguation, ValidationFailed expected-set) that the Form A matrix surfaced.24e641b— matrix infrastructure + Form A (23 cells).fffb44f— schema-awareclassify_input(Bug 4), surfaced by Form B.a9a04cf— Form B (12 cells).37db2f5— Form C + update + delete (34 cells, no new bugs — the data-mutation family was already well-shaped).216e7ba— DDLwrites_tablefix (Bug 5 / §2.2).c7ecc64— create table + DDL + app coverage (55 cells).
The pattern, again: the bugs all came from the matrix, not the pre-existing suite — which is the whole point handoff-12 §1 was making.