Files
rdbms-playground/docs/handoff/20260523-handoff-32.md

348 lines
19 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Session handoff — 2026-05-23 (32)
Thirty-second handover. A long, productive session that pushed
**ADR-0033 Phase 3 from 3f through 3i** (DELETE, RETURNING, UPSERT,
the three DML diagnostics), wrote **ADR-0034** (history journal),
fixed a pre-existing self-referential-cascade bug, and closed with a
`/runda` adversarial round that found **six** latent
INSERT-target-scope false-positives.
The next session should **start sub-phase 3j** (dispatch wiring —
make `insert`/`update`/`delete` shared DSL/SQL entry words and remove
the dev scaffold words), then **3k** (verification sweep). See §4.
## §1. State at handoff
**Branch:** `main`. **Tests: 1613 passing, 0 failing, 1 ignored.**
**Clippy:** clean (`cargo clippy --all-targets -- -D warnings`).
**Commits this session** (newest first). `origin/main` is at
`6b8888f` (3h), so the **six 3i commits are local-only**:
```
8d17583 walker: 3i /runda DA round — fix INSERT-target scope confusion (6 cases) (local)
4fa0aa0 db+walker: 3i DA pass — not_null PK false-positive fix + arity hardening (local)
cfd925c grammar+db: 3i — DML column-existence + cross-cut verification (local)
2d1112d grammar+db: 3i — not_null_missing diagnostic + TableColumn constraints (local)
6db8253 grammar+db: 3i — insert_arity_mismatch diagnostic (ADR-0033 §8.1) (local)
be63315 grammar+db: 3i — auto_column_overridden diagnostic (ADR-0033 §8.2) (local)
6b8888f grammar+db: 3h — UPSERT ON CONFLICT DO NOTHING / DO UPDATE (origin/main)
fd8b74b grammar+db: 3g — RETURNING on INSERT/UPDATE/DELETE (ADR-0033 §5)
b935090 docs: ADR-0034 — history.log as a complete journal; replay reads ok-only
62f09be db: fix self-referential cascade over-count + SQL-delete render test
2c86a13 grammar+db: 3f — SQL DELETE + cascade summary (ADR-0033 §1/§7)
```
(Unpushed commits are a normal working state; pushing is the user's
step — do not prompt about it.)
## §2. Process lessons (read first)
1. **`/runda` (a forced extra DA round) earned its keep.** After I
reported 3i "done with a thorough DA pass," the user invoked
`/runda`. Probing the **interaction** of the new 3i diagnostic
passes with the *existing* `schema_existence` pass under
`INSERT … SELECT` surfaced **six** false-positive bugs I had
missed (§3i below). The lesson: my first DA pass tested each new
pass in isolation; the bugs lived in the **cross-cut** between new
and old passes on a compound input. **Always exercise new
diagnostics against the awkward compound forms** (`INSERT … SELECT
… ON CONFLICT … DO UPDATE … RETURNING`), not just the simple ones.
2. **Probe-then-assert works.** Each `/runda` bug was confirmed with
a throwaway test asserting the *correct* behaviour, watching it
FAIL, then fixed, then the test kept. Don't fix on suspicion.
3. **Escalate ADR-mechanism mismatches; don't silently "improve".**
Three times this session the implementation revealed the ADR's
prescribed mechanism was wrong or insufficient (3f cascade,
3h `excluded` scoping, history logging). Each was escalated to the
user and resolved by ADR amendment / a new ADR / a recorded
decision — never silent drift. The user values this; keep doing it.
## §3. What shipped this session (detail)
### 3f — SQL `DELETE` + cascade (`2c86a13`) + self-ref fix (`62f09be`)
- `src/dsl/grammar/sql_delete.rs` (`FROM <table> [WHERE] [;]`),
`Command::SqlDelete`, `Request::RunSqlDelete`, `do_sql_delete`.
- **ADR-0033 Amendment 2** (written this session): §7's WHERE-injected
pre-count rested on a *false premise* (it claimed the DSL
`do_delete` builds pre-counts from the typed `Expr` — it does not;
it uses **before/after child row-count diffing**). The pre-count
would also have broken §2's parity promise (reporting `SET NULL`
the DSL path doesn't). Replaced by mirroring `do_delete`'s
count-diff exactly → exact parity, no WHERE extraction, **risk R2
withdrawn**. `ON DELETE CASCADE` row removals only; SET NULL
deferred for both paths. Render shared via `CommandOutcome::Delete`.
- **Self-ref cascade bug** (pre-existing in `do_delete`, surfaced by
3f's DA): a self-referential `ON DELETE CASCADE` FK conflated the
directly-deleted rows with cascade rows. Fixed in **both**
`do_delete` and `do_sql_delete`: when `rel.other_table == target`,
`rows_changed -= rows_affected`.
### 3g — `RETURNING` (`fd8b74b`)
- Shared `RETURNING_CLAUSE` (reuses Phase-2 `PROJECTION_LIST`, made
`pub(crate)`); optional tail on all three SQL DML shapes.
- `returning: bool` on the three `Command::Sql*` variants (per ADR
§5), set by the ast-builders, threaded Request→client→worker.
- `run_returning` (db.rs) collects the returned rows as a
`DataResult` (a RETURNING DML mutates *and* yields in one
`query_map`); reuses `resolve_select_column_types`. `DeleteResult`
gained a `data` field, rendered alongside the cascade summary.
- **Follow-set fix:** `returning` added to the table-source and
projection bare-alias follow-sets (`sql_select.rs`), so an
`INSERT … SELECT … FROM t RETURNING …` stops before RETURNING
instead of reading it as `t`'s alias.
- **Auto-fill × RETURNING:** `build_sql_insert`'s `row_source` now
stops before the first trailing clause (ON CONFLICT or RETURNING),
and the shortid-auto-fill rewrite re-appends the whole trailing
tail so generated ids surface in `RETURNING *`.
### 3h — UPSERT `ON CONFLICT` (`6b8888f`)
- `on_conflict_clause` on `SQL_INSERT_SHAPE`: optional `(col,…)`
target (distinct `conflict_target_column` role so it never enters
`listed_columns`), `DO NOTHING` / `DO UPDATE SET … [WHERE …]`.
- **`do` is factored OUT of the action Choice** (`DO ( NOTHING |
UPDATE … )`) because a Choice with branches sharing a `do` prefix
trips the `walk_seq`/`walk_choice` interaction (ADR-0033
Amendment 1) — a branch matching `do` then failing its 2nd token
returns a hard `Failed`, blocking fall-through. **Re-learn:** any
`Choice` whose branches share a leading token must factor it out.
- Worker runs the UPSERT verbatim (SQLite-native); no new exec path.
- **`excluded` pseudo-table:** resolves to the target's columns
inside the DO UPDATE action and completes at `excluded.|`, but stays
flagged outside it. **Mechanism deviation (recorded):** the plan
prescribed a `from_scope` `TableBinding` push; I used **byte-range
scoping** in the diagnostic pass (`upsert_excluded: (target,
do_update_start, returning_or_end)`) + `current_table_columns` in
completion. Same behaviour, no walker scope-frame change.
### 3i — three DML diagnostics + cross-cut (`be63315`→`8d17583`)
All under ADR-0033 §8. Each diagnostic is a separate post-walk pass in
`src/dsl/walker/mod.rs`, wired into `walk` next to
`schema_existence_diagnostics`. Catalog keys in
`src/friendly/keys.rs` + `src/friendly/strings/en-US.yaml`.
- **`auto_column_overridden`** (WARN, §8.2) — `dml_auto_column_diagnostics`:
an explicit value for a `serial`/`shortid` listed column.
- **`insert_arity_mismatch`** (ERROR, §8.1) — `dml_insert_arity_diagnostics`:
per-tuple VALUES arity (each offending row emits its own diagnostic;
the walk stops at the first depth-0 keyword so `ON CONFLICT`/`RETURNING`
parens aren't mis-counted) + INSERT…SELECT projection arity (plain
SELECT; WITH-prefixed deferred to avoid false positives).
- **`not_null_missing`** (WARN, §8.3) — `dml_not_null_missing_diagnostics`:
omitted NOT-NULL-no-default column; **excludes serial/shortid
(auto-filled) and defaulted columns**. Needed a **data-model
extension**: `TableColumn` (in `SchemaCache`) gained `not_null` +
`has_default` (with a `TableColumn::new` constructor), populated in
`build_schema_cache` from `ColumnDescription`. **DA fix:** the cache
builder must use `c.notnull` ALONE, *not* `|| c.primary_key` — an
`int` PK is an `INTEGER PRIMARY KEY` rowid alias that auto-fills, so
treating any PK as required false-flags it.
- **`dml_target_column_diagnostics`** (ERROR) — validates
`insert_column` / `update_set_column` / `conflict_target_column`
against the INSERT target (the cross-cut "schema-existence on INSERT
VALUES" gate item; also closed DA finding #12, the UPSERT DO UPDATE
SET divergence).
#### The `/runda` round — INSERT-target scope confusion (commit `8d17583`)
**One root cause, six manifestations, all pre-existing
false-positives.** The INSERT target is recorded under role
`insert_target_table`, NOT as a diagnostic `bindings` entry. So refs
that should resolve to the *target* row were checked against the
statement's `bindings` — which for an `INSERT … SELECT` are the
SELECT's *source* tables. New helper **`bare_ref_insert_target`**
re-scopes a ref onto the target when it sits in a target-referencing
region (the DO UPDATE byte range, or an INSERT's RETURNING list).
Applied to: (1) insert column list, (2) ON CONFLICT target,
(3) DO UPDATE SET-RHS / WHERE bare refs (also closed #12's residual),
(4) RETURNING bare refs, (5) target-qualified `t.col`, (6)
target-qualified `t.*`. Each has positive + negative tests.
## §4. Sub-phase 3j — the NEXT job (dispatch wiring)
Read `docs/plans/20260520-adr-0033-phase-3.md` "Sub-phase 3j" and
ADR-0033 **Amendment 1** (the category-grouped dispatch mechanism —
this is the machinery 3j uses).
**Goal:** make `insert` / `update` / `delete` **shared DSL/SQL entry
words** and **remove the dev scaffold words** (`sqlinsert`,
`sql_update`, `sql_delete`).
**Current state of the scaffold** (what 3j removes/rewires):
- `src/dsl/grammar/data.rs`: `SQL_INSERT` (entry word `sqlinsert`),
`SQL_UPDATE` (`sql_update`), `SQL_DELETE` (`sql_delete`)
CommandNodes, registered `Advanced` in `REGISTRY`
(`src/dsl/grammar/mod.rs`). Their ast-builders `build_sql_insert` /
`build_sql_update` / `build_sql_delete` reconstruct the real keyword
(`format!("insert {tail}")` etc.) — **once the real entry word is
shared, these collapse to `source.trim()`** (the dev-word prefix
reconstruction goes away).
- The DSL CommandNodes `data::INSERT` / `UPDATE` / `DELETE` (entry
words `insert`/`update`/`delete`) are `Simple`.
**The mechanism (ADR-0033 Amendment 1):** dispatch is
**category-grouped** in `walker::walk`. Each `REGISTRY` entry carries
a `CommandCategory::{Simple, Advanced}`. A **shared entry word has a
node in BOTH groups** — e.g. a `Simple` DSL `insert` node and an
`Advanced` SQL `insert` node, both entry word `insert`. Selection:
- **Simple mode** → only `Simple` candidates; if none but an
`Advanced` exists, emit the `advanced_mode.sql_in_simple` hint.
- **Advanced mode** → try `Advanced` candidate(s) first, then the
`Simple` candidate as fallback (`delete from t where id=1` → SQL;
`delete from t --all-rows` → falls through to DSL).
**3j scope (from the plan):**
1. Register the SQL shapes under the real entry words `insert` /
`update` / `delete` with category `Advanced` (alongside the
existing `Simple` DSL nodes). The dispatcher already supports two
candidates per entry word (Amendment 1 / how `select`+`with` work).
2. Remove the `sqlinsert` / `sql_update` / `sql_delete` dev words +
their CommandNodes; simplify the three `build_sql_*` ast-builders
to `source.trim()` (no keyword reconstruction).
3. **De-duplicate the completion entry-word lists** — once a word
appears twice in `REGISTRY` (Simple + Advanced), the entry-word
candidate lists must not show it twice (flagged in ADR-0033
Amendment 1 consequences; the `command_for_entry_word` /
completion filters in `mod.rs` are the sites).
4. Verify: existing DSL insert/update/delete tests still green
(unchanged inputs/outputs); SQL inputs route via the shared word in
Advanced; a structurally-ambiguous input (`delete from t where id =
1`) routes SQL-first; a DSL-only input (`delete from t --all-rows`)
falls through to DSL.
**3j heads-up — the diagnostics keyed on roles will keep working:**
the 3i passes match on roles (`insert_target_table`, `insert_column`,
etc.) that come from the SQL shapes, so they fire only when the SQL
shape matched. A DSL insert (different roles) won't trip them. But
**re-run the full diagnostic test set after 3j** — the dev-word →
real-word switch changes how the path is built (the entry-word offset
/ no keyword-prefix reconstruction), and the ast-builders' `source`
handling changes.
**Tests that hard-code the dev words** must be migrated in 3j:
`tests/sql_insert.rs`, `tests/sql_update.rs`, `tests/sql_delete.rs`
use `sqlinsert …` / `sql_update …` / `sql_delete …` inputs and parse
helpers; the walker tests in `mod.rs` use `sqlinsert …` inputs. After
3j they become `insert …` / `update …` / `delete …` (Advanced mode).
This is a sizeable but mechanical migration — consider an Explore/
general-purpose sub-agent for the input-string swaps, compile-verified.
## §5. Sub-phase 3k — verification sweep (after 3j)
Per the plan: fill the cross-cut verification matrix, produce the
phase-exit verification report at `docs/handoff/<date>-phase-3-
verification.md`, final DA review (genuine, not rubber-stamp), confirm
zero failures / zero skips / no regression from the Phase-2 baseline
(1446/0/1 — note this session's count is 1613 and climbing as tests
were added per sub-phase). Clippy clean.
## §6. Decisions settled this session (do not re-litigate)
- **3f cascade = count-diff parity** (ADR-0033 **Amendment 2**),
user-confirmed. SET NULL reporting **deferred for
both DSL and SQL paths** (preserves parity); a future enhancement
touches both `do_delete` and `do_sql_delete` together.
- **`returning: bool`** on the Command variants (ADR §5), not a
text field — the auto-fill rewrite preserves the trailing tail
instead.
- **`excluded` scoping = byte-range** (not the plan's TableBinding
push) — recorded as a mechanism deviation; behaviour matches §9.
- **History journal = ADR-0034** (written, **Accepted**): see §7.
- **not_null = `c.notnull` only** (not `|| primary_key`) — int-PK
false-positive avoidance.
## §7. Tracked deferred items (nothing lost)
These are **parked, not dropped** — pick up after Phase 3 (3k) per the
user's "we'll take care of the extra tasks when phase 3 is complete".
- **TASK #8 — Implement ADR-0034 (history journal).** `history.log`
is success-only, but the in-memory Up/Down recall ring records every
submission, and the ring is re-seeded from `history.log` on open —
so failed commands are recallable in-session but **lost across
sessions**. ADR-0034 (`docs/adr/0034-history-journal-and-replay-
filter.md`, **Accepted**): make `history.log` a complete journal
tagged `ok`/`err`; hydration reads all, replay reads `ok` only.
Code = two sub-tasks (journal-failures+filtering; replay-format).
Successful commands stay journalled transactionally by the worker;
failures journalled `err` best-effort from the runtime/app error
path (a parse failure never reaches the worker). Existing all-`ok`
logs need no migration.
- **TASK #9 — Fix broken replay + add the missing test.** `run_replay`
(`src/runtime.rs`) parses each whole trimmed line through the DSL
parser with NO understanding of the `<ts>|<status>|<source>` format,
so `replay history.log` fails on line 1 — despite ADR-0006 / the
`resolve_replay_path` comment promising it works. **No test feeds
the pipe format to replay** (the `replay_history_log_records_
subcommands_only` test only checks what replay *writes*). Fix:
`run_replay` accepts BOTH a journal record (extract+unescape source,
skip unless `status==ok`) and a bare command line; detection by the
leading timestamp+status prefix. Test-first with a real
`history.log`.
### Residuals / observations (lower priority, all discussed)
- **#12 residual is CLOSED** — DO UPDATE SET-RHS / WHERE bare refs are
now validated against the target (the `/runda` round). Task #12 is
marked completed.
- **`excluded` completion soft-leak:** completion offers the target's
columns for `excluded.|` even outside DO UPDATE (the diagnostic
pass strictly scopes it; completion is the softer surface). Untested,
advisory, acceptable.
- **Old-project migration** (from handoff 30 §4): the pre-`check_expr`
3-column `__rdbms_playground_columns` schema; deferred to the
migration framework (ADR-0015 Iter 6).
- **Two-line hint box** (ADR-0022 Amendment 2) — deferred.
- **`writes_user_listed_column` flag** stays `false` (handoff 31 §7) —
VALUES-against-listed-columns completion narrowing, nothing needs it.
## §8. Process pins (unchanged, still binding)
- **Confirm every commit.** Propose the message; wait for the
go-ahead. **No auto-commit at sub-phase gates** (this overrides the
plan's standing commit pre-authorization — the user confirmed each
commit this whole session).
- **Push is the user's step.** Never push; never prompt about it.
- **No AI attribution** in commits (global rule).
- **Test-first.** Reproduce a bug with a FAILING test before fixing;
for features, exit-gate tests before "done". The `/runda` round is
the model: probe → confirm fail → fix → keep the test.
- **Core walker changes** (`walk_seq` / `walk_choice` /
`walk_repeated`) need explicit user OK before coding. (New *node
variants* and new *post-walk passes* do not — those were added
freely this session.)
- **Escalate ambiguity / ADR-mechanism mismatches; never classify
work out of scope without user confirmation.** Resolve mechanism
changes by ADR amendment BEFORE coding (3f, 3h, history all did).
- **`/runda`** (the extra-DA-round skill) is a user-invoked tool — do
not invoke it yourself, but internalise its lesson: re-check
cross-cut interactions.
## §9. How to take over
1. **Read, in order:** this file →
`docs/adr/0033-sql-dml-grammar.md` (esp. **Amendment 1** —
dispatch — and **Amendment 2** — cascade) →
`docs/plans/20260520-adr-0033-phase-3.md` "Sub-phase 3j" + "3k" →
`docs/adr/0034-history-journal-and-replay-filter.md` (parked) →
`CLAUDE.md`.
2. **Baseline:**
```
cargo test # expect 1613 passing / 0 failing / 1 ignored
cargo clippy --all-targets -- -D warnings # clean
```
3. **Start 3j** per §4. The dispatch machinery (Amendment 1,
category-grouped) already exists and is how `select`/`with` work —
3j adds the SQL `insert`/`update`/`delete` nodes as `Advanced`
alongside the `Simple` DSL nodes, removes the dev words, simplifies
the three `build_sql_*` builders, de-dups completion entry-word
lists, and migrates the dev-word test inputs. Expect a sizeable but
mechanical test migration (the three `tests/sql_*.rs` + walker
tests hard-code `sqlinsert`/`sql_update`/`sql_delete`).
4. **Re-run the full diagnostic suite after 3j** (§4 heads-up) — the
path-shape change is the risk.
5. Then **3k** (verification sweep + phase-exit report).
6. **Escalate** anything not settled in ADR-0033 / the plan; the user
wants mechanism mismatches surfaced, not silently fixed.