diff --git a/docs/adr/0048-seed-fake-data-generation.md b/docs/adr/0048-seed-fake-data-generation.md index 0226e2c..9dfd1ed 100644 --- a/docs/adr/0048-seed-fake-data-generation.md +++ b/docs/adr/0048-seed-fake-data-generation.md @@ -2,7 +2,7 @@ ## Status -**Accepted (2026-06-11); Phase 1 implemented (2026-06-11).** Design +**Accepted (2026-06-11); Phase 1 + Phase 2 implemented (2026-06-11).** Design settled with the user across an extended fork dialogue (every decision below was escalated and user-chosen), then hardened by a pre-build `/runda` Devil's-Advocate pass that found six blockers — undo @@ -26,11 +26,39 @@ ADR decisions (D5/D15/D16/D17 + atomicity + zero-count), all closed. **Implemented in Phase 1:** the whole-row `seed [count] [--seed ]` form and every D1–D18 decision *except* the two -deferred-to-Phase-2 surfaces below. **Deferred to Phase 2** (designed -here, not yet built): the **`set` override clause** (D2) and the -**`
.` column-fill** form (D1 form 2). Further SD2 -increments (custom user generators, NULL injection, multi-locale, -recursive parent auto-seed) remain out of scope (see Out of scope). +Phase-2 surfaces. + +**Phase 2 implemented (2026-06-11):** both remaining surfaces — the +**`set` override clause** (D2: fixed value / pick-list / named +generator / range, quoted literals, type-aware) and the +**`
.` column-fill** form (D1 form 2: an UPDATE over +existing rows, refusing PK/autogen targets, empty-table no-op, one undo +step). The named-generator vocabulary (D9) lives in `src/seed` +(`KNOWN_GENERATORS` / `generator_for_name`); a new range `Generator` +(`src/seed/generators.rs`) backs `between`; the override clause is +folded from the flat matched path (`build_seed_overrides`, +`src/dsl/grammar/data.rs`) and applied to the per-column plan +(`apply_seed_overrides`, `src/db.rs`), with column-fill in +`do_seed_column_fill`. Full ambient wiring: completion (the generator +vocabulary after `as`, the `set`/`.col` column slots), highlighting +(`HighlightClass::Function` → `tok_function`, the generator slot), the +validity indicator (`IdentSource::Generators` — an unknown name flagged +`[ERR]`), help, and parse-error pedagogy rows. The D13 advisory now +carries its Phase-2/3 wording (points at `set` and the column-fill +repair). A post-implementation `/runda` pass then added one +user-chosen refinement: a **bounded override on a UNIQUE column** (a +fixed value / too-short pick-list) is now a **friendly error** rather +than a silent uniqueness cap (see D2). **2400 tests pass / 0 fail / 0 +skip; clippy clean.** Two +implementation refinements vs. this ADR's wording, both met the +user-facing contract: dates in the range form are **quoted** (the D2 +amendment, above — no date-literal token exists); and the `set` value +slots reuse `update`'s typed `current_column_value` (no spurious +column-ref match) rather than the raw expression operand. + +Further SD2 increments (custom user generators, NULL injection, +multi-locale, recursive parent auto-seed) remain out of scope (see Out +of scope). Closes `requirements.md` **SD1** and delivers the core of **SD2** (per-type generators, determinism, the `fake`-backed catalogue). It @@ -158,15 +186,37 @@ is nothing new to learn: | Fixed value | `set status = 'pending'` | every row gets the constant | | Pick-from-list | `set role in ('admin', 'editor', 'viewer')` | uniform random choice from the list | | Explicit generator | `set work_addr as email` | force a named generator (D9) | -| Range | `set price between 10 and 100` | uniform in range; **also dates** — `set signup between 2023-01-01 and 2024-12-31` | +| Range | `set price between 10 and 100` | uniform in range; **also dates** — `set signup between '2023-01-01' and '2024-12-31'` | Multiple clauses combine: `seed users 20 set role in ('admin', -'user'), status = 'active', signup between 2023-01-01 and 2024-12-31`. +'user'), status = 'active', signup between '2023-01-01' and +'2024-12-31'`. + +**Override × UNIQUE capacity (post-implementation `/runda`, user-chosen: +"friendly error").** A *bounded* override — a fixed value, or a +pick-list — on a **single-column-UNIQUE** target (a `UNIQUE` column or a +single-column PK) that offers fewer **distinct** values than the row +count cannot fill the run; rather than let the D10 uniqueness machinery +silently cap it (e.g. `seed users 100 set email = 'x'` → 1 row), seed +**refuses up front** with a friendly error pointing at the fixes (use a +generator, or a longer list). Generators and ranges are treated as +effectively unbounded sources — if one genuinely exhausts, the D14 +distinct-combination cap still applies. Compound uniqueness is exempt +(the *other* key columns can still vary). **Quoting (fork, user-chosen: "quoted, grammar-consistent").** Text values and list items are **quoted string literals** (`'admin'`), -exactly as everywhere else in the DSL — numbers and dates stay -unquoted. This reuses the ADR-0026 expression grammar **unchanged**: +exactly as everywhere else in the DSL — only **numbers** stay +unquoted. **Amendment (2026-06-11, Phase 2 build):** the original +wording said "numbers *and dates* stay unquoted", but this DSL has +**no date-literal token** — `Value` is `Number`/`Text` only, and a +date is a **quoted string** validated by `bind_date` (`'2023-01-01'`) +everywhere else (insert / update / `where`). An unquoted `2023-01-01` +lexes as `2023`,`-`,`01`,… and cannot parse. So **dates in the range +form are quoted** (`between '2023-01-01' and '2024-12-31'`) — which is +in fact *more* faithful to this decision's own "quoted, +grammar-consistent" principle. Numbers remain unquoted (`NumberLit`). +This reuses the ADR-0026 expression grammar **unchanged**: the DA pass confirmed that the `in (...)` form's operands are typed value slots, so a *bare* `admin` would parse as a **column reference** (→ "unknown column"), not a string. Quoting is therefore not a style @@ -521,23 +571,26 @@ ADR-0045 showed "claimed verified" is not verified): Design is whole; the **implementation** is phased into reviewable, test-first commits: -1. **Core whole-row seed** — grammar/AST/executor; type-based - generation + the `fake`-backed name heuristics (D7/D8/D11); - identifier uniqueness (D10) + constraint uniqueness; FK sampling - (joint tuples) + empty-parent error + junction distinct-combos - (D14); `--seed` determinism (D4); default count + cap + zero-no-op - (D6/D1); required-column block guard (D1); **undo batch (D15)**; - **replay-as-data-write classification (D16)**; **CHECK derive / - friendly-fail (D17)**; **capped auto-show (D18)**; the enum/CHECK - advisory in its **Phase-1 wording** (D12/D13); full ambient wiring; - both modes. -2. **The `set` override clause** (D2) — value / list / generator / - range, type-aware, with completion + highlight + validity for the - generator-name slot. -3. **Column-fill mode** (`seed
.`, D1 form 2) — the - UPDATE path. +1. **Core whole-row seed** *(done, Phase 1)* — grammar/AST/executor; + type-based generation + the `fake`-backed name heuristics + (D7/D8/D11); identifier uniqueness (D10) + constraint uniqueness; FK + sampling (joint tuples) + empty-parent error + junction + distinct-combos (D14); `--seed` determinism (D4); default count + cap + + zero-no-op (D6/D1); required-column block guard (D1); **undo batch + (D15)**; **replay-as-data-write classification (D16)**; **CHECK + derive / friendly-fail (D17)**; **capped auto-show (D18)**; the + enum/CHECK advisory in its **Phase-1 wording** (D12/D13); full + ambient wiring; both modes. +2. **The `set` override clause** (D2) *(done, Phase 2)* — value / list / + generator / range, type-aware, with completion + highlight + + validity for the generator-name slot. +3. **Column-fill mode** (`seed
.`, D1 form 2) *(done, + Phase 2)* — the UPDATE path. -Each phase is independently green before the next. +Each phase is independently green before the next. (Phases 2 and 3 +landed together — they share the `set`-override executor machinery, so +splitting them risked a state where `set` parsed but column-fill +silently no-op'd.) ## Testing (ADR-0008 tiers 1–3; test-first) diff --git a/docs/adr/README.md b/docs/adr/README.md index 5eb8778..101ee05 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -53,4 +53,4 @@ This directory contains the project's ADRs, recorded per - [ADR-0045 — `create m:n relationship` convenience command (C4)](0045-mn-convenience.md) — **Accepted + implemented 2026-06-10** (closes `requirements.md` **C4**; all forks user-confirmed + a `/runda` DA pass that verified the `do_create_table` reuse against code and corrected the "no PK-less tables" assumption — advanced SQL `create table t (a int)` has none, so a parent-PK guard is retained). Implementation corrected a second ADR premise: "the walker already dispatches multiple nodes per entry word" held only in *advanced* mode — two simple-mode spots (dispatcher `decide`, completion continuation-merge) assumed ≤1 DSL form per entry word and were generalized **behaviour-preservingly** (dispatch reduces to the old single-candidate commit; completion merge gated on `simple_count > 1`). Junction echo wired (`render_create_m2n`, round-trips as SQL). `create m:n relationship from to [as ]` generates a junction table with one FK column per parent PK column, a **compound PK over all the FK columns** (the textbook junction — the pair is unique, no duplicate links), and **two 1:n relationships**, all in **one transaction = one undo step** (built by reusing `do_create_table`, which already takes `foreign_keys` + writes relationship metadata — no batch bracketing). Forks all user-chosen: junction PK = compound-over-FKs (vs surrogate serial / no PK); referential actions = **`CASCADE`** on delete+update (vs NO ACTION / RESTRICT); naming = auto `{T1}_{T2}` + optional `as` (vs auto-only); available in **both modes** (Simple-category DSL, like the sibling relationship commands). FK columns named `{parent_table}_{pk_column}` (disambiguates shared `id`; generalises to compound parents via ADR-0043), typed via `fk_target_type` (ADR-0011). A distinct `Command::CreateM2nRelationship` (not lowered to `CreateTable`) preserves command identity (X5) and lets the teaching echo speak in m:n terms. Cross-cutting wiring enumerated: separate `CREATE_M2N` `CommandNode` (own `help_id`/`usage_ids`), `("m","m:n")` completion composite, `HintMode`s, grammar-driven highlighting, `help`/`help create`, `parse_error_pedagogy` near-miss matrix, teaching echo. OOS: **self-referential m:n** (`from T to T`) refused outright (user-confirmed "full stop" — directional column-naming is more than this beginner convenience warrants); per-relationship action overrides; extra junction payload columns; m:n diagram echo; renaming the auto-generated relationships - [ADR-0046 — Schema sidebar focus/navigation mode and responsive input & hint layout (UI #20/#21/#23)](0046-sidebar-navigation-and-responsive-input-hint.md) — **Accepted + implemented 2026-06-10, phased A→B→C** (8 commits `9f5f76b`…`22bec61`; closes Gitea **#20** hint jumpiness, **#21** left-column improvements, **#23** long input — all forks user-confirmed, including the persistent show/hide toggle which is **deferred**: the Ctrl-O peek covers #21's "keystroke to show and hide"). Two decisions landed differently from the draft (recorded inline): relationship data on **`App`** not `SchemaCache` (DB2); the nav overlay clears **only the sidebar strip + a one-column gutter**, panels staying visible behind (DC2). Treats the three UI issues as one coupled decision because they share the terminal's width/height budget. **Phase A (input & hint):** the hint panel's height becomes a function of **terminal geometry, fixed between resizes** (not of hint content), eliminating the #20 jump at its source — measured catalog shows ≥ ~54-col right-column width never needs > 2 hint lines, so 3 lines is a rare narrow-terminal-only case; height buckets `H<40` compact (input 1 row + horizontal scroll / hint 2) vs `H≥40` comfortable (input 2 rows soft-wrap / hint 2), output `Min(5)` honoured first under degradation; input gains horizontal scroll (`input_scroll_offset`, single logical `String` — **not** I1 multi-line) and 2-row soft-wrap display when tall, preserving ADR-0027's 6-col indicator reserve. **Phase B (sidebar):** the 26-col Tables column is **kept but made optional and richer** (not deleted — pedagogy wins ties) — **width-derived session-only** visibility (visible iff width > 90 or a Ctrl-O peek is active — no stored field; hides at width ≤ 90 so the 90-col screencasts drop it; ADR-0015 format untouched), plus a **relationships panel** rendered narrow with endpoints broken at the arrow, ellipsized — a **separate sibling panel** that **overrides S2**'s nested-list extension model (relationships are cross-table). the full records live on a new **`App.relationships`** field (revised from the ADR's original `SchemaCache.relationship_details` at implementation — `SchemaCache` is walker-facing and needs only the names, kept in `relationships: Vec`; details are UI-only, so `App` mirrors `app.tables` and avoids ~23 fixture edits), delivered by `Database::read_all_relationships` + an `AppEvent::RelationshipsRefreshed`; the two left panels split vertically with the relationships panel floored at 5 rows ("(none)" when empty) and capped at 50 % of the column (DB4). **Phase C (navigation mode):** **`Ctrl-O`** enters a focus cycle (Input → Tables → Relationships → Input; `Esc` exits) orthogonal to the ADR-0003 input mode — **`Ctrl-B` was rejected on review as the default tmux prefix** (unreachable inside tmux); the focused panel **expands to ~40–50 cols as a `Clear` overlay** (right panels stay unchanging underneath) and scrolls via **Up/Down (line) + PageUp/PageDown (page)** (context-rebind, reusing the output-scroll viewport mechanism), with an accent focus border; all non-nav keys inert in nav mode (and nav keys inert while a modal is open). Forks all user-chosen: keep-optional-richer (vs remove/narrow); navigation-mode (vs modeless modifier scroll); `Ctrl-O` (Ctrl-B rejected = tmux prefix); overlay (vs layout re-split); inert-non-nav-keys; geometry-fixed hint height; `H<40/≥40` thresholds; session-only persistence; Up/Down line-scroll; **separate relationships panel overriding S2**; **no hint-area toggle** (S4's stale "keyboard-toggleable" claim struck — never implemented, unwanted). A pre-build `/runda` DA pass drove these corrections: caught the `Ctrl-B`/tmux collision, the `SchemaCache` retype that would have broken completion, the 2-row-input/indicator placement, the missing nav-mode key disposition + modal gate, and three unreferenced requirements (S1 evolved, S2 overridden, S4 corrected); also cross-checked open issue **#22** (overlay/annotation layer — separate ADR, adjacent). OOS: true multi-line input (I1); readline shortcuts (I1b); cross-session sidebar persistence; output as a third nav focus; relationship search/edit from the panel; hint-area toggle; #22's annotation layer. Accepted consequence: the 90-col visibility threshold makes a terminal's output *narrower* when widened across the boundary (sidebar appears) - [ADR-0047 — Demonstration overlay layer (keystroke badges + step captions)](0047-demonstration-overlay-layer.md) — **Accepted 2026-06-10; implemented 2026-06-11, phased A→B→C (closes Gitea #22)** (commits `f879d54`→`2d0f4b2`; no `requirements.md` item — tracked by issue + ADR per convention; all forks user-confirmed + a pre-build `/runda` pass that produced 10 tightening findings and a whole-implementation `/runda` pass that returned PASS, no blockers). An in-app **demonstration mode** (`--demo` flag / `RDBMS_PLAYGROUND_DEMO` env, **off by default, zero footprint when off**) that renders two transient overlays so `autocast` screencasts — and live teaching, and a future guided-lesson system — can show otherwise-invisible interactions. **Keystroke badges** (`[TAB]`, `[ENTER]`, `[UP]`, …): **automatic, app-detected** over a fixed set of glyph-less keys (the app already sees every key, so it re-records for free), label via a pure `demo_badge_label(&KeyEvent)`; the badge **auto-expires on a ~1.5 s timer** that extends the runtime's existing time-boxed-`recv` arm condition (`debounce.is_armed() || badge_pending`; expiry `Instant` in the runtime, `App.demo_badge` the render mirror — mirroring the `input` vs `input_indicator` split). **Step captions**: a **stealth, control-code-delimited input buffer** toggled by **`Ctrl+]`** (byte `0x1D` → arrives as `Char('5')+CONTROL`, verified against crossterm 0.29 `parse.rs:110-113`; chosen over `Ctrl+!`, which is **not a single ASCII byte so autocast cannot send it** — the same wall as arrow keys, R4) — typed characters accumulate **invisibly** (prompt untouched, no echo/history), `Backspace` edits, other keys inert, a second `Ctrl+]` **commits** to the caption box (empty commit dismisses); lives in pure-sync `App::update()`, **intercepted before the modal gate** so captions/badges work **over the load picker** (the `#24` projects cast). Both render as **floating flat black-on-yellow rectangles** (solid fill, **no border glyphs** — a one-cell text margin, deliberately unlike the app's bordered panels; user decision post-build, `2d0f4b2`) **at the output panel's inner bottom-right**, drawn **last over modals**, badge **stacked above** the caption, **no layout reflow**; caption **word-wraps to ≤ 3 lines** (3–5 rows), badge fixed 3 rows; clamp/skip guard for tiny terminals; a new **`App.last_output_area: Rect`** (set in `render_output_panel`) gives the top-level draw the anchor. Caption persists **until the next keystroke**; badge suppressed while capturing. Forks all user-chosen: `--demo` activation (vs hidden command / chord); automatic badges (vs scripted); stealth buffer (vs typed-command / preloaded-file); floating bottom-right boxes (vs HUD / banner / subtitle); `Ctrl+]` trigger; wrap-to-3-line captions; ~1.5 s badge / next-keystroke caption timing. Tested test-first across Tier 1 (label fn, capture state machine incl. over-modal + demo-off gate, nearest-deadline helper), Tier 2 (insta snapshots: badge/caption/both-stacked at 90×26 light+dark, short-terminal clamp), Tier 3 (`--demo` plumbing, badge set/suppressed, caption-without-input wiring), CLI (`--demo` parse + env fallback) — with an **honest limit** noted: the `tokio` timer wiring inside `run_loop` is exercised via the pure pieces + Tier-3 plumbing, not a standalone integration test of the timeout (same posture as the existing `IndicatorDebounce`). One intentional, user-acknowledged behaviour: `Ctrl-C` is inert while capturing (every non-`Ctrl+]` key is, by spec). Final tally **2290 passing / 0 failing / 0 skipped** (1 long-standing ignored doctest), clippy clean. OOS: scripted/manual badge push; badges for glyph keys; configurable styling/placement; the guided-lesson system itself (own ADR); cross-session/-switch persistence; localised caption content; arrow-only cast interactions (output-pane scroll); wiring the overlays into the website `casts.mjs` scripts (website-branch follow-up). Implementation phased **A** (`--demo` plumbing) → **B** (badges) → **C** (captions) + a flat-rectangle restyle -- [ADR-0048 — `seed` fake-data generation command](0048-seed-fake-data-generation.md) — **Accepted 2026-06-11; Phase 1 implemented 2026-06-11** (commits `202e25a`→`fbd219b`; design settled with the user across an extended fork dialogue, hardened by a pre-build `/runda` pass (six blockers folded in) and a post-implementation `/runda` pass (eight gaps closed — FK/shortid determinism so **D4 holds with no exceptions**, plus six untested ADR decisions); **2358 tests pass, clippy clean**). Closes `requirements.md` **SD1** and the core of **SD2**; closes the `seed` half of **A1**. **Phase 1 shipped:** whole-row `seed
[count] [--seed ]` with realistic name-aware generation (the `fake` crate + a type-gated heuristic catalogue, table-context name disambiguation, hand-rolled `product` generator, bounded dates), identifier + constraint uniqueness incl. junction distinct-combos, FK sampling from existing parent rows (empty-parent error), `IN`-CHECK derivation + complex-CHECK advisory, a required-column block guard, `--seed` reproducibility (serial/FK/shortid all deterministic), undo as one batch step, replay as a data write, a capped auto-show preview, the enum/CHECK advisory, and an O(N) single-transaction insert path. **Deferred to Phase 2** (designed, not built): the `set` override clause (D2) and the `
.` column-fill form. Further SD2 increments (custom generators, NULL injection, multi-locale, recursive auto-seed) out of scope. Closes `requirements.md` **SD1** and the core of **SD2**; closes the `seed` half of **A1** (the other being `hint`/**H2**). A dedicated `seed` command (own AST variant + `do_seed` executor, **both modes**) generating **realistic, name-aware** fake data. Two forms: **`seed
[count]`** (new rows, default **20**, capped) and **`seed
.`** (fill a column on existing rows, an UPDATE). Generation adds the **`fake` crate** (v5, English) driven by a **type-gated, token-matched name-heuristic catalogue** (~30 patterns, documented false-positive guards), with **table-context** disambiguating the `name`/`title` family (`products.name`→product, `users.name`→person, `vendors.name`→company), a **hand-rolled `product` generator** (`fake` has no commerce module), **bounded dates** (`date`/`timestamp`/`dob`/`*_at` recognised, recent windows — never "all of history"), the **identifier family** (`id`/`code`/`ref`/`number`, non-FK/non-PK) → **unique sequential**, and **enum-ish names** (`role`/`status`/`type`/…) left generic + a **post-seed Hint advisory** pointing at `set … in (…)`. A **`set` override clause** — `= value` / `in (a,b,c)` / `as ` / `between a and b` (numeric **and** date), reusing ADR-0026 operators — answers the heuristic-miss case. **`--seed `** makes runs reproducible (and enables exact-value tests). **FK** columns sampled uniformly from existing parent rows (**empty parent → friendly error**, no recursion v1); **junction/compound-PK** tables seeded with **distinct combinations**, capped + noted (SD1). A **required-column block guard** refuses rather than NULL-violate a `NOT NULL` column it can't fill (e.g. `NOT NULL blob`). Full ambient wiring (completion incl. a new generator-name vocabulary highlighted as `tok_function`, hints, `help seed`, ADR-0042 near-miss matrix, ADR-0027 validity); **no DSL→SQL teaching echo** (seed is a utility command, not a SQL twin). Honours **X5** — `do_seed` reuses insert/update *mechanics as helpers*, not by emitting `Command::Insert`. Implementation phased: (1) core whole-row seed → (2) `set` overrides → (3) column-fill. Deferred (future SD2): recursive auto-seed, NULL injection, multi-locale, user-defined custom generators, full per-column report +- [ADR-0048 — `seed` fake-data generation command](0048-seed-fake-data-generation.md) — **Accepted 2026-06-11; Phase 1 + Phase 2 implemented 2026-06-11** (Phase 1 commits `202e25a`→`fbd219b`; design settled with the user across an extended fork dialogue, hardened by a pre-build `/runda` pass (six blockers folded in), a post-implementation `/runda` pass (eight gaps closed — FK/shortid determinism so **D4 holds with no exceptions**, plus six untested ADR decisions), and a Phase-2 pre-build `/runda` pass (which caught the no-date-literal-token reality → the D2 quoted-dates amendment), and a post-implementation `/runda` pass (which added a friendly error for a bounded override on a UNIQUE column — see D2); **2400 tests pass, clippy clean**). Closes `requirements.md` **SD1** and the core of **SD2**; closes the `seed` half of **A1**. **Phase 1 shipped:** whole-row `seed
[count] [--seed ]` with realistic name-aware generation (the `fake` crate + a type-gated heuristic catalogue, table-context name disambiguation, hand-rolled `product` generator, bounded dates), identifier + constraint uniqueness incl. junction distinct-combos, FK sampling from existing parent rows (empty-parent error), `IN`-CHECK derivation + complex-CHECK advisory, a required-column block guard, `--seed` reproducibility (serial/FK/shortid all deterministic), undo as one batch step, replay as a data write, a capped auto-show preview, the enum/CHECK advisory, and an O(N) single-transaction insert path. **Phase 2 shipped (2026-06-11):** the `set` override clause (D2 — fixed value / pick-list / `as ` / `between` range, **quoted** dates per the D2 amendment, type-aware, override drops the column from the advisory) and the `
.` column-fill form (D1 form 2 — an UPDATE over existing rows, refusing PK/autogen targets, empty-table no-op, FK/unique-respecting, one undo step), with the new `KNOWN_GENERATORS` vocabulary (D9), a range `Generator`, full completion/highlight (`HighlightClass::Function`)/validity (`IdentSource::Generators`)/help/pedagogy wiring, and the D13 advisory's Phase-2/3 wording. Further SD2 increments (custom generators, NULL injection, multi-locale, recursive auto-seed) out of scope. Closes `requirements.md` **SD1** and the core of **SD2**; closes the `seed` half of **A1** (the other being `hint`/**H2**). A dedicated `seed` command (own AST variant + `do_seed` executor, **both modes**) generating **realistic, name-aware** fake data. Two forms: **`seed
[count]`** (new rows, default **20**, capped) and **`seed
.`** (fill a column on existing rows, an UPDATE). Generation adds the **`fake` crate** (v5, English) driven by a **type-gated, token-matched name-heuristic catalogue** (~30 patterns, documented false-positive guards), with **table-context** disambiguating the `name`/`title` family (`products.name`→product, `users.name`→person, `vendors.name`→company), a **hand-rolled `product` generator** (`fake` has no commerce module), **bounded dates** (`date`/`timestamp`/`dob`/`*_at` recognised, recent windows — never "all of history"), the **identifier family** (`id`/`code`/`ref`/`number`, non-FK/non-PK) → **unique sequential**, and **enum-ish names** (`role`/`status`/`type`/…) left generic + a **post-seed Hint advisory** pointing at `set … in (…)`. A **`set` override clause** — `= value` / `in (a,b,c)` / `as ` / `between a and b` (numeric **and** date), reusing ADR-0026 operators — answers the heuristic-miss case. **`--seed `** makes runs reproducible (and enables exact-value tests). **FK** columns sampled uniformly from existing parent rows (**empty parent → friendly error**, no recursion v1); **junction/compound-PK** tables seeded with **distinct combinations**, capped + noted (SD1). A **required-column block guard** refuses rather than NULL-violate a `NOT NULL` column it can't fill (e.g. `NOT NULL blob`). Full ambient wiring (completion incl. a new generator-name vocabulary highlighted as `tok_function`, hints, `help seed`, ADR-0042 near-miss matrix, ADR-0027 validity); **no DSL→SQL teaching echo** (seed is a utility command, not a SQL twin). Honours **X5** — `do_seed` reuses insert/update *mechanics as helpers*, not by emitting `Command::Insert`. Implementation phased: (1) core whole-row seed → (2) `set` overrides → (3) column-fill. Deferred (future SD2): recursive auto-seed, NULL injection, multi-locale, user-defined custom generators, full per-column report diff --git a/docs/handoff/20260611-handoff-66.md b/docs/handoff/20260611-handoff-66.md new file mode 100644 index 0000000..70f8562 --- /dev/null +++ b/docs/handoff/20260611-handoff-66.md @@ -0,0 +1,145 @@ +# Session handoff — 2026-06-11 (66) + +Sixty-sixth handover. Continues from handoff-65 (ADR-0048 `seed` +Phase 1). This session built **ADR-0048 Phase 2** end to end: the +**`set` override clause** (D2) and the **`
.` +column-fill** form (D1 form 2) — the two surfaces Phase 1 deliberately +deferred. Designed-then-DA-vetted (a `/runda` pass that caught a real +ADR-vs-grammar conflict), then built test-first. + +## §1. State at handoff + +**Branch:** `main`. All Phase-2 work is in the working tree; +**commits are pending the user's approval** (see §6). Unpushed is the +normal working state. + +**Tests: 2400 passing / 0 failing / 0 skipped / 1 ignored** (the +long-standing `friendly` doctest). **Clippy clean** (nursery, all +targets). +42 over handoff-65's 2358. + +## §2. What landed (read ADR-0048 — Status + D1/D2/D9/D13) + +`seed [.] [count] [set ] [--seed ]`. + +- **`set` override clause (D2):** four forms, comma-separated — + `status = 'active'` (fixed), `role in ('a','b')` (pick-list), + `work_addr as email` (named generator), `price between 10 and 100` + (range; numeric **and quoted dates**). Type-aware; an override + **drops its column from the generic-fill advisory** (D13). Value + slots reuse `update`'s typed `current_column_value` (quoting + enforced structurally — a bare word is rejected). +- **Column-fill (D1 form 2):** `seed users.email [set …]` fills one + column across **existing** rows (an UPDATE). Refuses PK / autogen + (`serial`/`shortid`/`blob`) targets; **empty table → friendly + no-op**; FK target samples the parent; UNIQUE/identifier target gets + collision-free values; **one undo step**; `set` may only adjust the + filled column; a row count is refused. +- **Named-generator vocabulary (D9):** `src/seed/vocabulary.rs` — + `KNOWN_GENERATORS` + `generator_for_name` + `is_known_generator_prefix`, + the single source of truth for completion, validity, and the executor. +- **Range generator:** `Generator::Range { low, high }` in + `src/seed/generators.rs`, interpreted per destination type; + `range_bounds_reason` validates compatibility before generation. +- **Ambient wiring:** completion (generator names after `as`, the + `set ` and `.col` column slots, the `set` keyword); highlight + (new `HighlightClass::Function` → existing `tok_function`); validity + (new `IdentSource::Generators` — unknown generator flagged `[ERR]`; + unknown column in `set`/`.col` flagged via the existing Columns + path); help (`help.data.seed`); parse-error pedagogy near-miss rows; + the D13 advisory's **Phase-2/3 wording** (points at `set` and the + column-fill repair). Both modes (D5). + +## §3. The ADR amendment (a real DA find) + +The pre-build `/runda` pass found that **ADR-0048 D2's "dates stay +unquoted" was impossible** — this DSL has **no date-literal token** +(`Value` is `Number`/`Text`; dates are quoted strings validated by +`bind_date`). Escalated to the user, who chose **quoted dates + +amend the ADR** (the grammar-consistent option). D2 now carries a +dated amendment; the range form uses `between '2023-01-01' and +'2024-12-31'`. This was the only divergence from the ADR text; numbers +remain unquoted. + +## §4. Where the code lives + +- **`src/dsl/command.rs`** — `Command::Seed` gains `target_column: + Option` + `overrides: Vec`; new `SeedOverride` + / `SeedOverrideKind`. +- **`src/dsl/grammar/data.rs`** — `SEED_SET_CLAUSE` + `SEED_DOT_COLUMN` + grammar; `SEED_GENERATOR` slot (`IdentSource::Generators`, + `HighlightClass::Function`); `build_seed` + the override fold + (`build_seed_overrides` / `parse_seed_override_tail`). +- **`src/dsl/grammar/mod.rs`** — `IdentSource::Generators` + + `HighlightClass::Function`. +- **`src/db.rs`** — `apply_seed_overrides` / `seed_override_plan` / + `seed_override_literal`; `do_seed_column_fill`; `do_seed` + + `Database::seed` + worker wiring threaded with the new params. +- **`src/seed/`** — `vocabulary.rs` (new); `generators.rs` (range + generator + `range_bounds_reason`); `mod.rs` (`Generator::Range`). +- **`src/completion.rs`** — generator candidates after `as`; generator + validity. **`src/input_render.rs`** — `"generator"` invalid-ident + kind. **`src/theme.rs`** — `Function → tok_function`. +- **Catalog** — `help.data.seed`, `parse.usage.seed`, + `seed.advisory_generic` (Phase-2/3 wording) in `en-US.yaml`; + `keys.rs` placeholders updated. +- **Tests** — `tests/it/seed.rs` (+~30: builder fold, executor + set/column-fill, undo, advanced mode), `src/seed/{vocabulary, + generators}.rs` (range + vocabulary units), `src/completion.rs` + (generator + column validity), `src/dsl/walker/highlight.rs`, + `tests/typing_surface/mod.rs` (completion slots), + `tests/it/parse_error_pedagogy.rs` (near-miss rows). + +## §5. Two implementation refinements vs. the ADR (both met the contract) + +- **Quoted dates** (the D2 amendment, §3). +- **Value slots reuse `current_column_value`** (the `update … set` + typed slot) rather than the raw ADR-0026 expression operand — no + spurious column-ref match, typed narrowing, consistent with + `update`. The user-facing contract (quoted literals, type-aware) is + fully met. + +The `seed_take_value` / `seed_set_error` builder paths are +drift-guards (the typed slots only ever match value literals, so a bare +word is rejected at the grammar level) — they use the generic +`parse.error_wrapper`, mirroring `expr::build_expr`. + +## §6. How to take over / next steps + +1. Read handoffs 64 → 65 → 66, `CLAUDE.md`, `docs/requirements.md`, + `docs/adr/0048-…md` (Status block + D1/D2/D9/D13 + the amendment). +2. **Seed is feature-complete (SD1 + SD2).** `requirements.md`: **SD1 + `[x]`, SD2 `[x]`**. The only open A1 gap is `hint`/**H2** (own ADR). +3. **Commits pending approval.** Suggested split: + - `feat(seed): set override clause + column-fill (ADR-0048 Phase 2)` + — all `src/` + `tests/` changes. + - `docs: ADR-0048 Phase 2 implemented + handoff 66` — ADR / README / + requirements / this file. +4. Next options (user's call): **H2 `hint`** (closes A1); **TT5 CI**; + the larger **V4 journal** / **tutorial** ADRs; or Tier-4 PTY (TT4). +5. Consider a `cargo sweep` at this milestone (`target/` grows). + +## §7. Post-implementation `/runda` pass (done this session) + +A DA pass over the completed code found **no correctness bugs and no +dropped requirements**; all D1–D18 acceptance criteria verified met, +tests confirmed to catch regressions. One **design fork** was surfaced +and **resolved by the user**: + +- **Bounded override × UNIQUE column** — a fixed value / too-short + pick-list on a single-column-UNIQUE target used to silently cap the + run (e.g. `seed users 100 set email = 'x'` → 1 row). Now a **friendly + error** up front (`seed_override_capacity_guard`, `src/db.rs`), for + both whole-row and column-fill; generators/ranges stay cap-based + (unbounded sources). ADR-0048 D2 documents it; two tests pin it. + +Remaining **non-blocking** edges (noted, not bugs): + +- Overriding an **FK column** with a literal: the override wins (D2); a + non-parent value fails safely through the FK-error layer. +- **Column-fill of one column of a *compound* FK** samples that column + independently → an invalid tuple fails safely (UPDATE rejected, + rollback), never corrupts. Single-column FKs / non-FK columns are + exact. +- The generator slot uses the **default candidate-ladder hint** (offers + the vocabulary), not a dedicated prose intro — discoverability is met + by completion; a prose intro is optional polish. diff --git a/docs/requirements.md b/docs/requirements.md index 28a5369..2222f11 100644 --- a/docs/requirements.md +++ b/docs/requirements.md @@ -677,22 +677,26 @@ since ADR-0027.) empty-parent friendly error), `IN`-CHECK derivation, a required-column block guard, undo as one step, replay as a data write, a capped auto-show + enum/CHECK advisory, and an O(N) - single-transaction path. 25 integration + ~40 unit tests; two - `/runda` passes. The `set` override clause and `
.` - column-fill are SD2 Phase-2, deferred.)* -- [/] **SD2** Detailed seeding rules (per-type generators, + single-transaction path. The `set` override clause and + `
.` column-fill landed in SD2 Phase 2, below.)* +- [x] **SD2** Detailed seeding rules (per-type generators, locale, determinism, override hooks). - *(Core done 2026-06-11 via **ADR-0048**: type-gated name-aware - per-type generators with a `fake`-backed catalogue + - table-context disambiguation, **`--seed` determinism** + *(Done 2026-06-11 via **ADR-0048** (Phase 1 + Phase 2). Phase 1: + type-gated name-aware per-type generators with a `fake`-backed + catalogue + table-context disambiguation, **`--seed` determinism** (serial/FK/shortid all reproducible — D4 holds with no - exceptions), English-only locale (X2). **Still open (Phase 2):** - the `set` override clause — value / pick-from-list / explicit - generator / range (the "override hooks" core, designed in - ADR-0048 D2 but not yet built) and the `
.` - column-fill form. Deferred SD2 increments: user-defined custom - generators, NULL injection, multi-locale, recursive parent - auto-seed.)* + exceptions), English-only locale (X2). **Phase 2 (the "override + hooks" core):** the `set` override clause — fixed value / + pick-from-list / `as ` / `between` range (numeric and + **quoted** dates, type-aware; an override drops the column from + the generic-fill advisory) — and the `
.` + column-fill form (an UPDATE over existing rows, refusing + PK/autogen targets, empty-table no-op, FK/unique-respecting, one + undo step). Adds the `KNOWN_GENERATORS` vocabulary (D9), a range + `Generator`, and full completion / highlight / validity / help / + parse-error-pedagogy wiring. Deferred SD2 increments: + user-defined custom generators, NULL injection, multi-locale, + recursive parent auto-seed.)* ## Query analysis