From 78c38e8b333be3fc1aa10aebb194e0ee5a43d4cd Mon Sep 17 00:00:00 2001 From: "claude@clouddev1" Date: Thu, 11 Jun 2026 21:49:06 +0000 Subject: [PATCH] docs: ADR-0048 Phase 1 accepted/implemented + handoff 65 - ADR-0048 status -> Accepted; Phase 1 implemented (commits 202e25a..fbd219b), with the pre-build and post-implementation /runda passes and the 2358-test green state recorded; index entry updated. - requirements.md: SD1 [x] (whole-row seed + FK/junction, both modes, --seed reproducibility with no exceptions), SD2 [/] (core generators / determinism done; the set override clause + column-fill are Phase 2), A1 14/15 (only hint/H2 remains unregistered). - Handoff 65: the full seed Phase-1 build, the two /runda passes, where the code lives, and Phase-2 / next steps. --- docs/adr/0048-seed-fake-data-generation.md | 45 +++++-- docs/adr/README.md | 2 +- docs/handoff/20260611-handoff-65.md | 144 +++++++++++++++++++++ docs/requirements.md | 50 ++++--- 4 files changed, 208 insertions(+), 33 deletions(-) create mode 100644 docs/handoff/20260611-handoff-65.md diff --git a/docs/adr/0048-seed-fake-data-generation.md b/docs/adr/0048-seed-fake-data-generation.md index 9a35949..0226e2c 100644 --- a/docs/adr/0048-seed-fake-data-generation.md +++ b/docs/adr/0048-seed-fake-data-generation.md @@ -2,22 +2,39 @@ ## Status -**Proposed (2026-06-11).** Design settled with the user across an -extended fork dialogue (every decision below was escalated and -user-chosen), then hardened by a `/runda` Devil's-Advocate pass that -found six blockers — undo integration (D15), replay semantics (D16), -`set` value quoting (D2), CHECK-constraint handling (D17), a -phase-ordering bug in the advisory (D13), and auto-show flooding -(D18) — plus refinements (state-relative reproducibility, compound-FK -tuple sampling, column-fill constraint rules, the `fake` dependency -scan). All are folded in below, the three genuine forks among them -re-escalated and user-resolved. Pending **phased, test-first -implementation**; this status flips to *Accepted / implemented* once -that lands. +**Accepted (2026-06-11); Phase 1 implemented (2026-06-11).** Design +settled with the user across an extended fork dialogue (every decision +below was escalated and user-chosen), then hardened by a pre-build +`/runda` Devil's-Advocate pass that found six blockers — undo +integration (D15), replay semantics (D16), `set` value quoting (D2), +CHECK-constraint handling (D17), a phase-ordering bug in the advisory +(D13), and auto-show flooding (D18) — plus refinements (state-relative +reproducibility, compound-FK tuple sampling, column-fill constraint +rules, the `fake` dependency scan), all folded in. + +**Phase 1 shipped** test-first across commits `202e25a` (generation +library + `fake` dependency) → `f1e9484` (command skeleton) → +`73493fa` (FK sampling) → `9c13501` (uniqueness / junction / IN-CHECK) +→ `0b3ab3c` (`SeedResult` / preview / advisory / count cap) → +`e6ff63d` (single-transaction O(N) path) → `fbd219b` (`--seed` flag, +ambient wiring, and a whole-implementation `/runda` pass). The +post-implementation `/runda` found eight gaps — FK-sampling +determinism (now `ORDER BY`), shortid reproducibility (now from the +seeded RNG, so **D4 holds with no exceptions**), and six untested +ADR decisions (D5/D15/D16/D17 + atomicity + zero-count), all closed. +**2358 tests pass / 0 fail / 0 skip; clippy clean.** + +**Implemented in Phase 1:** the whole-row `seed [count] +[--seed ]` form and every D1–D18 decision *except* the two +deferred-to-Phase-2 surfaces below. **Deferred to Phase 2** (designed +here, not yet built): the **`set` override clause** (D2) and the +**`
.` column-fill** form (D1 form 2). Further SD2 +increments (custom user generators, NULL injection, multi-locale, +recursive parent auto-seed) remain out of scope (see Out of scope). Closes `requirements.md` **SD1** and delivers the core of **SD2** -(per-type generators, determinism, the override surface). It also -closes one of the two remaining gaps in **A1** ("all canonical +(per-type generators, determinism, the `fake`-backed catalogue). It +also closes one of the two remaining gaps in **A1** ("all canonical app-level commands") — `seed`; the other, `hint` (**H2**), is separate. diff --git a/docs/adr/README.md b/docs/adr/README.md index 1744c6f..5eb8778 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -53,4 +53,4 @@ This directory contains the project's ADRs, recorded per - [ADR-0045 — `create m:n relationship` convenience command (C4)](0045-mn-convenience.md) — **Accepted + implemented 2026-06-10** (closes `requirements.md` **C4**; all forks user-confirmed + a `/runda` DA pass that verified the `do_create_table` reuse against code and corrected the "no PK-less tables" assumption — advanced SQL `create table t (a int)` has none, so a parent-PK guard is retained). Implementation corrected a second ADR premise: "the walker already dispatches multiple nodes per entry word" held only in *advanced* mode — two simple-mode spots (dispatcher `decide`, completion continuation-merge) assumed ≤1 DSL form per entry word and were generalized **behaviour-preservingly** (dispatch reduces to the old single-candidate commit; completion merge gated on `simple_count > 1`). Junction echo wired (`render_create_m2n`, round-trips as SQL). `create m:n relationship from to [as ]` generates a junction table with one FK column per parent PK column, a **compound PK over all the FK columns** (the textbook junction — the pair is unique, no duplicate links), and **two 1:n relationships**, all in **one transaction = one undo step** (built by reusing `do_create_table`, which already takes `foreign_keys` + writes relationship metadata — no batch bracketing). Forks all user-chosen: junction PK = compound-over-FKs (vs surrogate serial / no PK); referential actions = **`CASCADE`** on delete+update (vs NO ACTION / RESTRICT); naming = auto `{T1}_{T2}` + optional `as` (vs auto-only); available in **both modes** (Simple-category DSL, like the sibling relationship commands). FK columns named `{parent_table}_{pk_column}` (disambiguates shared `id`; generalises to compound parents via ADR-0043), typed via `fk_target_type` (ADR-0011). A distinct `Command::CreateM2nRelationship` (not lowered to `CreateTable`) preserves command identity (X5) and lets the teaching echo speak in m:n terms. Cross-cutting wiring enumerated: separate `CREATE_M2N` `CommandNode` (own `help_id`/`usage_ids`), `("m","m:n")` completion composite, `HintMode`s, grammar-driven highlighting, `help`/`help create`, `parse_error_pedagogy` near-miss matrix, teaching echo. OOS: **self-referential m:n** (`from T to T`) refused outright (user-confirmed "full stop" — directional column-naming is more than this beginner convenience warrants); per-relationship action overrides; extra junction payload columns; m:n diagram echo; renaming the auto-generated relationships - [ADR-0046 — Schema sidebar focus/navigation mode and responsive input & hint layout (UI #20/#21/#23)](0046-sidebar-navigation-and-responsive-input-hint.md) — **Accepted + implemented 2026-06-10, phased A→B→C** (8 commits `9f5f76b`…`22bec61`; closes Gitea **#20** hint jumpiness, **#21** left-column improvements, **#23** long input — all forks user-confirmed, including the persistent show/hide toggle which is **deferred**: the Ctrl-O peek covers #21's "keystroke to show and hide"). Two decisions landed differently from the draft (recorded inline): relationship data on **`App`** not `SchemaCache` (DB2); the nav overlay clears **only the sidebar strip + a one-column gutter**, panels staying visible behind (DC2). Treats the three UI issues as one coupled decision because they share the terminal's width/height budget. **Phase A (input & hint):** the hint panel's height becomes a function of **terminal geometry, fixed between resizes** (not of hint content), eliminating the #20 jump at its source — measured catalog shows ≥ ~54-col right-column width never needs > 2 hint lines, so 3 lines is a rare narrow-terminal-only case; height buckets `H<40` compact (input 1 row + horizontal scroll / hint 2) vs `H≥40` comfortable (input 2 rows soft-wrap / hint 2), output `Min(5)` honoured first under degradation; input gains horizontal scroll (`input_scroll_offset`, single logical `String` — **not** I1 multi-line) and 2-row soft-wrap display when tall, preserving ADR-0027's 6-col indicator reserve. **Phase B (sidebar):** the 26-col Tables column is **kept but made optional and richer** (not deleted — pedagogy wins ties) — **width-derived session-only** visibility (visible iff width > 90 or a Ctrl-O peek is active — no stored field; hides at width ≤ 90 so the 90-col screencasts drop it; ADR-0015 format untouched), plus a **relationships panel** rendered narrow with endpoints broken at the arrow, ellipsized — a **separate sibling panel** that **overrides S2**'s nested-list extension model (relationships are cross-table). the full records live on a new **`App.relationships`** field (revised from the ADR's original `SchemaCache.relationship_details` at implementation — `SchemaCache` is walker-facing and needs only the names, kept in `relationships: Vec`; details are UI-only, so `App` mirrors `app.tables` and avoids ~23 fixture edits), delivered by `Database::read_all_relationships` + an `AppEvent::RelationshipsRefreshed`; the two left panels split vertically with the relationships panel floored at 5 rows ("(none)" when empty) and capped at 50 % of the column (DB4). **Phase C (navigation mode):** **`Ctrl-O`** enters a focus cycle (Input → Tables → Relationships → Input; `Esc` exits) orthogonal to the ADR-0003 input mode — **`Ctrl-B` was rejected on review as the default tmux prefix** (unreachable inside tmux); the focused panel **expands to ~40–50 cols as a `Clear` overlay** (right panels stay unchanging underneath) and scrolls via **Up/Down (line) + PageUp/PageDown (page)** (context-rebind, reusing the output-scroll viewport mechanism), with an accent focus border; all non-nav keys inert in nav mode (and nav keys inert while a modal is open). Forks all user-chosen: keep-optional-richer (vs remove/narrow); navigation-mode (vs modeless modifier scroll); `Ctrl-O` (Ctrl-B rejected = tmux prefix); overlay (vs layout re-split); inert-non-nav-keys; geometry-fixed hint height; `H<40/≥40` thresholds; session-only persistence; Up/Down line-scroll; **separate relationships panel overriding S2**; **no hint-area toggle** (S4's stale "keyboard-toggleable" claim struck — never implemented, unwanted). A pre-build `/runda` DA pass drove these corrections: caught the `Ctrl-B`/tmux collision, the `SchemaCache` retype that would have broken completion, the 2-row-input/indicator placement, the missing nav-mode key disposition + modal gate, and three unreferenced requirements (S1 evolved, S2 overridden, S4 corrected); also cross-checked open issue **#22** (overlay/annotation layer — separate ADR, adjacent). OOS: true multi-line input (I1); readline shortcuts (I1b); cross-session sidebar persistence; output as a third nav focus; relationship search/edit from the panel; hint-area toggle; #22's annotation layer. Accepted consequence: the 90-col visibility threshold makes a terminal's output *narrower* when widened across the boundary (sidebar appears) - [ADR-0047 — Demonstration overlay layer (keystroke badges + step captions)](0047-demonstration-overlay-layer.md) — **Accepted 2026-06-10; implemented 2026-06-11, phased A→B→C (closes Gitea #22)** (commits `f879d54`→`2d0f4b2`; no `requirements.md` item — tracked by issue + ADR per convention; all forks user-confirmed + a pre-build `/runda` pass that produced 10 tightening findings and a whole-implementation `/runda` pass that returned PASS, no blockers). An in-app **demonstration mode** (`--demo` flag / `RDBMS_PLAYGROUND_DEMO` env, **off by default, zero footprint when off**) that renders two transient overlays so `autocast` screencasts — and live teaching, and a future guided-lesson system — can show otherwise-invisible interactions. **Keystroke badges** (`[TAB]`, `[ENTER]`, `[UP]`, …): **automatic, app-detected** over a fixed set of glyph-less keys (the app already sees every key, so it re-records for free), label via a pure `demo_badge_label(&KeyEvent)`; the badge **auto-expires on a ~1.5 s timer** that extends the runtime's existing time-boxed-`recv` arm condition (`debounce.is_armed() || badge_pending`; expiry `Instant` in the runtime, `App.demo_badge` the render mirror — mirroring the `input` vs `input_indicator` split). **Step captions**: a **stealth, control-code-delimited input buffer** toggled by **`Ctrl+]`** (byte `0x1D` → arrives as `Char('5')+CONTROL`, verified against crossterm 0.29 `parse.rs:110-113`; chosen over `Ctrl+!`, which is **not a single ASCII byte so autocast cannot send it** — the same wall as arrow keys, R4) — typed characters accumulate **invisibly** (prompt untouched, no echo/history), `Backspace` edits, other keys inert, a second `Ctrl+]` **commits** to the caption box (empty commit dismisses); lives in pure-sync `App::update()`, **intercepted before the modal gate** so captions/badges work **over the load picker** (the `#24` projects cast). Both render as **floating flat black-on-yellow rectangles** (solid fill, **no border glyphs** — a one-cell text margin, deliberately unlike the app's bordered panels; user decision post-build, `2d0f4b2`) **at the output panel's inner bottom-right**, drawn **last over modals**, badge **stacked above** the caption, **no layout reflow**; caption **word-wraps to ≤ 3 lines** (3–5 rows), badge fixed 3 rows; clamp/skip guard for tiny terminals; a new **`App.last_output_area: Rect`** (set in `render_output_panel`) gives the top-level draw the anchor. Caption persists **until the next keystroke**; badge suppressed while capturing. Forks all user-chosen: `--demo` activation (vs hidden command / chord); automatic badges (vs scripted); stealth buffer (vs typed-command / preloaded-file); floating bottom-right boxes (vs HUD / banner / subtitle); `Ctrl+]` trigger; wrap-to-3-line captions; ~1.5 s badge / next-keystroke caption timing. Tested test-first across Tier 1 (label fn, capture state machine incl. over-modal + demo-off gate, nearest-deadline helper), Tier 2 (insta snapshots: badge/caption/both-stacked at 90×26 light+dark, short-terminal clamp), Tier 3 (`--demo` plumbing, badge set/suppressed, caption-without-input wiring), CLI (`--demo` parse + env fallback) — with an **honest limit** noted: the `tokio` timer wiring inside `run_loop` is exercised via the pure pieces + Tier-3 plumbing, not a standalone integration test of the timeout (same posture as the existing `IndicatorDebounce`). One intentional, user-acknowledged behaviour: `Ctrl-C` is inert while capturing (every non-`Ctrl+]` key is, by spec). Final tally **2290 passing / 0 failing / 0 skipped** (1 long-standing ignored doctest), clippy clean. OOS: scripted/manual badge push; badges for glyph keys; configurable styling/placement; the guided-lesson system itself (own ADR); cross-session/-switch persistence; localised caption content; arrow-only cast interactions (output-pane scroll); wiring the overlays into the website `casts.mjs` scripts (website-branch follow-up). Implementation phased **A** (`--demo` plumbing) → **B** (badges) → **C** (captions) + a flat-rectangle restyle -- [ADR-0048 — `seed` fake-data generation command](0048-seed-fake-data-generation.md) — **Proposed 2026-06-11** (design settled with the user across an extended fork dialogue; every decision escalated + user-chosen; then a `/runda` DA pass found six blockers — undo integration, replay semantics, `set` value quoting, CHECK handling, an advisory phase-ordering bug, auto-show flooding — all folded in, the genuine forks re-escalated and resolved; phased test-first implementation to follow). Closes `requirements.md` **SD1** and the core of **SD2**; closes the `seed` half of **A1** (the other being `hint`/**H2**). A dedicated `seed` command (own AST variant + `do_seed` executor, **both modes**) generating **realistic, name-aware** fake data. Two forms: **`seed
[count]`** (new rows, default **20**, capped) and **`seed
.`** (fill a column on existing rows, an UPDATE). Generation adds the **`fake` crate** (v5, English) driven by a **type-gated, token-matched name-heuristic catalogue** (~30 patterns, documented false-positive guards), with **table-context** disambiguating the `name`/`title` family (`products.name`→product, `users.name`→person, `vendors.name`→company), a **hand-rolled `product` generator** (`fake` has no commerce module), **bounded dates** (`date`/`timestamp`/`dob`/`*_at` recognised, recent windows — never "all of history"), the **identifier family** (`id`/`code`/`ref`/`number`, non-FK/non-PK) → **unique sequential**, and **enum-ish names** (`role`/`status`/`type`/…) left generic + a **post-seed Hint advisory** pointing at `set … in (…)`. A **`set` override clause** — `= value` / `in (a,b,c)` / `as ` / `between a and b` (numeric **and** date), reusing ADR-0026 operators — answers the heuristic-miss case. **`--seed `** makes runs reproducible (and enables exact-value tests). **FK** columns sampled uniformly from existing parent rows (**empty parent → friendly error**, no recursion v1); **junction/compound-PK** tables seeded with **distinct combinations**, capped + noted (SD1). A **required-column block guard** refuses rather than NULL-violate a `NOT NULL` column it can't fill (e.g. `NOT NULL blob`). Full ambient wiring (completion incl. a new generator-name vocabulary highlighted as `tok_function`, hints, `help seed`, ADR-0042 near-miss matrix, ADR-0027 validity); **no DSL→SQL teaching echo** (seed is a utility command, not a SQL twin). Honours **X5** — `do_seed` reuses insert/update *mechanics as helpers*, not by emitting `Command::Insert`. Implementation phased: (1) core whole-row seed → (2) `set` overrides → (3) column-fill. Deferred (future SD2): recursive auto-seed, NULL injection, multi-locale, user-defined custom generators, full per-column report +- [ADR-0048 — `seed` fake-data generation command](0048-seed-fake-data-generation.md) — **Accepted 2026-06-11; Phase 1 implemented 2026-06-11** (commits `202e25a`→`fbd219b`; design settled with the user across an extended fork dialogue, hardened by a pre-build `/runda` pass (six blockers folded in) and a post-implementation `/runda` pass (eight gaps closed — FK/shortid determinism so **D4 holds with no exceptions**, plus six untested ADR decisions); **2358 tests pass, clippy clean**). Closes `requirements.md` **SD1** and the core of **SD2**; closes the `seed` half of **A1**. **Phase 1 shipped:** whole-row `seed
[count] [--seed ]` with realistic name-aware generation (the `fake` crate + a type-gated heuristic catalogue, table-context name disambiguation, hand-rolled `product` generator, bounded dates), identifier + constraint uniqueness incl. junction distinct-combos, FK sampling from existing parent rows (empty-parent error), `IN`-CHECK derivation + complex-CHECK advisory, a required-column block guard, `--seed` reproducibility (serial/FK/shortid all deterministic), undo as one batch step, replay as a data write, a capped auto-show preview, the enum/CHECK advisory, and an O(N) single-transaction insert path. **Deferred to Phase 2** (designed, not built): the `set` override clause (D2) and the `
.` column-fill form. Further SD2 increments (custom generators, NULL injection, multi-locale, recursive auto-seed) out of scope. Closes `requirements.md` **SD1** and the core of **SD2**; closes the `seed` half of **A1** (the other being `hint`/**H2**). A dedicated `seed` command (own AST variant + `do_seed` executor, **both modes**) generating **realistic, name-aware** fake data. Two forms: **`seed
[count]`** (new rows, default **20**, capped) and **`seed
.`** (fill a column on existing rows, an UPDATE). Generation adds the **`fake` crate** (v5, English) driven by a **type-gated, token-matched name-heuristic catalogue** (~30 patterns, documented false-positive guards), with **table-context** disambiguating the `name`/`title` family (`products.name`→product, `users.name`→person, `vendors.name`→company), a **hand-rolled `product` generator** (`fake` has no commerce module), **bounded dates** (`date`/`timestamp`/`dob`/`*_at` recognised, recent windows — never "all of history"), the **identifier family** (`id`/`code`/`ref`/`number`, non-FK/non-PK) → **unique sequential**, and **enum-ish names** (`role`/`status`/`type`/…) left generic + a **post-seed Hint advisory** pointing at `set … in (…)`. A **`set` override clause** — `= value` / `in (a,b,c)` / `as ` / `between a and b` (numeric **and** date), reusing ADR-0026 operators — answers the heuristic-miss case. **`--seed `** makes runs reproducible (and enables exact-value tests). **FK** columns sampled uniformly from existing parent rows (**empty parent → friendly error**, no recursion v1); **junction/compound-PK** tables seeded with **distinct combinations**, capped + noted (SD1). A **required-column block guard** refuses rather than NULL-violate a `NOT NULL` column it can't fill (e.g. `NOT NULL blob`). Full ambient wiring (completion incl. a new generator-name vocabulary highlighted as `tok_function`, hints, `help seed`, ADR-0042 near-miss matrix, ADR-0027 validity); **no DSL→SQL teaching echo** (seed is a utility command, not a SQL twin). Honours **X5** — `do_seed` reuses insert/update *mechanics as helpers*, not by emitting `Command::Insert`. Implementation phased: (1) core whole-row seed → (2) `set` overrides → (3) column-fill. Deferred (future SD2): recursive auto-seed, NULL injection, multi-locale, user-defined custom generators, full per-column report diff --git a/docs/handoff/20260611-handoff-65.md b/docs/handoff/20260611-handoff-65.md new file mode 100644 index 0000000..07bd865 --- /dev/null +++ b/docs/handoff/20260611-handoff-65.md @@ -0,0 +1,144 @@ +# Session handoff — 2026-06-11 (65) + +Sixty-fifth handover. Continues from handoff-64 (ADR-0047 demo +overlays). This session designed and shipped **ADR-0048 — the `seed` +fake-data generation command (SD1)**, Phase 1, end to end: an ADR with +an extended fork dialogue + two `/runda` passes, then a phased +test-first build. + +## §1. State at handoff + +**Branch:** `main`. **HEAD will be the doc-wrap-up commit** (see §6) — +all seed work committed, nothing pending. Unpushed (push is the user's +step; normal working state). + +**Tests: 2358 passing / 0 failing / 0 skipped / 1 ignored** (the long +-standing `friendly` doctest). **Clippy clean** (nursery, all targets). ++68 over handoff-64's 2290. + +**`cargo sweep` run** at wrap-up: `target/` 1.6 G → 183 M. + +**This session's commits:** +``` +202e25a feat(seed): fake-data generation library + fake dependency (P1.1) +f1e9484 feat(seed): command plumbing + walking skeleton (P1.2) +73493fa feat(seed): FK sampling, empty-parent error, block guard (P1.3a) +9c13501 feat(seed): uniqueness, junction distinct-combos, IN-CHECK (P1.3b) +0b3ab3c feat(seed): SeedResult outcome, capped preview, advisory, count cap (P1.3c) +e6ff63d perf(seed): single-transaction multi-row insert path (P1.3d) +fbd219b feat(seed): --seed flag, ambient wiring, and /runda hardening (P1.4 + DA) +``` +(plus the earlier `4d0ae77` multi-tab-scope withdrawal and `0af7f56` +ADR-0048 doc, and the wrap-up doc commit.) + +## §2. What `seed` does (Phase 1 — read ADR-0048) + +`seed
[count] [--seed ]` — populate a table with realistic +fake data. **Available in both modes** (A1). + +- **Realistic, name-aware generation:** the **`fake` crate** (v5, + English) driven by a **type-gated heuristic catalogue** (`src/seed/ + heuristics.rs`) — `email`→email, `first_name`→first name, `price`→ + currency, etc., each only firing when the column *type* is + compatible. **Table-context** disambiguates `name`/`title` + (`products.name`→a hand-rolled **product** name, `users.name`→person, + `vendors.name`→company). **Bounded dates** (`dob`/`created_at`/ + `date`/`timestamp` → recent windows, never "all of history", anchored + to a fixed reference epoch for reproducibility). Type-based fallback + otherwise. +- **Uniqueness (D10):** the user-fillable PK, compound UNIQUE + constraints, single-column UNIQUE, and identifier-named columns + (`id`/`code`/…) stay distinct across the batch and vs existing rows; + **junction tables** get **distinct FK combinations** (capped at the + available product, reported). Identifier ints get a monotonic + sequence. +- **FK (D14):** every FK column samples an existing parent row (compound + FK reads one consistent parent row); **empty parent → friendly + error**. +- **`IN`-CHECK (D17):** a simple `col IN ('a','b')` CHECK becomes the + value source (enum-as-CHECK just works); complex CHECKs are flagged in + the advisory and best-effort generated (a violation rolls the batch + back). +- **Reproducibility (D4):** `--seed ` → identical data on the same DB + state. **Holds with no exceptions** — serial (rowid/MAX+1), FK + (`ORDER BY`), **shortid (seeded RNG)**, all generators. +- **Output:** the seeded-row count, a **capped preview** (first 20 + rows), and a **Hint-styled advisory** naming enum-ish / underivable- + CHECK columns filled generically. Count cap 10 000; `seed t 0` no-op. +- **Safety:** one **undo** step (snapshot wraps the whole seed); + **replay** re-runs it as a data write; the insert path is a single + transaction (O(N), atomic, commit-db-last preserved). + +## §3. Where the code lives + +- **`src/seed/`** — the pure generation library (no DB): `mod.rs` + (`ColumnSpec`, `Generator`, `SeedRng`, `make_rng`), `heuristics.rs` + (`choose_generator` + the catalogue + `is_enum_ish`), `generators.rs` + (`generate_value` + the `product` generator + bounded dates), + `check.rs` (`parse_in_check_values`). ~40 Tier-1 tests, deterministic. +- **`src/db.rs`** — `do_seed` (+ `SeedColPlan`, `sample_parent_key_ + tuples`, `seed_value_list_key`, `seed_max_int`, `SeedResult`, + `DEFAULT_SEED_COUNT`/`MAX_SEED_COUNT`/`SEED_PREVIEW_CAP`), the new + **`insert_one_row`** core extracted from `do_insert` (shared, no + tx/persist — so seed runs N rows in one tx), and the `Request::Seed` / + `Database::seed` / worker wiring. +- **`src/dsl/grammar/data.rs`** — `SEED` `CommandNode`, `build_seed`, + the `--seed` flag grammar (`Seq[Flag("seed"), NumberLit]`, the first + DSL flag with a value). `Command::Seed` in `command.rs`. +- **Runtime/render** — `CommandOutcome::Seed`, `AppEvent:: + DslSeedSucceeded`, `App::handle_dsl_seed_success`. Catalog keys + `ok.rows_seeded` / `seed.capped` / `seed.advisory_generic` / + `help.data.seed` / `parse.usage.seed`. +- **Tests** — `tests/it/seed.rs` (25 integration tests), + `tests/typing_surface/mod.rs` (`seed_completion_and_validity`), + `tests/it/parse_error_pedagogy.rs` (bare-`seed` near-miss row), + `src/app.rs` (two render tests), `src/dsl/shortid.rs` + (`generate_with_rng`). + +## §4. Process notes (the two `/runda` passes) + +- **Pre-build `/runda`** (on the ADR) found six blockers — undo + integration (D15), replay semantics (D16), `set`-value quoting (D2), + CHECK handling (D17), an advisory phase-ordering bug (D13), auto-show + flooding (D18) — all folded into ADR-0048 before any code; the three + genuine forks re-escalated and user-resolved. +- **Post-implementation `/runda`** (on the whole implementation) found + **eight gaps**, all closed: FK-sampling determinism (→ `ORDER BY`), + **shortid not reproducible** (→ seeded RNG, fixed not documented — the + user chose the fix), and six **untested ADR decisions** (D5 advanced + mode, D15 undo, D16 replay, D17 complex-CHECK advisory, atomic + rollback, zero-count) — tests added for each. + +## §5. Phase 2 (deferred — designed in ADR-0048, NOT built) + +These are the only seed pieces left; both have full designs in +ADR-0048: + +1. **The `set` override clause (D2)** — `seed t 20 set role in + ('a','b'), status = 'x', work_addr as email, price between 10 and + 100`. Value / pick-from-list / explicit-generator / range, **quoted + literals** (grammar-consistent). This is the SD2 "override hooks" + core. The `ColumnSpec.check_in_values` → `PickFrom` plumbing and the + `Generator` vocabulary already exist; this adds the grammar + a `set` + clause that overrides the per-column plan. +2. **Column-fill (`seed
.`, D1 form 2)** — fill one + column across *existing* rows (an UPDATE). Refuses PK/autogen targets; + empty-table no-op. + +`requirements.md`: **SD1 `[x]`**, **SD2 `[/]`** (core done; the two +above open), **A1 14/15** (only `hint`/**H2** unregistered). + +## §6. How to take over + +1. Read handoffs 63 → 64 → 65, `CLAUDE.md`, `docs/requirements.md`, + `docs/adr/0048-seed-fake-data-generation.md` (the whole thing — D1 + –D18 + the as-built status block). +2. **Seed is feature-complete for Phase 1; nothing pending.** Next + options (user's call): seed **Phase 2** (`set` clause + column-fill); + **H2 `hint`** (closes A1) — own ADR; **TT5 CI**; or the larger + **V4 journal** / **tutorial** ADRs. +3. Two minor, user-deferred observations (non-blocking): the uniqueness + retry cap (`MAX_ATTEMPTS=200`) can cap a *medium* unique domain + slightly below its true size (junction/small domains are exact); + `literal_to_value` doesn't type-check an IN-CHECK literal vs a numeric + column (a malformed `int IN ('a')` CHECK fails cleanly at bind). diff --git a/docs/requirements.md b/docs/requirements.md index 2d24adb..28a5369 100644 --- a/docs/requirements.md +++ b/docs/requirements.md @@ -246,13 +246,12 @@ since ADR-0027.) available in both modes: `save`, `save as`, `load`, `new`, `rebuild`, `export`, `import`, `seed`, `replay`, `undo`, `redo`, `mode`, `help`, `hint`, `quit`. - *(Partial, verified 2026-06-07: 13 of 15 implemented and - available in both modes — `quit`/`q`, `mode simple|advanced`, - `help`, `save`, `save as`, `load`, `new`, `rebuild`, `export`, - `import`, `replay`, `undo`, `redo` (REGISTRY in - `grammar/app.rs:249-333`). **Missing: `seed`** (tracked as SD1) - **and `hint`** (tracked as H2) — neither is registered. A1 - closes when SD1 + H2 land.)* + *(Partial: **14 of 15** implemented and available in both modes — + `quit`/`q`, `mode simple|advanced`, `help`, `save`, `save as`, + `load`, `new`, `rebuild`, `export`, `import`, `replay`, `undo`, + `redo`, and now **`seed`** (ADR-0048 / SD1, done 2026-06-11). + **Only `hint`** (tracked as H2) remains unregistered. A1 closes + when H2 lands.)* ## DSL data commands @@ -665,20 +664,35 @@ since ADR-0027.) ## Sample data / seeding -- [ ] **SD1** `seed
[count]` generates plausible fake +- [x] **SD1** `seed
[count]` generates plausible fake data; junction tables are seeded with valid foreign-key references drawn from existing parent rows. - *(Taken up 2026-06-11 — specified by **ADR-0048**; phased - test-first implementation in progress.)* -- [ ] **SD2** Detailed seeding rules (per-type generators, + *(Done 2026-06-11 via **ADR-0048** (commits `202e25a`→`fbd219b`). + Whole-row `seed
[count] [--seed ]` with realistic + name-aware generation (`fake` crate + a type-gated heuristic + catalogue, table-context name disambiguation, hand-rolled + `product` generator, bounded dates), identifier + constraint + uniqueness, **junction tables seeded with valid FK references + drawn from existing parent rows** (distinct combinations, capped; + empty-parent friendly error), `IN`-CHECK derivation, a + required-column block guard, undo as one step, replay as a data + write, a capped auto-show + enum/CHECK advisory, and an O(N) + single-transaction path. 25 integration + ~40 unit tests; two + `/runda` passes. The `set` override clause and `
.` + column-fill are SD2 Phase-2, deferred.)* +- [/] **SD2** Detailed seeding rules (per-type generators, locale, determinism, override hooks). - *(Taken up 2026-06-11 — `[~]`→`[ ]`. **ADR-0048** settles the - design: type-gated name-aware generators with a `fake`-backed - catalogue + table-context disambiguation, `--seed` determinism, - English-only locale (X2), and a `set` override clause (the - "override hooks" core). Deferred to future SD2 increments: - user-defined custom generators, NULL injection, multi-locale, - recursive parent auto-seed.)* + *(Core done 2026-06-11 via **ADR-0048**: type-gated name-aware + per-type generators with a `fake`-backed catalogue + + table-context disambiguation, **`--seed` determinism** + (serial/FK/shortid all reproducible — D4 holds with no + exceptions), English-only locale (X2). **Still open (Phase 2):** + the `set` override clause — value / pick-from-list / explicit + generator / range (the "override hooks" core, designed in + ADR-0048 D2 but not yet built) and the `
.` + column-fill form. Deferred SD2 increments: user-defined custom + generators, NULL injection, multi-locale, recursive parent + auto-seed.)* ## Query analysis