78c38e8b33
- ADR-0048 status -> Accepted; Phase 1 implemented (commits 202e25a..fbd219b), with the pre-build and post-implementation /runda passes and the 2358-test green state recorded; index entry updated. - requirements.md: SD1 [x] (whole-row seed + FK/junction, both modes, --seed reproducibility with no exceptions), SD2 [/] (core generators / determinism done; the set override clause + column-fill are Phase 2), A1 14/15 (only hint/H2 remains unregistered). - Handoff 65: the full seed Phase-1 build, the two /runda passes, where the code lives, and Phase-2 / next steps.
145 lines
7.3 KiB
Markdown
145 lines
7.3 KiB
Markdown
# Session handoff — 2026-06-11 (65)
|
||
|
||
Sixty-fifth handover. Continues from handoff-64 (ADR-0047 demo
|
||
overlays). This session designed and shipped **ADR-0048 — the `seed`
|
||
fake-data generation command (SD1)**, Phase 1, end to end: an ADR with
|
||
an extended fork dialogue + two `/runda` passes, then a phased
|
||
test-first build.
|
||
|
||
## §1. State at handoff
|
||
|
||
**Branch:** `main`. **HEAD will be the doc-wrap-up commit** (see §6) —
|
||
all seed work committed, nothing pending. Unpushed (push is the user's
|
||
step; normal working state).
|
||
|
||
**Tests: 2358 passing / 0 failing / 0 skipped / 1 ignored** (the long
|
||
-standing `friendly` doctest). **Clippy clean** (nursery, all targets).
|
||
+68 over handoff-64's 2290.
|
||
|
||
**`cargo sweep` run** at wrap-up: `target/` 1.6 G → 183 M.
|
||
|
||
**This session's commits:**
|
||
```
|
||
202e25a feat(seed): fake-data generation library + fake dependency (P1.1)
|
||
f1e9484 feat(seed): command plumbing + walking skeleton (P1.2)
|
||
73493fa feat(seed): FK sampling, empty-parent error, block guard (P1.3a)
|
||
9c13501 feat(seed): uniqueness, junction distinct-combos, IN-CHECK (P1.3b)
|
||
0b3ab3c feat(seed): SeedResult outcome, capped preview, advisory, count cap (P1.3c)
|
||
e6ff63d perf(seed): single-transaction multi-row insert path (P1.3d)
|
||
fbd219b feat(seed): --seed flag, ambient wiring, and /runda hardening (P1.4 + DA)
|
||
```
|
||
(plus the earlier `4d0ae77` multi-tab-scope withdrawal and `0af7f56`
|
||
ADR-0048 doc, and the wrap-up doc commit.)
|
||
|
||
## §2. What `seed` does (Phase 1 — read ADR-0048)
|
||
|
||
`seed <table> [count] [--seed <n>]` — populate a table with realistic
|
||
fake data. **Available in both modes** (A1).
|
||
|
||
- **Realistic, name-aware generation:** the **`fake` crate** (v5,
|
||
English) driven by a **type-gated heuristic catalogue** (`src/seed/
|
||
heuristics.rs`) — `email`→email, `first_name`→first name, `price`→
|
||
currency, etc., each only firing when the column *type* is
|
||
compatible. **Table-context** disambiguates `name`/`title`
|
||
(`products.name`→a hand-rolled **product** name, `users.name`→person,
|
||
`vendors.name`→company). **Bounded dates** (`dob`/`created_at`/
|
||
`date`/`timestamp` → recent windows, never "all of history", anchored
|
||
to a fixed reference epoch for reproducibility). Type-based fallback
|
||
otherwise.
|
||
- **Uniqueness (D10):** the user-fillable PK, compound UNIQUE
|
||
constraints, single-column UNIQUE, and identifier-named columns
|
||
(`id`/`code`/…) stay distinct across the batch and vs existing rows;
|
||
**junction tables** get **distinct FK combinations** (capped at the
|
||
available product, reported). Identifier ints get a monotonic
|
||
sequence.
|
||
- **FK (D14):** every FK column samples an existing parent row (compound
|
||
FK reads one consistent parent row); **empty parent → friendly
|
||
error**.
|
||
- **`IN`-CHECK (D17):** a simple `col IN ('a','b')` CHECK becomes the
|
||
value source (enum-as-CHECK just works); complex CHECKs are flagged in
|
||
the advisory and best-effort generated (a violation rolls the batch
|
||
back).
|
||
- **Reproducibility (D4):** `--seed <n>` → identical data on the same DB
|
||
state. **Holds with no exceptions** — serial (rowid/MAX+1), FK
|
||
(`ORDER BY`), **shortid (seeded RNG)**, all generators.
|
||
- **Output:** the seeded-row count, a **capped preview** (first 20
|
||
rows), and a **Hint-styled advisory** naming enum-ish / underivable-
|
||
CHECK columns filled generically. Count cap 10 000; `seed t 0` no-op.
|
||
- **Safety:** one **undo** step (snapshot wraps the whole seed);
|
||
**replay** re-runs it as a data write; the insert path is a single
|
||
transaction (O(N), atomic, commit-db-last preserved).
|
||
|
||
## §3. Where the code lives
|
||
|
||
- **`src/seed/`** — the pure generation library (no DB): `mod.rs`
|
||
(`ColumnSpec`, `Generator`, `SeedRng`, `make_rng`), `heuristics.rs`
|
||
(`choose_generator` + the catalogue + `is_enum_ish`), `generators.rs`
|
||
(`generate_value` + the `product` generator + bounded dates),
|
||
`check.rs` (`parse_in_check_values`). ~40 Tier-1 tests, deterministic.
|
||
- **`src/db.rs`** — `do_seed` (+ `SeedColPlan`, `sample_parent_key_
|
||
tuples`, `seed_value_list_key`, `seed_max_int`, `SeedResult`,
|
||
`DEFAULT_SEED_COUNT`/`MAX_SEED_COUNT`/`SEED_PREVIEW_CAP`), the new
|
||
**`insert_one_row`** core extracted from `do_insert` (shared, no
|
||
tx/persist — so seed runs N rows in one tx), and the `Request::Seed` /
|
||
`Database::seed` / worker wiring.
|
||
- **`src/dsl/grammar/data.rs`** — `SEED` `CommandNode`, `build_seed`,
|
||
the `--seed` flag grammar (`Seq[Flag("seed"), NumberLit]`, the first
|
||
DSL flag with a value). `Command::Seed` in `command.rs`.
|
||
- **Runtime/render** — `CommandOutcome::Seed`, `AppEvent::
|
||
DslSeedSucceeded`, `App::handle_dsl_seed_success`. Catalog keys
|
||
`ok.rows_seeded` / `seed.capped` / `seed.advisory_generic` /
|
||
`help.data.seed` / `parse.usage.seed`.
|
||
- **Tests** — `tests/it/seed.rs` (25 integration tests),
|
||
`tests/typing_surface/mod.rs` (`seed_completion_and_validity`),
|
||
`tests/it/parse_error_pedagogy.rs` (bare-`seed` near-miss row),
|
||
`src/app.rs` (two render tests), `src/dsl/shortid.rs`
|
||
(`generate_with_rng`).
|
||
|
||
## §4. Process notes (the two `/runda` passes)
|
||
|
||
- **Pre-build `/runda`** (on the ADR) found six blockers — undo
|
||
integration (D15), replay semantics (D16), `set`-value quoting (D2),
|
||
CHECK handling (D17), an advisory phase-ordering bug (D13), auto-show
|
||
flooding (D18) — all folded into ADR-0048 before any code; the three
|
||
genuine forks re-escalated and user-resolved.
|
||
- **Post-implementation `/runda`** (on the whole implementation) found
|
||
**eight gaps**, all closed: FK-sampling determinism (→ `ORDER BY`),
|
||
**shortid not reproducible** (→ seeded RNG, fixed not documented — the
|
||
user chose the fix), and six **untested ADR decisions** (D5 advanced
|
||
mode, D15 undo, D16 replay, D17 complex-CHECK advisory, atomic
|
||
rollback, zero-count) — tests added for each.
|
||
|
||
## §5. Phase 2 (deferred — designed in ADR-0048, NOT built)
|
||
|
||
These are the only seed pieces left; both have full designs in
|
||
ADR-0048:
|
||
|
||
1. **The `set` override clause (D2)** — `seed t 20 set role in
|
||
('a','b'), status = 'x', work_addr as email, price between 10 and
|
||
100`. Value / pick-from-list / explicit-generator / range, **quoted
|
||
literals** (grammar-consistent). This is the SD2 "override hooks"
|
||
core. The `ColumnSpec.check_in_values` → `PickFrom` plumbing and the
|
||
`Generator` vocabulary already exist; this adds the grammar + a `set`
|
||
clause that overrides the per-column plan.
|
||
2. **Column-fill (`seed <table>.<column>`, D1 form 2)** — fill one
|
||
column across *existing* rows (an UPDATE). Refuses PK/autogen targets;
|
||
empty-table no-op.
|
||
|
||
`requirements.md`: **SD1 `[x]`**, **SD2 `[/]`** (core done; the two
|
||
above open), **A1 14/15** (only `hint`/**H2** unregistered).
|
||
|
||
## §6. How to take over
|
||
|
||
1. Read handoffs 63 → 64 → 65, `CLAUDE.md`, `docs/requirements.md`,
|
||
`docs/adr/0048-seed-fake-data-generation.md` (the whole thing — D1
|
||
–D18 + the as-built status block).
|
||
2. **Seed is feature-complete for Phase 1; nothing pending.** Next
|
||
options (user's call): seed **Phase 2** (`set` clause + column-fill);
|
||
**H2 `hint`** (closes A1) — own ADR; **TT5 CI**; or the larger
|
||
**V4 journal** / **tutorial** ADRs.
|
||
3. Two minor, user-deferred observations (non-blocking): the uniqueness
|
||
retry cap (`MAX_ATTEMPTS=200`) can cap a *medium* unique domain
|
||
slightly below its true size (junction/small domains are exact);
|
||
`literal_to_value` doesn't type-check an IN-CHECK literal vs a numeric
|
||
column (a malformed `int IN ('a')` CHECK fails cleanly at bind).
|