- ADR-0048 status -> Accepted; Phase 1 implemented (commits 202e25a..fbd219b), with the pre-build and post-implementation /runda passes and the 2358-test green state recorded; index entry updated. - requirements.md: SD1 [x] (whole-row seed + FK/junction, both modes, --seed reproducibility with no exceptions), SD2 [/] (core generators / determinism done; the set override clause + column-fill are Phase 2), A1 14/15 (only hint/H2 remains unregistered). - Handoff 65: the full seed Phase-1 build, the two /runda passes, where the code lives, and Phase-2 / next steps.
7.3 KiB
Session handoff — 2026-06-11 (65)
Sixty-fifth handover. Continues from handoff-64 (ADR-0047 demo
overlays). This session designed and shipped ADR-0048 — the seed
fake-data generation command (SD1), Phase 1, end to end: an ADR with
an extended fork dialogue + two /runda passes, then a phased
test-first build.
§1. State at handoff
Branch: main. HEAD will be the doc-wrap-up commit (see §6) —
all seed work committed, nothing pending. Unpushed (push is the user's
step; normal working state).
Tests: 2358 passing / 0 failing / 0 skipped / 1 ignored (the long
-standing friendly doctest). Clippy clean (nursery, all targets).
+68 over handoff-64's 2290.
cargo sweep run at wrap-up: target/ 1.6 G → 183 M.
This session's commits:
202e25a feat(seed): fake-data generation library + fake dependency (P1.1)
f1e9484 feat(seed): command plumbing + walking skeleton (P1.2)
73493fa feat(seed): FK sampling, empty-parent error, block guard (P1.3a)
9c13501 feat(seed): uniqueness, junction distinct-combos, IN-CHECK (P1.3b)
0b3ab3c feat(seed): SeedResult outcome, capped preview, advisory, count cap (P1.3c)
e6ff63d perf(seed): single-transaction multi-row insert path (P1.3d)
fbd219b feat(seed): --seed flag, ambient wiring, and /runda hardening (P1.4 + DA)
(plus the earlier 4d0ae77 multi-tab-scope withdrawal and 0af7f56
ADR-0048 doc, and the wrap-up doc commit.)
§2. What seed does (Phase 1 — read ADR-0048)
seed <table> [count] [--seed <n>] — populate a table with realistic
fake data. Available in both modes (A1).
- Realistic, name-aware generation: the
fakecrate (v5, English) driven by a type-gated heuristic catalogue (src/seed/ heuristics.rs) —email→email,first_name→first name,price→ currency, etc., each only firing when the column type is compatible. Table-context disambiguatesname/title(products.name→a hand-rolled product name,users.name→person,vendors.name→company). Bounded dates (dob/created_at/date/timestamp→ recent windows, never "all of history", anchored to a fixed reference epoch for reproducibility). Type-based fallback otherwise. - Uniqueness (D10): the user-fillable PK, compound UNIQUE
constraints, single-column UNIQUE, and identifier-named columns
(
id/code/…) stay distinct across the batch and vs existing rows; junction tables get distinct FK combinations (capped at the available product, reported). Identifier ints get a monotonic sequence. - FK (D14): every FK column samples an existing parent row (compound FK reads one consistent parent row); empty parent → friendly error.
IN-CHECK (D17): a simplecol IN ('a','b')CHECK becomes the value source (enum-as-CHECK just works); complex CHECKs are flagged in the advisory and best-effort generated (a violation rolls the batch back).- Reproducibility (D4):
--seed <n>→ identical data on the same DB state. Holds with no exceptions — serial (rowid/MAX+1), FK (ORDER BY), shortid (seeded RNG), all generators. - Output: the seeded-row count, a capped preview (first 20
rows), and a Hint-styled advisory naming enum-ish / underivable-
CHECK columns filled generically. Count cap 10 000;
seed t 0no-op. - Safety: one undo step (snapshot wraps the whole seed); replay re-runs it as a data write; the insert path is a single transaction (O(N), atomic, commit-db-last preserved).
§3. Where the code lives
src/seed/— the pure generation library (no DB):mod.rs(ColumnSpec,Generator,SeedRng,make_rng),heuristics.rs(choose_generator+ the catalogue +is_enum_ish),generators.rs(generate_value+ theproductgenerator + bounded dates),check.rs(parse_in_check_values). ~40 Tier-1 tests, deterministic.src/db.rs—do_seed(+SeedColPlan,sample_parent_key_ tuples,seed_value_list_key,seed_max_int,SeedResult,DEFAULT_SEED_COUNT/MAX_SEED_COUNT/SEED_PREVIEW_CAP), the newinsert_one_rowcore extracted fromdo_insert(shared, no tx/persist — so seed runs N rows in one tx), and theRequest::Seed/Database::seed/ worker wiring.src/dsl/grammar/data.rs—SEEDCommandNode,build_seed, the--seedflag grammar (Seq[Flag("seed"), NumberLit], the first DSL flag with a value).Command::Seedincommand.rs.- Runtime/render —
CommandOutcome::Seed,AppEvent:: DslSeedSucceeded,App::handle_dsl_seed_success. Catalog keysok.rows_seeded/seed.capped/seed.advisory_generic/help.data.seed/parse.usage.seed. - Tests —
tests/it/seed.rs(25 integration tests),tests/typing_surface/mod.rs(seed_completion_and_validity),tests/it/parse_error_pedagogy.rs(bare-seednear-miss row),src/app.rs(two render tests),src/dsl/shortid.rs(generate_with_rng).
§4. Process notes (the two /runda passes)
- Pre-build
/runda(on the ADR) found six blockers — undo integration (D15), replay semantics (D16),set-value quoting (D2), CHECK handling (D17), an advisory phase-ordering bug (D13), auto-show flooding (D18) — all folded into ADR-0048 before any code; the three genuine forks re-escalated and user-resolved. - Post-implementation
/runda(on the whole implementation) found eight gaps, all closed: FK-sampling determinism (→ORDER BY), shortid not reproducible (→ seeded RNG, fixed not documented — the user chose the fix), and six untested ADR decisions (D5 advanced mode, D15 undo, D16 replay, D17 complex-CHECK advisory, atomic rollback, zero-count) — tests added for each.
§5. Phase 2 (deferred — designed in ADR-0048, NOT built)
These are the only seed pieces left; both have full designs in ADR-0048:
- The
setoverride clause (D2) —seed t 20 set role in ('a','b'), status = 'x', work_addr as email, price between 10 and 100. Value / pick-from-list / explicit-generator / range, quoted literals (grammar-consistent). This is the SD2 "override hooks" core. TheColumnSpec.check_in_values→PickFromplumbing and theGeneratorvocabulary already exist; this adds the grammar + asetclause that overrides the per-column plan. - Column-fill (
seed <table>.<column>, D1 form 2) — fill one column across existing rows (an UPDATE). Refuses PK/autogen targets; empty-table no-op.
requirements.md: SD1 [x], SD2 [/] (core done; the two
above open), A1 14/15 (only hint/H2 unregistered).
§6. How to take over
- Read handoffs 63 → 64 → 65,
CLAUDE.md,docs/requirements.md,docs/adr/0048-seed-fake-data-generation.md(the whole thing — D1 –D18 + the as-built status block). - Seed is feature-complete for Phase 1; nothing pending. Next
options (user's call): seed Phase 2 (
setclause + column-fill); H2hint(closes A1) — own ADR; TT5 CI; or the larger V4 journal / tutorial ADRs. - Two minor, user-deferred observations (non-blocking): the uniqueness
retry cap (
MAX_ATTEMPTS=200) can cap a medium unique domain slightly below its true size (junction/small domains are exact);literal_to_valuedoesn't type-check an IN-CHECK literal vs a numeric column (a malformedint IN ('a')CHECK fails cleanly at bind).