feat(seed): fake-data generation library + fake dependency (ADR-0048 P1.1)
The pure generation half of `seed` — no command wiring yet: - src/seed/: ColumnSpec + Generator model and a seeded StdRng; the type-gated name-heuristic catalogue (D7) with documented false-positive guards; table-context name disambiguation (D11); identifier (D10) and enum-ish (D12) detection; per-type + bounded-date generators (D8); the hand-rolled product generator (D9); and PickFrom for IN-CHECK / enum lists. - Adds the `fake` crate (v5, default features). Verified: single rand 0.10.1 (no duplication), determinism via one seeded StdRng driving both fake and the hand-rolled generators, security-clean across osv/grype/trivy. - ADR-0048 D3 updated to record the dependency verification. 32 Tier-1 tests (exact-value via fixed --seed); 1673 lib tests pass, clippy all-targets clean.
This commit is contained in:
@@ -170,23 +170,29 @@ companies, phone numbers, lorem text, dates. Generation is driven by a
|
||||
per-column **generator** chosen by the heuristics (D7) or the override
|
||||
(D2), falling back to **type-based** generation (D8).
|
||||
|
||||
**Two open implementation-time verifications** (flagged honestly, to
|
||||
be resolved when the dependency is locked, not assumed here):
|
||||
**Implementation-time verifications (resolved 2026-06-11 when the
|
||||
dependency was added):**
|
||||
|
||||
- **`rand` de-duplication.** The project is on `rand 0.10.1`; `fake`
|
||||
brings its own `rand`. Confirm a single `rand` version resolves (a
|
||||
duplicate is harmless but should be a conscious outcome, and
|
||||
`shortid.rs` + the seed RNG must share the version we standardise
|
||||
on).
|
||||
- **`fake` module inventory.** Confirm which generators v5 actually
|
||||
ships (strong prior: it has Name/Internet/Address/Company/Lorem/
|
||||
Chrono/Currency/Job/Color but **no commerce/product module** — see
|
||||
D9), and the minimal feature-flag set needed (derive, chrono-backed
|
||||
dates).
|
||||
- **Security (new-dependency posture).** `fake` and its transitive
|
||||
tree must be scanned (`trivy fs`, `grype`, `osv-scanner`) before
|
||||
merge, per the global new-dependency rule; findings documented, not
|
||||
silently accepted.
|
||||
- **`rand` de-duplication — clean.** `fake` 5.1.0 depends on
|
||||
`rand = "0.10"`, the **same major** as the project's `rand 0.10.1`,
|
||||
so `cargo tree -e normal` resolves a **single** `rand 0.10.1` (no
|
||||
runtime duplication; the `rand 0.8.6` visible to `cargo tree -i
|
||||
rand` is only `fake`'s own dev-dependency, never compiled for us).
|
||||
Consequence for D4: one seeded `rand 0.10` `StdRng` can drive
|
||||
**both** `fake`'s `fake_with_rng` and the hand-rolled generators —
|
||||
determinism is single-RNG, single-version, and shares `shortid.rs`'s
|
||||
`rand` version.
|
||||
- **`fake` module inventory / features — confirmed.** Default features
|
||||
(`["either"]`) cover the core string fakers used here
|
||||
(Name/Internet/Address/Company/Lorem/PhoneNumber); `fake`'s `chrono`
|
||||
feature is **deliberately omitted** (dates generated in-house for
|
||||
D8's bounded windows). No commerce/product module exists → `product`
|
||||
is hand-rolled (D9). (The exact faker call sites are pinned when the
|
||||
generation library is built.)
|
||||
- **Security (new-dependency posture) — clean.** The `fake` tree (296
|
||||
packages total) scanned clean by **all three** mandated scanners:
|
||||
`osv-scanner` (no issues), `grype` (no vulnerabilities), `trivy fs
|
||||
--scanners vuln` (0). No findings to document or accept.
|
||||
|
||||
### D4 — Determinism: `--seed <n>` (fork, user-chosen: "optional flag")
|
||||
|
||||
|
||||
Reference in New Issue
Block a user