Two additive D7 catalogue rules, surfaced while writing the website seed docs. No change to the type fallback, executor, or grammar. #33 — year-like int columns. `published`/`birth_year` were just `int`, so they fell to the unbounded int path and produced nonsense (`9419`). Add an int-gated year rule (after the quantity rule, so `year_count` stays a count): `year`/`*_year`/`published`/`founded` -> a bounded 1950-2025 year (new `YearRecent`), or the dob-style birth window 1945-2007 for `birth`/`born`/`dob` (new `YearBirth`). Plain int; not added to the D9 named-generator vocabulary. #34 — conventional choice sets. A few enum-ish names have a near-canonical small set that reads far better than lorem text. Add a type-gated PickFrom lookup (reusing the existing generator): priority/prio, severity, rating/stars. `status` is deliberately excluded (values too domain-specific) and keeps the D12 advisory; a user IN-CHECK still wins. `priority` leaves ENUM_TOKENS. ADR-0048 Amendment 1; +8 tests (incl. a column-fill integration test that also closes a pre-existing gap on that path).
This commit is contained in:
@@ -317,6 +317,8 @@ with the implementation):
|
||||
| `url`/`website`/`homepage` · `color`/`colour` | URL / hex colour | text |
|
||||
| `price`/`amount`/`cost`/`salary`/`balance`/`total` | currency-range number | numeric |
|
||||
| `age` · `quantity`/`qty`/`stock`/`count` | 18–80 · small int | numeric |
|
||||
| `year`/`*_year`/`published`/`founded` (Amendment 1) | bounded year (birth window for `birth`/`born`/`dob`, else 1950–2025) | int |
|
||||
| `priority`/`prio` · `severity` · `rating`/`stars` (Amendment 1) | built-in `PickFrom` value set | text/int |
|
||||
| `date`/`*_date` | date, recent ~3 yr window | date |
|
||||
| `dob`/`birthday` | date, adult window (18–80 yr ago) | date |
|
||||
| `timestamp`/`datetime` · `created_at`/`updated_at`/`*_at` | datetime, recent window (`updated_at` ≥ `created_at`) | datetime |
|
||||
@@ -675,3 +677,66 @@ the regression floor.
|
||||
derive-`IN`-else-friendly-fail tier.
|
||||
- **`set`-driven NULL / per-column report / recursive parent seed:**
|
||||
deferred — see Out of scope.
|
||||
|
||||
## Amendment 1 — year-as-int + conventional choice sets (2026-06-12)
|
||||
|
||||
Two SD2-style refinements to the D7 catalogue, surfaced while writing
|
||||
the website `seed` docs. Both are additive name rules; no change to D8
|
||||
(type fallback), the executor, or the grammar.
|
||||
|
||||
### Issue #33 — year-like `int` columns
|
||||
|
||||
A column such as `published` or `birth_year` was just an `int`, so it
|
||||
fell through to the unbounded type-based `int` path (D8) and produced
|
||||
nonsense like `9419` or `1426` — implausible as years, undercutting the
|
||||
"realistic data" pedagogy. Added an **`int`-gated** year rule, placed
|
||||
*after* the quantity rule (so `year_count` stays a count):
|
||||
|
||||
- `year` / `*_year` / `published` / `founded` → **`YearRecent`**, a
|
||||
bounded window of **1950–2025** (75 years relative to the fixed
|
||||
`REF_YEAR`, wide enough for published books / founding years /
|
||||
release years; matches the issue's own `between 1950 and 2020`
|
||||
workaround).
|
||||
- the same with a `birth` / `born` / `dob` token (e.g. `birth_year`) →
|
||||
**`YearBirth`**, mirroring the existing `dob → DateAdult` adult birth
|
||||
window as years (**1945–2007**).
|
||||
|
||||
Both emit a plain `int`. `published` / `founded` are included
|
||||
(user-confirmed): an `int` so named is almost always a year (a flag
|
||||
would be `is_published`). The generators are **not** added to the D9
|
||||
named-generator vocabulary — explicit control stays with `set <col>
|
||||
between <lo> and <hi>`.
|
||||
|
||||
### Issue #34 — built-in value sets for conventional choice names
|
||||
|
||||
D12 deliberately does not guess values for enum-ish names. For a few,
|
||||
though, there is a near-canonical small set that reads far better than
|
||||
lorem text. Added a **type-gated `PickFrom`** lookup (reusing the
|
||||
existing generator — no new machinery), placed ahead of the enum-ish
|
||||
fallthrough:
|
||||
|
||||
| Name (tokens) | text | int |
|
||||
|---|---|---|
|
||||
| `priority` / `prio` | `low`/`medium`/`high` | `1`/`2`/`3` |
|
||||
| `severity` | `low`/`medium`/`high`/`critical` | `1`/`2`/`3`/`4` |
|
||||
| `rating` / `stars` | — | `1`–`5` |
|
||||
|
||||
A user-declared `IN`-CHECK (D17) still wins — it is resolved before the
|
||||
heuristics. Any name that gains a set is **removed from the enum-ish
|
||||
advisory trigger** (`priority` left `ENUM_TOKENS`); since the advisory
|
||||
(D13) only fires on `Generator::Generic`, a `PickFrom` name is excluded
|
||||
either way, but the removal keeps `is_enum_ish` semantically "names seed
|
||||
still can't guess".
|
||||
|
||||
**`status` is deliberately excluded** (user-confirmed on the issue): its
|
||||
real values are too domain-specific (`active/inactive`,
|
||||
`open/closed/pending`, `draft/published`, …), so it keeps the D12
|
||||
"don't guess" stance — generic text + the advisory pointing at `set
|
||||
status in (…)`. `state` stays its US-state-name generator (D7);
|
||||
`type`/`kind`/`category`/`stage`/`gender` and `size`/`tier`/`plan` were
|
||||
considered and left to the advisory.
|
||||
|
||||
**Website follow-up** (tracked on the `website` branch, not here): the
|
||||
`seed` cast exercises a `tickets` table with `priority`; it should be
|
||||
re-recorded so the table tightens once `priority` collapses to a short
|
||||
value — likely subsumed by the pre-publication cast sweep.
|
||||
|
||||
+1
-1
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user