docs(website): document the seed command, with cast (ADR-0048)

New Reference page "Generating sample data" (captured output + a two-table seed cast showing generation, FK sampling context, and a `set` override); cross-links from inserting-and-editing-data and columns; seed added to the rdbms highlight grammar; querying/sql-queries renumbered. Cast stance in STYLE.md revised to "justify the absence". Refs #33, #34.
2026-06-12 14:41:37 +00:00
parent 4691d7950a
commit 77c55fa669
9 changed files with 933 additions and 19 deletions
@@ -37,7 +37,9 @@ The new column goes at the end of the table and is empty for existing rows.
 After the change, the updated structure is shown automatically. If you add a
 `serial` or `shortid` column — or change a column *to* one of those types —
 the empty cells are filled with freshly generated values in the same step (see
-[Types](/reference/types/)).
+[Types](/reference/types/)). To fill the new column on existing rows with
+generated data, use
+[`seed books.page_count`](/reference/generating-sample-data/#filling-one-column).

 ## Rename a column

@@ -0,0 +1,309 @@
+---
+title: Generating sample data
+description: Fill tables with realistic, name-aware fake rows using seed — with foreign keys, reproducible runs, and per-column overrides.
+sidebar:
+  order: 8
+---
+
+import Demo from '../../../components/Demo.astro';
+
+Once a schema is in place you usually want rows to query against — enough of
+them to make `where`, `group by`, `order by`, and `limit` interesting. Typing
+those by hand is tedious. `seed` fills a table with **plausible, generated
+data** in one command, so you can get to querying straight away. It works in
+both [simple and advanced mode](/getting-started/modes/), with the same
+syntax.
+
+<Demo src="/casts/seed.cast" title="One line fills a table with realistic, ready-to-query rows — seed reads each column's name to decide what to make." />
+
+The examples use the [example library](/getting-started/example-library/).
+
+## Filling a table
+
+`seed <table> <count>` generates rows and inserts them. If you leave out the
+count, seed makes **20** rows:
+
+```rdbms
+seed members 6
+```
+
+```
+  6 row(s) seeded into members
+┌───────────┬─────────────────┬────────────┐
+│ member_id │ name            │ joined     │
+├───────────┼─────────────────┼────────────┤
+│         1 │ Bret Leffler    │ 2023-10-16 │
+│         2 │ Santa Nicolas   │ 2024-03-14 │
+│         3 │ Vivienne Barton │ 2024-03-04 │
+│         4 │ Fatima Rippin   │ 2022-11-14 │
+│         5 │ Lola Cole       │ 2025-05-29 │
+│         6 │ Reina Waters    │ 2023-11-25 │
+└───────────┴─────────────────┴────────────┘
+```
+
+The data is **random**, so your rows will differ — run it again for a fresh
+set, or see [Reproducible runs](#reproducible-runs) to pin it. `member_id`
+was filled automatically (it is a `serial` column), exactly as it is for an
+[`insert`](/reference/inserting-and-editing-data/); seed leaves `serial` and
+`shortid` columns to the database.
+
+Notice the values are not random noise: `name` produced believable people and
+`joined` produced recent dates. That is because seed reads each **column's
+name** to decide what to generate.
+
+## How columns are filled
+
+A column's **name** chooses a generator, but only when the column's
+[type](/reference/types/) fits — a column called `email` typed `int` will not
+get an address. Matching is case-insensitive and looks at the name's parts
+(`first_name`, `signup_date`, `is_active`). A representative set:
+
+| Column name looks like… | You get | For types |
+|---|---|---|
+| `name`, `first_name`, `last_name`, `full_name` | a person's name | `text` |
+| `email` | an email address | `text` |
+| `username`, `login`, `handle` | a username | `text` |
+| `phone`, `mobile`, `tel` | a phone number | `text` |
+| `city`, `country`, `state`, `street`, `zip` | address parts | `text` |
+| `company`, `employer` · `job`, `position` | a company / job title | `text` |
+| `description`, `bio`, `notes`, `comment` | a sentence or paragraph | `text` |
+| `url`, `website` · `color` | a URL / hex colour | `text` |
+| `price`, `amount`, `cost`, `salary` | a money amount | `int`, `real`, `decimal` |
+| `age` · `quantity`, `qty`, `stock` | a plausible age / small number | `int` |
+| `date`, `dob`, `created_at`, `updated_at` | a recent (or birth-window) date | `date`, `datetime` |
+| `is_active`, `has_*`, `enabled` | `true` / `false` | `bool` |
+
+When a column's name **isn't** recognised, seed falls back to its **type**:
+placeholder words for `text`, a number for `int` and `real`, a recent value
+for `date`. So a column like `published` (just an `int` to the database) gets
+an arbitrary number, not a sensible year. That is exactly what the
+[`set` clause](#choosing-values-yourself) is for — pin those columns
+explicitly.
+
+Two name families are handled specially:
+
+- **Identifier-like names** that are *not* a foreign key or the primary key —
+  `code`, `sku`, `ref`, `barcode`, a `*_id` that isn't a relationship — get
+  **unique** values, so they read like real identifiers and never collide.
+- **Choice-like names** — `status`, `role`, `type`, `category`, `priority`,
+  and similar — have no sensible generic value, so seed fills them with
+  placeholder text and then [tells you](#columns-seed-cant-guess) to choose
+  the real values yourself.
+
+Any column with a `unique` [constraint](/reference/constraints/) always gets
+collision-free values, whatever its name — that is a correctness guarantee,
+not a guess.
+
+## Foreign keys
+
+Seed respects [relationships](/reference/relationships/). A foreign-key column
+is filled by **sampling from the rows that already exist** in the parent
+table, so every generated reference is valid. Seed the parent first:
+
+```rdbms
+seed authors 5
+seed books 6 set published between 1950 and 2020
+```
+
+```
+  6 row(s) seeded into books
+┌─────────┬────────────────────┬───────────┬───────────┬─────────────────────────────┐
+│ book_id │ title              │ author_id │ published │ isbn                        │
+├─────────┼────────────────────┼───────────┼───────────┼─────────────────────────────┤
+│       1 │ Austen Wuckert     │         4 │      1960 │ nihil molestiae             │
+│       2 │ Dayne Cremin       │         4 │      1978 │ repellat rerum              │
+│       3 │ Jayda Hagenes      │         1 │      1987 │ corrupti perspiciatis earum │
+│       4 │ Bethany VonRueden  │         5 │      1961 │ laborum deserunt facere     │
+│       5 │ Maximillian Hammes │         2 │      2018 │ fuga in                     │
+│       6 │ Skylar Cassin      │         2 │      2020 │ alias qui                   │
+└─────────┴────────────────────┴───────────┴───────────┴─────────────────────────────┘
+```
+
+Every `author_id` points at a real author (1–5). Duplicates are expected and
+correct — one author has many books. (`title` and `isbn` here are placeholder
+text, since neither name maps to a real-world generator; pin them with
+[`set`](#choosing-values-yourself) if you want something specific.)
+
+If a parent table is **empty**, seed refuses rather than inventing a reference
+that would break the relationship:
+
+```
+cannot seed `books`: parent table `authors` (referenced by `author_id`) has
+no rows. Seed or insert into `authors` first.
+```
+
+This mirrors the order you would insert data by hand, and quietly teaches
+foreign-key dependency order. A junction table linking two parents (a
+many-to-many bridge) is filled with **distinct combinations** of the parents'
+keys; if you ask for more rows than there are combinations, seed makes as many
+as it can and tells you.
+
+## Reproducible runs
+
+Add `--seed <n>` to make a run **repeatable**: the same number produces the
+same data, so a teacher can hand out one dataset and a demo stays stable.
+
+```rdbms
+seed members 6 --seed 42
+```
+
+```
+  6 row(s) seeded into members
+┌───────────┬─────────────────┬────────────┐
+│ member_id │ name            │ joined     │
+├───────────┼─────────────────┼────────────┤
+│         1 │ Bret Leffler    │ 2023-10-16 │
+│         2 │ Santa Nicolas   │ 2024-03-14 │
+│         3 │ Vivienne Barton │ 2024-03-04 │
+│         4 │ Fatima Rippin   │ 2022-11-14 │
+│         5 │ Lola Cole       │ 2025-05-29 │
+│         6 │ Reina Waters    │ 2023-11-25 │
+└───────────┴─────────────────┴────────────┘
+```
+
+Run that again and you get the very same six members. "The same data" is
+relative to the table's current contents: because foreign keys and unique
+values read the rows already present, reproducibility assumes the same
+starting point.
+
+## Choosing values yourself
+
+Seed's guesses are a starting point. The optional `set` clause pins how one or
+more columns are filled. It reuses syntax you already know from
+[`where`](/reference/querying-and-inspecting/) and `update`, so there is
+nothing new to learn — four forms:
+
+| Form | Example | Meaning |
+|---|---|---|
+| Fixed value | `set status = 'active'` | every row gets the same value |
+| Pick from a list | `set role in ('admin', 'editor', 'viewer')` | a random choice from the list |
+| Named generator | `set contact as email` | force a specific generator |
+| Range | `set price between 10 and 100` | a value in the range (also dates) |
+
+Combine them with commas:
+
+```rdbms
+seed tickets 6 set status in ('open', 'pending', 'closed'), priority in ('low', 'high')
+```
+
+```
+  6 row(s) seeded into tickets
+┌───────────┬──────────────────────────┬─────────┬──────────┐
+│ ticket_id │ subject                  │ status  │ priority │
+├───────────┼──────────────────────────┼─────────┼──────────┤
+│         7 │ atque libero             │ pending │ high     │
+│         8 │ culpa maiores et         │ open    │ low      │
+│         9 │ natus rerum animi        │ open    │ high     │
+│        10 │ sapiente rem             │ closed  │ low      │
+│        11 │ placeat blanditiis quasi │ closed  │ high     │
+│        12 │ sed exercitationem       │ closed  │ low      │
+└───────────┴──────────────────────────┴─────────┴──────────┘
+```
+
+Text values and list items are **quoted** (`'admin'`), exactly as elsewhere;
+only numbers are bare. Dates in a range are quoted too
+(`set joined between '2023-01-01' and '2024-12-31'`). A range on a number
+column takes numeric bounds, a range on a date column takes date bounds — a
+mismatched bound is a friendly error.
+
+The named generators you can use after `as` are:
+
+`age`, `bool`, `city`, `color`, `company`, `country`, `date`, `datetime`,
+`email`, `first_name`, `job`, `last_name`, `name`, `paragraph`, `password`,
+`phone`, `price`, `product`, `sentence`, `state`, `street`, `url`, `username`,
+`zip`.
+
+:::note
+If you pin a `unique` column (or a single-column primary key) to a fixed value
+or a list that is too short to fill every row, seed stops and explains — it
+cannot make 20 distinct rows from three choices. Use a generator or a longer
+list.
+:::
+
+## Filling one column
+
+`seed <table>.<column>` fills **one column across the rows that already
+exist**, rather than adding new rows — the natural follow-up to
+[`add column`](/reference/columns/), and the way to repair a single
+column seed guessed wrongly. Combined with `set`, it sets that column
+deliberately:
+
+```rdbms
+seed tickets.status set status in ('open', 'closed')
+```
+
+```
+  12 row(s) seeded into tickets
+┌───────────┬───────────────────────────┬────────┬──────────────────────────────────┐
+│ ticket_id │ subject                   │ status │ priority                         │
+├───────────┼───────────────────────────┼────────┼──────────────────────────────────┤
+│         1 │ ad natus                  │ closed │ sed iusto                        │
+│         2 │ voluptas iure aut         │ closed │ eveniet consequatur consequuntur │
+│         3 │ rerum nulla reprehenderit │ closed │ est quibusdam et                 │
+│         4 │ cumque autem voluptas     │ open   │ facere maxime                    │
+│         5 │ sit harum                 │ open   │ eveniet commodi reprehenderit    │
+│         6 │ rerum deserunt            │ closed │ mollitia ut repellendus          │
+└───────────┴───────────────────────────┴────────┴──────────────────────────────────┘
+```
+
+Only `status` changed; the other columns are untouched. Column-fill **refuses**
+primary-key and autogenerated (`serial` / `shortid`) columns — you do not
+"fill in" an identity column — and on an empty table it is a no-op.
+
+## Columns seed can't guess
+
+Choice-like columns — `status`, `role`, `type`, and the like — get placeholder
+text, because there is no sensible generic value for them. After a seed, the
+playground points this out:
+
+```rdbms
+seed tickets 6
+```
+
+```
+  6 row(s) seeded into tickets
+┌───────────┬───────────────────────────┬──────────────────────────────┬──────────────────────────────────┐
+│ ticket_id │ subject                   │ status                       │ priority                         │
+├───────────┼───────────────────────────┼──────────────────────────────┼──────────────────────────────────┤
+│         1 │ ad natus                  │ temporibus eos rerum         │ sed iusto                        │
+│         2 │ voluptas iure aut         │ repudiandae commodi possimus │ eveniet consequatur consequuntur │
+│         3 │ rerum nulla reprehenderit │ earum culpa                  │ est quibusdam et                 │
+│         4 │ cumque autem voluptas     │ ea praesentium pariatur      │ facere maxime                    │
+│         5 │ sit harum                 │ et et laboriosam             │ eveniet commodi reprehenderit    │
+│         6 │ rerum deserunt            │ qui voluptate                │ mollitia ut repellendus          │
+└───────────┴───────────────────────────┴──────────────────────────────┴──────────────────────────────────┘
+```
+
+> `status, priority` filled with generic text — they look like fixed value
+> sets. Pin them next time with `set status in ('…', '…')`, or fix these rows
+> with `seed tickets.status set status in ('…', '…')`.
+
+The two fixes it suggests are the [`set` clause](#choosing-values-yourself) on
+the next seed, and [column-fill](#filling-one-column) to repair the rows you
+just made. If a `check` constraint restricts a column to a list of values
+(`check status in ('open', 'closed')`), seed reads that list and uses it
+automatically — no override needed.
+
+## Limits
+
+- The most you can seed at once is **10,000** rows; more is a friendly error
+  (a guard against a typo like `seed members 1000000`). Seed in smaller
+  batches if you genuinely need more.
+- `seed members 0` does nothing.
+- A `not null` column seed cannot produce a value for — the only real case is
+  a `not null blob` — makes seed refuse the whole command and name the column,
+  rather than fail partway through.
+
+A whole `seed` is a **single step** in the history: one [`undo`](/using-the-playground/undo-and-history/)
+removes every row it added, not one row at a time.
+
+## Syntax
+
+```rdbms-syntax
+seed <Table> [<count>] [set <col> = <value> | in (<value>, ...) | as <generator> | between <low> and <high>][, ...] [--seed <n>]
+seed <Table>.<column> [set ...] [--seed <n>]
+```
+
+See also [Inserting & editing data](/reference/inserting-and-editing-data/),
+[Relationships](/reference/relationships/), [Columns](/reference/columns/), and
+[Constraints](/reference/constraints/).
@@ -113,5 +113,8 @@ update <Table> set <col>=<value>[, ...] (where <expr> | --all-rows)
 delete from <Table> (where <expr> | --all-rows)
 ```

+To fill a table with many rows at once instead of typing each one, see
+[Generating sample data](/reference/generating-sample-data/).
+
 See also [Querying & inspecting](/reference/querying-and-inspecting/) and
 [Constraints](/reference/constraints/).
@@ -2,7 +2,7 @@
 title: Querying & inspecting
 description: View rows and schema, run SQL queries with joins, and explain how a query runs.
 sidebar:
-  order: 8
+  order: 9
 ---

 This page covers reading what is in your project: the rows in a table, the
@@ -2,7 +2,7 @@
 title: SQL queries
 description: The advanced-mode SQL query surface — DISTINCT, GROUP BY/HAVING, set operations, subqueries, CTEs, and expressions.
 sidebar:
-  order: 9
+  order: 10
 ---

 [Querying & inspecting](/reference/querying-and-inspecting/) covers viewing rows
@@ -53,7 +53,7 @@ const repository = {
 	},
 	keyword: {
 		match:
-			'(?i)\\b(create|table|tables|drop|add|column|with|pk|to|from|into|values|insert|update|set|where|delete|show|data|rename|change|alter|relationship|relationships|index|indexes|on|as|references|constraint|not|unique|default|check|primary|key|cascade|restrict|and|or|in|between|like|is|explain|replay|undo|redo|save|new|load|rebuild|export|import|copy|mode|help|hint|quit|messages|all|last|types|simple|advanced)\\b',
+			'(?i)\\b(create|table|tables|drop|add|column|with|pk|to|from|into|values|insert|update|seed|set|where|delete|show|data|rename|change|alter|relationship|relationships|index|indexes|on|as|references|constraint|not|unique|default|check|primary|key|cascade|restrict|and|or|in|between|like|is|explain|replay|undo|redo|save|new|load|rebuild|export|import|copy|mode|help|hint|quit|messages|all|last|types|simple|advanced)\\b',
 		name: 'keyword.control.rdbms',
 	},
 };