docs(website): document the seed command, with cast (ADR-0048)

New Reference page "Generating sample data" (captured output + a
two-table seed cast showing generation, FK sampling context, and a
`set` override); cross-links from inserting-and-editing-data and
columns; seed added to the rdbms highlight grammar;
querying/sql-queries renumbered. Cast stance in STYLE.md revised to
"justify the absence". Refs #33, #34.
This commit is contained in:
claude@clouddev1
2026-06-12 14:41:37 +00:00
parent 4691d7950a
commit 77c55fa669
9 changed files with 933 additions and 19 deletions
@@ -37,7 +37,9 @@ The new column goes at the end of the table and is empty for existing rows.
After the change, the updated structure is shown automatically. If you add a
`serial` or `shortid` column — or change a column *to* one of those types —
the empty cells are filled with freshly generated values in the same step (see
[Types](/reference/types/)).
[Types](/reference/types/)). To fill the new column on existing rows with
generated data, use
[`seed books.page_count`](/reference/generating-sample-data/#filling-one-column).
## Rename a column
@@ -0,0 +1,309 @@
---
title: Generating sample data
description: Fill tables with realistic, name-aware fake rows using seed — with foreign keys, reproducible runs, and per-column overrides.
sidebar:
order: 8
---
import Demo from '../../../components/Demo.astro';
Once a schema is in place you usually want rows to query against — enough of
them to make `where`, `group by`, `order by`, and `limit` interesting. Typing
those by hand is tedious. `seed` fills a table with **plausible, generated
data** in one command, so you can get to querying straight away. It works in
both [simple and advanced mode](/getting-started/modes/), with the same
syntax.
<Demo src="/casts/seed.cast" title="One line fills a table with realistic, ready-to-query rows — seed reads each column's name to decide what to make." />
The examples use the [example library](/getting-started/example-library/).
## Filling a table
`seed <table> <count>` generates rows and inserts them. If you leave out the
count, seed makes **20** rows:
```rdbms
seed members 6
```
```
6 row(s) seeded into members
┌───────────┬─────────────────┬────────────┐
│ member_id │ name │ joined │
├───────────┼─────────────────┼────────────┤
│ 1 │ Bret Leffler │ 2023-10-16 │
│ 2 │ Santa Nicolas │ 2024-03-14 │
│ 3 │ Vivienne Barton │ 2024-03-04 │
│ 4 │ Fatima Rippin │ 2022-11-14 │
│ 5 │ Lola Cole │ 2025-05-29 │
│ 6 │ Reina Waters │ 2023-11-25 │
└───────────┴─────────────────┴────────────┘
```
The data is **random**, so your rows will differ — run it again for a fresh
set, or see [Reproducible runs](#reproducible-runs) to pin it. `member_id`
was filled automatically (it is a `serial` column), exactly as it is for an
[`insert`](/reference/inserting-and-editing-data/); seed leaves `serial` and
`shortid` columns to the database.
Notice the values are not random noise: `name` produced believable people and
`joined` produced recent dates. That is because seed reads each **column's
name** to decide what to generate.
## How columns are filled
A column's **name** chooses a generator, but only when the column's
[type](/reference/types/) fits — a column called `email` typed `int` will not
get an address. Matching is case-insensitive and looks at the name's parts
(`first_name`, `signup_date`, `is_active`). A representative set:
| Column name looks like… | You get | For types |
|---|---|---|
| `name`, `first_name`, `last_name`, `full_name` | a person's name | `text` |
| `email` | an email address | `text` |
| `username`, `login`, `handle` | a username | `text` |
| `phone`, `mobile`, `tel` | a phone number | `text` |
| `city`, `country`, `state`, `street`, `zip` | address parts | `text` |
| `company`, `employer` · `job`, `position` | a company / job title | `text` |
| `description`, `bio`, `notes`, `comment` | a sentence or paragraph | `text` |
| `url`, `website` · `color` | a URL / hex colour | `text` |
| `price`, `amount`, `cost`, `salary` | a money amount | `int`, `real`, `decimal` |
| `age` · `quantity`, `qty`, `stock` | a plausible age / small number | `int` |
| `date`, `dob`, `created_at`, `updated_at` | a recent (or birth-window) date | `date`, `datetime` |
| `is_active`, `has_*`, `enabled` | `true` / `false` | `bool` |
When a column's name **isn't** recognised, seed falls back to its **type**:
placeholder words for `text`, a number for `int` and `real`, a recent value
for `date`. So a column like `published` (just an `int` to the database) gets
an arbitrary number, not a sensible year. That is exactly what the
[`set` clause](#choosing-values-yourself) is for — pin those columns
explicitly.
Two name families are handled specially:
- **Identifier-like names** that are *not* a foreign key or the primary key —
`code`, `sku`, `ref`, `barcode`, a `*_id` that isn't a relationship — get
**unique** values, so they read like real identifiers and never collide.
- **Choice-like names** — `status`, `role`, `type`, `category`, `priority`,
and similar — have no sensible generic value, so seed fills them with
placeholder text and then [tells you](#columns-seed-cant-guess) to choose
the real values yourself.
Any column with a `unique` [constraint](/reference/constraints/) always gets
collision-free values, whatever its name — that is a correctness guarantee,
not a guess.
## Foreign keys
Seed respects [relationships](/reference/relationships/). A foreign-key column
is filled by **sampling from the rows that already exist** in the parent
table, so every generated reference is valid. Seed the parent first:
```rdbms
seed authors 5
seed books 6 set published between 1950 and 2020
```
```
6 row(s) seeded into books
┌─────────┬────────────────────┬───────────┬───────────┬─────────────────────────────┐
│ book_id │ title │ author_id │ published │ isbn │
├─────────┼────────────────────┼───────────┼───────────┼─────────────────────────────┤
│ 1 │ Austen Wuckert │ 4 │ 1960 │ nihil molestiae │
│ 2 │ Dayne Cremin │ 4 │ 1978 │ repellat rerum │
│ 3 │ Jayda Hagenes │ 1 │ 1987 │ corrupti perspiciatis earum │
│ 4 │ Bethany VonRueden │ 5 │ 1961 │ laborum deserunt facere │
│ 5 │ Maximillian Hammes │ 2 │ 2018 │ fuga in │
│ 6 │ Skylar Cassin │ 2 │ 2020 │ alias qui │
└─────────┴────────────────────┴───────────┴───────────┴─────────────────────────────┘
```
Every `author_id` points at a real author (15). Duplicates are expected and
correct — one author has many books. (`title` and `isbn` here are placeholder
text, since neither name maps to a real-world generator; pin them with
[`set`](#choosing-values-yourself) if you want something specific.)
If a parent table is **empty**, seed refuses rather than inventing a reference
that would break the relationship:
```
cannot seed `books`: parent table `authors` (referenced by `author_id`) has
no rows. Seed or insert into `authors` first.
```
This mirrors the order you would insert data by hand, and quietly teaches
foreign-key dependency order. A junction table linking two parents (a
many-to-many bridge) is filled with **distinct combinations** of the parents'
keys; if you ask for more rows than there are combinations, seed makes as many
as it can and tells you.
## Reproducible runs
Add `--seed <n>` to make a run **repeatable**: the same number produces the
same data, so a teacher can hand out one dataset and a demo stays stable.
```rdbms
seed members 6 --seed 42
```
```
6 row(s) seeded into members
┌───────────┬─────────────────┬────────────┐
│ member_id │ name │ joined │
├───────────┼─────────────────┼────────────┤
│ 1 │ Bret Leffler │ 2023-10-16 │
│ 2 │ Santa Nicolas │ 2024-03-14 │
│ 3 │ Vivienne Barton │ 2024-03-04 │
│ 4 │ Fatima Rippin │ 2022-11-14 │
│ 5 │ Lola Cole │ 2025-05-29 │
│ 6 │ Reina Waters │ 2023-11-25 │
└───────────┴─────────────────┴────────────┘
```
Run that again and you get the very same six members. "The same data" is
relative to the table's current contents: because foreign keys and unique
values read the rows already present, reproducibility assumes the same
starting point.
## Choosing values yourself
Seed's guesses are a starting point. The optional `set` clause pins how one or
more columns are filled. It reuses syntax you already know from
[`where`](/reference/querying-and-inspecting/) and `update`, so there is
nothing new to learn — four forms:
| Form | Example | Meaning |
|---|---|---|
| Fixed value | `set status = 'active'` | every row gets the same value |
| Pick from a list | `set role in ('admin', 'editor', 'viewer')` | a random choice from the list |
| Named generator | `set contact as email` | force a specific generator |
| Range | `set price between 10 and 100` | a value in the range (also dates) |
Combine them with commas:
```rdbms
seed tickets 6 set status in ('open', 'pending', 'closed'), priority in ('low', 'high')
```
```
6 row(s) seeded into tickets
┌───────────┬──────────────────────────┬─────────┬──────────┐
│ ticket_id │ subject │ status │ priority │
├───────────┼──────────────────────────┼─────────┼──────────┤
│ 7 │ atque libero │ pending │ high │
│ 8 │ culpa maiores et │ open │ low │
│ 9 │ natus rerum animi │ open │ high │
│ 10 │ sapiente rem │ closed │ low │
│ 11 │ placeat blanditiis quasi │ closed │ high │
│ 12 │ sed exercitationem │ closed │ low │
└───────────┴──────────────────────────┴─────────┴──────────┘
```
Text values and list items are **quoted** (`'admin'`), exactly as elsewhere;
only numbers are bare. Dates in a range are quoted too
(`set joined between '2023-01-01' and '2024-12-31'`). A range on a number
column takes numeric bounds, a range on a date column takes date bounds — a
mismatched bound is a friendly error.
The named generators you can use after `as` are:
`age`, `bool`, `city`, `color`, `company`, `country`, `date`, `datetime`,
`email`, `first_name`, `job`, `last_name`, `name`, `paragraph`, `password`,
`phone`, `price`, `product`, `sentence`, `state`, `street`, `url`, `username`,
`zip`.
:::note
If you pin a `unique` column (or a single-column primary key) to a fixed value
or a list that is too short to fill every row, seed stops and explains — it
cannot make 20 distinct rows from three choices. Use a generator or a longer
list.
:::
## Filling one column
`seed <table>.<column>` fills **one column across the rows that already
exist**, rather than adding new rows — the natural follow-up to
[`add column`](/reference/columns/), and the way to repair a single
column seed guessed wrongly. Combined with `set`, it sets that column
deliberately:
```rdbms
seed tickets.status set status in ('open', 'closed')
```
```
12 row(s) seeded into tickets
┌───────────┬───────────────────────────┬────────┬──────────────────────────────────┐
│ ticket_id │ subject │ status │ priority │
├───────────┼───────────────────────────┼────────┼──────────────────────────────────┤
│ 1 │ ad natus │ closed │ sed iusto │
│ 2 │ voluptas iure aut │ closed │ eveniet consequatur consequuntur │
│ 3 │ rerum nulla reprehenderit │ closed │ est quibusdam et │
│ 4 │ cumque autem voluptas │ open │ facere maxime │
│ 5 │ sit harum │ open │ eveniet commodi reprehenderit │
│ 6 │ rerum deserunt │ closed │ mollitia ut repellendus │
└───────────┴───────────────────────────┴────────┴──────────────────────────────────┘
```
Only `status` changed; the other columns are untouched. Column-fill **refuses**
primary-key and autogenerated (`serial` / `shortid`) columns — you do not
"fill in" an identity column — and on an empty table it is a no-op.
## Columns seed can't guess
Choice-like columns — `status`, `role`, `type`, and the like — get placeholder
text, because there is no sensible generic value for them. After a seed, the
playground points this out:
```rdbms
seed tickets 6
```
```
6 row(s) seeded into tickets
┌───────────┬───────────────────────────┬──────────────────────────────┬──────────────────────────────────┐
│ ticket_id │ subject │ status │ priority │
├───────────┼───────────────────────────┼──────────────────────────────┼──────────────────────────────────┤
│ 1 │ ad natus │ temporibus eos rerum │ sed iusto │
│ 2 │ voluptas iure aut │ repudiandae commodi possimus │ eveniet consequatur consequuntur │
│ 3 │ rerum nulla reprehenderit │ earum culpa │ est quibusdam et │
│ 4 │ cumque autem voluptas │ ea praesentium pariatur │ facere maxime │
│ 5 │ sit harum │ et et laboriosam │ eveniet commodi reprehenderit │
│ 6 │ rerum deserunt │ qui voluptate │ mollitia ut repellendus │
└───────────┴───────────────────────────┴──────────────────────────────┴──────────────────────────────────┘
```
> `status, priority` filled with generic text — they look like fixed value
> sets. Pin them next time with `set status in ('…', '…')`, or fix these rows
> with `seed tickets.status set status in ('…', '…')`.
The two fixes it suggests are the [`set` clause](#choosing-values-yourself) on
the next seed, and [column-fill](#filling-one-column) to repair the rows you
just made. If a `check` constraint restricts a column to a list of values
(`check status in ('open', 'closed')`), seed reads that list and uses it
automatically — no override needed.
## Limits
- The most you can seed at once is **10,000** rows; more is a friendly error
(a guard against a typo like `seed members 1000000`). Seed in smaller
batches if you genuinely need more.
- `seed members 0` does nothing.
- A `not null` column seed cannot produce a value for — the only real case is
a `not null blob` — makes seed refuse the whole command and name the column,
rather than fail partway through.
A whole `seed` is a **single step** in the history: one [`undo`](/using-the-playground/undo-and-history/)
removes every row it added, not one row at a time.
## Syntax
```rdbms-syntax
seed <Table> [<count>] [set <col> = <value> | in (<value>, ...) | as <generator> | between <low> and <high>][, ...] [--seed <n>]
seed <Table>.<column> [set ...] [--seed <n>]
```
See also [Inserting & editing data](/reference/inserting-and-editing-data/),
[Relationships](/reference/relationships/), [Columns](/reference/columns/), and
[Constraints](/reference/constraints/).
@@ -113,5 +113,8 @@ update <Table> set <col>=<value>[, ...] (where <expr> | --all-rows)
delete from <Table> (where <expr> | --all-rows)
```
To fill a table with many rows at once instead of typing each one, see
[Generating sample data](/reference/generating-sample-data/).
See also [Querying & inspecting](/reference/querying-and-inspecting/) and
[Constraints](/reference/constraints/).
@@ -2,7 +2,7 @@
title: Querying & inspecting
description: View rows and schema, run SQL queries with joins, and explain how a query runs.
sidebar:
order: 8
order: 9
---
This page covers reading what is in your project: the rows in a table, the
@@ -2,7 +2,7 @@
title: SQL queries
description: The advanced-mode SQL query surface — DISTINCT, GROUP BY/HAVING, set operations, subqueries, CTEs, and expressions.
sidebar:
order: 9
order: 10
---
[Querying & inspecting](/reference/querying-and-inspecting/) covers viewing rows
+1 -1
View File
@@ -53,7 +53,7 @@ const repository = {
},
keyword: {
match:
'(?i)\\b(create|table|tables|drop|add|column|with|pk|to|from|into|values|insert|update|set|where|delete|show|data|rename|change|alter|relationship|relationships|index|indexes|on|as|references|constraint|not|unique|default|check|primary|key|cascade|restrict|and|or|in|between|like|is|explain|replay|undo|redo|save|new|load|rebuild|export|import|copy|mode|help|hint|quit|messages|all|last|types|simple|advanced)\\b',
'(?i)\\b(create|table|tables|drop|add|column|with|pk|to|from|into|values|insert|update|seed|set|where|delete|show|data|rename|change|alter|relationship|relationships|index|indexes|on|as|references|constraint|not|unique|default|check|primary|key|cascade|restrict|and|or|in|between|like|is|explain|replay|undo|redo|save|new|load|rebuild|export|import|copy|mode|help|hint|quit|messages|all|last|types|simple|advanced)\\b',
name: 'keyword.control.rdbms',
},
};