Files
rdbms-playground/docs/adr/0029-column-constraints.md
T
claude@clouddev1 a049ff9aa0 docs: handoff 23 — ADR-0029 complete; tick C3
ADR-0029 (column constraints — NOT NULL / UNIQUE / CHECK /
DEFAULT) is fully implemented across the handoff-22 and
handoff-23 sessions. Ticks requirement C3, and corrects
ADR §10's CHECK-error wording to the compiled-SQL form per
the §7 storage deviation.
2026-05-19 18:56:50 +00:00

20 KiB

ADR-0029: Column constraints (NOT NULL / UNIQUE / CHECK / DEFAULT)

Status

Accepted

Context

The requirements checklist (C3) commits to the full schema- constraint surface. Most of it has landed across earlier ADRs: primary keys, single and compound, at create-table time; foreign keys with ON DELETE / ON UPDATE actions (ADR-0013); indexes (ADR-0025). What remains are the four column-level constraints every relational schema relies on: NOT NULL, UNIQUE, CHECK, and DEFAULT.

They are core pedagogy. A learner cannot model a real schema without saying "this column is required", "these values must be distinct", "this value must be in range", or "use this when none is given". Until now the playground can only teach the primary key — which bundles NOT NULL + UNIQUE invisibly — and the relational shape of foreign keys. The four standalone constraints are the missing half of "what a column is".

Three things make this the right moment, and shape the decision:

  1. The rebuild-table primitive exists (ADR-0013). SQLite's ALTER TABLE cannot add NOT NULL / UNIQUE / CHECK to an existing column; the rebuild dance is the accepted recipe, and it is already load-bearing for change-column- type and every relationship operation.
  2. The per-cell dry-run model exists (ADR-0017). Applying a constraint to a populated column is the same shape of problem as a type change: scan the existing data, classify it, and refuse with a learner-friendly table of offending rows before any SQL writes. ADR-0029 reuses that ethos.
  3. The WHERE-expression grammar exists (ADR-0026). A CHECK constraint is a boolean expression over the column. Rather than invent a second expression language, CHECK reuses the ADR-0026 Expr grammar through the Subgrammar node — the same expression language the learner already meets in where filters.

PRIMARY KEY is not revisited here. It keeps its dedicated with pk … clause: it is table-level (compound keys span columns), it is declared once, and folding it into the per-column constraint suffix would muddy both surfaces. A PK column is implicitly NOT NULL and UNIQUE; this ADR makes that explicit in §9 by rejecting redundant declarations.

Decision

1. The constraint vocabulary

Four column-level constraints, each appearing at most once per column in v1:

Constraint Payload Meaning
not null the column rejects NULL
unique all non-NULL values must be distinct
default a value literal value used when an insert omits the column
check a parenthesised Expr every row must satisfy the boolean expression

The payload literal for default is an ordinary value literal per the ADR-0014 value grammar. The check expression is an ADR-0026 Expr, reached through Subgrammar — see §2.

unique follows SQLite's standard semantics: multiple NULLs do not collide (SQL's "NULLs are distinct" rule). This is stated explicitly because it is a common learner surprise and the friendly-error and dry-run text both depend on it.

2. Grammar — declaring constraints

2.1 The constraint suffix (create table, add column)

A column spec gains an optional, repeatable constraint suffix after the (type) group:

create table Books with pk isbn(text) check (isbn like '978%')

add column to Books: title (text) not null
add column to Books: stock (int) default 0 check (stock >= 0)

Where each constraint is useful. The simple-mode create table … with pk … declares only primary-key columns — every column in the with pk list is part of the primary key (non-PK columns are added afterward with add column; see docs/simple-mode-limitations.md). Since a PK column is already NOT NULL and UNIQUE, §9 rejects those two as redundant there — so on a create table column the suffix is useful for default / check. not null / unique come into their own on add column (non-PK columns) and on add constraint. The suffix grammar is nonetheless shared verbatim across all three surfaces; §9 does the rejecting.

  • Standard SQL writes constraints after the data type (email TEXT NOT NULL UNIQUE). The playground brackets the type as email(text) — a pre-existing convention this ADR does not relitigate — so the faithful adaptation is the same keyword order after the bracketed type.
  • Constraint keywords are case-insensitive (ADR-0009) and may appear in any order. Each kind at most once; a repeat (not null not null) is a friendly error naming the duplicated constraint.
  • default <literal> takes one value literal.
  • check ( <expr> ) parenthesises its expression — matching SQL's CHECK (…), and giving the parser an unambiguous end for the expression before the next constraint or the column-separating comma. The expression is the ADR-0026 Expr grammar via Node::Subgrammar(&expr::OR_EXPR).
  • The column-separating comma of the with pk list follows the last constraint of a column, exactly as today.

The constraint suffix is one shared grammar fragment, used by both create table's column list and add column.

2.2 Constraints on an existing column (add constraint … / drop constraint …)

Modifying the constraints of an already-created column is a first-class v1 surface. It reuses the established add / drop verb families and the dotted <Table>.<column> reference (ADR-0013's add 1:n relationship from <P>.<col>):

add constraint not null to Users.email
add constraint unique to Users.email
add constraint default 18 to Users.age
add constraint check (age >= 0) to Users.age

drop constraint not null from Users.email
drop constraint unique from Users.email
drop constraint default from Users.age
drop constraint check from Users.age
  • add constraint <constraint> to <T>.<col> and drop constraint <constraint-kind> from <T>.<col> join the existing add / drop command Choices as new forms. The form word constraint discriminates them — exactly as column / index / relationship discriminate the other add / drop forms. Every add / drop stays uniformly <verb> <noun> …, which keeps the grammar hierarchy clean.
  • After add constraint, the <constraint> is exactly the §2.1 suffix vocabulary: not null, unique, default <literal>, check ( <expr> ). The constraint grammar fragment is shared a third time.
  • After drop constraint, only the kind is named — not null, unique, default, check — since at most one of each exists per column.

3. AST

ColumnSpec (src/dsl/command.rs) gains the four constraint slots — each kind is at-most-one, so they are fields, not a Vec:

pub struct ColumnSpec {
    pub name: String,
    pub ty: Type,
    pub not_null: bool,
    pub unique: bool,
    pub default: Option<Value>,
    pub check: Option<Expr>,
}

Command::AddColumn carries the same four slots alongside its table / column / ty.

Two new commands cover §2.2:

Command::AddConstraint   { table: String, column: String, constraint: Constraint }
Command::DropConstraint  { table: String, column: String, kind: ConstraintKind }

pub enum Constraint     { NotNull, Unique, Default(Value), Check(Expr) }
pub enum ConstraintKind { NotNull, Unique, Default, Check }

Constraint carries the payload (Default / Check); ConstraintKind is payload-free, for drop.

4. SQLite mapping and enforcement

Every constraint maps directly to native SQLite column-DDL — nothing is enforced application-side that SQLite can enforce itself, consistent with the STRICT-tables posture (ADR-0002):

"email" TEXT NOT NULL UNIQUE
"age"   INTEGER DEFAULT 18 CHECK ("age" >= 0)
  • not nullNOT NULL; uniqueUNIQUE; default vDEFAULT <literal>.
  • checkCHECK (<sql-expr>). The Expr is compiled to SQL the same way WHERE expressions are (compile_expr), except DDL admits no ? parameters, so literals are inlined — the inlining helper from ADR-0028 §3 (inline_params_for_display) already does exactly this. Identifiers are double-quoted; the emitted SQL names no engine product, so the ADR-0002 rule holds.

Constraint DDL flows through the existing column-DDL assembly (schema_to_ddl), so the rebuild-table primitive and every ALTER path pick it up automatically once the constraint data reaches ReadColumn.

5. Applying a constraint to a populated column — the dry-run

add constraint not null / unique / check against a column that already holds data can be violated by that data. Per the ADR-0017 ethos, the violation is caught by a dry-run pass that runs before any SQL writes, and reported as a learner-friendly table — never as a raw engine error.

Command Pre-flight check
add constraint not null SELECT … WHERE "col" IS NULL — any row is a violation
add constraint unique non-NULL values grouped; any group of size > 1 collides
add constraint check (expr) SELECT … WHERE NOT (expr) — any row is a violation
add constraint default none — DEFAULT never touches existing rows
drop constraint <any> none — removing a constraint cannot violate data

On a violation the command is refused and the offending rows are rendered through the pretty-table renderer (ADR-0016 / ADR-0017 §7) — rows identified by primary-key value(s), capped at 100 with a trailing … and N more row. For add constraint unique, colliding rows are grouped by their shared value, mirroring ADR-0017 §7's uniqueness-collision table.

When the dry-run passes, the constraint is applied through the rebuild-table primitive (ADR-0013) — SQLite cannot ALTER a constraint onto an existing column. An empty table passes the dry-run trivially.

There is no --force override: a constraint that the data violates is not something the learner should be able to wave through. They fix the data (with update / delete) and retry. This is stricter than ADR-0017's --force-conversion because a constraint is a rule, not a lossy transform — a forced-through violation would leave the schema lying about the data.

6. DEFAULT semantics

  • A default lets an insert omit the column in the explicit-column form (insert into T (a, b) values …); the omitted column takes its default. The natural-order short form (insert into T values (…)) still requires every non-auto-generated column positionally — there is no positional "skip", exactly as in SQL.
  • add constraint default affects only future inserts; existing rows are untouched (SQL-standard behaviour, and why §5 needs no dry-run for it).
  • default is rejected on a serial or shortid column: those carry their own auto-fill contract (ADR-0018), and a second source of "the value when none is given" would be ambiguous. The error says so.
  • add column … not null on a non-empty table requires a default in the same spec — the new column is NULL in every existing row, which NOT NULL forbids. Without a default the command is refused with a message that explains the rule and points at default. On an empty table it is fine. (This is SQLite's own ADD COLUMN rule, surfaced as a friendly error rather than a raw one.)
  • add column … unique default <v> on a table with more than one row is refused: every existing row would receive the same default value, colliding immediately. The error explains the collision.

7. Storage and round-trip

playground.db is a derived artifact (ADR-0004 / ADR-0015); constraints must round-trip through project.yaml or they vanish on rebuild / export / import.

The CHECK expression is stored — everywhere — as the compiled SQL form: the parsed Expr run through compile_expr with literals inlined (§4). Identifiers are double-quoted, exactly as ADR-0028's explain display SQL already renders them, so the form is consistent with what the learner already meets. (The original plan stored canonical DSL text; that needs both an Expr→text renderer and a text→Expr re-parser for the round-trip, while the rebuild path needs the SQL form regardless — storing SQL once removes both.)

  • project.yaml — the ColumnSchema record gains not_null: bool, default: Option<String>, and check: Option<String>. (unique: bool already exists, from ADR-0018's serial / shortid contract.) check holds the compiled SQL.
  • Metadata tableNOT NULL, UNIQUE, and DEFAULT are all recoverable from SQLite itself (pragma_table_info's notnull and dflt_value; pragma_index_list origin u), so they need no metadata row. CHECK is not exposed by any pragma. So __rdbms_playground_columns carries a nullable check_expr TEXT column holding the compiled SQL, which schema_to_ddl and describe echo verbatim. It is part of the internal table's CREATE TABLE definition — there are no existing databases to migrate.

8. Structure rendering

describe_table already reports primary_key and notnull; ColumnDescription gains unique: bool, default: Option<String>, and check: Option<String>. constraints_display (output_render.rs) extends to list every constraint a column carries:

┌───────┬────────┬─────────────────────────────────┐
│ Name  │ Type   │ Constraints                     │
├───────┼────────┼─────────────────────────────────┤
│ id    │ serial │ PK                              │
│ email │ text   │ NOT NULL, UNIQUE                │
│ age   │ int    │ DEFAULT 18, CHECK ("age" >= 0)  │
└───────┴────────┴─────────────────────────────────┘

The CHECK renders in its compiled-SQL form (§7) — the same double-quoted-identifier style as ADR-0028's explain SQL.

9. PK columns — redundant and impossible constraints

A primary-key column is always implicitly NOT NULL. A single-column primary key is additionally UNIQUE; the columns of a compound primary key are not individually unique — only the combination is.

This makes the rules precise:

  • Declaring not null on any PK column — in the suffix or via add constraint … to — is a friendly error: the PK already implies it. The message says so, rather than silently ignoring it, so the learner learns why.
  • Declaring unique on a single-column-PK column is likewise a friendly error. Declaring unique on a column of a compound PK is allowed and meaningful — it adds an individual-uniqueness rule the compound PK does not provide.
  • drop constraint not null from any PK column, and drop constraint unique from a single-column-PK column, are friendly errors: the primary key still enforces them, so there is nothing to drop.
  • default and check on a PK column are always allowed — the PK implies neither.

Friendly errors here are deliberate teaching moments, not silent no-ops: the playground favours clarity over permissiveness even where SQLite itself would tolerate the redundant declaration.

10. Friendly errors

The friendly-error layer (ADR-0019) already enriches NOT NULL and UNIQUE violations from insert / update. Two additions:

  • CHECK violation — the engine reports CHECK constraint failed: <col>. The runtime enrichment resolves the column, the offending value, and the column's CHECK expression; the catalog entry renders "the value <v> breaks the rule <check-expr>". The rule is shown in its compiled-SQL form (§7) — the same form stored and displayed everywhere, since deviation §7 removed the DSL-text representation.
  • The §5 / §6 refusals — populated-column dry-run failures, the not null / unique / serial+default rejections — are all friendly catalog strings, with the dry-run tables rendered per ADR-0016.

11. Out of scope

  • OOS-1. Table-level CHECK constraints spanning multiple columns. v1's check is per-column; its expression typically references the column it sits on. A table-level check (…) clause is a clean follow-up.
  • OOS-2. More than one CHECK per column. v1 caps at one; SQL allows several.
  • OOS-3. Non-literal DEFAULTs — CURRENT_TIMESTAMP, expressions. v1 takes a value literal only.
  • OOS-4. Named constraints (CONSTRAINT <name> CHECK …). The playground's constraints are anonymous; only indexes and relationships carry user names.
  • OOS-5. Inline REFERENCES foreign keys in the column suffix. Relationships keep their dedicated add 1:n relationship command (ADR-0013).
  • OOS-6. --force-style overrides for §5 — deliberately omitted (see §5).

Consequences

  • ColumnSpec and Command::AddColumn grow four constraint slots; two new commands (AddConstraint / DropConstraint) and two new AST enums (Constraint / ConstraintKind) land. Every exhaustive match Command gains arms.
  • A shared constraint-suffix grammar fragment is used in three places: the create table column list, add column, and the add constraint … to command. CHECK reuses the ADR-0026 Expr grammar through Subgrammar.
  • The rebuild-table primitive (ADR-0013) becomes the path for add constraint … / drop constraint … on populated columns, joining its existing change-column-type and relationship users.
  • A pre-flight dry-run (§5), in the ADR-0017 style, guards every populated-column constraint addition; its violation tables go through the ADR-0016 pretty-table renderer.
  • The internal metadata table __rdbms_playground_columns carries a new check_expr column — its first change since ADR-0012 — holding the compiled-SQL form of the CHECK.
  • project.yaml's ColumnSchema grows three fields; the format stays backward-compatible (the new keys default to "absent" — not_null: false, no default, no check).
  • The friendly-error layer gains a CHECK-violation entry.

Implementation notes

A sensible build order, each step test-guarded:

  1. AST + the constraint grammar fragment. ColumnSpec / Command changes; the shared constraint-suffix grammar; create table and add column parse constraints into the AST. No execution yet — but every exhaustive match compiles. (Combine with step 2 if the match breakage makes a green commit impossible, exactly as ADR-0028 did.)
  2. Execution at create / add time. schema_to_ddl emits the constraint DDL; do_create_table / do_add_column honour it, including the §6 add column rules. The Expr → SQL compile for CHECK.
  3. Storage round-trip. ColumnSchema fields, the check_expr metadata column, and the project.yaml read/write paths.
  4. add constraint … / drop constraint …. The two commands, the rebuild-table path, and the §5 dry-run with its pretty-table refusals.
  5. Rendering + friendly errors. constraints_display, the CHECK-violation catalog entry, the structure view.
  6. Typing-surface matrix cells for the new grammar.

See also

  • ADR-0002 — STRICT tables; "no engine name in user-facing strings" (the compiled CHECK SQL names no product).
  • ADR-0009 — DSL syntax conventions; keyword-led clauses.
  • ADR-0012 — internal column metadata; __rdbms_playground_columns.
  • ADR-0013 — the rebuild-table primitive, reused by §5; the dotted <Table>.<column> reference reused by §2.2.
  • ADR-0014 — the value-literal grammar, reused by default.
  • ADR-0016 — pretty-table rendering for the §5 dry-run tables.
  • ADR-0017 — the per-cell dry-run ethos §5 follows.
  • ADR-0018 — serial / shortid auto-fill, why default is rejected on those columns; the existing unique field.
  • ADR-0026 — the Expr grammar CHECK reuses via Subgrammar.
  • ADR-0028 — the literal-inlining helper reused to compile CHECK expressions into parameter-free DDL.