# ADR-0029: Column constraints (NOT NULL / UNIQUE / CHECK / DEFAULT) ## Status Accepted ## Context The requirements checklist (`C3`) commits to the full schema- constraint surface. Most of it has landed across earlier ADRs: primary keys, single and compound, at create-table time; foreign keys with `ON DELETE` / `ON UPDATE` actions (ADR-0013); indexes (ADR-0025). What remains are the four *column-level* constraints every relational schema relies on: `NOT NULL`, `UNIQUE`, `CHECK`, and `DEFAULT`. They are core pedagogy. A learner cannot model a real schema without saying "this column is required", "these values must be distinct", "this value must be in range", or "use this when none is given". Until now the playground can only teach the primary key — which bundles NOT NULL + UNIQUE invisibly — and the relational shape of foreign keys. The four standalone constraints are the missing half of "what a column is". Three things make this the right moment, and shape the decision: 1. **The rebuild-table primitive exists** (ADR-0013). SQLite's `ALTER TABLE` cannot add `NOT NULL` / `UNIQUE` / `CHECK` to an existing column; the rebuild dance is the accepted recipe, and it is already load-bearing for change-column- type and every relationship operation. 2. **The per-cell dry-run model exists** (ADR-0017). Applying a constraint to a *populated* column is the same shape of problem as a type change: scan the existing data, classify it, and refuse with a learner-friendly table of offending rows *before* any SQL writes. ADR-0029 reuses that ethos. 3. **The WHERE-expression grammar exists** (ADR-0026). A `CHECK` constraint is a boolean expression over the column. Rather than invent a second expression language, `CHECK` *reuses* the ADR-0026 `Expr` grammar through the `Subgrammar` node — the same expression language the learner already meets in `where` filters. `PRIMARY KEY` is **not** revisited here. It keeps its dedicated `with pk …` clause: it is table-level (compound keys span columns), it is declared once, and folding it into the per-column constraint suffix would muddy both surfaces. A PK column is implicitly `NOT NULL` and `UNIQUE`; this ADR makes that explicit in §9 by rejecting redundant declarations. ## Decision ### 1. The constraint vocabulary Four column-level constraints, each appearing **at most once** per column in v1: | Constraint | Payload | Meaning | |--------------|------------------------|-------------------------------------------| | `not null` | — | the column rejects `NULL` | | `unique` | — | all non-`NULL` values must be distinct | | `default` | a value literal | value used when an `insert` omits the column | | `check` | a parenthesised `Expr` | every row must satisfy the boolean expression | The payload literal for `default` is an ordinary value literal per the ADR-0014 value grammar. The `check` expression is an ADR-0026 `Expr`, reached through `Subgrammar` — see §2. `unique` follows SQLite's standard semantics: multiple `NULL`s do **not** collide (SQL's "NULLs are distinct" rule). This is stated explicitly because it is a common learner surprise and the friendly-error and dry-run text both depend on it. ### 2. Grammar — declaring constraints #### 2.1 The constraint suffix (`create table`, `add column`) A column spec gains an optional, repeatable constraint suffix **after** the `(type)` group: ``` create table Books with pk isbn(text) check (isbn like '978%') add column to Books: title (text) not null add column to Books: stock (int) default 0 check (stock >= 0) ``` **Where each constraint is useful.** The simple-mode `create table … with pk …` declares *only* primary-key columns — every column in the `with pk` list is part of the primary key (non-PK columns are added afterward with `add column`; see `docs/simple-mode-limitations.md`). Since a PK column is already `NOT NULL` and `UNIQUE`, §9 rejects those two as redundant there — so on a `create table` column the suffix is useful for `default` / `check`. `not null` / `unique` come into their own on `add column` (non-PK columns) and on `add constraint`. The suffix grammar is nonetheless shared verbatim across all three surfaces; §9 does the rejecting. - Standard SQL writes constraints after the data type (`email TEXT NOT NULL UNIQUE`). The playground brackets the type as `email(text)` — a pre-existing convention this ADR does not relitigate — so the faithful adaptation is the same keyword order *after* the bracketed type. - Constraint keywords are case-insensitive (ADR-0009) and may appear in any order. Each kind at most once; a repeat (`not null not null`) is a friendly error naming the duplicated constraint. - `default ` takes one value literal. - `check ( )` parenthesises its expression — matching SQL's `CHECK (…)`, and giving the parser an unambiguous end for the expression before the next constraint or the column-separating comma. The expression is the ADR-0026 `Expr` grammar via `Node::Subgrammar(&expr::OR_EXPR)`. - The column-separating comma of the `with pk` list follows the last constraint of a column, exactly as today. The constraint suffix is one shared grammar fragment, used by both `create table`'s column list and `add column`. #### 2.2 Constraints on an existing column (`add constraint …` / `drop constraint …`) Modifying the constraints of an *already-created* column is a first-class v1 surface. It reuses the established `add` / `drop` verb families and the dotted `.` reference (ADR-0013's `add 1:n relationship from

.

`): ``` add constraint not null to Users.email add constraint unique to Users.email add constraint default 18 to Users.age add constraint check (age >= 0) to Users.age drop constraint not null from Users.email drop constraint unique from Users.email drop constraint default from Users.age drop constraint check from Users.age ``` - `add constraint to .` and `drop constraint from .` join the existing `add` / `drop` command `Choice`s as new forms. The form word `constraint` discriminates them — exactly as `column` / `index` / `relationship` discriminate the other `add` / `drop` forms. Every `add` / `drop` stays uniformly ` …`, which keeps the grammar hierarchy clean. - After `add constraint`, the `` is exactly the §2.1 suffix vocabulary: `not null`, `unique`, `default `, `check ( )`. The constraint grammar fragment is shared a third time. - After `drop constraint`, only the *kind* is named — `not null`, `unique`, `default`, `check` — since at most one of each exists per column. ### 3. AST `ColumnSpec` (`src/dsl/command.rs`) gains the four constraint slots — each kind is at-most-one, so they are fields, not a `Vec`: ```rust pub struct ColumnSpec { pub name: String, pub ty: Type, pub not_null: bool, pub unique: bool, pub default: Option, pub check: Option, } ``` `Command::AddColumn` carries the same four slots alongside its `table` / `column` / `ty`. Two new commands cover §2.2: ```rust Command::AddConstraint { table: String, column: String, constraint: Constraint } Command::DropConstraint { table: String, column: String, kind: ConstraintKind } pub enum Constraint { NotNull, Unique, Default(Value), Check(Expr) } pub enum ConstraintKind { NotNull, Unique, Default, Check } ``` `Constraint` carries the payload (`Default` / `Check`); `ConstraintKind` is payload-free, for `drop`. ### 4. SQLite mapping and enforcement Every constraint maps directly to native SQLite column-DDL — nothing is enforced application-side that SQLite can enforce itself, consistent with the STRICT-tables posture (ADR-0002): ```sql "email" TEXT NOT NULL UNIQUE "age" INTEGER DEFAULT 18 CHECK ("age" >= 0) ``` - `not null` → `NOT NULL`; `unique` → `UNIQUE`; `default v` → `DEFAULT `. - `check` → `CHECK ()`. The `Expr` is compiled to SQL the same way WHERE expressions are (`compile_expr`), except DDL admits no `?` parameters, so literals are inlined — the inlining helper from ADR-0028 §3 (`inline_params_for_display`) already does exactly this. Identifiers are double-quoted; the emitted SQL names no engine product, so the ADR-0002 rule holds. Constraint DDL flows through the existing column-DDL assembly (`schema_to_ddl`), so the rebuild-table primitive and every ALTER path pick it up automatically once the constraint data reaches `ReadColumn`. ### 5. Applying a constraint to a populated column — the dry-run `add constraint not null` / `unique` / `check` against a column that already holds data can be *violated by that data*. Per the ADR-0017 ethos, the violation is caught by a **dry-run pass that runs before any SQL writes**, and reported as a learner-friendly table — never as a raw engine error. | Command | Pre-flight check | |--------------------------------|-----------------------------------------------------------| | `add constraint not null` | `SELECT … WHERE "col" IS NULL` — any row is a violation | | `add constraint unique` | non-`NULL` values grouped; any group of size > 1 collides | | `add constraint check (expr)` | `SELECT … WHERE NOT (expr)` — any row is a violation | | `add constraint default` | none — `DEFAULT` never touches existing rows | | `drop constraint ` | none — removing a constraint cannot violate data | On a violation the command is **refused** and the offending rows are rendered through the pretty-table renderer (ADR-0016 / ADR-0017 §7) — rows identified by primary-key value(s), capped at 100 with a trailing `… and N more` row. For `add constraint unique`, colliding rows are grouped by their shared value, mirroring ADR-0017 §7's uniqueness-collision table. When the dry-run passes, the constraint is applied through the rebuild-table primitive (ADR-0013) — SQLite cannot `ALTER` a constraint onto an existing column. An empty table passes the dry-run trivially. There is no `--force` override: a constraint that the data violates is not something the learner should be able to wave through. They fix the data (with `update` / `delete`) and retry. This is stricter than ADR-0017's `--force-conversion` because a constraint is a *rule*, not a lossy transform — a forced-through violation would leave the schema lying about the data. ### 6. DEFAULT semantics - A `default` lets an `insert` **omit** the column in the explicit-column form (`insert into T (a, b) values …`); the omitted column takes its default. The natural-order short form (`insert into T values (…)`) still requires every non-auto-generated column positionally — there is no positional "skip", exactly as in SQL. - `add constraint default` affects only *future* inserts; existing rows are untouched (SQL-standard behaviour, and why §5 needs no dry-run for it). - `default` is rejected on a `serial` or `shortid` column: those carry their own auto-fill contract (ADR-0018), and a second source of "the value when none is given" would be ambiguous. The error says so. - `add column … not null` on a **non-empty** table requires a `default` in the same spec — the new column is `NULL` in every existing row, which `NOT NULL` forbids. Without a default the command is refused with a message that explains the rule and points at `default`. On an empty table it is fine. (This is SQLite's own `ADD COLUMN` rule, surfaced as a friendly error rather than a raw one.) - `add column … unique default ` on a table with more than one row is refused: every existing row would receive the same default value, colliding immediately. The error explains the collision. ### 7. Storage and round-trip `playground.db` is a derived artifact (ADR-0004 / ADR-0015); constraints must round-trip through `project.yaml` or they vanish on `rebuild` / `export` / `import`. The `CHECK` expression is stored — everywhere — as the **compiled SQL** form: the parsed `Expr` run through `compile_expr` with literals inlined (§4). Identifiers are double-quoted, exactly as ADR-0028's `explain` display SQL already renders them, so the form is consistent with what the learner already meets. (The original plan stored canonical *DSL text*; that needs both an `Expr`→text renderer and a text→`Expr` re-parser for the round-trip, while the rebuild path needs the SQL form regardless — storing SQL once removes both.) - **`project.yaml`** — the `ColumnSchema` record gains `not_null: bool`, `default: Option`, and `check: Option`. (`unique: bool` already exists, from ADR-0018's `serial` / `shortid` contract.) `check` holds the compiled SQL. - **Metadata table** — `NOT NULL`, `UNIQUE`, and `DEFAULT` are all recoverable from SQLite itself (`pragma_table_info`'s `notnull` and `dflt_value`; `pragma_index_list` origin `u`), so they need no metadata row. `CHECK` is *not* exposed by any pragma. So `__rdbms_playground_columns` carries a nullable `check_expr TEXT` column holding the compiled SQL, which `schema_to_ddl` and `describe` echo verbatim. It is part of the internal table's `CREATE TABLE` definition — there are no existing databases to migrate. ### 8. Structure rendering `describe_table` already reports `primary_key` and `notnull`; `ColumnDescription` gains `unique: bool`, `default: Option`, and `check: Option`. `constraints_display` (`output_render.rs`) extends to list every constraint a column carries: ``` ┌───────┬────────┬─────────────────────────────────┐ │ Name │ Type │ Constraints │ ├───────┼────────┼─────────────────────────────────┤ │ id │ serial │ PK │ │ email │ text │ NOT NULL, UNIQUE │ │ age │ int │ DEFAULT 18, CHECK ("age" >= 0) │ └───────┴────────┴─────────────────────────────────┘ ``` The `CHECK` renders in its compiled-SQL form (§7) — the same double-quoted-identifier style as ADR-0028's `explain` SQL. ### 9. PK columns — redundant and impossible constraints A primary-key column is always implicitly `NOT NULL`. A *single-column* primary key is additionally `UNIQUE`; the columns of a *compound* primary key are **not** individually unique — only the combination is. This makes the rules precise: - Declaring `not null` on any PK column — in the suffix or via `add constraint … to` — is a **friendly error**: the PK already implies it. The message says so, rather than silently ignoring it, so the learner learns *why*. - Declaring `unique` on a single-column-PK column is likewise a friendly error. Declaring `unique` on a column of a *compound* PK is **allowed and meaningful** — it adds an individual-uniqueness rule the compound PK does not provide. - `drop constraint not null` from any PK column, and `drop constraint unique` from a single-column-PK column, are friendly errors: the primary key still enforces them, so there is nothing to drop. - `default` and `check` on a PK column are always allowed — the PK implies neither. Friendly errors here are deliberate teaching moments, not silent no-ops: the playground favours clarity over permissiveness even where SQLite itself would tolerate the redundant declaration. ### 10. Friendly errors The friendly-error layer (ADR-0019) already enriches `NOT NULL` and `UNIQUE` violations from `insert` / `update`. Two additions: - **`CHECK` violation** — the engine reports `CHECK constraint failed: `. The runtime enrichment resolves the column, the offending value, and the column's `CHECK` expression; the catalog entry renders "the value `` breaks the rule ``". The rule is shown in its **compiled-SQL form** (§7) — the same form stored and displayed everywhere, since deviation §7 removed the DSL-text representation. - **The §5 / §6 refusals** — populated-column dry-run failures, the `not null` / `unique` / `serial+default` rejections — are all friendly catalog strings, with the dry-run tables rendered per ADR-0016. ### 11. Out of scope - **OOS-1.** Table-level `CHECK` constraints spanning multiple columns. v1's `check` is per-column; its expression typically references the column it sits on. A table-level `check (…)` clause is a clean follow-up. - **OOS-2.** More than one `CHECK` per column. v1 caps at one; SQL allows several. - **OOS-3.** Non-literal `DEFAULT`s — `CURRENT_TIMESTAMP`, expressions. v1 takes a value literal only. - **OOS-4.** Named constraints (`CONSTRAINT CHECK …`). The playground's constraints are anonymous; only indexes and relationships carry user names. - **OOS-5.** Inline `REFERENCES` foreign keys in the column suffix. Relationships keep their dedicated `add 1:n relationship` command (ADR-0013). - **OOS-6.** `--force`-style overrides for §5 — deliberately omitted (see §5). ## Consequences - `ColumnSpec` and `Command::AddColumn` grow four constraint slots; two new commands (`AddConstraint` / `DropConstraint`) and two new AST enums (`Constraint` / `ConstraintKind`) land. Every exhaustive `match Command` gains arms. - A shared constraint-suffix grammar fragment is used in three places: the `create table` column list, `add column`, and the `add constraint … to` command. `CHECK` reuses the ADR-0026 `Expr` grammar through `Subgrammar`. - The rebuild-table primitive (ADR-0013) becomes the path for `add constraint …` / `drop constraint …` on populated columns, joining its existing change-column-type and relationship users. - A pre-flight dry-run (§5), in the ADR-0017 style, guards every populated-column constraint addition; its violation tables go through the ADR-0016 pretty-table renderer. - The internal metadata table `__rdbms_playground_columns` carries a new `check_expr` column — its first change since ADR-0012 — holding the compiled-SQL form of the `CHECK`. - `project.yaml`'s `ColumnSchema` grows three fields; the format stays backward-compatible (the new keys default to "absent" — `not_null: false`, no `default`, no `check`). - The friendly-error layer gains a `CHECK`-violation entry. ## Implementation notes A sensible build order, each step test-guarded: 1. **AST + the constraint grammar fragment.** `ColumnSpec` / `Command` changes; the shared constraint-suffix grammar; `create table` and `add column` parse constraints into the AST. No execution yet — but every exhaustive `match` compiles. (Combine with step 2 if the `match` breakage makes a green commit impossible, exactly as ADR-0028 did.) 2. **Execution at create / add time.** `schema_to_ddl` emits the constraint DDL; `do_create_table` / `do_add_column` honour it, including the §6 `add column` rules. The `Expr` → SQL compile for `CHECK`. 3. **Storage round-trip.** `ColumnSchema` fields, the `check_expr` metadata column, and the `project.yaml` read/write paths. 4. **`add constraint …` / `drop constraint …`.** The two commands, the rebuild-table path, and the §5 dry-run with its pretty-table refusals. 5. **Rendering + friendly errors.** `constraints_display`, the `CHECK`-violation catalog entry, the structure view. 6. **Typing-surface matrix cells** for the new grammar. ## See also - ADR-0002 — STRICT tables; "no engine name in user-facing strings" (the compiled `CHECK` SQL names no product). - ADR-0009 — DSL syntax conventions; keyword-led clauses. - ADR-0012 — internal column metadata; `__rdbms_playground_columns`. - ADR-0013 — the rebuild-table primitive, reused by §5; the dotted `
.` reference reused by §2.2. - ADR-0014 — the value-literal grammar, reused by `default`. - ADR-0016 — pretty-table rendering for the §5 dry-run tables. - ADR-0017 — the per-cell dry-run ethos §5 follows. - ADR-0018 — `serial` / `shortid` auto-fill, why `default` is rejected on those columns; the existing `unique` field. - ADR-0026 — the `Expr` grammar `CHECK` reuses via `Subgrammar`. - ADR-0028 — the literal-inlining helper reused to compile `CHECK` expressions into parameter-free DDL.