94ec87b2ff
Three design questions settled during 4a implementation (plan + ADR §13 + README in lockstep): - CHECK/DEFAULT defer to the 4a.2 constraint slice: sql_expr is validate-only (no Expr AST), so they need raw-SQL-text storage on a separate path, not do_create_table's Expr->compile reuse. 4a.2 now also covers composite UNIQUE / multi-column table CHECK. - double precision (the lone two-word alias) handled via a keyword-pair branch; single-word aliases + discarded (len) cover the rest. - serial sole-PK in a multi-column table must inline PRIMARY KEY to keep autoincrement (worker-step do_create_table extension). 4a core narrows to columns + types + NOT NULL/UNIQUE/PRIMARY KEY + IF NOT EXISTS; everything else errors "not yet supported".
379 lines
19 KiB
Markdown
379 lines
19 KiB
Markdown
# ADR-0035: Advanced-mode SQL DDL
|
||
|
||
## Status
|
||
|
||
Proposed. Design agreed with the user (2026-05-24); implementation
|
||
is phased and pending (§13). This is **Phase 4** of the ADR-0030
|
||
roadmap (the advanced-mode SQL surface), the peer of ADR-0031
|
||
(expression grammar), ADR-0032 (`SELECT`), and ADR-0033 (DML). It
|
||
**clarifies ADR-0030 §4** on how DDL is represented and executed.
|
||
|
||
**Refinements (2026-05-24, pre-implementation `/runda` round,
|
||
user-confirmed).** Two open micro-calls were settled before 4a:
|
||
(1) `IF [NOT] EXISTS` is **admitted** as a no-op-that-succeeds-with-a-note
|
||
rather than refused — it is a near-universal cross-vendor idiom
|
||
(PostgreSQL, MySQL/MariaDB, SQLite, Oracle 23ai), not an
|
||
engine-specific spelling, so it belongs in the standard surface
|
||
(§3/§4/§12/§13); (2) `INTEGER PRIMARY KEY` maps to a **plain `int`**
|
||
primary key, *not* auto-increment — `serial` remains the sole
|
||
auto-increment type (§3).
|
||
|
||
## Context
|
||
|
||
ADR-0030 fixed the *architecture* of advanced mode — SQL authored as
|
||
grammar in the unified tree (not a separate batch parser), with the
|
||
playground's own type vocabulary and metadata model — and noted that
|
||
each large grammar piece gets its own focused ADR. Phases 1–3 shipped:
|
||
the SQL expression grammar (ADR-0031), full `SELECT` (ADR-0032), and
|
||
DML — `INSERT`/`UPDATE`/`DELETE` (ADR-0033). Phase 4 is **DDL**:
|
||
`CREATE` / `DROP` / `ALTER TABLE` and `CREATE` / `DROP INDEX`.
|
||
|
||
Two things from the earlier phases shape this one:
|
||
|
||
1. **The advanced surface gets its *own* commands.** ADR-0033
|
||
established that a SQL statement produces a distinct command
|
||
(`SqlInsert` / `SqlUpdate` / `SqlDelete`), separate from the
|
||
simple-mode typed command for the same verb. Those DML commands
|
||
execute as **validated SQL run verbatim** — possible only because
|
||
DML changes no schema and touches no metadata.
|
||
2. **DDL cannot run verbatim.** If `CREATE TABLE Orders (id INTEGER)`
|
||
executed as-is, the engine would make the table, but the
|
||
playground would lose what the user meant: that `id` is `serial`,
|
||
that a `REFERENCES` clause is a *named relationship*, that `STRICT`
|
||
applies, that the ten-type vocabulary governs. Recovering that
|
||
needs the parsed statement either way.
|
||
|
||
ADR-0030 §4 said "DDL → a `Command` … run the typed executor." That
|
||
remains right in spirit — DDL is *structurally* executed, not raw —
|
||
but it predates the DML build and read as "reuse the simple-mode
|
||
`CreateTable` variant." This ADR clarifies it: **DDL gets its own
|
||
advanced commands too**, executed structurally (not verbatim). The
|
||
"verbatim" execution of the DML commands is an implementation
|
||
convenience available only because nothing about DML required
|
||
otherwise — not an architectural rule.
|
||
|
||
Requirements touched: realizes `Q4` for DDL; closes the advanced-mode
|
||
side of table/column/index/constraint/relationship operations; lands
|
||
the table-rename half of `C1` (advanced mode only).
|
||
|
||
## Decision
|
||
|
||
### 1. Own per-statement SQL DDL commands (clarifies ADR-0030 §4)
|
||
|
||
New `Command` variants, one per statement kind — granularity mirrors
|
||
the DML phase:
|
||
|
||
- `SqlCreateTable`
|
||
- `SqlAlterTable`
|
||
- `SqlDropTable`
|
||
- `SqlCreateIndex`
|
||
- `SqlDropIndex`
|
||
|
||
They are produced by the unified grammar's `ast_builder`s in advanced
|
||
mode. Unlike the DML `Sql*` commands they **execute structurally**:
|
||
the handler reads the parsed structure and performs the schema change
|
||
through the playground's metadata-maintaining machinery — writing
|
||
`__rdbms_playground_columns` / `__rdbms_playground_relationships`,
|
||
applying `STRICT`, using the ten-type vocabulary — so an
|
||
advanced-mode-created object is a first-class playground object,
|
||
identical to a simple-mode-created one (ADR-0030 §5).
|
||
|
||
**Simple mode is untouched.** The existing typed commands
|
||
(`CreateTable`, `AddColumn`, `AddRelationship`, …) and their grammar
|
||
are unchanged; advanced SQL DDL is purely additive.
|
||
|
||
**Execution sharing (per the user's steer).** The SQL DDL handlers
|
||
**reuse the low-level schema/metadata helpers** — the table builder,
|
||
the metadata writers, the rebuild-table primitive (ADR-0013) — where
|
||
the underlying operation is genuinely the same, so the two surfaces
|
||
cannot drift. Where the SQL path is genuinely different (e.g. a
|
||
`CREATE TABLE` that declares several inline foreign keys, which has no
|
||
simple-mode shape), it is implemented directly **for clarity rather
|
||
than bending the simple-mode command shapes to absorb it**. Shared
|
||
where it works; separate where it doesn't.
|
||
|
||
### 2. Dispatch — shared entry words, advanced-only `alter`
|
||
|
||
`create` and `drop` are already simple-mode entry words. They reuse
|
||
the **category-grouped, mode-aware dispatch** from ADR-0033
|
||
Amendment 1: each appears in both the `Simple` and `Advanced` groups
|
||
of the `REGISTRY`; in advanced mode the SQL node is tried first and
|
||
falls back to the simple node when the SQL shape doesn't match. So in
|
||
advanced mode `CREATE TABLE T (id serial)` parses as SQL while
|
||
`create table T with pk id(serial)` still parses as the simple form —
|
||
exactly as `insert` behaves today. `alter` is a **new advanced-only
|
||
entry word** (`CommandCategory::Advanced`); simple mode keeps its
|
||
`add column` / `drop column` / `rename column` / `change column`
|
||
verbs and gains no `alter`.
|
||
|
||
### 3. Type vocabulary (restates ADR-0030 §5)
|
||
|
||
The type-name slot accepts the playground keywords directly (`text`,
|
||
`int`, `real`, `decimal`, `bool`, `date`, `datetime`, `blob`,
|
||
`serial`, `shortid`) **and** standard-SQL aliases mapped onto them:
|
||
`integer`/`smallint`/`bigint` → `int`; `varchar`/`char` → `text`;
|
||
`boolean` → `bool`; `timestamp` → `datetime`; `numeric` → `decimal`;
|
||
`float`/`double precision` → `real`; `binary`/`varbinary` → `blob`. A
|
||
length/precision argument (`varchar(255)`, `numeric(10,2)`) is
|
||
**accepted and ignored** — the playground's types are
|
||
unparameterised. Engine storage-type names are neither accepted as
|
||
input nor shown (§9).
|
||
|
||
The map is purely **lexical**: `INTEGER PRIMARY KEY` becomes a plain
|
||
`int` primary key — it is **not** treated as auto-increment, unlike
|
||
the engine's rowid-alias idiom. Auto-increment is reached only through
|
||
the explicit `serial` type (`id serial primary key`). This keeps the
|
||
engine's storage behaviour from leaking into the standard surface and
|
||
matches ADR-0005's single-auto-increment-type model.
|
||
|
||
### 4. The DDL surface (full; `Q4`, no pre-emptive cuts)
|
||
|
||
**`CREATE TABLE <name> ( <element>, … )`**
|
||
|
||
- **Column elements**: `<name> <type> [constraints…]`, where the
|
||
column constraints are the ADR-0029 set spelled in SQL: `NOT NULL`,
|
||
`UNIQUE`, `PRIMARY KEY`, `DEFAULT <expr>`, `CHECK (<expr>)`, and an
|
||
inline `REFERENCES <T>(<col>) [ON DELETE …] [ON UPDATE …]` (§5).
|
||
- **Table elements**: `PRIMARY KEY (<col>, …)` (single **and
|
||
compound**), `UNIQUE (<col>, …)`, `CHECK (<expr>)`,
|
||
`[CONSTRAINT <name>] FOREIGN KEY (<col>) REFERENCES <T>(<col>)
|
||
[ON DELETE …] [ON UPDATE …]` (§5).
|
||
- `CHECK` and `DEFAULT` expressions reuse the ADR-0031 `sql_expr`
|
||
grammar (the same fragment `WHERE`/`HAVING`/projections use).
|
||
- `CREATE TABLE IF NOT EXISTS <name> …` is admitted: when the table
|
||
already exists the statement is a **no-op that succeeds with a note**
|
||
("table already exists — skipped") instead of the plain-form
|
||
"table already exists" error. `IF NOT EXISTS` is a near-universal
|
||
cross-vendor idiom, not an engine-specific spelling, so it is part of
|
||
the standard surface (refines §12).
|
||
|
||
**`DROP TABLE [IF EXISTS] <name>`** → `SqlDropTable`. Cascade of inbound
|
||
relationships follows the existing `drop table` semantics. `IF EXISTS`
|
||
is admitted (universal across the major engines): dropping an absent
|
||
table is then a **no-op that succeeds with a note** instead of the
|
||
plain-form "no such table" error.
|
||
|
||
**`ALTER TABLE <name> <action>`** → `SqlAlterTable`, where `<action>`
|
||
covers, mapping to the existing low-level operations:
|
||
|
||
| SQL action | Underlying operation |
|
||
|---|---|
|
||
| `ADD COLUMN <name> <type> [constraints]` | add-column (ADR-0013 rebuild where needed) |
|
||
| `DROP COLUMN <name>` | drop-column |
|
||
| `RENAME COLUMN <old> TO <new>` | rename-column |
|
||
| `ALTER COLUMN <name> TYPE <type>` | change-column-type (§5 conversion) |
|
||
| `ADD [CONSTRAINT <name>] <table-constraint>` | add-constraint / add-relationship (FK) |
|
||
| `DROP CONSTRAINT <name>` | drop-constraint |
|
||
| `RENAME TO <new>` | **table rename (§6, new low-level op)** |
|
||
|
||
**`CREATE [UNIQUE] INDEX [<name>] ON <table> (<col>, …)`** →
|
||
`SqlCreateIndex`, mapped to the ADR-0025 index machinery; `UNIQUE`
|
||
sets the index's uniqueness (a small extension to ADR-0025's index
|
||
model if it does not already carry the flag, called out in §13).
|
||
|
||
**`DROP INDEX <name>`** → `SqlDropIndex`.
|
||
|
||
### 5. Foreign keys → named relationships
|
||
|
||
A `REFERENCES` / `FOREIGN KEY` clause is the SQL spelling of an
|
||
ADR-0013 relationship. Because `SqlCreateTable` is its own command
|
||
carrying the whole parsed structure, a `CREATE TABLE` that declares
|
||
FK columns **creates the table and its relationship metadata
|
||
together** — one statement, one command, one transaction, **one undo
|
||
step** (§10). No decomposition into separate commands is needed.
|
||
|
||
- `ON DELETE` / `ON UPDATE` → the ADR-0013 referential actions.
|
||
- A `CONSTRAINT <name> FOREIGN KEY …` names the relationship; an
|
||
unnamed FK is auto-named by the existing ADR-0013 convention.
|
||
- `ALTER TABLE child ADD [CONSTRAINT <name>] FOREIGN KEY (<col>)
|
||
REFERENCES <P>(<col>) …` adds a relationship to an existing table
|
||
(the clean 1:1 with add-relationship).
|
||
- FK column type compatibility follows `Type::fk_target_type`
|
||
(ADR-0011) unchanged.
|
||
|
||
### 6. Table rename — advanced mode only (`C1`)
|
||
|
||
`ALTER TABLE <old> RENAME TO <new>` is **advanced-mode only**; there
|
||
is no simple-mode rename-table verb. It needs a genuinely new
|
||
low-level operation (none exists today): within one transaction,
|
||
rename the table in the database, rename its `data/<table>.csv` file,
|
||
and update every metadata row that names it — the column-metadata
|
||
rows, and **both ends of any relationship** in
|
||
`__rdbms_playground_relationships` that references the old name. Name
|
||
validation and `__rdbms_*` rejection apply to the target. This closes
|
||
the rename half of `C1` for the advanced surface.
|
||
|
||
### 7. Column type conversion — one engine, mode-appropriate policy
|
||
|
||
The per-cell classification of ADR-0017 (clean / lossy / incompatible,
|
||
plus static refusals for playground-type-specific targets such as
|
||
`→ serial` and `↔ blob`) is a property of the **type set**, shared by
|
||
both modes. The policy on the *lossy* tier differs by mode:
|
||
|
||
| Tier | Simple mode | Advanced mode (`ALTER COLUMN … TYPE`) |
|
||
|---|---|---|
|
||
| **clean** | auto-convert | auto-convert |
|
||
| **incompatible** | refuse (friendly) | refuse (friendly) — real SQL errors too |
|
||
| **static-refused** (`→serial`, `↔blob`, …) | refuse | refuse — our own types have no SQL meaning to mirror |
|
||
| **lossy** (`3.14`→`3`) | **refuse by default**; `--force-conversion` opts in | **perform it** (what SQL does), with a post-op "N values converted with loss" note; **no force flag** |
|
||
|
||
Rationale: **simple mode protects up front; advanced mode trusts the
|
||
user like SQL does and lets `undo` catch regrets.** A lossy advanced
|
||
conversion is snapshotted (§10), so it is one `undo` away — there is
|
||
no silent *irreversible* loss, and no need to drop to simple mode to
|
||
"force". Conversions that exist only in the playground's vocabulary
|
||
stay protected in both modes. The simple-mode `--force-conversion` /
|
||
`--dont-convert` flags are unchanged and have **no SQL spelling**
|
||
(advanced mode always performs the conversion); the Postgres `USING
|
||
<expr>` clause is **not** adopted (§12).
|
||
|
||
### 8. Constraints
|
||
|
||
Column- and table-level constraints map to the ADR-0029 model:
|
||
`NOT NULL`, `UNIQUE`, `PRIMARY KEY` (incl. compound, table-level),
|
||
`DEFAULT <expr>`, `CHECK (<expr>)`. A populated-column constraint
|
||
addition reuses ADR-0029's pre-flight dry-run guard. `CHECK` /
|
||
`DEFAULT` expressions are stored as the SQL the user could re-enter in
|
||
advanced mode (ADR-0030 §11) — one syntax, not a third.
|
||
|
||
### 9. Engine neutrality (ADR-0030 §7)
|
||
|
||
No engine type names in or out (§3). `STRICT` is applied internally by
|
||
the create path; it is not in the authored grammar, so typing it is an
|
||
ordinary parse error, not a surfaced engine feature. Parse errors,
|
||
out-of-subset refusals, and execution failures route through the
|
||
friendly-error layer (ADR-0019) with engine-neutral wording.
|
||
|
||
### 10. Persistence, metadata, history, replay, undo
|
||
|
||
- Structural execution keeps `project.yaml`, the metadata tables, and
|
||
the CSV layer correct with the same guarantees as the simple-mode
|
||
path (ADR-0015 §6 ordering preserved).
|
||
- `history.log` records the **literal submitted SQL line**; replay
|
||
re-runs it through the one walker with the advanced view active.
|
||
`create` / `drop` / `alter` are **schema-write entry words, not in
|
||
ADR-0034 Amendment 1's app-lifecycle skip set**, so SQL DDL
|
||
**replays as a write** (re-applied) with **no replay-filter change**
|
||
— unlike `undo` / `redo`, which had to be added to that skip set.
|
||
- **Undo (ADR-0006):** each SQL DDL statement is a user mutation
|
||
carrying a `source`, so it is snapshotted by the worker hook and is
|
||
**one undo step** — including a `CREATE TABLE` with foreign keys,
|
||
precisely because it is a single command (§5) rather than a
|
||
decomposed sequence.
|
||
|
||
### 11. Ambient assistance comes for free (ADR-0030 §8)
|
||
|
||
Because the DDL is grammar in the unified tree, the walker
|
||
**mechanisms** apply with no DDL-specific assistance code: syntax
|
||
highlighting, the `[ERR]`/`[WRN]` validity indicator (ADR-0027), the
|
||
per-command parse-error usage skeleton (ADR-0021), and the completion
|
||
engine.
|
||
|
||
What each grammar node still **authors** (this is writing the grammar,
|
||
not bolting assistance on afterwards): the correct `IdentSource` on
|
||
every schema-name slot — so `ALTER TABLE`/`DROP TABLE`/`DROP INDEX`
|
||
and `REFERENCES T(col)` / `CREATE INDEX ON T (cols)` complete from the
|
||
`SchemaCache`; the per-node hint + usage catalog keys (as the
|
||
app-command nodes carry `help_id` / `usage_ids`); and the
|
||
DDL-specific walker diagnostics with their catalog keys — the DDL
|
||
peers of the DML diagnostics ADR-0033 added (e.g. unknown type,
|
||
column-already-exists, FK column-type mismatch, the §7 lossy-conversion
|
||
note). The integration is structural, not free of authoring.
|
||
|
||
### 12. Out of scope
|
||
|
||
- Per ADR-0030 §3: views, triggers, transaction control, `PRAGMA`,
|
||
`ATTACH`/`DETACH`, `VACUUM`, virtual tables, multi-statement
|
||
batches. One statement per submission; a trailing `;` is tolerated.
|
||
- The Postgres `USING <expr>` conversion clause (§7) — heavy
|
||
(per-row expression evaluation), dialect-specific, and unable to
|
||
express playground-type targets.
|
||
- The simple-mode `--dont-convert` semantics have no SQL form
|
||
(advanced `ALTER COLUMN TYPE` always converts).
|
||
- The **DSL → SQL teaching echo** (ADR-0030 §10) is Phase 5, a
|
||
separate ADR — not this one.
|
||
- Engine-specific DDL spellings (`AUTOINCREMENT`, `WITHOUT ROWID`,
|
||
collations) — the grammar admits the standard surface; extras are
|
||
ordinary parse errors. (`IF [NOT] EXISTS` was **reclassified into
|
||
scope** — see §4 — as a near-universal cross-vendor idiom rather
|
||
than an engine-specific spelling.)
|
||
|
||
### 13. Phased implementation plan
|
||
|
||
Sub-phases, each opening with the smallest end-to-end slice and each
|
||
with an explicit exit gate + a written Devil's-Advocate gate, mirroring
|
||
ADR-0033's structure:
|
||
|
||
- **4a — Dispatch + `CREATE TABLE` core.** Advanced `create`
|
||
dispatch; `SqlCreateTable` for columns + types (the §3 map, incl. the
|
||
two-word `double precision` and discarded length args) + the
|
||
**clean-reuse column constraints only** — `NOT NULL` / `UNIQUE` /
|
||
column-level `PRIMARY KEY` — + single/compound table-level
|
||
`PRIMARY KEY`, plus `IF NOT EXISTS` (no-op-with-note, §4). Reuses
|
||
`do_create_table` (extended so a `serial` sole-PK inlines `PRIMARY
|
||
KEY` in a multi-column table, preserving autoincrement). **No FK**
|
||
(4b); **no `DEFAULT`/`CHECK`/table-level `UNIQUE`** (4a.2).
|
||
- **4a.2 — The constraint slice.** Split out (2026-05-24,
|
||
user-confirmed) for the constraints that are *not* a clean reuse:
|
||
(1) **`CHECK`/`DEFAULT`** via the full `sql_expr` surface stored as
|
||
**raw SQL text** — needed because `sql_expr` is validate-only and
|
||
yields no `Expr` AST for `compile_check_sql`/`ColumnSpec`, so it is a
|
||
separate execution path; (2) **composite `UNIQUE(a,b)` and
|
||
multi-column table `CHECK`** — the first structures `TableSchema`
|
||
cannot already represent, needing a model + YAML round-trip +
|
||
`read_schema` detection + `do_create_table` emission extension, with
|
||
save/load/rebuild tests. Until then 4a rejects all of these
|
||
"not yet supported". (The general rule: a DDL feature needs new
|
||
model/execution work only when it introduces a structure simple mode
|
||
could never produce, or an expression the structural helper cannot
|
||
consume — cf. the `UNIQUE`-index flag in 4d and the rename op in 4h.)
|
||
- **4b — Foreign keys in `CREATE TABLE`.** Inline `REFERENCES` +
|
||
table-level `FOREIGN KEY` → relationship metadata, one undo step.
|
||
- **4c — `DROP TABLE [IF EXISTS]`** → `SqlDropTable` (cascade parity;
|
||
`IF EXISTS` no-op-with-note, §4).
|
||
- **4d — `CREATE [UNIQUE] INDEX` / `DROP INDEX`** → `SqlCreateIndex`
|
||
/ `SqlDropIndex` (ADR-0025; the `UNIQUE` flag extension if needed).
|
||
- **4e — `ALTER TABLE` add/drop/rename column.**
|
||
- **4f — `ALTER TABLE … ALTER COLUMN TYPE`** (the §7 conversion
|
||
model + the lossy-with-note path).
|
||
- **4g — `ALTER TABLE` add/drop constraint, add foreign key.**
|
||
- **4h — `ALTER TABLE … RENAME TO`** (the §6 new low-level op).
|
||
- **4i — Verification sweep.** Typing-surface + matrix coverage,
|
||
engine-neutral error pass, undo-parity check (one step per
|
||
statement), `help`/usage for the new forms.
|
||
|
||
## Consequences
|
||
|
||
- Advanced mode reaches DDL parity with simple mode and adds
|
||
table-rename, so a learner can build and evolve a whole schema in
|
||
standard SQL with the playground's types, metadata, and safety
|
||
intact.
|
||
- The command set grows by five `Sql*` DDL variants; the worker gains
|
||
their handlers, which lean on shared low-level helpers where the
|
||
operation matches the simple-mode path and stand alone where the
|
||
SQL surface is genuinely richer (multi-FK `CREATE TABLE`).
|
||
- One genuinely new capability — table rename — adds a low-level op
|
||
that the simple mode does not have; it must keep the CSV file name
|
||
and the relationship metadata in step with the table name.
|
||
- ADR-0030 §4 is clarified (own `Sql*` DDL commands, structurally
|
||
executed); no behaviour of the shipped DML/`SELECT` phases changes.
|
||
- The conversion model unifies simple and advanced without a force
|
||
flag in SQL, relying on `undo` (ADR-0006) as the advanced-mode
|
||
safety net — a concrete payoff of having shipped undo first.
|
||
|
||
## See also
|
||
|
||
- **ADR-0030** — the advanced-mode architecture; this is its Phase 4
|
||
and clarifies §4 (DDL representation) and restates §5 (types) / §7
|
||
(neutrality) / §8 (assistance) / §11 (persistence).
|
||
- **ADR-0033** — the DML phase; source of the category-grouped
|
||
mode-aware dispatch (Amendment 1) reused for shared entry words.
|
||
- **ADR-0031** — `sql_expr`, reused for `CHECK` / `DEFAULT`.
|
||
- **ADR-0013** — relationships + the rebuild-table primitive that the
|
||
`ALTER`/FK handlers build on.
|
||
- **ADR-0017** — the column type-change classification §7 shares.
|
||
- **ADR-0029** — column constraints; **ADR-0025** — indexes;
|
||
**ADR-0011** — FK column-type compatibility; **ADR-0005** — the
|
||
ten-type vocabulary.
|
||
- **ADR-0006** — undo; each DDL statement is one undo step (§10).
|