Files
rdbms-playground/docs/adr/0035-advanced-mode-sql-ddl.md
T
claude@clouddev1 a079200b17 docs: ADR-0035 — advanced-mode SQL DDL (Phase 4)
Phase 4 of the ADR-0030 roadmap; clarifies §4. Advanced-mode
CREATE/DROP/ALTER TABLE + CREATE/DROP INDEX get their own
per-statement Sql* commands, executed structurally (not verbatim)
so the playground's types, named relationships, and STRICT stay
intact. Full surface (no pre-emptive cuts): constraints, compound
PK, FK -> named relationships (one statement = one undo step),
ALTER incl. advanced-only table rename (C1), [UNIQUE] indexes.
Unified column-type-conversion: lossy refuses in simple mode but
proceeds-with-a-note in advanced, with undo as the safety net.
Integration (parser/hint/completion/diagnostics/history/replay/undo)
is structural via the unified grammar; replay treats DDL as a write.
Nine sub-phases (4a-4i). Updates the ADR README index.

Status: Proposed (design agreed; implementation pending).
2026-05-24 22:14:30 +00:00

331 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ADR-0035: Advanced-mode SQL DDL
## Status
Proposed. Design agreed with the user (2026-05-24); implementation
is phased and pending (§13). This is **Phase 4** of the ADR-0030
roadmap (the advanced-mode SQL surface), the peer of ADR-0031
(expression grammar), ADR-0032 (`SELECT`), and ADR-0033 (DML). It
**clarifies ADR-0030 §4** on how DDL is represented and executed.
## Context
ADR-0030 fixed the *architecture* of advanced mode — SQL authored as
grammar in the unified tree (not a separate batch parser), with the
playground's own type vocabulary and metadata model — and noted that
each large grammar piece gets its own focused ADR. Phases 13 shipped:
the SQL expression grammar (ADR-0031), full `SELECT` (ADR-0032), and
DML — `INSERT`/`UPDATE`/`DELETE` (ADR-0033). Phase 4 is **DDL**:
`CREATE` / `DROP` / `ALTER TABLE` and `CREATE` / `DROP INDEX`.
Two things from the earlier phases shape this one:
1. **The advanced surface gets its *own* commands.** ADR-0033
established that a SQL statement produces a distinct command
(`SqlInsert` / `SqlUpdate` / `SqlDelete`), separate from the
simple-mode typed command for the same verb. Those DML commands
execute as **validated SQL run verbatim** — possible only because
DML changes no schema and touches no metadata.
2. **DDL cannot run verbatim.** If `CREATE TABLE Orders (id INTEGER)`
executed as-is, the engine would make the table, but the
playground would lose what the user meant: that `id` is `serial`,
that a `REFERENCES` clause is a *named relationship*, that `STRICT`
applies, that the ten-type vocabulary governs. Recovering that
needs the parsed statement either way.
ADR-0030 §4 said "DDL → a `Command` … run the typed executor." That
remains right in spirit — DDL is *structurally* executed, not raw —
but it predates the DML build and read as "reuse the simple-mode
`CreateTable` variant." This ADR clarifies it: **DDL gets its own
advanced commands too**, executed structurally (not verbatim). The
"verbatim" execution of the DML commands is an implementation
convenience available only because nothing about DML required
otherwise — not an architectural rule.
Requirements touched: realizes `Q4` for DDL; closes the advanced-mode
side of table/column/index/constraint/relationship operations; lands
the table-rename half of `C1` (advanced mode only).
## Decision
### 1. Own per-statement SQL DDL commands (clarifies ADR-0030 §4)
New `Command` variants, one per statement kind — granularity mirrors
the DML phase:
- `SqlCreateTable`
- `SqlAlterTable`
- `SqlDropTable`
- `SqlCreateIndex`
- `SqlDropIndex`
They are produced by the unified grammar's `ast_builder`s in advanced
mode. Unlike the DML `Sql*` commands they **execute structurally**:
the handler reads the parsed structure and performs the schema change
through the playground's metadata-maintaining machinery — writing
`__rdbms_playground_columns` / `__rdbms_playground_relationships`,
applying `STRICT`, using the ten-type vocabulary — so an
advanced-mode-created object is a first-class playground object,
identical to a simple-mode-created one (ADR-0030 §5).
**Simple mode is untouched.** The existing typed commands
(`CreateTable`, `AddColumn`, `AddRelationship`, …) and their grammar
are unchanged; advanced SQL DDL is purely additive.
**Execution sharing (per the user's steer).** The SQL DDL handlers
**reuse the low-level schema/metadata helpers** — the table builder,
the metadata writers, the rebuild-table primitive (ADR-0013) — where
the underlying operation is genuinely the same, so the two surfaces
cannot drift. Where the SQL path is genuinely different (e.g. a
`CREATE TABLE` that declares several inline foreign keys, which has no
simple-mode shape), it is implemented directly **for clarity rather
than bending the simple-mode command shapes to absorb it**. Shared
where it works; separate where it doesn't.
### 2. Dispatch — shared entry words, advanced-only `alter`
`create` and `drop` are already simple-mode entry words. They reuse
the **category-grouped, mode-aware dispatch** from ADR-0033
Amendment 1: each appears in both the `Simple` and `Advanced` groups
of the `REGISTRY`; in advanced mode the SQL node is tried first and
falls back to the simple node when the SQL shape doesn't match. So in
advanced mode `CREATE TABLE T (id serial)` parses as SQL while
`create table T with pk id(serial)` still parses as the simple form —
exactly as `insert` behaves today. `alter` is a **new advanced-only
entry word** (`CommandCategory::Advanced`); simple mode keeps its
`add column` / `drop column` / `rename column` / `change column`
verbs and gains no `alter`.
### 3. Type vocabulary (restates ADR-0030 §5)
The type-name slot accepts the playground keywords directly (`text`,
`int`, `real`, `decimal`, `bool`, `date`, `datetime`, `blob`,
`serial`, `shortid`) **and** standard-SQL aliases mapped onto them:
`integer`/`smallint`/`bigint``int`; `varchar`/`char``text`;
`boolean``bool`; `timestamp``datetime`; `numeric``decimal`;
`float`/`double precision``real`; `binary`/`varbinary``blob`. A
length/precision argument (`varchar(255)`, `numeric(10,2)`) is
**accepted and ignored** — the playground's types are
unparameterised. Engine storage-type names are neither accepted as
input nor shown (§9).
### 4. The DDL surface (full; `Q4`, no pre-emptive cuts)
**`CREATE TABLE <name> ( <element>, … )`**
- **Column elements**: `<name> <type> [constraints…]`, where the
column constraints are the ADR-0029 set spelled in SQL: `NOT NULL`,
`UNIQUE`, `PRIMARY KEY`, `DEFAULT <expr>`, `CHECK (<expr>)`, and an
inline `REFERENCES <T>(<col>) [ON DELETE …] [ON UPDATE …]` (§5).
- **Table elements**: `PRIMARY KEY (<col>, …)` (single **and
compound**), `UNIQUE (<col>, …)`, `CHECK (<expr>)`,
`[CONSTRAINT <name>] FOREIGN KEY (<col>) REFERENCES <T>(<col>)
[ON DELETE …] [ON UPDATE …]` (§5).
- `CHECK` and `DEFAULT` expressions reuse the ADR-0031 `sql_expr`
grammar (the same fragment `WHERE`/`HAVING`/projections use).
**`DROP TABLE <name>`** → `SqlDropTable`. Cascade of inbound
relationships follows the existing `drop table` semantics.
**`ALTER TABLE <name> <action>`** → `SqlAlterTable`, where `<action>`
covers, mapping to the existing low-level operations:
| SQL action | Underlying operation |
|---|---|
| `ADD COLUMN <name> <type> [constraints]` | add-column (ADR-0013 rebuild where needed) |
| `DROP COLUMN <name>` | drop-column |
| `RENAME COLUMN <old> TO <new>` | rename-column |
| `ALTER COLUMN <name> TYPE <type>` | change-column-type (§5 conversion) |
| `ADD [CONSTRAINT <name>] <table-constraint>` | add-constraint / add-relationship (FK) |
| `DROP CONSTRAINT <name>` | drop-constraint |
| `RENAME TO <new>` | **table rename (§6, new low-level op)** |
**`CREATE [UNIQUE] INDEX [<name>] ON <table> (<col>, …)`** →
`SqlCreateIndex`, mapped to the ADR-0025 index machinery; `UNIQUE`
sets the index's uniqueness (a small extension to ADR-0025's index
model if it does not already carry the flag, called out in §13).
**`DROP INDEX <name>`** → `SqlDropIndex`.
### 5. Foreign keys → named relationships
A `REFERENCES` / `FOREIGN KEY` clause is the SQL spelling of an
ADR-0013 relationship. Because `SqlCreateTable` is its own command
carrying the whole parsed structure, a `CREATE TABLE` that declares
FK columns **creates the table and its relationship metadata
together** — one statement, one command, one transaction, **one undo
step** (§10). No decomposition into separate commands is needed.
- `ON DELETE` / `ON UPDATE` → the ADR-0013 referential actions.
- A `CONSTRAINT <name> FOREIGN KEY …` names the relationship; an
unnamed FK is auto-named by the existing ADR-0013 convention.
- `ALTER TABLE child ADD [CONSTRAINT <name>] FOREIGN KEY (<col>)
REFERENCES <P>(<col>) …` adds a relationship to an existing table
(the clean 1:1 with add-relationship).
- FK column type compatibility follows `Type::fk_target_type`
(ADR-0011) unchanged.
### 6. Table rename — advanced mode only (`C1`)
`ALTER TABLE <old> RENAME TO <new>` is **advanced-mode only**; there
is no simple-mode rename-table verb. It needs a genuinely new
low-level operation (none exists today): within one transaction,
rename the table in the database, rename its `data/<table>.csv` file,
and update every metadata row that names it — the column-metadata
rows, and **both ends of any relationship** in
`__rdbms_playground_relationships` that references the old name. Name
validation and `__rdbms_*` rejection apply to the target. This closes
the rename half of `C1` for the advanced surface.
### 7. Column type conversion — one engine, mode-appropriate policy
The per-cell classification of ADR-0017 (clean / lossy / incompatible,
plus static refusals for playground-type-specific targets such as
`→ serial` and `↔ blob`) is a property of the **type set**, shared by
both modes. The policy on the *lossy* tier differs by mode:
| Tier | Simple mode | Advanced mode (`ALTER COLUMN … TYPE`) |
|---|---|---|
| **clean** | auto-convert | auto-convert |
| **incompatible** | refuse (friendly) | refuse (friendly) — real SQL errors too |
| **static-refused** (`→serial`, `↔blob`, …) | refuse | refuse — our own types have no SQL meaning to mirror |
| **lossy** (`3.14`→`3`) | **refuse by default**; `--force-conversion` opts in | **perform it** (what SQL does), with a post-op "N values converted with loss" note; **no force flag** |
Rationale: **simple mode protects up front; advanced mode trusts the
user like SQL does and lets `undo` catch regrets.** A lossy advanced
conversion is snapshotted (§10), so it is one `undo` away — there is
no silent *irreversible* loss, and no need to drop to simple mode to
"force". Conversions that exist only in the playground's vocabulary
stay protected in both modes. The simple-mode `--force-conversion` /
`--dont-convert` flags are unchanged and have **no SQL spelling**
(advanced mode always performs the conversion); the Postgres `USING
<expr>` clause is **not** adopted (§12).
### 8. Constraints
Column- and table-level constraints map to the ADR-0029 model:
`NOT NULL`, `UNIQUE`, `PRIMARY KEY` (incl. compound, table-level),
`DEFAULT <expr>`, `CHECK (<expr>)`. A populated-column constraint
addition reuses ADR-0029's pre-flight dry-run guard. `CHECK` /
`DEFAULT` expressions are stored as the SQL the user could re-enter in
advanced mode (ADR-0030 §11) — one syntax, not a third.
### 9. Engine neutrality (ADR-0030 §7)
No engine type names in or out (§3). `STRICT` is applied internally by
the create path; it is not in the authored grammar, so typing it is an
ordinary parse error, not a surfaced engine feature. Parse errors,
out-of-subset refusals, and execution failures route through the
friendly-error layer (ADR-0019) with engine-neutral wording.
### 10. Persistence, metadata, history, replay, undo
- Structural execution keeps `project.yaml`, the metadata tables, and
the CSV layer correct with the same guarantees as the simple-mode
path (ADR-0015 §6 ordering preserved).
- `history.log` records the **literal submitted SQL line**; replay
re-runs it through the one walker with the advanced view active.
`create` / `drop` / `alter` are **schema-write entry words, not in
ADR-0034 Amendment 1's app-lifecycle skip set**, so SQL DDL
**replays as a write** (re-applied) with **no replay-filter change**
— unlike `undo` / `redo`, which had to be added to that skip set.
- **Undo (ADR-0006):** each SQL DDL statement is a user mutation
carrying a `source`, so it is snapshotted by the worker hook and is
**one undo step** — including a `CREATE TABLE` with foreign keys,
precisely because it is a single command (§5) rather than a
decomposed sequence.
### 11. Ambient assistance comes for free (ADR-0030 §8)
Because the DDL is grammar in the unified tree, the walker
**mechanisms** apply with no DDL-specific assistance code: syntax
highlighting, the `[ERR]`/`[WRN]` validity indicator (ADR-0027), the
per-command parse-error usage skeleton (ADR-0021), and the completion
engine.
What each grammar node still **authors** (this is writing the grammar,
not bolting assistance on afterwards): the correct `IdentSource` on
every schema-name slot — so `ALTER TABLE`/`DROP TABLE`/`DROP INDEX`
and `REFERENCES T(col)` / `CREATE INDEX ON T (cols)` complete from the
`SchemaCache`; the per-node hint + usage catalog keys (as the
app-command nodes carry `help_id` / `usage_ids`); and the
DDL-specific walker diagnostics with their catalog keys — the DDL
peers of the DML diagnostics ADR-0033 added (e.g. unknown type,
column-already-exists, FK column-type mismatch, the §7 lossy-conversion
note). The integration is structural, not free of authoring.
### 12. Out of scope
- Per ADR-0030 §3: views, triggers, transaction control, `PRAGMA`,
`ATTACH`/`DETACH`, `VACUUM`, virtual tables, multi-statement
batches. One statement per submission; a trailing `;` is tolerated.
- The Postgres `USING <expr>` conversion clause (§7) — heavy
(per-row expression evaluation), dialect-specific, and unable to
express playground-type targets.
- The simple-mode `--dont-convert` semantics have no SQL form
(advanced `ALTER COLUMN TYPE` always converts).
- The **DSL → SQL teaching echo** (ADR-0030 §10) is Phase 5, a
separate ADR — not this one.
- Engine-specific DDL spellings (`AUTOINCREMENT`, `WITHOUT ROWID`,
collations, `IF [NOT] EXISTS` if judged out-of-subset) — the
grammar admits the standard surface; extras are ordinary parse
errors.
### 13. Phased implementation plan
Sub-phases, each opening with the smallest end-to-end slice and each
with an explicit exit gate + a written Devil's-Advocate gate, mirroring
ADR-0033's structure:
- **4a — Dispatch + `CREATE TABLE` core.** Advanced `create`
dispatch; `SqlCreateTable` for columns + types (the §3 map) +
column constraints + single/compound `PRIMARY KEY`. No FK yet.
- **4b — Foreign keys in `CREATE TABLE`.** Inline `REFERENCES` +
table-level `FOREIGN KEY` → relationship metadata, one undo step.
- **4c — `DROP TABLE`** → `SqlDropTable` (cascade parity).
- **4d — `CREATE [UNIQUE] INDEX` / `DROP INDEX`** → `SqlCreateIndex`
/ `SqlDropIndex` (ADR-0025; the `UNIQUE` flag extension if needed).
- **4e — `ALTER TABLE` add/drop/rename column.**
- **4f — `ALTER TABLE … ALTER COLUMN TYPE`** (the §7 conversion
model + the lossy-with-note path).
- **4g — `ALTER TABLE` add/drop constraint, add foreign key.**
- **4h — `ALTER TABLE … RENAME TO`** (the §6 new low-level op).
- **4i — Verification sweep.** Typing-surface + matrix coverage,
engine-neutral error pass, undo-parity check (one step per
statement), `help`/usage for the new forms.
## Consequences
- Advanced mode reaches DDL parity with simple mode and adds
table-rename, so a learner can build and evolve a whole schema in
standard SQL with the playground's types, metadata, and safety
intact.
- The command set grows by five `Sql*` DDL variants; the worker gains
their handlers, which lean on shared low-level helpers where the
operation matches the simple-mode path and stand alone where the
SQL surface is genuinely richer (multi-FK `CREATE TABLE`).
- One genuinely new capability — table rename — adds a low-level op
that the simple mode does not have; it must keep the CSV file name
and the relationship metadata in step with the table name.
- ADR-0030 §4 is clarified (own `Sql*` DDL commands, structurally
executed); no behaviour of the shipped DML/`SELECT` phases changes.
- The conversion model unifies simple and advanced without a force
flag in SQL, relying on `undo` (ADR-0006) as the advanced-mode
safety net — a concrete payoff of having shipped undo first.
## See also
- **ADR-0030** — the advanced-mode architecture; this is its Phase 4
and clarifies §4 (DDL representation) and restates §5 (types) / §7
(neutrality) / §8 (assistance) / §11 (persistence).
- **ADR-0033** — the DML phase; source of the category-grouped
mode-aware dispatch (Amendment 1) reused for shared entry words.
- **ADR-0031** — `sql_expr`, reused for `CHECK` / `DEFAULT`.
- **ADR-0013** — relationships + the rebuild-table primitive that the
`ALTER`/FK handlers build on.
- **ADR-0017** — the column type-change classification §7 shares.
- **ADR-0029** — column constraints; **ADR-0025** — indexes;
**ADR-0011** — FK column-type compatibility; **ADR-0005** — the
ten-type vocabulary.
- **ADR-0006** — undo; each DDL statement is one undo step (§10).