docs: ADR-0035 — advanced-mode SQL DDL (Phase 4)

Phase 4 of the ADR-0030 roadmap; clarifies §4. Advanced-mode
CREATE/DROP/ALTER TABLE + CREATE/DROP INDEX get their own
per-statement Sql* commands, executed structurally (not verbatim)
so the playground's types, named relationships, and STRICT stay
intact. Full surface (no pre-emptive cuts): constraints, compound
PK, FK -> named relationships (one statement = one undo step),
ALTER incl. advanced-only table rename (C1), [UNIQUE] indexes.
Unified column-type-conversion: lossy refuses in simple mode but
proceeds-with-a-note in advanced, with undo as the safety net.
Integration (parser/hint/completion/diagnostics/history/replay/undo)
is structural via the unified grammar; replay treats DDL as a write.
Nine sub-phases (4a-4i). Updates the ADR README index.

Status: Proposed (design agreed; implementation pending).
This commit is contained in:
claude@clouddev1
2026-05-24 22:14:30 +00:00
parent df6aa69155
commit a079200b17
2 changed files with 331 additions and 0 deletions
+330
View File
@@ -0,0 +1,330 @@
# ADR-0035: Advanced-mode SQL DDL
## Status
Proposed. Design agreed with the user (2026-05-24); implementation
is phased and pending (§13). This is **Phase 4** of the ADR-0030
roadmap (the advanced-mode SQL surface), the peer of ADR-0031
(expression grammar), ADR-0032 (`SELECT`), and ADR-0033 (DML). It
**clarifies ADR-0030 §4** on how DDL is represented and executed.
## Context
ADR-0030 fixed the *architecture* of advanced mode — SQL authored as
grammar in the unified tree (not a separate batch parser), with the
playground's own type vocabulary and metadata model — and noted that
each large grammar piece gets its own focused ADR. Phases 13 shipped:
the SQL expression grammar (ADR-0031), full `SELECT` (ADR-0032), and
DML — `INSERT`/`UPDATE`/`DELETE` (ADR-0033). Phase 4 is **DDL**:
`CREATE` / `DROP` / `ALTER TABLE` and `CREATE` / `DROP INDEX`.
Two things from the earlier phases shape this one:
1. **The advanced surface gets its *own* commands.** ADR-0033
established that a SQL statement produces a distinct command
(`SqlInsert` / `SqlUpdate` / `SqlDelete`), separate from the
simple-mode typed command for the same verb. Those DML commands
execute as **validated SQL run verbatim** — possible only because
DML changes no schema and touches no metadata.
2. **DDL cannot run verbatim.** If `CREATE TABLE Orders (id INTEGER)`
executed as-is, the engine would make the table, but the
playground would lose what the user meant: that `id` is `serial`,
that a `REFERENCES` clause is a *named relationship*, that `STRICT`
applies, that the ten-type vocabulary governs. Recovering that
needs the parsed statement either way.
ADR-0030 §4 said "DDL → a `Command` … run the typed executor." That
remains right in spirit — DDL is *structurally* executed, not raw —
but it predates the DML build and read as "reuse the simple-mode
`CreateTable` variant." This ADR clarifies it: **DDL gets its own
advanced commands too**, executed structurally (not verbatim). The
"verbatim" execution of the DML commands is an implementation
convenience available only because nothing about DML required
otherwise — not an architectural rule.
Requirements touched: realizes `Q4` for DDL; closes the advanced-mode
side of table/column/index/constraint/relationship operations; lands
the table-rename half of `C1` (advanced mode only).
## Decision
### 1. Own per-statement SQL DDL commands (clarifies ADR-0030 §4)
New `Command` variants, one per statement kind — granularity mirrors
the DML phase:
- `SqlCreateTable`
- `SqlAlterTable`
- `SqlDropTable`
- `SqlCreateIndex`
- `SqlDropIndex`
They are produced by the unified grammar's `ast_builder`s in advanced
mode. Unlike the DML `Sql*` commands they **execute structurally**:
the handler reads the parsed structure and performs the schema change
through the playground's metadata-maintaining machinery — writing
`__rdbms_playground_columns` / `__rdbms_playground_relationships`,
applying `STRICT`, using the ten-type vocabulary — so an
advanced-mode-created object is a first-class playground object,
identical to a simple-mode-created one (ADR-0030 §5).
**Simple mode is untouched.** The existing typed commands
(`CreateTable`, `AddColumn`, `AddRelationship`, …) and their grammar
are unchanged; advanced SQL DDL is purely additive.
**Execution sharing (per the user's steer).** The SQL DDL handlers
**reuse the low-level schema/metadata helpers** — the table builder,
the metadata writers, the rebuild-table primitive (ADR-0013) — where
the underlying operation is genuinely the same, so the two surfaces
cannot drift. Where the SQL path is genuinely different (e.g. a
`CREATE TABLE` that declares several inline foreign keys, which has no
simple-mode shape), it is implemented directly **for clarity rather
than bending the simple-mode command shapes to absorb it**. Shared
where it works; separate where it doesn't.
### 2. Dispatch — shared entry words, advanced-only `alter`
`create` and `drop` are already simple-mode entry words. They reuse
the **category-grouped, mode-aware dispatch** from ADR-0033
Amendment 1: each appears in both the `Simple` and `Advanced` groups
of the `REGISTRY`; in advanced mode the SQL node is tried first and
falls back to the simple node when the SQL shape doesn't match. So in
advanced mode `CREATE TABLE T (id serial)` parses as SQL while
`create table T with pk id(serial)` still parses as the simple form —
exactly as `insert` behaves today. `alter` is a **new advanced-only
entry word** (`CommandCategory::Advanced`); simple mode keeps its
`add column` / `drop column` / `rename column` / `change column`
verbs and gains no `alter`.
### 3. Type vocabulary (restates ADR-0030 §5)
The type-name slot accepts the playground keywords directly (`text`,
`int`, `real`, `decimal`, `bool`, `date`, `datetime`, `blob`,
`serial`, `shortid`) **and** standard-SQL aliases mapped onto them:
`integer`/`smallint`/`bigint``int`; `varchar`/`char``text`;
`boolean``bool`; `timestamp``datetime`; `numeric``decimal`;
`float`/`double precision``real`; `binary`/`varbinary``blob`. A
length/precision argument (`varchar(255)`, `numeric(10,2)`) is
**accepted and ignored** — the playground's types are
unparameterised. Engine storage-type names are neither accepted as
input nor shown (§9).
### 4. The DDL surface (full; `Q4`, no pre-emptive cuts)
**`CREATE TABLE <name> ( <element>, … )`**
- **Column elements**: `<name> <type> [constraints…]`, where the
column constraints are the ADR-0029 set spelled in SQL: `NOT NULL`,
`UNIQUE`, `PRIMARY KEY`, `DEFAULT <expr>`, `CHECK (<expr>)`, and an
inline `REFERENCES <T>(<col>) [ON DELETE …] [ON UPDATE …]` (§5).
- **Table elements**: `PRIMARY KEY (<col>, …)` (single **and
compound**), `UNIQUE (<col>, …)`, `CHECK (<expr>)`,
`[CONSTRAINT <name>] FOREIGN KEY (<col>) REFERENCES <T>(<col>)
[ON DELETE …] [ON UPDATE …]` (§5).
- `CHECK` and `DEFAULT` expressions reuse the ADR-0031 `sql_expr`
grammar (the same fragment `WHERE`/`HAVING`/projections use).
**`DROP TABLE <name>`** → `SqlDropTable`. Cascade of inbound
relationships follows the existing `drop table` semantics.
**`ALTER TABLE <name> <action>`** → `SqlAlterTable`, where `<action>`
covers, mapping to the existing low-level operations:
| SQL action | Underlying operation |
|---|---|
| `ADD COLUMN <name> <type> [constraints]` | add-column (ADR-0013 rebuild where needed) |
| `DROP COLUMN <name>` | drop-column |
| `RENAME COLUMN <old> TO <new>` | rename-column |
| `ALTER COLUMN <name> TYPE <type>` | change-column-type (§5 conversion) |
| `ADD [CONSTRAINT <name>] <table-constraint>` | add-constraint / add-relationship (FK) |
| `DROP CONSTRAINT <name>` | drop-constraint |
| `RENAME TO <new>` | **table rename (§6, new low-level op)** |
**`CREATE [UNIQUE] INDEX [<name>] ON <table> (<col>, …)`** →
`SqlCreateIndex`, mapped to the ADR-0025 index machinery; `UNIQUE`
sets the index's uniqueness (a small extension to ADR-0025's index
model if it does not already carry the flag, called out in §13).
**`DROP INDEX <name>`** → `SqlDropIndex`.
### 5. Foreign keys → named relationships
A `REFERENCES` / `FOREIGN KEY` clause is the SQL spelling of an
ADR-0013 relationship. Because `SqlCreateTable` is its own command
carrying the whole parsed structure, a `CREATE TABLE` that declares
FK columns **creates the table and its relationship metadata
together** — one statement, one command, one transaction, **one undo
step** (§10). No decomposition into separate commands is needed.
- `ON DELETE` / `ON UPDATE` → the ADR-0013 referential actions.
- A `CONSTRAINT <name> FOREIGN KEY …` names the relationship; an
unnamed FK is auto-named by the existing ADR-0013 convention.
- `ALTER TABLE child ADD [CONSTRAINT <name>] FOREIGN KEY (<col>)
REFERENCES <P>(<col>) …` adds a relationship to an existing table
(the clean 1:1 with add-relationship).
- FK column type compatibility follows `Type::fk_target_type`
(ADR-0011) unchanged.
### 6. Table rename — advanced mode only (`C1`)
`ALTER TABLE <old> RENAME TO <new>` is **advanced-mode only**; there
is no simple-mode rename-table verb. It needs a genuinely new
low-level operation (none exists today): within one transaction,
rename the table in the database, rename its `data/<table>.csv` file,
and update every metadata row that names it — the column-metadata
rows, and **both ends of any relationship** in
`__rdbms_playground_relationships` that references the old name. Name
validation and `__rdbms_*` rejection apply to the target. This closes
the rename half of `C1` for the advanced surface.
### 7. Column type conversion — one engine, mode-appropriate policy
The per-cell classification of ADR-0017 (clean / lossy / incompatible,
plus static refusals for playground-type-specific targets such as
`→ serial` and `↔ blob`) is a property of the **type set**, shared by
both modes. The policy on the *lossy* tier differs by mode:
| Tier | Simple mode | Advanced mode (`ALTER COLUMN … TYPE`) |
|---|---|---|
| **clean** | auto-convert | auto-convert |
| **incompatible** | refuse (friendly) | refuse (friendly) — real SQL errors too |
| **static-refused** (`→serial`, `↔blob`, …) | refuse | refuse — our own types have no SQL meaning to mirror |
| **lossy** (`3.14`→`3`) | **refuse by default**; `--force-conversion` opts in | **perform it** (what SQL does), with a post-op "N values converted with loss" note; **no force flag** |
Rationale: **simple mode protects up front; advanced mode trusts the
user like SQL does and lets `undo` catch regrets.** A lossy advanced
conversion is snapshotted (§10), so it is one `undo` away — there is
no silent *irreversible* loss, and no need to drop to simple mode to
"force". Conversions that exist only in the playground's vocabulary
stay protected in both modes. The simple-mode `--force-conversion` /
`--dont-convert` flags are unchanged and have **no SQL spelling**
(advanced mode always performs the conversion); the Postgres `USING
<expr>` clause is **not** adopted (§12).
### 8. Constraints
Column- and table-level constraints map to the ADR-0029 model:
`NOT NULL`, `UNIQUE`, `PRIMARY KEY` (incl. compound, table-level),
`DEFAULT <expr>`, `CHECK (<expr>)`. A populated-column constraint
addition reuses ADR-0029's pre-flight dry-run guard. `CHECK` /
`DEFAULT` expressions are stored as the SQL the user could re-enter in
advanced mode (ADR-0030 §11) — one syntax, not a third.
### 9. Engine neutrality (ADR-0030 §7)
No engine type names in or out (§3). `STRICT` is applied internally by
the create path; it is not in the authored grammar, so typing it is an
ordinary parse error, not a surfaced engine feature. Parse errors,
out-of-subset refusals, and execution failures route through the
friendly-error layer (ADR-0019) with engine-neutral wording.
### 10. Persistence, metadata, history, replay, undo
- Structural execution keeps `project.yaml`, the metadata tables, and
the CSV layer correct with the same guarantees as the simple-mode
path (ADR-0015 §6 ordering preserved).
- `history.log` records the **literal submitted SQL line**; replay
re-runs it through the one walker with the advanced view active.
`create` / `drop` / `alter` are **schema-write entry words, not in
ADR-0034 Amendment 1's app-lifecycle skip set**, so SQL DDL
**replays as a write** (re-applied) with **no replay-filter change**
— unlike `undo` / `redo`, which had to be added to that skip set.
- **Undo (ADR-0006):** each SQL DDL statement is a user mutation
carrying a `source`, so it is snapshotted by the worker hook and is
**one undo step** — including a `CREATE TABLE` with foreign keys,
precisely because it is a single command (§5) rather than a
decomposed sequence.
### 11. Ambient assistance comes for free (ADR-0030 §8)
Because the DDL is grammar in the unified tree, the walker
**mechanisms** apply with no DDL-specific assistance code: syntax
highlighting, the `[ERR]`/`[WRN]` validity indicator (ADR-0027), the
per-command parse-error usage skeleton (ADR-0021), and the completion
engine.
What each grammar node still **authors** (this is writing the grammar,
not bolting assistance on afterwards): the correct `IdentSource` on
every schema-name slot — so `ALTER TABLE`/`DROP TABLE`/`DROP INDEX`
and `REFERENCES T(col)` / `CREATE INDEX ON T (cols)` complete from the
`SchemaCache`; the per-node hint + usage catalog keys (as the
app-command nodes carry `help_id` / `usage_ids`); and the
DDL-specific walker diagnostics with their catalog keys — the DDL
peers of the DML diagnostics ADR-0033 added (e.g. unknown type,
column-already-exists, FK column-type mismatch, the §7 lossy-conversion
note). The integration is structural, not free of authoring.
### 12. Out of scope
- Per ADR-0030 §3: views, triggers, transaction control, `PRAGMA`,
`ATTACH`/`DETACH`, `VACUUM`, virtual tables, multi-statement
batches. One statement per submission; a trailing `;` is tolerated.
- The Postgres `USING <expr>` conversion clause (§7) — heavy
(per-row expression evaluation), dialect-specific, and unable to
express playground-type targets.
- The simple-mode `--dont-convert` semantics have no SQL form
(advanced `ALTER COLUMN TYPE` always converts).
- The **DSL → SQL teaching echo** (ADR-0030 §10) is Phase 5, a
separate ADR — not this one.
- Engine-specific DDL spellings (`AUTOINCREMENT`, `WITHOUT ROWID`,
collations, `IF [NOT] EXISTS` if judged out-of-subset) — the
grammar admits the standard surface; extras are ordinary parse
errors.
### 13. Phased implementation plan
Sub-phases, each opening with the smallest end-to-end slice and each
with an explicit exit gate + a written Devil's-Advocate gate, mirroring
ADR-0033's structure:
- **4a — Dispatch + `CREATE TABLE` core.** Advanced `create`
dispatch; `SqlCreateTable` for columns + types (the §3 map) +
column constraints + single/compound `PRIMARY KEY`. No FK yet.
- **4b — Foreign keys in `CREATE TABLE`.** Inline `REFERENCES` +
table-level `FOREIGN KEY` → relationship metadata, one undo step.
- **4c — `DROP TABLE`** → `SqlDropTable` (cascade parity).
- **4d — `CREATE [UNIQUE] INDEX` / `DROP INDEX`** → `SqlCreateIndex`
/ `SqlDropIndex` (ADR-0025; the `UNIQUE` flag extension if needed).
- **4e — `ALTER TABLE` add/drop/rename column.**
- **4f — `ALTER TABLE … ALTER COLUMN TYPE`** (the §7 conversion
model + the lossy-with-note path).
- **4g — `ALTER TABLE` add/drop constraint, add foreign key.**
- **4h — `ALTER TABLE … RENAME TO`** (the §6 new low-level op).
- **4i — Verification sweep.** Typing-surface + matrix coverage,
engine-neutral error pass, undo-parity check (one step per
statement), `help`/usage for the new forms.
## Consequences
- Advanced mode reaches DDL parity with simple mode and adds
table-rename, so a learner can build and evolve a whole schema in
standard SQL with the playground's types, metadata, and safety
intact.
- The command set grows by five `Sql*` DDL variants; the worker gains
their handlers, which lean on shared low-level helpers where the
operation matches the simple-mode path and stand alone where the
SQL surface is genuinely richer (multi-FK `CREATE TABLE`).
- One genuinely new capability — table rename — adds a low-level op
that the simple mode does not have; it must keep the CSV file name
and the relationship metadata in step with the table name.
- ADR-0030 §4 is clarified (own `Sql*` DDL commands, structurally
executed); no behaviour of the shipped DML/`SELECT` phases changes.
- The conversion model unifies simple and advanced without a force
flag in SQL, relying on `undo` (ADR-0006) as the advanced-mode
safety net — a concrete payoff of having shipped undo first.
## See also
- **ADR-0030** — the advanced-mode architecture; this is its Phase 4
and clarifies §4 (DDL representation) and restates §5 (types) / §7
(neutrality) / §8 (assistance) / §11 (persistence).
- **ADR-0033** — the DML phase; source of the category-grouped
mode-aware dispatch (Amendment 1) reused for shared entry words.
- **ADR-0031** — `sql_expr`, reused for `CHECK` / `DEFAULT`.
- **ADR-0013** — relationships + the rebuild-table primitive that the
`ALTER`/FK handlers build on.
- **ADR-0017** — the column type-change classification §7 shares.
- **ADR-0029** — column constraints; **ADR-0025** — indexes;
**ADR-0011** — FK column-type compatibility; **ADR-0005** — the
ten-type vocabulary.
- **ADR-0006** — undo; each DDL statement is one undo step (§10).