Files
rdbms-playground/docs/adr/0035-advanced-mode-sql-ddl.md
T
claude@clouddev1 19d3cd3306 docs: ADR-0035 — record two /runda refinements (IF [NOT] EXISTS, INTEGER PRIMARY KEY)
Pre-implementation /runda round settled two open micro-calls before 4a,
both user-confirmed:

- IF [NOT] EXISTS admitted (no-op-that-succeeds-with-a-note), not
  refused — a near-universal cross-vendor idiom (PostgreSQL, MySQL,
  SQLite, Oracle 23ai), reclassified into scope rather than treated as
  an engine-specific spelling. Touches §3/§4/§12/§13 (4a, 4c).
- INTEGER PRIMARY KEY maps to a plain int PK, not auto-increment;
  serial stays the sole auto-increment type (§3).

README index updated in the same edit per the lockstep rule.
2026-05-24 22:31:44 +00:00

18 KiB
Raw Blame History

ADR-0035: Advanced-mode SQL DDL

Status

Proposed. Design agreed with the user (2026-05-24); implementation is phased and pending (§13). This is Phase 4 of the ADR-0030 roadmap (the advanced-mode SQL surface), the peer of ADR-0031 (expression grammar), ADR-0032 (SELECT), and ADR-0033 (DML). It clarifies ADR-0030 §4 on how DDL is represented and executed.

Refinements (2026-05-24, pre-implementation /runda round, user-confirmed). Two open micro-calls were settled before 4a: (1) IF [NOT] EXISTS is admitted as a no-op-that-succeeds-with-a-note rather than refused — it is a near-universal cross-vendor idiom (PostgreSQL, MySQL/MariaDB, SQLite, Oracle 23ai), not an engine-specific spelling, so it belongs in the standard surface (§3/§4/§12/§13); (2) INTEGER PRIMARY KEY maps to a plain int primary key, not auto-increment — serial remains the sole auto-increment type (§3).

Context

ADR-0030 fixed the architecture of advanced mode — SQL authored as grammar in the unified tree (not a separate batch parser), with the playground's own type vocabulary and metadata model — and noted that each large grammar piece gets its own focused ADR. Phases 13 shipped: the SQL expression grammar (ADR-0031), full SELECT (ADR-0032), and DML — INSERT/UPDATE/DELETE (ADR-0033). Phase 4 is DDL: CREATE / DROP / ALTER TABLE and CREATE / DROP INDEX.

Two things from the earlier phases shape this one:

  1. The advanced surface gets its own commands. ADR-0033 established that a SQL statement produces a distinct command (SqlInsert / SqlUpdate / SqlDelete), separate from the simple-mode typed command for the same verb. Those DML commands execute as validated SQL run verbatim — possible only because DML changes no schema and touches no metadata.
  2. DDL cannot run verbatim. If CREATE TABLE Orders (id INTEGER) executed as-is, the engine would make the table, but the playground would lose what the user meant: that id is serial, that a REFERENCES clause is a named relationship, that STRICT applies, that the ten-type vocabulary governs. Recovering that needs the parsed statement either way.

ADR-0030 §4 said "DDL → a Command … run the typed executor." That remains right in spirit — DDL is structurally executed, not raw — but it predates the DML build and read as "reuse the simple-mode CreateTable variant." This ADR clarifies it: DDL gets its own advanced commands too, executed structurally (not verbatim). The "verbatim" execution of the DML commands is an implementation convenience available only because nothing about DML required otherwise — not an architectural rule.

Requirements touched: realizes Q4 for DDL; closes the advanced-mode side of table/column/index/constraint/relationship operations; lands the table-rename half of C1 (advanced mode only).

Decision

1. Own per-statement SQL DDL commands (clarifies ADR-0030 §4)

New Command variants, one per statement kind — granularity mirrors the DML phase:

  • SqlCreateTable
  • SqlAlterTable
  • SqlDropTable
  • SqlCreateIndex
  • SqlDropIndex

They are produced by the unified grammar's ast_builders in advanced mode. Unlike the DML Sql* commands they execute structurally: the handler reads the parsed structure and performs the schema change through the playground's metadata-maintaining machinery — writing __rdbms_playground_columns / __rdbms_playground_relationships, applying STRICT, using the ten-type vocabulary — so an advanced-mode-created object is a first-class playground object, identical to a simple-mode-created one (ADR-0030 §5).

Simple mode is untouched. The existing typed commands (CreateTable, AddColumn, AddRelationship, …) and their grammar are unchanged; advanced SQL DDL is purely additive.

Execution sharing (per the user's steer). The SQL DDL handlers reuse the low-level schema/metadata helpers — the table builder, the metadata writers, the rebuild-table primitive (ADR-0013) — where the underlying operation is genuinely the same, so the two surfaces cannot drift. Where the SQL path is genuinely different (e.g. a CREATE TABLE that declares several inline foreign keys, which has no simple-mode shape), it is implemented directly for clarity rather than bending the simple-mode command shapes to absorb it. Shared where it works; separate where it doesn't.

2. Dispatch — shared entry words, advanced-only alter

create and drop are already simple-mode entry words. They reuse the category-grouped, mode-aware dispatch from ADR-0033 Amendment 1: each appears in both the Simple and Advanced groups of the REGISTRY; in advanced mode the SQL node is tried first and falls back to the simple node when the SQL shape doesn't match. So in advanced mode CREATE TABLE T (id serial) parses as SQL while create table T with pk id(serial) still parses as the simple form — exactly as insert behaves today. alter is a new advanced-only entry word (CommandCategory::Advanced); simple mode keeps its add column / drop column / rename column / change column verbs and gains no alter.

3. Type vocabulary (restates ADR-0030 §5)

The type-name slot accepts the playground keywords directly (text, int, real, decimal, bool, date, datetime, blob, serial, shortid) and standard-SQL aliases mapped onto them: integer/smallint/bigintint; varchar/chartext; booleanbool; timestampdatetime; numericdecimal; float/double precisionreal; binary/varbinaryblob. A length/precision argument (varchar(255), numeric(10,2)) is accepted and ignored — the playground's types are unparameterised. Engine storage-type names are neither accepted as input nor shown (§9).

The map is purely lexical: INTEGER PRIMARY KEY becomes a plain int primary key — it is not treated as auto-increment, unlike the engine's rowid-alias idiom. Auto-increment is reached only through the explicit serial type (id serial primary key). This keeps the engine's storage behaviour from leaking into the standard surface and matches ADR-0005's single-auto-increment-type model.

4. The DDL surface (full; Q4, no pre-emptive cuts)

CREATE TABLE <name> ( <element>, … )

  • Column elements: <name> <type> [constraints…], where the column constraints are the ADR-0029 set spelled in SQL: NOT NULL, UNIQUE, PRIMARY KEY, DEFAULT <expr>, CHECK (<expr>), and an inline REFERENCES <T>(<col>) [ON DELETE …] [ON UPDATE …] (§5).
  • Table elements: PRIMARY KEY (<col>, …) (single and compound), UNIQUE (<col>, …), CHECK (<expr>), [CONSTRAINT <name>] FOREIGN KEY (<col>) REFERENCES <T>(<col>) [ON DELETE …] [ON UPDATE …] (§5).
  • CHECK and DEFAULT expressions reuse the ADR-0031 sql_expr grammar (the same fragment WHERE/HAVING/projections use).
  • CREATE TABLE IF NOT EXISTS <name> … is admitted: when the table already exists the statement is a no-op that succeeds with a note ("table already exists — skipped") instead of the plain-form "table already exists" error. IF NOT EXISTS is a near-universal cross-vendor idiom, not an engine-specific spelling, so it is part of the standard surface (refines §12).

DROP TABLE [IF EXISTS] <name>SqlDropTable. Cascade of inbound relationships follows the existing drop table semantics. IF EXISTS is admitted (universal across the major engines): dropping an absent table is then a no-op that succeeds with a note instead of the plain-form "no such table" error.

ALTER TABLE <name> <action>SqlAlterTable, where <action> covers, mapping to the existing low-level operations:

SQL action Underlying operation
ADD COLUMN <name> <type> [constraints] add-column (ADR-0013 rebuild where needed)
DROP COLUMN <name> drop-column
RENAME COLUMN <old> TO <new> rename-column
ALTER COLUMN <name> TYPE <type> change-column-type (§5 conversion)
ADD [CONSTRAINT <name>] <table-constraint> add-constraint / add-relationship (FK)
DROP CONSTRAINT <name> drop-constraint
RENAME TO <new> table rename (§6, new low-level op)

CREATE [UNIQUE] INDEX [<name>] ON <table> (<col>, …)SqlCreateIndex, mapped to the ADR-0025 index machinery; UNIQUE sets the index's uniqueness (a small extension to ADR-0025's index model if it does not already carry the flag, called out in §13).

DROP INDEX <name>SqlDropIndex.

5. Foreign keys → named relationships

A REFERENCES / FOREIGN KEY clause is the SQL spelling of an ADR-0013 relationship. Because SqlCreateTable is its own command carrying the whole parsed structure, a CREATE TABLE that declares FK columns creates the table and its relationship metadata together — one statement, one command, one transaction, one undo step (§10). No decomposition into separate commands is needed.

  • ON DELETE / ON UPDATE → the ADR-0013 referential actions.
  • A CONSTRAINT <name> FOREIGN KEY … names the relationship; an unnamed FK is auto-named by the existing ADR-0013 convention.
  • ALTER TABLE child ADD [CONSTRAINT <name>] FOREIGN KEY (<col>) REFERENCES <P>(<col>) … adds a relationship to an existing table (the clean 1:1 with add-relationship).
  • FK column type compatibility follows Type::fk_target_type (ADR-0011) unchanged.

6. Table rename — advanced mode only (C1)

ALTER TABLE <old> RENAME TO <new> is advanced-mode only; there is no simple-mode rename-table verb. It needs a genuinely new low-level operation (none exists today): within one transaction, rename the table in the database, rename its data/<table>.csv file, and update every metadata row that names it — the column-metadata rows, and both ends of any relationship in __rdbms_playground_relationships that references the old name. Name validation and __rdbms_* rejection apply to the target. This closes the rename half of C1 for the advanced surface.

7. Column type conversion — one engine, mode-appropriate policy

The per-cell classification of ADR-0017 (clean / lossy / incompatible, plus static refusals for playground-type-specific targets such as → serial and ↔ blob) is a property of the type set, shared by both modes. The policy on the lossy tier differs by mode:

Tier Simple mode Advanced mode (ALTER COLUMN … TYPE)
clean auto-convert auto-convert
incompatible refuse (friendly) refuse (friendly) — real SQL errors too
static-refused (→serial, ↔blob, …) refuse refuse — our own types have no SQL meaning to mirror
lossy (3.143) refuse by default; --force-conversion opts in perform it (what SQL does), with a post-op "N values converted with loss" note; no force flag

Rationale: simple mode protects up front; advanced mode trusts the user like SQL does and lets undo catch regrets. A lossy advanced conversion is snapshotted (§10), so it is one undo away — there is no silent irreversible loss, and no need to drop to simple mode to "force". Conversions that exist only in the playground's vocabulary stay protected in both modes. The simple-mode --force-conversion / --dont-convert flags are unchanged and have no SQL spelling (advanced mode always performs the conversion); the Postgres USING <expr> clause is not adopted (§12).

8. Constraints

Column- and table-level constraints map to the ADR-0029 model: NOT NULL, UNIQUE, PRIMARY KEY (incl. compound, table-level), DEFAULT <expr>, CHECK (<expr>). A populated-column constraint addition reuses ADR-0029's pre-flight dry-run guard. CHECK / DEFAULT expressions are stored as the SQL the user could re-enter in advanced mode (ADR-0030 §11) — one syntax, not a third.

9. Engine neutrality (ADR-0030 §7)

No engine type names in or out (§3). STRICT is applied internally by the create path; it is not in the authored grammar, so typing it is an ordinary parse error, not a surfaced engine feature. Parse errors, out-of-subset refusals, and execution failures route through the friendly-error layer (ADR-0019) with engine-neutral wording.

10. Persistence, metadata, history, replay, undo

  • Structural execution keeps project.yaml, the metadata tables, and the CSV layer correct with the same guarantees as the simple-mode path (ADR-0015 §6 ordering preserved).
  • history.log records the literal submitted SQL line; replay re-runs it through the one walker with the advanced view active. create / drop / alter are schema-write entry words, not in ADR-0034 Amendment 1's app-lifecycle skip set, so SQL DDL replays as a write (re-applied) with no replay-filter change — unlike undo / redo, which had to be added to that skip set.
  • Undo (ADR-0006): each SQL DDL statement is a user mutation carrying a source, so it is snapshotted by the worker hook and is one undo step — including a CREATE TABLE with foreign keys, precisely because it is a single command (§5) rather than a decomposed sequence.

11. Ambient assistance comes for free (ADR-0030 §8)

Because the DDL is grammar in the unified tree, the walker mechanisms apply with no DDL-specific assistance code: syntax highlighting, the [ERR]/[WRN] validity indicator (ADR-0027), the per-command parse-error usage skeleton (ADR-0021), and the completion engine.

What each grammar node still authors (this is writing the grammar, not bolting assistance on afterwards): the correct IdentSource on every schema-name slot — so ALTER TABLE/DROP TABLE/DROP INDEX and REFERENCES T(col) / CREATE INDEX ON T (cols) complete from the SchemaCache; the per-node hint + usage catalog keys (as the app-command nodes carry help_id / usage_ids); and the DDL-specific walker diagnostics with their catalog keys — the DDL peers of the DML diagnostics ADR-0033 added (e.g. unknown type, column-already-exists, FK column-type mismatch, the §7 lossy-conversion note). The integration is structural, not free of authoring.

12. Out of scope

  • Per ADR-0030 §3: views, triggers, transaction control, PRAGMA, ATTACH/DETACH, VACUUM, virtual tables, multi-statement batches. One statement per submission; a trailing ; is tolerated.
  • The Postgres USING <expr> conversion clause (§7) — heavy (per-row expression evaluation), dialect-specific, and unable to express playground-type targets.
  • The simple-mode --dont-convert semantics have no SQL form (advanced ALTER COLUMN TYPE always converts).
  • The DSL → SQL teaching echo (ADR-0030 §10) is Phase 5, a separate ADR — not this one.
  • Engine-specific DDL spellings (AUTOINCREMENT, WITHOUT ROWID, collations) — the grammar admits the standard surface; extras are ordinary parse errors. (IF [NOT] EXISTS was reclassified into scope — see §4 — as a near-universal cross-vendor idiom rather than an engine-specific spelling.)

13. Phased implementation plan

Sub-phases, each opening with the smallest end-to-end slice and each with an explicit exit gate + a written Devil's-Advocate gate, mirroring ADR-0033's structure:

  • 4a — Dispatch + CREATE TABLE core. Advanced create dispatch; SqlCreateTable for columns + types (the §3 map) + column constraints + single/compound PRIMARY KEY, plus IF NOT EXISTS (no-op-with-note, §4). No FK yet.
  • 4b — Foreign keys in CREATE TABLE. Inline REFERENCES + table-level FOREIGN KEY → relationship metadata, one undo step.
  • 4c — DROP TABLE [IF EXISTS]SqlDropTable (cascade parity; IF EXISTS no-op-with-note, §4).
  • 4d — CREATE [UNIQUE] INDEX / DROP INDEXSqlCreateIndex / SqlDropIndex (ADR-0025; the UNIQUE flag extension if needed).
  • 4e — ALTER TABLE add/drop/rename column.
  • 4f — ALTER TABLE … ALTER COLUMN TYPE (the §7 conversion model + the lossy-with-note path).
  • 4g — ALTER TABLE add/drop constraint, add foreign key.
  • 4h — ALTER TABLE … RENAME TO (the §6 new low-level op).
  • 4i — Verification sweep. Typing-surface + matrix coverage, engine-neutral error pass, undo-parity check (one step per statement), help/usage for the new forms.

Consequences

  • Advanced mode reaches DDL parity with simple mode and adds table-rename, so a learner can build and evolve a whole schema in standard SQL with the playground's types, metadata, and safety intact.
  • The command set grows by five Sql* DDL variants; the worker gains their handlers, which lean on shared low-level helpers where the operation matches the simple-mode path and stand alone where the SQL surface is genuinely richer (multi-FK CREATE TABLE).
  • One genuinely new capability — table rename — adds a low-level op that the simple mode does not have; it must keep the CSV file name and the relationship metadata in step with the table name.
  • ADR-0030 §4 is clarified (own Sql* DDL commands, structurally executed); no behaviour of the shipped DML/SELECT phases changes.
  • The conversion model unifies simple and advanced without a force flag in SQL, relying on undo (ADR-0006) as the advanced-mode safety net — a concrete payoff of having shipped undo first.

See also

  • ADR-0030 — the advanced-mode architecture; this is its Phase 4 and clarifies §4 (DDL representation) and restates §5 (types) / §7 (neutrality) / §8 (assistance) / §11 (persistence).
  • ADR-0033 — the DML phase; source of the category-grouped mode-aware dispatch (Amendment 1) reused for shared entry words.
  • ADR-0031sql_expr, reused for CHECK / DEFAULT.
  • ADR-0013 — relationships + the rebuild-table primitive that the ALTER/FK handlers build on.
  • ADR-0017 — the column type-change classification §7 shares.
  • ADR-0029 — column constraints; ADR-0025 — indexes; ADR-0011 — FK column-type compatibility; ADR-0005 — the ten-type vocabulary.
  • ADR-0006 — undo; each DDL statement is one undo step (§10).