diff --git a/docs/adr/0043-compound-pk-foreign-key-references.md b/docs/adr/0043-compound-pk-foreign-key-references.md new file mode 100644 index 0000000..a7c01cc --- /dev/null +++ b/docs/adr/0043-compound-pk-foreign-key-references.md @@ -0,0 +1,260 @@ +# ADR-0043: Compound-primary-key foreign-key references (T3) + +## Status + +**Accepted** — 2026-06-09. All four genuine forks confirmed by the +user at the recommended option: **F-A** full PK in order, **F-B** +house-style uniform column lists (no migration; back-compat not +required), **F-C** parenthesized DSL lists, **F-D** bare table-level +SQL FK auto-expands to the parent's full PK. Closes the one open +leg of +`requirements.md` **T3** ("compound primary keys handled +end-to-end (DSL, storage, display, **FK reference**)"): a foreign +key that *references* a compound (multi-column) primary key. + +Cross-references **ADR-0011** (FK column type compatibility — +`Type::fk_target_type`), **ADR-0013** (relationships, naming, the +rebuild-table strategy, and the `__rdbms_playground_relationships` +metadata table), **ADR-0035 §4b** (the SQL `FOREIGN KEY` surface), +**ADR-0004 / ADR-0015** (`project.yaml` as the authoritative +format; `playground.db` is a derived artifact), and **ADR-0009** +(DSL surface conventions). + +## Context + +Compound PRIMARY KEYs are declared, stored, and displayed today +(`create table T with pk a(int), b(int)` → `primary_key: +Vec`). The missing leg is the *reference*: a child table +whose foreign key points at a parent's compound PK. A 2026-06-09 +codebase audit found single-column FK is a pervasive assumption — +~15–20 sites across 6+ files: + +- **Metadata** — `__rdbms_playground_relationships` stores scalar + `parent_column TEXT` / `child_column TEXT` + (`PRIMARY KEY (child_table, child_column)`). +- **Persistence** — `RelationshipSchema { parent_column: String, + child_column: String }`; `project.yaml` `RawEndpoint { table, + column }`. +- **Grammar** — `add 1:n relationship … from

. to + .` (one ident per side); SQL `FOREIGN KEY () + REFERENCES

()` (parens that hold exactly one ident). +- **AST** — `Command::AddRelationship { parent_column: String, + child_column: String }`; `SqlForeignKey { child_column: String, + parent_column: Option }`. +- **Executor** — `schema_to_ddl` emits a single-column + `FOREIGN KEY (c) REFERENCES P(p)`; `check_fk_type_compat` + compares one parent type to one child type; bare + `REFERENCES

` on a compound-PK parent is refused as + ambiguous (`resolve_create_table_fks`, + `do_alter_add_foreign_key`). +- **Display** — `RelationshipEnd { other_column: String, + local_column: String }`. + +This is not a sweep-sized change, which is why it earns an ADR +rather than an inline build. The decisions below also turn the +audit's worst-case framing (a metadata-schema + yaml-format +migration via the F3 framework) into a **no-migration** change. + +### Why no migration is needed + +**Decision input (user, 2026-06-09): back-compatibility with +existing saved projects is not required.** The project is +pre-release; there is no installed base of `project.yaml` / +`playground.db` files to preserve. This removes the only force +that would have demanded an F3 migrator or a version bump, and — +more importantly — it lets the representation be chosen for +*cleanliness and consistency* rather than for byte-identical +back-compat. The consequence is explicit and accepted: a +`project.yaml` written before this change that contains +relationships will not load under the new format. + +Freed of back-compat, the storage follows the convention the file +**already uses** for ordered column lists rather than inventing a +new one: + +- `project.yaml` already writes `primary_key: [id]` (a compound PK + is `primary_key: [a, b]`) and index `columns: [a, b]` + (`RawIndex { columns: Vec }`). The relationship endpoint + is the lone multi-column-capable slot still using a scalar + `column:`. It joins the house style (D5). +- The metadata columns are `TEXT`; SQLite has no array type, so a + list lives in a text cell as JSON regardless. That JSON is now a + *uniform* encoding (a one-element array for the single-column + case), not a "bare-name-or-JSON, sniff which" fallback — the + fallback only existed to keep old rows identical, which is no + longer a goal. + +So this is not a clever back-compat dodge; it is "use the existing +list convention, uniformly." No version bump, no F3 migrator. + +## Decision + +Support a foreign key that references a parent's **full** compound +primary key, matched **positionally** to an equal-length child +column list, with per-pair type compatibility — across both the +DSL and SQL surfaces — using format-flexible storage that needs no +migration. + +### D1 — Matching policy: the full PK, in order + +A compound-PK FK references **all** columns of the parent's +primary key, in PK declaration order, matched 1:1 to the child's +column list (same length). Referencing a *subset* of a compound PK +is **out of scope**: SQL/SQLite require FK parent columns to form a +PK or UNIQUE key, and a strict subset of a compound PK is not +itself unique unless separately constrained. Teaching-clean rule: +*a foreign key to a compound key names every column of that key.* + +A length mismatch (child supplies N columns, parent PK has M ≠ N) +is a friendly error naming both counts. + +### D2 — Type compatibility: per pair, positional + +Each child column's type must satisfy +`parent_pk_col.fk_target_type() == child_col` for the +corresponding pair (the existing ADR-0011 rule, applied +element-wise in order). `check_fk_type_compat` generalises to walk +the pairs and report the **first** offending pair with the same +wording it uses today. + +### D3 — DSL syntax: parenthesized column lists + +`add 1:n relationship [as ] + from

.(, ) to .(, ) + [on delete …] [on update …] [--create-fk]` + +The single-column form `from

. to .` is unchanged +(no parens) — back-compatible and the common case. The +parenthesized list is the multi-column form. Both sides must use +the same arity (enforced as a D1 length check). Parentheses mirror +the existing compound-PK *declaration* syntax (`with pk a(int), +b(int)` uses parens around the per-column type; the FK list uses +parens around the column names) and the SQL `FOREIGN KEY (…)` +shape, so the surface stays internally consistent. + +### D4 — SQL syntax: extend the existing lists + +`FOREIGN KEY (, ) REFERENCES

(, )` — the grammar's +child and parent column slots become comma-separated **lists** +(today capped at one). Inline ` REFERENCES

(, +)` stays single-child-column (one inline column can't match a +2-column key) — a compound FK uses the table-level form. Bare +table-level `FOREIGN KEY (x, y) REFERENCES

` (no parent +columns) **auto-expands to the parent's full PK** when the arities +match; bare inline ` REFERENCES

` on a compound-PK parent +keeps today's friendly refusal, with the message pointing at the +table-level multi-column form. + +### D5 — Storage: uniform column lists, matching the house style + +Both stores hold an **ordered column list**, uniformly (a +one-element list for the single-column case), following the +convention `project.yaml` already uses for `primary_key` and index +`columns`. + +- **`project.yaml`**: `RawEndpoint` becomes `{ table, columns: + Vec }` and writes `columns: [a, b]` (single-column → + `columns: [id]`), exactly parallel to `primary_key: [id]`. No + scalar `column:` form, no dual-shape reader. +- **Metadata** (`__rdbms_playground_relationships`): no + `CREATE TABLE` change (the `TEXT` columns and + `PRIMARY KEY (child_table, child_column)` are untouched). + `parent_column` / `child_column` store the list as a JSON array + string — uniformly, including `["id"]` for a single column + (SQLite has no array type, so a text cell is where a list lives). + The actual enforced FK lives on the rebuilt child table's DDL + (`FOREIGN KEY (a, b) REFERENCES P(x, y)`), emitted by + `schema_to_ddl`, exactly as the single-column FK is today via the + rebuild-table primitive (ADR-0013) — one relationship, one undo + step. + +### D6 — In-memory model: `Vec` column lists + +`Command::AddRelationship`, `SqlForeignKey`, `RelationshipSchema`, +the internal `ReadForeignKey`, and `RelationshipEnd` (display) all +carry `parent_columns: Vec` / `child_columns: Vec` +(or `Option>` for the bare-SQL parent case). A +one-element vec is the single-column case; nothing about the +single-column UX changes. + +## Genuine forks (escalated for sign-off) + +These are decisions, not facts. Recommendations are marked; the +user confirms before this ADR moves to Accepted. + +- **F-A — matching policy.** Full PK only (D1, *recommended*) vs. + allow a subset (needs a separate UNIQUE key; larger, less + teaching-clean). +- **F-B — storage encoding.** Uniform column lists in the existing + house style — `columns: [a, b]` in yaml (like `primary_key`), + JSON-array in the unchanged metadata `TEXT` columns; no + back-compat, no migration (D5, *recommended*) vs. a normalized + relationship-columns child table (more "correct" but a schema + change with joins on read, no learner-visible payoff). Premise: + no existing projects to preserve (confirmed). +- **F-C — DSL multi-column syntax.** `from P.(a, b) to C.(x, y)` + parenthesized (D3, *recommended*) vs. a repeated-dotted form + (`from P.a, P.b to C.x, C.y`, more ambiguous to parse and read). +- **F-D — bare table-level SQL FK auto-expansion.** Auto-expand + `FOREIGN KEY (x,y) REFERENCES P` to P's full PK when arities + match (D4, *recommended*) vs. always require explicit parent + columns. + +## Implementation sketch (change sites) + +Grouped; each lands behind tests. No migration step. + +1. **AST** — `AddRelationship` + `SqlForeignKey` column fields → + `Vec` / `Option>` (`command.rs`). +2. **Grammar** — DSL endpoint column slot → optional + parenthesized list (`ddl.rs`); SQL child/parent column slots → + comma lists (`sql_create_table.rs`). Builders collect lists. +3. **Metadata** — `insert_relationship_metadata` / + `read_all_relationships` encode/decode bare-or-JSON + (`db.rs`); no `CREATE TABLE` change. +4. **Persistence** — `RelationshipSchema` → `Vec`; + `RawEndpoint` becomes `{ table, columns: Vec }`, written + `columns: [a, b]` like `primary_key` + (`persistence/mod.rs`, `persistence/yaml.rs`). +5. **Executor** — `do_add_relationship` / + `resolve_create_table_fks` / `do_alter_add_foreign_key` walk + column lists; `schema_to_ddl` emits multi-column `FOREIGN KEY + (…) REFERENCES P(…)`; `check_fk_type_compat` loops pairs; + bare-reference paths auto-expand to the full PK (D4) or refuse + with the improved message (`db.rs`). +6. **Display** — `RelationshipEnd` → column lists; `describe` / + echo render `(a, b) → (x, y)` (`db.rs`, `echo.rs`). +7. **Tests** — parse (DSL + SQL, single still works, multi parses, + arity mismatch errors); worker round-trip (declare a 2-col FK, + rebuild, FK enforced, type-mismatch refused); persistence + round-trip (yaml `columns:` reads + writes; a legacy + single-column yaml still loads); display. + +## Consequences + +- T3 closes; a learner can model a real composite-key relationship + end to end. +- No migration, and the on-disk representation gets *more* + consistent: the relationship endpoint joins the `primary_key: + [...]` / index `columns: [...]` list convention. The in-app + single-column UX is untouched (one-element vecs). +- Accepted trade-off (user, 2026-06-09): a `project.yaml` written + before this change that contains relationships will not load + under the new format. There is no installed base to preserve, so + this is a clean cutover, not data loss. +- The relationship model becomes list-based throughout, which is + the natural foundation if subset/UNIQUE-targeted FKs are ever + wanted (explicitly OOS here). +- A modest, broad refactor (the `Vec` field change ripples through + the 6 layers) — methodical, not deep; locked by tests at each + layer. + +## Out of scope + +- Subset/non-PK FK targets (referencing a UNIQUE key that isn't + the PK) — possible later on this list-based foundation. +- Any change to single-column behaviour, the rebuild-table + primitive, or the undo model (one relationship = one undo step + stands). +- A `project.yaml` version bump or F3 migrator (not needed — + no installed base to migrate; clean cutover per D5). diff --git a/docs/adr/README.md b/docs/adr/README.md index b41111f..e3ffcf2 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -48,3 +48,4 @@ This directory contains the project's ADRs, recorded per - [ADR-0040 — A per-command completion marker (✓/✗) replaces the `[ok]` summary line](0040-completion-marker-replaces-ok-summary.md) — **Accepted 2026-05-30 (issue #9)**, amends ADR-0014 / ADR-0028 / ADR-0019 output conventions, builds on ADR-0037's mode-tagged echo. An audit of the whole command surface found the `[ok] ` summary line duplicates the echo line above it (verb+subject) everywhere; its only unique contribution is the success-vs-error signal (and `explain select` even rendered `[ok] explain` with an empty subject post-ADR-0039). Decision: drop the `[ok]` line and the symmetric `"…" failed:` prefix; the echo line gains a trailing inline **✓** (green, success) / **✗** (red, failure) — `running:` becomes a pending state that resolves to ` ✓/✗` on completion (status set via the existing `rfind(Echo)` lookup). Content (row counts, structure, data, plan tree, teaching echo) unchanged. Scoped to the DSL/data/SQL family that has the redundant echo+`[ok]` pair; app-command `[ok]` lines (`rebuild`/`export`/`now editing`) are payload-bearing, have no echo to mark, and stay as-is. `ok.summary` retired; `dsl.failed` reduced to the rendered reason. Broad but mechanical snapshot churn. OOS: app-command `[ok]` lines, the `[WRN]` validity indicator, and the tag colours (issue #10) - [ADR-0041 — Copy the output panel to the system clipboard](0041-copy-output-to-clipboard.md) — **Accepted 2026-06-02 (issue #11)**, amends ADR-0003's app-command registry (adds **`copy`** / `copy all` / `copy last`). The friction it removes: filing a bug report meant terminal-selecting the output panel and fighting wrapping/borders. New **app-level command** (sigil-free, both modes): `copy` / `copy all` copy the whole panel; `copy last` copies from the most recent echo line to the end. **Mechanism — OSC 52 *and* native (`arboard`), always both**, because OSC 52 acceptance is undetectable (no terminal ack), so a true "fall back when unsupported" can't be built: emit the OSC 52 escape (no new dep — `base64`+`crossterm`; works over SSH; tmux-passthrough-wrapped via `$TMUX`), then a best-effort native write whose failure is ignored (headless host — OSC 52 carried it); the two carry identical content. **Format — plain text verbatim as rendered** (tags, `✓`/`✗`, box-drawing) joined by `\n`, without viewport padding/wrapping; a drift-lock test pins `OutputLine::plain_text` to `render_output_line`. `arboard` added **`--no-default-features`** (drops the `image` crate; X11-only on Linux — `wayland-data-control` deliberately omitted as it ~doubles the dep tree and OSC 52 covers native-Wayland). Security: write-only, scans clean for arboard's tree (cargo audit / osv-scanner / grype), 1Password-maintained, minimal surface. OOS: Markdown export, selection/range, a keybinding, OSC 52 read, `screen` passthrough - [ADR-0042 — H1a parse-error pedagogy in the grammar-tree era](0042-h1a-parse-error-pedagogy-grammar-tree.md) — **Accepted 2026-06-03.** Continues **H1a** from ADR-0021 against the ADR-0024 grammar tree (ADR-0021's chumsky mechanism is dead). Records the **baseline already shipped** — per-command `usage:` block (38 `parse.usage.*` templates), available-commands fallback, structural "after `…`, expected …" wording, source-derived ident slot labels ("table name"/"column name"), curated `parse.custom.*` near-miss messages, and the ADR-0027/0033/0036 schema-aware `[ERR]` diagnostics — so H1a is *substantially* delivered at the intent level. Defines the remaining work as **(1)** a verified per-command **near-miss matrix** (`tests/typing_surface/` + `tests/it/parse_error_pedagogy.rs`) as the definition of done, test-first; **(2)** **friendlier literal expectation labels** — optional prose glosses on `Word`/`Punct`/`Flag` positions that *add* role context while always keeping the exact literal visible (e.g. "a filter clause: `where …` or `--all-rows`"); **(3)** **advanced-mode SQL** near-miss parity (RETURNING scope, CTE-arity positioning, `CROSS JOIN … ON`, INSERT…SELECT count) — **in scope**, kept distinct from ADR-0019 §OOS-2 which covers advanced-SQL *engine*-error sanitisation, a different layer. Catalog/anchor-phrase discipline (ADR-0019) preserved; no public API change. OOS: I3/I4, spell-correction, multi-error reporting, verbosity-gating the usage block +- [ADR-0043 — Compound-primary-key foreign-key references (T3)](0043-compound-pk-foreign-key-references.md) — **Accepted 2026-06-09** (all four forks confirmed at the recommended option: full-PK matching, house-style uniform lists, parenthesized DSL syntax, bare-SQL-FK auto-expansion). Closes the open leg of `requirements.md` **T3**: a foreign key that *references* a parent's compound primary key. A 2026-06-09 audit found single-column FK woven through ~15–20 sites (metadata table, `RelationshipSchema`, `project.yaml` `RawEndpoint`, both grammar surfaces, executor FK-DDL emission, per-column type-compat, display) — earns an ADR, not an inline build. **Decision:** reference the parent's **full** compound PK, matched **positionally** to an equal-length child column list, per-pair `fk_target_type` compat (ADR-0011, element-wise); DSL `from

.(a, b) to .(x, y)` (single form unchanged), SQL `FOREIGN KEY (x, y) REFERENCES P(a, b)` (extend the existing one-cap lists; bare table-level FK auto-expands to the parent PK when arities match). **Storage — no migration (back-compat not required, user-confirmed 2026-06-09; no installed base):** the relationship endpoint joins the list convention `project.yaml` *already* uses — `columns: [a, b]` like `primary_key: [id]` and index `columns: [...]` (the endpoint was the lone scalar `column:` holdout); the metadata `TEXT` columns are unchanged and store the list as a uniform JSON array (`["id"]` even for single — SQLite has no array type). No F3 migrator, no version bump; accepted trade-off is that a pre-change `project.yaml` with relationships won't load (clean cutover). In-memory model goes list-based (`Vec`) through all six layers; the enforced FK is the rebuilt child-table DDL (`FOREIGN KEY (a,b) REFERENCES P(x,y)`), one relationship = one undo step (ADR-0013). Genuine forks escalated: matching policy (full-PK vs subset), storage (house-style uniform lists vs normalized table), DSL syntax (parenthesized vs repeated-dotted), bare-SQL-FK auto-expansion. OOS: subset/non-PK (UNIQUE-targeted) FK references; any single-column behaviour change diff --git a/docs/requirements.md b/docs/requirements.md index b74373c..6537f08 100644 --- a/docs/requirements.md +++ b/docs/requirements.md @@ -405,10 +405,24 @@ since ADR-0027.) (`with pk a(int),b(int)`), **storage** (`primary_key: Vec`), and **display** are present and tested. **Missing: a FK that *references* a compound PK** — - `db.rs:6822-6836` enforces a single `parent_column: String`; - a bare `REFERENCES parent` on a compound-PK table is refused as - ambiguous, and multi-column FK target syntax is not in the - grammar. This is the one open end-to-end leg of T3.)* + `db.rs` resolve/alter FK paths enforce a single + `parent_column: String`; a bare `REFERENCES parent` on a + compound-PK table is refused as ambiguous, and multi-column FK + target syntax is not in the grammar. This is the one open + end-to-end leg of T3 — but a **codebase audit (2026-06-09) + found it is not a small finish**: single-column FK is woven + through ~15–20 sites across 6+ files — the + `__rdbms_playground_relationships` table schema, the + `RelationshipSchema` struct, the **`project.yaml` relationship + format** (`RawEndpoint { column }`), both grammar surfaces + (`add 1:n relationship` + SQL `FOREIGN KEY`), the executor's FK + DDL emission, and the per-column type-compat check. It needs a + **migration** (the metadata-table + yaml-format change, F3) and + an **ADR** to settle the design forks: compound-PK matching + policy (must an FK reference *all* PK columns, or a subset?), + per-pair type-compat semantics, the yaml multi-column shape, and + back-compat for existing single-column projects. So this leg is + ADR-first, not a sweep item.)* ## Visualizations