rdbms-playground/docs/adr/0025-indexes.md

# ADR-0025: Indexes

## Status

Accepted. **Amendment 1 (2026-05-25):** UNIQUE indexes are admitted on
the **advanced-mode** SQL surface (`CREATE UNIQUE INDEX`) — see
*Amendment 1* below and ADR-0035 §4d. The original *Out of scope*
exclusion stands for the **simple-mode DSL** (`add unique index` remains
deferred).

## Context

The requirements checklist (`C3`) commits to indexes as part
of the schema-constraint surface, and `S2` commits to the
items list showing "tables and per-table indexes". Neither is
implemented yet.

Indexes are the natural next teaching topic after relationships:
they are the structure that makes `EXPLAIN QUERY PLAN` (`QA1`)
pedagogically interesting — the plan for a filtered query
visibly changes from a full scan to an index search once an
index exists. `QA1` itself is a deliberate follow-up (it needs
its own rendering ADR and a query worth explaining); this ADR
sets it up by giving the playground real indexes.

Three design problems shape the decision:

1. **SQLite owns the index namespace.** Unlike foreign keys —
   which have no name slot, the problem ADR-0013 solved with an
   internal metadata table — an index in SQLite *is* a named
   object. `sqlite_master` and `PRAGMA index_list` /
   `index_info` carry the name, table, column list, and
   uniqueness natively. There is nothing app-specific to store.
2. **`DROP TABLE` silently drops a table's indexes.** The
   rebuild-table primitive (ADR-0013) — used by change-column-
   type and every relationship operation — drops and recreates
   the table. Once indexes exist, every such operation would
   erase them unless the primitive is taught to preserve them.
3. **`playground.db` is a derived artifact** (ADR-0004 /
   ADR-0015). Indexes must round-trip through `project.yaml`
   or they vanish on `rebuild`, `export`, and `import`.

## Decision

### Grammar

Indexes are declared and removed via DSL commands following
ADR-0009 (required clauses keyword-based; optional names
introduced by `as` per the ADR-0013 convention; `--` flags for
opt-ins):

```
add index [as <name>] on <Table> (<col>[, <col>...])

drop index <name>
drop index on <Table> (<col>[, <col>...])
```

- `add index` is a third branch of the existing `add`
  command, alongside `add column` and `add 1:n relationship`;
  `drop index` is a new branch of the existing `drop` command.
- `as <name>` is optional. The `as` keyword introduces the
  name, matching `add 1:n relationship [as <name>]` (ADR-0013
  established `as` as the convention for optional names).
- `on <Table>` uses the keyword `on` — the SQL-natural word
  for `CREATE INDEX ... ON table`, and pedagogically aligned.
- The column list is parenthesised and comma-separated, the
  same shape as `create table` and `insert`. One or more
  columns; multiple columns produce a composite index in the
  given order. An empty list `()` is a parse error.
- Column-list completion resolves against the named table,
  reusing the dynamic-subgrammar mechanism that already drives
  `insert into T (...)` column candidates.

`add unique index` is **not** part of this ADR — see
*Out of scope*.

### Auto-name format

When `as <name>` is omitted, the executor generates
`<Table>_<col1>[_<col2>...]_idx`, mirroring the descriptive,
subject-first style of ADR-0013's relationship auto-names.

Examples:

- `add index on Customers (email)` → `Customers_email_idx`
- `add index on Orders (CustId, Date)` → `Orders_CustId_Date_idx`

If the generated name is already taken — which happens exactly
when the same columns of the same table are already indexed —
the command is refused with a friendly error naming the
existing index (a second index on an identical column set is
redundant). A duplicate *explicit* name is likewise a friendly
error.

### Drop forms

`drop index` accepts two forms, mirroring `drop relationship`:

- `drop index <name>` — for users who named the index or know
  the generated name.
- `drop index on <Table> (<col>...)` — the positional form,
  resolved by matching the table and exact column set against
  the table's indexes. No match is a friendly error; more than
  one match is an ambiguity error listing the candidates and
  advising the user to drop by name.

### Storage — no metadata table

Indexes do **not** get a `__rdbms_playground_indexes` table.
SQLite stores everything the application needs natively:

- `PRAGMA index_list(<table>)` — index name, uniqueness, and
  `origin` (`c` = `CREATE INDEX`, `u` = UNIQUE constraint,
  `pk` = primary key).
- `PRAGMA index_info(<index>)` — the ordered column list.

The application reads indexes through these pragmas. Only
`origin = 'c'` indexes are treated as user indexes; the
automatic indexes SQLite creates to back primary keys and
UNIQUE constraints are not surfaced as user indexes.

This is a deliberate divergence from the ADR-0013 relationship
precedent. Relationships needed a metadata table because SQL
foreign keys have no name slot; indexes have one, so the
divergence is justified — adding a metadata table would
duplicate state SQLite already owns and create a consistency
hazard.

The in-memory representation is a small structural value
(`name`, `table`, ordered `columns`) carried by `db.rs`,
`persistence`, and the renderer.

### `project.yaml` persistence

A top-level `indexes:` list is added to `project.yaml`,
mirroring `relationships:`. Each entry records the index name,
its table, and its ordered column list:

```yaml
indexes:
  - name: Customers_email_idx
    table: Customers
    columns: [email]
```

- `version:` stays `1`. The field is additive and optional:
  the `serde_yml` reader marks it `#[serde(default)]`, so
  project files written before this change parse unchanged. No
  migrator is required (the ADR-0015 §F3 framework stays
  empty).
- The hand-rolled writer emits `indexes: []` when there are
  none, consistent with how `tables`/`relationships` render.
- `SchemaSnapshot` gains an `indexes` vector alongside
  `tables` and `relationships`.
- `rebuild_from_text` recreates each index (via
  `CREATE INDEX`) after the tables are built. `export` /
  `import` carry indexes because they operate on the text
  artifacts.

### Rebuild-table interaction

The `rebuild_table` primitive (ADR-0013) is extended so it no
longer loses indexes:

1. **Before** the `DROP TABLE`, capture the table's user
   indexes structurally (name + ordered columns) via
   `PRAGMA index_list` / `index_info`, filtered to
   `origin = 'c'`.
2. **After** the `ALTER TABLE ... RENAME`, recreate them with
   `CREATE INDEX`.

Recreation is parameterised by an optional column-rename map
and a set of dropped columns, so the same primitive serves
every caller:

- **add / drop relationship**, **change column type** — the
  column set is unchanged; indexes are recreated verbatim.
- **rename column** — an index referencing the old column name
  is regenerated with the new name; the index keeps its own
  name. No error.
- **drop column** — see below.

Because indexes are captured structurally (not as raw SQL
text), regeneration after a rename is a clean substitution
rather than SQL string-munging.

### Drop / rename column interaction

- **rename column** and **change column type** preserve any
  covering index transparently, per the rebuild rules above.
- **drop column** is refused by default when an index covers
  the dropped column. The error names the offending index(es)
  and advises dropping them first. This matches the existing
  conservative posture of `drop column`, which already refuses
  primary-key and FK-involved columns.
- A new `--cascade` flag on `drop column` opts in to the
  cascading behaviour: covering indexes are dropped
  automatically and each is reported in the result note.

```
drop column <col> from table <Table> [--cascade]
```

`Command::DropColumn` gains a `cascade: bool`. `--cascade` is
the first destructive cascade flag in the DSL; future
cascading drops should follow the same opt-in `--` pattern.
Indexes that do *not* cover the dropped column are recreated
normally regardless of the flag.

### Display — structure view

`render_structure` (the table-structure view in the output
panel) gains an `Indexes:` section, rendered after the
relationship sections and only when the table has at least one
user index:

```
Customers
  Id    [serial PK]
  Email [text]
  Indexes:
    Customers_email_idx (Email)
    cust_lookup (Email, Name)
```

`add index` and `drop index` return the affected table's
description, so the auto-show pattern (ADR-0014) displays the
updated structure — including this section — after the
command, the same as `add column` and `add relationship`.

### Display — items list (S2)

The items list (left panel) becomes a nested list: each table,
with its indexes indented beneath it.

```
Tables
  Customers
    Customers_email_idx
    cust_lookup
  Orders
    Orders_date_idx
```

This satisfies `S2` ("the items list shows tables and
per-table indexes; designed to extend to additional element
kinds … without restructuring") — the nested model *is* that
extensible structure; future kinds (relationships, views) slot
in as further child rows.

The panel's data model changes from a flat `Vec<String>` of
table names to a structured list (table name plus its index
names), populated by a schema refresh that now also reads
indexes. Index rows are display-only: the current-table
highlight behaviour is unchanged, and selecting an index row
carries no new action in this ADR.

### Errors and edge cases

All user-facing strings obey the ADR-0002 rule — "the
database" / "the engine", never the engine product name.

- `add index` on a non-existent table → friendly error.
- `add index` naming a column the table does not have →
  friendly error naming the column.
- Duplicate explicit index name, or an auto-name collision
  (same table + column set) → friendly error naming the
  existing index.
- `drop index <name>` for an unknown name → friendly error.
- `drop index on T(cols)` with no match → friendly error;
  with multiple matches → ambiguity error listing candidates.
- Internal `__rdbms_*` tables are not user tables, so the
  table identifier never resolves to one.
- `add index` / `drop index` are DSL DDL commands, available
  in simple mode, appended to `history.log`, and replayable —
  consistent with `add column` / `add relationship`.

### Out of scope

Explicitly excluded from this ADR:

- **UNIQUE indexes** (`add unique index`). A unique index is
  also a constraint; UNIQUE is tracked as its own `C3`
  sub-item and is a distinct teaching concern.
- **Partial indexes** (`CREATE INDEX ... WHERE`), **expression
  / computed indexes**, and per-column **`DESC` / collation**
  modifiers — advanced features beyond the playground's
  pedagogical aim. Plain column-list indexes only.
- **`EXPLAIN QUERY PLAN` / `QA1`** — the deliberate follow-up.
  It needs its own rendering ADR (`QA2`) and builds on the
  indexes this ADR delivers.

## Consequences

- The playground gains real, persistent indexes, advancing the
  index portion of `C3` and satisfying `S2`.
- The rebuild-table primitive now preserves indexes. This also
  closes a latent bug: once indexes exist, column rename /
  type-change would otherwise silently drop them — there are no
  indexes today, so the bug is latent rather than live, but the
  fix ships with the feature that would trigger it.
- A new structural index representation threads through
  `db.rs`, `persistence`, and `output_render`.
- No new internal table — a deliberate divergence from the
  ADR-0013 relationship precedent, justified by SQLite owning
  the index namespace natively.
- The items panel is no longer a flat list; the nested model
  is the `S2`-mandated extension point for future element
  kinds.
- `drop column --cascade` establishes the opt-in `--` flag
  pattern for destructive cascades.
- `EXPLAIN QUERY PLAN` (`QA1`) becomes worthwhile: once it
  lands, `show data <T> where <col> = <val>` is a query whose
  plan visibly changes when an index on `<col>` exists.

## Implementation notes

Two details settled differently from the sketch above, recorded
here so the decision text and the code agree:

- **`rename column` / `drop column` do not use the rebuild
  primitive.** Both run native `ALTER TABLE` (the playground
  targets SQLite 3.25+/3.35+). `ALTER TABLE … RENAME COLUMN`
  already rewrites index definitions that reference the renamed
  column, so rename needs no index code at all. `drop column`
  detects covering indexes directly and either refuses or, with
  `--cascade`, issues `DROP INDEX` before the column drop. Only
  `change column` (and the relationship operations) go through
  the rebuild primitive, and there the column set is unchanged,
  so the captured indexes are recreated verbatim — no
  column-rename map or dropped-column set is needed.

- **The items list keeps a flat `app.tables` plus a cache
  map.** Rather than restructuring `app.tables` and the
  `TablesRefreshed` event payload, per-table index names ride
  in `SchemaCache::table_indexes`, populated by the existing
  schema-cache refresh. The panel renders the ordered table
  list with each table's indexes indented beneath — the
  `S2` nested view — reading the two together.

## Amendment 1 — UNIQUE indexes in advanced mode (2026-05-25)

This ADR's *Out of scope* excluded UNIQUE indexes (`add unique index`)
on the grounds that a unique index conflates two concepts the playground
teaches separately — an index (a performance structure) and a UNIQUE
*constraint* (an integrity rule, tracked as its own `C3` sub-item). That
reasoning was written on 2026-05-16, when the **simple-mode DSL was the
only input surface**, and it still holds there: simple mode teaches the
two concepts separately, so `add unique index` stays deferred.

ADR-0035 (advanced-mode SQL DDL) introduced a second surface whose
explicit posture is to "trust the user like SQL does" (ADR-0035 §7). On
that surface `CREATE UNIQUE INDEX` is standard SQL a learner types
verbatim, and the concept-separation argument does not transfer — so
ADR-0035 §4 lists `CREATE [UNIQUE] INDEX` and **§4d supersedes this
ADR's exclusion for the advanced surface**. The constraint track this
ADR deferred *to* (ADR-0018 → ADR-0029 → ADR-0035 §4a.2) has since
shipped, so there is no remaining dependency.

Mechanically, the index model gains an `IndexSchema.unique` flag — an
additive, `#[serde(default)]` `project.yaml` field (`version` stays
`1`); the engine already reports uniqueness via `pragma_index_list`
(origin `c`), so **no `__rdbms_*` metadata table is added** (the §Storage
decision is unchanged — the divergence from the relationship precedent
stands). The rebuild primitive re-emits `CREATE UNIQUE INDEX`; the
structure view and items panel mark a unique index `[unique]`
(ADR-0035 §4d). The redundant-column-set guard keys on `(columns,
unique)` so a plain and a unique index over the same columns are not
mutual duplicates.

The amendment also hardened the shared `do_add_index` executor to refuse
an internal `__rdbms_*` table as "no such table" (consistent with the
app-wide opacity of internal tables) — closing a latent exposure on
*both* the simple `add index` and the new SQL `CREATE INDEX` surfaces,
which both reach `do_add_index`.

## See also

- ADR-0004 / ADR-0015 (project file format and storage runtime)
- ADR-0009 (DSL command syntax conventions)
- ADR-0012 (internal column metadata — and why indexes diverge
  from that precedent)
- ADR-0013 (relationships, the rebuild-table primitive, and the
  `as <name>` convention)
- ADR-0014 (auto-show after writes)
- ADR-0023 / ADR-0024 (the unified grammar tree the new
  commands plug into)