rdbms-playground/docs/adr/0028-query-plans.md

# ADR-0028: Query plans (`EXPLAIN QUERY PLAN`)

## Status

Accepted

## Context

`QA1` commits to running `EXPLAIN QUERY PLAN` on demand and
rendering the result as an annotated tree that highlights
full table scans, index use, and join order. `QA2` — the
rendering specifics (tree layout, annotation taxonomy,
colour scheme) — was deferred pending this ADR. This ADR
covers both: the `explain` command and how its output is
rendered.

Two earlier decisions make this the right moment:

- **ADR-0025** gave the playground real, persistent
  indexes.
- **ADR-0026** designed `show data … where <expr>` — a
  *filtered* query.

Both matter, because a query plan is only pedagogically
interesting for a filtered query: an unfiltered
`SELECT * FROM t` is always a full scan, and an index can
never appear. The teaching payoff — a plan that visibly
flips from a scan to an index search once an index exists
— needs a `WHERE`. So a genuinely interesting `explain`
depends on ADR-0026 being implemented; this ADR, like
ADR-0026 and ADR-0027, is design-ahead.

`explain` also continues the arc the WHERE-expression work
began: it is one more step from the simple DSL toward real
SQL, introducing the actual `EXPLAIN QUERY PLAN` concept
and the engine's own plan vocabulary (`SCAN`, `SEARCH`,
`USING INDEX`).

One rendering obstacle shapes the decision. The output
panel today stores each line as plain text with a single
`OutputKind`, and `render_output_line` colours the whole
line by that kind. A useful plan wants finer colour —
marking the *parts* of a step that signal a scan versus an
index. ADR-0016 deferred exactly this ("per-cell theming …
V4 territory") but explicitly anticipated it (its OOS-3:
"sets up the architecture, defers the colours until the
query DSL ships"). This ADR realizes it.

## Decision

### 1. Invocation — the `explain` prefix

A plan is requested by prefixing a query with `explain`:

```
explain show data <T> [where <expr>] [limit <n>]
explain update <T> set … (where <expr> | --all-rows)
explain delete from <T>  (where <expr> | --all-rows)
```

- `explain` mirrors SQL's own `EXPLAIN QUERY PLAN <stmt>`;
  the prefix teaches the real concept directly.
- It applies to the three commands that issue a query with
  a row-finding step: `show data`, `update`, `delete`.
- **`EXPLAIN QUERY PLAN` does not execute the statement.**
  So `explain update …` and `explain delete …` show how
  the engine *would* locate the rows and change nothing —
  a safe way to inspect a destructive operation's plan.
  This is stated explicitly because it is pedagogically
  useful and not self-evident.
- The wrapped command must be well-formed: a complete
  `update` / `delete` still needs its `where` or
  `--all-rows`. `explain` of an incomplete command is the
  same parse error the command alone would be.

**Grammar.** `explain` is a new top-level command. After
the `explain` keyword its shape is a `Choice` over the
three explainable commands' shapes, reached through the
`Subgrammar` node (ADR-0026) — the `show` / `update` /
`delete` grammars are *referenced*, not duplicated, so the
explained command is parsed, completed, hinted, and
highlighted exactly as it is on its own.

**AST.** `Command::Explain { query: Box<Command> }`. The
inner `Command` is the ordinary parsed command; the
runtime recognizes the `Explain` wrapper and routes it to
the plan path instead of normal execution.

### 2. Capturing the plan

The database worker prepares `EXPLAIN QUERY PLAN <sql>`,
where `<sql>` is exactly the SQL the inner command would
otherwise have run, and reads back the result rows. Each
row is `(id, parent, notused, detail)` — the same
read-only, multi-row shape as the existing `PRAGMA`-backed
reads (`read_table_indexes`).

- The inner command's SQL is produced by the *same*
  construction logic that builds it for real execution, so
  the plan is the plan of the actual query, not an
  approximation. This means the SQL-building step is
  separated from execution for the explainable commands.
- `EXPLAIN QUERY PLAN` determines the plan from the
  statement's structure and the schema, not from parameter
  *values*. The statement is still prepared with the inner
  command's parameters bound (so a parameterised `WHERE`
  prepares cleanly); the bound values do not affect the
  plan.
- A new `Request::ExplainPlan` / `do_explain_plan`, and a
  `QueryPlan` result type carrying the tree, flow back
  through `CommandOutcome` / `AppEvent` like other command
  results.

### 3. The plan tree

The `id` / `parent` columns form a tree. It is rendered
indented, with box-drawing connectors (`├─`, `└─`, `│`),
the way a file tree is drawn. Each node's text is the
engine's `detail` string **verbatim** — `SCAN Customers`,
`SEARCH Customers USING INDEX Customers_email_idx
(Email=?)`, and so on. Nothing is reworded.

Verbatim text is a deliberate pedagogical choice: it
teaches the real vocabulary a learner meets in every
database tool. The `detail` strings name no engine
*product*, so the ADR-0002 "no engine name in user-facing
strings" rule is satisfied as-is.

The block is preceded by the usual command echo, and the
SQL being explained is shown above the tree — seeing the
generated SQL beside its plan is itself part of the
simple → advanced bridge.

The displayed SQL is **standard SQL**, rendered to read as
a complete, copy-pasteable query: identifiers are
double-quoted (the ISO delimited-identifier form); `WHERE`
literals are shown *inline* (`WHERE "Email" =
'alice@example.com'`) rather than as the `?` placeholders
the statement is actually prepared with (§2 — execution
and plan capture keep the parameters); and inequality is
written `<>` even when the user typed `!=`. The one clause
the display carries that the user did not type is the
implicit `ORDER BY <pk>` that `limit` adds (ADR-0026 §5)
— itself a worthwhile lesson.

### 4. Annotation taxonomy

Each node's `detail` string is classified, by matching it
against a small table of substring patterns, into one of:

| Category | Recognised by | Reading |
|---|---|---|
| Full scan | `SCAN <t>`, no index | every row read — expensive |
| Index search | `SEARCH … USING INDEX …` | indexed lookup — efficient |
| Covering index | `USING COVERING INDEX` | indexed, no table fetch |
| Primary-key lookup | `USING INTEGER PRIMARY KEY` | direct row lookup |
| Automatic index | `USING AUTOMATIC … INDEX` | the engine built a *temporary* index because none existed — the strongest "add an index here" signal |
| Temp B-tree | `USE TEMP B-TREE FOR …` | sorting / grouping with no index to lean on |
| Neutral | anything else / structural | — |

A `detail` that matches no pattern renders neutral rather
than failing — the engine's plan vocabulary may grow.

The **automatic index** category is the most important
teaching moment: it is the case where the learner *should*
have added an index and the engine quietly compensated. It
is called out distinctly, not folded in with plain scans.

### 5. Styled output lines (the rendering mechanism)

Span-level colour needs the output panel to colour *parts*
of a line. Today an `OutputLine` is
`{ text: String, kind: OutputKind }` and
`render_output_line` colours the whole line by `kind` —
except simple-mode echo lines, which already render
multi-span from re-lexed token runs
(`input_render::lex_to_runs` → runs of
`{ byte_range, style }`).

This ADR adds, to `OutputLine`, an optional **styled-runs**
payload — a list of `{ byte_range, style-class }` over the
line's text, the same shape `lex_to_runs` already
produces. `render_output_line` gains one branch: when the
payload is present, render the text as spans per the runs;
otherwise fall back to the existing whole-line `kind`
styling. The echo path is unchanged.

The runs carry a *semantic style class*, resolved to a
concrete colour at render time from the active theme — not
a baked-in colour — so the styling stays correct
regardless of theme.

This is a **general** capability: any output line may now
carry rich styling. The plan renderer is its first and,
for now, only consumer; the existing renderers
(`render_structure`, `render_data_table`) keep producing
plain lines. The mechanism is the per-span styling
ADR-0016 anticipated, and V4's session-log work will reuse
it — the same "general mechanism, single current consumer"
shape as ADR-0027's diagnostics model.

Scroll math is unaffected: one display row per plan node;
styled spans do not change the line count.

### 6. Colour scheme

Beyond neutral text the plan needs an "efficient" colour
and an "expensive" colour:

- **Efficient** — index search, covering index, primary-key
  lookup — green.
- **Expensive** — full scan, temp B-tree — amber, reusing
  the `warning` colour introduced by ADR-0027.
- **Automatic index** — amber as well, but with a distinct
  marker (an icon or short tag) so it reads as "you should
  add an index", not merely "this is slow".
- Connector glyphs and table / index *names* stay neutral;
  only the category-bearing keywords of the `detail` string
  carry the category colour.

`theme.system`'s green is the existing "normal output"
colour; a plan-specific efficient colour distinct from it
avoids "green means two things". The exact theme fields are
an implementation detail; the requirement is that the
scheme is legible on both light and dark backgrounds
(NFR-5, NFR-7).

### 7. Out of scope

- **Explaining raw advanced-mode SQL.** There is no SQL
  parser yet (`Q1`); `explain` covers the simple-mode DSL
  queries. When SQL parsing lands, `explain` extends to it.
- **`EXPLAIN`** (the bytecode form) — only
  `EXPLAIN QUERY PLAN`. The bytecode dump is not a teaching
  surface.
- **Cost estimates / row-count predictions** —
  `EXPLAIN QUERY PLAN` does not provide them and this ADR
  does not invent them.
- **Re-styling existing output.** The styled-line mechanism
  (§5) is available to all output, but this ADR only wires
  the plan renderer to it; `render_structure` /
  `render_data_table` are untouched.
- **A plan history, or multiple plan tabs.**

## Consequences

- A new `explain` command — `Command::Explain`,
  `Request::ExplainPlan` + `do_explain_plan`, a `QueryPlan`
  result type, and `CommandOutcome` / `AppEvent` variants.
- `explain` covers `show data` / `update` / `delete`;
  because `EXPLAIN QUERY PLAN` never executes, explaining a
  destructive command is safe.
- The SQL-construction step for the explainable commands is
  separated from execution, so the same SQL feeds both real
  execution and `EXPLAIN QUERY PLAN`.
- `OutputLine` gains an optional styled-runs payload, and
  `render_output_line` a branch to honour it — a general
  per-span output-styling capability (ADR-0016's OOS-3
  realized), with the plan renderer as its first consumer.
- A new `render_explain_plan` in `output_render.rs`
  producing the styled tree; a small substring-pattern
  table for the annotation taxonomy.
- Theme gains plan colours (an efficient colour distinct
  from `system`; `warning` reused for expensive).
- Depends on ADR-0026: a plan that flips between a scan and
  an index search needs `show data … where`. The feature
  works against whatever queries exist; it is fully
  realised once C5a is implemented.
- Builds toward `Q1`: when advanced-mode SQL lands,
  `explain` extends to cover it.

## Implementation notes

A sensible order, each step test-guarded:

1. The styled-output-line mechanism — the `OutputLine`
   styled-runs payload and the `render_output_line` branch.
   No user-visible change on its own.
2. The `explain` grammar (the prefix plus the
   `Subgrammar`-referenced query shapes) and
   `Command::Explain` with its AST builder.
3. Separating SQL construction from execution for
   `show data` / `update` / `delete`; `Request::ExplainPlan`
   / `do_explain_plan`; the `QueryPlan` result and its
   `CommandOutcome` / `AppEvent` wiring.
4. `render_explain_plan` — the tree layout, the annotation
   taxonomy, and the styled runs; the theme colours.
5. Typing-surface matrix cells for `explain`.

## See also

- ADR-0002 — database engine; the "no engine name in
  user-facing strings" rule (plan `detail` strings name no
  product, so verbatim text complies).
- ADR-0016 — pretty table rendering; its OOS-3 anticipated
  the per-span output styling §5 realizes.
- ADR-0024 — the unified grammar tree the `explain` command
  plugs into.
- ADR-0025 — indexes; what makes a query plan
  pedagogically interesting.
- ADR-0026 — complex WHERE expressions; the filtered query
  worth explaining, and the `Subgrammar` node `explain`
  reuses to reference the query grammars.
- ADR-0027 — the diagnostics model; the same "general
  mechanism, single current consumer" shape as §5.