docs: ADR-0030 — advanced mode standard-SQL surface
Decides the architecture for SQL in advanced mode (Q1/Q2/Q4): SQL is authored as grammar within the unified grammar tree (ADR-0024) and parsed by the existing walker — not a separate batch parser — so SQL gets the same completion, highlighting, hints, and parse-error reporting as the DSL. Mode gates the SQL forms. DDL routes through the typed Command executor (metadata and the playground type vocabulary preserved); DML and SELECT execute as validated SQL. Engine-neutral posture; DSL→SQL teaching echo; phased plan. Supersedes ADR-0001's sqlparser-rs reservation. Ticks Q4; updates the ADR index and the Q1/Q2 notes. handoff-24 orients the implementation session at Phase 1.
This commit is contained in:
@@ -0,0 +1,382 @@
|
||||
# ADR-0030: Advanced mode — the standard-SQL surface
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
ADR-0003 split the input field into two modes. **Simple mode**
|
||||
(the default) takes the teaching DSL; **advanced mode** was
|
||||
specified to take "raw SQL, including DDL and queries". The DSL
|
||||
half is fully built (ADR-0009, ADR-0023/0024, and everything
|
||||
since); advanced mode is still a **placeholder** — a submitted
|
||||
line is echoed back unexecuted.
|
||||
|
||||
Requirement `Q1` commits to a *defined* SQL subset, `Q2` to
|
||||
rejecting out-of-subset syntax clearly, `Q4` is the subset
|
||||
specification — this ADR. Two constraints shape every decision
|
||||
below; both come from how this project already works.
|
||||
|
||||
1. **The engine is an implementation detail.** ADR-0002
|
||||
established that the database product is never named in
|
||||
user-facing strings. Advanced mode must *extend* that
|
||||
posture: it is a way to work with **standard SQL**, as
|
||||
independent of the storage engine as we can make it — not a
|
||||
console onto the engine. The engine's type names, its
|
||||
`STRICT` keyword, its dialect quirks, and its raw error text
|
||||
must not surface. And handing typed text straight to the
|
||||
engine would bypass the typed executor that keeps the
|
||||
internal metadata tables (ADR-0012/0013) in sync, writes
|
||||
`project.yaml` + CSV (ADR-0015), and preserves the
|
||||
playground's rich type vocabulary (ADR-0005).
|
||||
|
||||
2. **Assistance comes from one place.** Completion, syntax
|
||||
highlighting, hint-panel prose, the `[ERR]`/`[WRN]`
|
||||
indicator, and per-command parse-error usage all derive
|
||||
from a single **unified grammar tree** walked incrementally
|
||||
(ADR-0022/0023/0024 — explicitly "the single source of
|
||||
truth"). A *batch* SQL parser — the kind `sqlparser-rs`
|
||||
(reserved in ADR-0001) is — produces an AST and nothing
|
||||
else: it cannot say what is valid at the cursor, cannot
|
||||
drive completion, highlighting, or hints. Parsing SQL with
|
||||
such a library would leave advanced mode either *without*
|
||||
the ambient assistance the DSL has, or dependent on a
|
||||
second, parallel assistance system — both contrary to
|
||||
ADR-0023/0024.
|
||||
|
||||
The decision: **SQL is not parsed by a separate library. SQL
|
||||
becomes additional grammar within the unified tree**, walked by
|
||||
the same walker as the DSL. Advanced mode is not a different
|
||||
parser — it is the same parser with more grammar unlocked.
|
||||
|
||||
## Decision
|
||||
|
||||
### 1. SQL lives in the unified grammar tree
|
||||
|
||||
SQL statements are authored as `CommandNode` / `Node` grammar
|
||||
in the ADR-0024 tree and parsed by the existing walker. The
|
||||
consequence is the whole point: completion, highlighting,
|
||||
hint prose, the validity indicator, and parse-error usage
|
||||
**work for SQL exactly as for the DSL, for free**, because
|
||||
they are all walker outputs (§8).
|
||||
|
||||
`sqlparser-rs` is therefore *not* used as the parser;
|
||||
ADR-0001's reservation of it is superseded. (An implementer
|
||||
may retain it narrowly as a test oracle — parse the same SQL,
|
||||
compare — but it is not on the execution path.)
|
||||
|
||||
The honest cost: the supported SQL is exactly what we author
|
||||
into the tree — we are, in effect, writing a SQL grammar. This
|
||||
is the project's largest single feature to date. The target is
|
||||
the full teaching-relevant standard-SQL surface (§3); scope is
|
||||
cut only on *demonstrated* difficulty, as a deliberate
|
||||
escalation to the user, never silently.
|
||||
|
||||
### 2. Mode gates the grammar
|
||||
|
||||
There is one grammar tree. **Simple mode** exposes the DSL
|
||||
subset of it; **advanced mode** additionally exposes the SQL
|
||||
forms.
|
||||
|
||||
- Shared entry words — `create`, `drop`, `insert`, `update`,
|
||||
`delete` — carry both a DSL form and a SQL form as `Choice`
|
||||
branches under one `CommandNode` (mechanically how `add`
|
||||
already holds four sub-commands today). `select` is a new,
|
||||
SQL-only entry word.
|
||||
- SQL branches are mode-tagged; the walker presents the
|
||||
DSL-only view in simple mode and the full view in advanced.
|
||||
- The `:` one-shot escape and `mode advanced` unlock the SQL
|
||||
view for a line / persistently — unchanged from ADR-0003.
|
||||
- Because the grammar *knows* a node is SQL (it is tagged,
|
||||
merely gated), a simple-mode line that matches a gated SQL
|
||||
form yields a precise hint — "this is SQL; switch with `mode
|
||||
advanced`, or prefix the line with `:`" — rather than a
|
||||
generic parse error. This satisfies `M1`'s "recognised as
|
||||
SQL" promise.
|
||||
|
||||
The DSL stays usable in advanced mode (the superset rule):
|
||||
nothing a learner already knows stops working.
|
||||
|
||||
### 3. The supported SQL surface (`Q4`)
|
||||
|
||||
The target is the teaching-relevant standard-SQL surface,
|
||||
authored into the tree with **no pre-emptive cuts**:
|
||||
|
||||
- **`SELECT`** — the full query surface: projection, `WHERE`,
|
||||
inner/outer `JOIN`s, `GROUP BY` / `HAVING`, aggregate
|
||||
functions, `ORDER BY`, `LIMIT` / `OFFSET`, scalar and
|
||||
correlated subqueries, `UNION` / `INTERSECT` / `EXCEPT`, and
|
||||
common table expressions (`WITH`).
|
||||
- **`INSERT`** (single- and multi-row), **`UPDATE`**,
|
||||
**`DELETE`**.
|
||||
- **`CREATE` / `DROP` / `ALTER TABLE`**, **`CREATE` / `DROP
|
||||
INDEX`**.
|
||||
- A **SQL expression grammar** — arithmetic, function calls,
|
||||
`CASE`, the comparison / `LIKE` / `IN` / `BETWEEN` / `IS
|
||||
NULL` predicate set, subquery expressions — the superset of
|
||||
ADR-0026's `WHERE` grammar, shared by `WHERE`, `HAVING`,
|
||||
`CHECK`, `SELECT` projections, and `DEFAULT`.
|
||||
|
||||
Out of the surface: views, triggers, transaction control
|
||||
(`BEGIN`/`COMMIT`/…), `PRAGMA`, `ATTACH`/`DETACH`, `VACUUM`,
|
||||
virtual tables, multi-statement batches. One statement per
|
||||
submission; a trailing `;` is tolerated.
|
||||
|
||||
The **SQL expression grammar** and the **full `SELECT`
|
||||
grammar** are each large enough to warrant their own focused
|
||||
ADR when implemented — the precedent is ADR-0026 for the
|
||||
`WHERE` grammar. ADR-0030 fixes the *architecture*; those
|
||||
ADRs fix the detailed grammar.
|
||||
|
||||
### 4. Execution — DDL through `Command`, DML and `SELECT` as validated SQL
|
||||
|
||||
The walker parsing a SQL statement yields a matched parse.
|
||||
From it:
|
||||
|
||||
- **DDL** → a `Command` (`CreateTable`, `DropTable`,
|
||||
`AddColumn`, `AddConstraint`, `AddIndex`, …). DDL *must* run
|
||||
the typed executor, because that is what keeps the metadata
|
||||
tables, the playground type vocabulary, and `STRICT` intact.
|
||||
The `CommandNode`'s `ast_builder` is the SQL → `Command`
|
||||
translator.
|
||||
- **DML and `SELECT`** → executed as the **validated SQL
|
||||
itself** (re-rendered canonically from the matched parse, or
|
||||
the validated original text). They change no schema, so
|
||||
modelling them as a typed `Command` buys nothing. For DML
|
||||
the worker — knowing the statement kind and target table
|
||||
from the parse — runs the statement and re-persists that
|
||||
table's CSV; `SELECT` is read-only, run and rendered (§6).
|
||||
|
||||
This split is also what makes advanced mode genuinely *full*.
|
||||
Because DML / `SELECT` / `CHECK` expressions are **not**
|
||||
lowered into the DSL's deliberately-limited `Expr` (ADR-0026),
|
||||
advanced mode delivers the full SQL expression surface —
|
||||
arithmetic, functions, subqueries, nested boolean operands —
|
||||
that `docs/simple-mode-limitations.md` records as the inverse
|
||||
of the simple-mode subset. The DSL `Expr` is the *DSL's*
|
||||
representation; the SQL surface does not round-trip through it.
|
||||
|
||||
### 5. Type vocabulary — the playground's, not the engine's
|
||||
|
||||
Advanced-mode DDL uses the playground's own ten-type
|
||||
vocabulary (ADR-0005). There is **no fallback to engine
|
||||
storage types**: a column created in advanced mode is a
|
||||
first-class `serial` / `decimal` / `date` / … exactly as a
|
||||
DSL-created one, with the same metadata row.
|
||||
|
||||
The type-name slot accepts the playground keywords directly
|
||||
(`text`, `int`, `real`, `decimal`, `bool`, `date`,
|
||||
`datetime`, `blob`, `serial`, `shortid`) and standard-SQL
|
||||
aliases that map onto them — `integer`/`smallint`/`bigint` →
|
||||
`int`; `varchar`/`char` → `text`; `boolean` → `bool`;
|
||||
`timestamp` → `datetime`; `numeric` → `decimal`;
|
||||
`float`/`double precision` → `real`; `binary`/`varbinary` →
|
||||
`blob`. A length / precision argument (`varchar(255)`) is
|
||||
accepted and ignored — the playground's types are
|
||||
unparameterised. The engine's own type names are an internal
|
||||
mapping and are neither accepted as input nor shown.
|
||||
|
||||
### 6. `SELECT` — the read-only query path
|
||||
|
||||
`SELECT` touches no metadata, no persistence, no types. It is
|
||||
carried as `Command::Select` holding the validated SQL; the
|
||||
worker (`Request::RunSelect`) prepares and runs it, producing
|
||||
the existing `DataResult`, which renders through the existing
|
||||
data-table renderer (the one `show data` uses, ADR-0016).
|
||||
Columns that carry no playground type — computed expressions,
|
||||
joined columns — render with neutral alignment; the result is
|
||||
capped like `show data`, with `LIMIT` suggested for large
|
||||
outputs. A reference to an internal `__rdbms_*` table is
|
||||
rejected by the grammar (those tables are not in scope).
|
||||
|
||||
### 7. Engine neutrality
|
||||
|
||||
- **No engine type names** in or out (§5).
|
||||
- **No `STRICT`**, no storage options. `STRICT` is applied
|
||||
internally by `do_create_table`; the user neither writes nor
|
||||
sees it. It is simply not part of the authored grammar, so
|
||||
typing it is an ordinary parse error — not a SQLite feature
|
||||
surfaced to the learner.
|
||||
- **Engine-neutral errors.** SQL parse errors, out-of-subset
|
||||
refusals, and execution failures all route through the
|
||||
friendly-error layer (ADR-0019); the engine's raw message
|
||||
and product name never appear.
|
||||
- **Honest limitation.** The grammar enforces the *structural*
|
||||
subset exactly. *Expression-level* neutrality is best-effort:
|
||||
an exotic engine-specific function the grammar admits and the
|
||||
engine then rejects surfaces an engine-neutral error rather
|
||||
than being caught up front. A function allowlist is a
|
||||
possible future hardening (§13).
|
||||
|
||||
### 8. Ambient assistance comes for free
|
||||
|
||||
Because SQL is grammar in the unified tree (§1), the walker
|
||||
gives SQL — with no SQL-specific assistance code — the same as
|
||||
the DSL:
|
||||
|
||||
- **Syntax highlighting** of SQL keywords, identifiers,
|
||||
literals.
|
||||
- **Tab completion** of SQL keywords, and of schema names
|
||||
(tables, columns) drawn from the same `SchemaCache` the DSL
|
||||
completion already uses.
|
||||
- **Hint-panel prose** at each grammar slot.
|
||||
- The **`[ERR]`/`[WRN]` validity indicator** (ADR-0027).
|
||||
- **Per-command parse-error usage** (ADR-0021).
|
||||
|
||||
This is the reason for §1: assistance and a batch parser are
|
||||
incompatible; assistance and the unified grammar tree are the
|
||||
same thing.
|
||||
|
||||
### 9. Parse errors and the unsupported surface (`Q2`)
|
||||
|
||||
A construct not in the authored grammar is an ordinary walker
|
||||
parse error; the ADR-0021 per-command usage machinery and the
|
||||
ADR-0027 indicator apply, with engine-neutral wording. There
|
||||
is no separate "valid SQL but unsupported" classifier — that
|
||||
would require the batch parser §1 dropped; the walker's
|
||||
expected-set drives the message instead.
|
||||
|
||||
### 10. The DSL → SQL teaching bridge
|
||||
|
||||
When a **DSL** command runs **in advanced mode**, its output
|
||||
includes the equivalent SQL — so a learner who knows the
|
||||
simple-mode form reads off how to express it in SQL.
|
||||
|
||||
- It is a `Command` → SQL renderer: the inverse of §4's DDL
|
||||
translator.
|
||||
- It fires only for commands entered via the DSL form, and
|
||||
only in advanced mode (a command the user already typed as
|
||||
SQL is not echoed back; simple mode is left uncluttered).
|
||||
- It renders as a distinct, de-emphasised output line beneath
|
||||
the `[ok]` summary, using the `OutputLine` styled-runs
|
||||
mechanism (ADR-0028).
|
||||
- App-level commands have no SQL form and are not echoed.
|
||||
|
||||
### 11. Persistence, metadata, history, replay
|
||||
|
||||
- **DDL** → `Command` → the typed executor, so `project.yaml`,
|
||||
the metadata tables, and `history.log` stay correct with no
|
||||
new code (§4).
|
||||
- **DML** → the worker re-persists the affected table's CSV
|
||||
after running the statement.
|
||||
- **`history.log`** records the **literal submitted line** — a
|
||||
statement typed as SQL is logged as that SQL. The replay
|
||||
format is therefore app-enterable syntax, no divergence.
|
||||
- **Replay** re-runs each log line through the one walker with
|
||||
the advanced view active, so a project whose history mixes
|
||||
DSL and SQL replays faithfully.
|
||||
- **`project.yaml`** stays a structured schema snapshot; its
|
||||
embedded expressions (a column `CHECK`) are stored as SQL
|
||||
the user could re-enter in advanced mode — one syntax, not a
|
||||
third.
|
||||
|
||||
### 12. Safety in advanced mode
|
||||
|
||||
Advanced mode carries **fewer rails** by design. The DSL's
|
||||
`WHERE`-or-`--all-rows` guard on `update`/`delete` (ADR-0014)
|
||||
is a simple-mode teaching aid; a SQL `DELETE FROM t` with no
|
||||
`WHERE` executes as written. The safety net is the
|
||||
auto-snapshot before destructive operations (ADR-0006), which
|
||||
fires regardless of which surface produced the statement; the
|
||||
mode's visual distinction (ADR-0003) is the user's signal
|
||||
until then.
|
||||
|
||||
### 13. Out of scope
|
||||
|
||||
- **OOS-1.** `CREATE VIEW` / `TRIGGER`. Views are anticipated
|
||||
by the items panel's design (`S2`) but need their own model.
|
||||
- **OOS-2.** `EXPLAIN` of advanced-mode SQL queries. The DSL
|
||||
`explain` (ADR-0028) still works for what it already wraps.
|
||||
- **OOS-3.** A function/expression allowlist for full
|
||||
expression-level engine neutrality (§7) — best-effort now.
|
||||
- **OOS-4.** Multi-statement batches and transaction control.
|
||||
- **OOS-5.** A SQL → DSL echo (the reverse of §10).
|
||||
|
||||
## Consequences
|
||||
|
||||
- The unified grammar tree gains a large body of SQL grammar.
|
||||
The `Node` taxonomy and the walker may need extension to
|
||||
carry it (e.g. deeper recursion for subqueries / CTEs) — a
|
||||
known risk, addressed per phase.
|
||||
- `sqlparser-rs` is **not** adopted as the parser; ADR-0001's
|
||||
reservation is superseded. `Q1`'s wording ("SQL parsed via
|
||||
`sqlparser-rs`") is superseded — SQL is parsed by the
|
||||
unified walker.
|
||||
- `Command` gains a `Select` variant; every exhaustive `match
|
||||
Command` gains an arm (the recurring ADR-0028/0029 gotcha).
|
||||
- The `Database` worker gains a `RunSelect` request and a
|
||||
"run validated DML, re-persist the table" request; DDL
|
||||
reuses the existing typed requests unchanged.
|
||||
- Mode-gating is added to the grammar / walker.
|
||||
- The metadata, persistence, and type machinery is reused
|
||||
unchanged for DDL — the payoff of routing DDL through
|
||||
`Command`.
|
||||
- This is the project's largest single feature so far. The
|
||||
phased plan keeps each step independently shippable;
|
||||
scope-cutting, if a slice proves disproportionate, is an
|
||||
explicit escalation, never a silent trim.
|
||||
- `Q4` is satisfied by this ADR; `Q1` / `Q2` are unblocked and
|
||||
reframed around the unified walker; `M1` gains its
|
||||
"recognised as SQL" hint.
|
||||
|
||||
## Implementation notes
|
||||
|
||||
Phased; each phase independently shippable and test-guarded.
|
||||
The two large grammar slices each warrant their own focused
|
||||
ADR when taken up (ADR-0026-style).
|
||||
|
||||
1. **Foundations + first `SELECT`.** Mode-gate the grammar
|
||||
(advanced unlocks the SQL nodes). Author the core SQL
|
||||
**expression grammar** — the ADR-0026 superset — as its own
|
||||
ADR. A single-table `SELECT` (projection, `WHERE`, `ORDER
|
||||
BY`, `LIMIT`) as a SQL `CommandNode` → `Command::Select` →
|
||||
worker `RunSelect` → the existing renderer. Replace the
|
||||
placeholder echo; add the simple-mode "this is SQL" hint.
|
||||
This proves the path end-to-end *with full walker
|
||||
assistance*.
|
||||
2. **`SELECT` — full.** `JOIN`s, `GROUP BY`/`HAVING`,
|
||||
aggregates, subqueries, `UNION`, CTEs. The big grammar
|
||||
phase — its own ADR.
|
||||
3. **DML.** `INSERT` / `UPDATE` / `DELETE` grammar; the
|
||||
execute-as-validated-SQL path; the worker re-persist step;
|
||||
settle multi-row `INSERT` and `shortid` auto-fill on a SQL
|
||||
`INSERT`.
|
||||
4. **DDL.** `CREATE` / `DROP` / `ALTER TABLE`, `CREATE` /
|
||||
`DROP INDEX` grammar → `Command`; the §5 type-name map; FK
|
||||
clauses → `AddRelationship`; may land table-rename (`C1`).
|
||||
5. **The DSL → SQL teaching echo** (§10).
|
||||
6. **Polish.** `help sql`; an engine-neutral error sweep;
|
||||
typing-surface / matrix coverage; the `DOC1` SQL-surface
|
||||
reference page.
|
||||
|
||||
## See also
|
||||
|
||||
- ADR-0001 — reserved `sqlparser-rs`; that reservation is
|
||||
superseded here (§1).
|
||||
- ADR-0002 — the engine is an implementation detail; "no
|
||||
engine name in user-facing strings" — §7 extends it.
|
||||
- ADR-0003 — the simple / advanced mode model this builds on.
|
||||
- ADR-0005 — the ten-type vocabulary advanced DDL uses (§5).
|
||||
- ADR-0009 — the DSL conventions; the DSL stays usable in
|
||||
advanced mode.
|
||||
- ADR-0012 / ADR-0013 — the metadata tables the `Command` core
|
||||
keeps in sync, inherited for free (§4, §11).
|
||||
- ADR-0014 — the data-operation model and the `--all-rows`
|
||||
guard advanced mode deliberately relaxes (§12).
|
||||
- ADR-0015 — persistence write-through and replay, reused and
|
||||
made surface-agnostic (§11).
|
||||
- ADR-0016 — the data-table renderer `SELECT` results reuse
|
||||
(§6).
|
||||
- ADR-0019 — the friendly-error layer all SQL errors route
|
||||
through (§7, §9).
|
||||
- ADR-0021 — per-command parse-error usage, free for SQL (§9).
|
||||
- ADR-0022 — ambient typing assistance; §8 is its extension to
|
||||
SQL.
|
||||
- ADR-0023 / ADR-0024 — the unified grammar tree SQL becomes
|
||||
part of (§1, §2).
|
||||
- ADR-0026 — the `WHERE` expression grammar the SQL expression
|
||||
grammar is the superset of (§3).
|
||||
- ADR-0027 — the validity indicator, free for SQL (§8).
|
||||
- ADR-0028 — the `OutputLine` styled-runs the teaching echo
|
||||
uses (§10).
|
||||
@@ -35,3 +35,4 @@ This directory contains the project's ADRs, recorded per
|
||||
- [ADR-0027 — Input-field validity indicator](0027-input-validity-indicator.md) — **Accepted**, a debounced `[ERR]` / `[WRN]` marker at the input row's right edge, backed by a walker diagnostics-severity model (parse-outcome + schema-existence); advisory, never blocks submission (`S6`); Amendment 1 adds a `LIKE`-on-numeric-column WARNING
|
||||
- [ADR-0028 — Query plans (`EXPLAIN QUERY PLAN`)](0028-query-plans.md) — **Accepted**, an `explain` prefix command over `show data` / `update` / `delete`; an annotated, span-styled plan tree; introduces the `OutputLine` styled-runs mechanism (ADR-0016's deferred per-span styling) (`QA1` / `QA2`)
|
||||
- [ADR-0029 — Column constraints (NOT NULL / UNIQUE / CHECK / DEFAULT)](0029-column-constraints.md) — **Accepted**, the four column-level constraints declared in the column-spec suffix (`create table` / `add column`) and modified on existing columns via `add constraint …` / `drop constraint …`; a pre-flight dry-run guards populated columns; `CHECK` reuses the ADR-0026 expression grammar via `Subgrammar` (`C3`)
|
||||
- [ADR-0030 — Advanced mode: the standard-SQL surface](0030-advanced-mode-sql-surface.md) — **Accepted**, SQL added as grammar *within the unified grammar tree* (ADR-0024), not a separate batch parser — so SQL gets the same completion / highlighting / hints / parse-errors as the DSL; mode gates the SQL forms; DDL routes through the typed `Command` executor (metadata + type vocabulary preserved), DML and `SELECT` execute as validated SQL; engine-neutral posture, the DSL→SQL teaching echo; supersedes ADR-0001's `sqlparser-rs` reservation; phased plan (`Q1` / `Q2` / `Q4`)
|
||||
|
||||
Reference in New Issue
Block a user