# ADR-0030: Advanced mode — the standard-SQL surface ## Status Accepted ## Context ADR-0003 split the input field into two modes. **Simple mode** (the default) takes the teaching DSL; **advanced mode** was specified to take "raw SQL, including DDL and queries". The DSL half is fully built (ADR-0009, ADR-0023/0024, and everything since); advanced mode is still a **placeholder** — a submitted line is echoed back unexecuted. Requirement `Q1` commits to a *defined* SQL subset, `Q2` to rejecting out-of-subset syntax clearly, `Q4` is the subset specification — this ADR. Two constraints shape every decision below; both come from how this project already works. 1. **The engine is an implementation detail.** ADR-0002 established that the database product is never named in user-facing strings. Advanced mode must *extend* that posture: it is a way to work with **standard SQL**, as independent of the storage engine as we can make it — not a console onto the engine. The engine's type names, its `STRICT` keyword, its dialect quirks, and its raw error text must not surface. And handing typed text straight to the engine would bypass the typed executor that keeps the internal metadata tables (ADR-0012/0013) in sync, writes `project.yaml` + CSV (ADR-0015), and preserves the playground's rich type vocabulary (ADR-0005). 2. **Assistance comes from one place.** Completion, syntax highlighting, hint-panel prose, the `[ERR]`/`[WRN]` indicator, and per-command parse-error usage all derive from a single **unified grammar tree** walked incrementally (ADR-0022/0023/0024 — explicitly "the single source of truth"). A *batch* SQL parser — the kind `sqlparser-rs` (reserved in ADR-0001) is — produces an AST and nothing else: it cannot say what is valid at the cursor, cannot drive completion, highlighting, or hints. Parsing SQL with such a library would leave advanced mode either *without* the ambient assistance the DSL has, or dependent on a second, parallel assistance system — both contrary to ADR-0023/0024. The decision: **SQL is not parsed by a separate library. SQL becomes additional grammar within the unified tree**, walked by the same walker as the DSL. Advanced mode is not a different parser — it is the same parser with more grammar unlocked. ## Decision ### 1. SQL lives in the unified grammar tree SQL statements are authored as `CommandNode` / `Node` grammar in the ADR-0024 tree and parsed by the existing walker. The consequence is the whole point: completion, highlighting, hint prose, the validity indicator, and parse-error usage **work for SQL exactly as for the DSL, for free**, because they are all walker outputs (§8). `sqlparser-rs` is therefore *not* used as the parser; ADR-0001's reservation of it is superseded. (An implementer may retain it narrowly as a test oracle — parse the same SQL, compare — but it is not on the execution path.) The honest cost: the supported SQL is exactly what we author into the tree — we are, in effect, writing a SQL grammar. This is the project's largest single feature to date. The target is the full teaching-relevant standard-SQL surface (§3); scope is cut only on *demonstrated* difficulty, as a deliberate escalation to the user, never silently. ### 2. Mode gates the grammar There is one grammar tree. **Simple mode** exposes the DSL subset of it; **advanced mode** additionally exposes the SQL forms. - Shared entry words — `create`, `drop`, `insert`, `update`, `delete` — carry both a DSL form and a SQL form as `Choice` branches under one `CommandNode` (mechanically how `add` already holds four sub-commands today). `select` is a new, SQL-only entry word. - SQL branches are mode-tagged; the walker presents the DSL-only view in simple mode and the full view in advanced. - The `:` one-shot escape and `mode advanced` unlock the SQL view for a line / persistently — unchanged from ADR-0003. - Because the grammar *knows* a node is SQL (it is tagged, merely gated), a simple-mode line that matches a gated SQL form yields a precise hint — "this is SQL; switch with `mode advanced`, or prefix the line with `:`" — rather than a generic parse error. This satisfies `M1`'s "recognised as SQL" promise. The DSL stays usable in advanced mode (the superset rule): nothing a learner already knows stops working. ### 3. The supported SQL surface (`Q4`) The target is the teaching-relevant standard-SQL surface, authored into the tree with **no pre-emptive cuts**: - **`SELECT`** — the full query surface: projection, `WHERE`, inner/outer `JOIN`s, `GROUP BY` / `HAVING`, aggregate functions, `ORDER BY`, `LIMIT` / `OFFSET`, scalar and correlated subqueries, `UNION` / `INTERSECT` / `EXCEPT`, and common table expressions (`WITH`). - **`INSERT`** (single- and multi-row), **`UPDATE`**, **`DELETE`**. - **`CREATE` / `DROP` / `ALTER TABLE`**, **`CREATE` / `DROP INDEX`**. - A **SQL expression grammar** — arithmetic, function calls, `CASE`, the comparison / `LIKE` / `IN` / `BETWEEN` / `IS NULL` predicate set, subquery expressions — the superset of ADR-0026's `WHERE` grammar, shared by `WHERE`, `HAVING`, `CHECK`, `SELECT` projections, and `DEFAULT`. Out of the surface: views, triggers, transaction control (`BEGIN`/`COMMIT`/…), `PRAGMA`, `ATTACH`/`DETACH`, `VACUUM`, virtual tables, multi-statement batches. One statement per submission; a trailing `;` is tolerated. The **SQL expression grammar** and the **full `SELECT` grammar** are each large enough to warrant their own focused ADR when implemented — the precedent is ADR-0026 for the `WHERE` grammar. ADR-0030 fixes the *architecture*; those ADRs fix the detailed grammar. ### 4. Execution — DDL through `Command`, DML and `SELECT` as validated SQL The walker parsing a SQL statement yields a matched parse. From it: - **DDL** → a `Command` (`CreateTable`, `DropTable`, `AddColumn`, `AddConstraint`, `AddIndex`, …). DDL *must* run the typed executor, because that is what keeps the metadata tables, the playground type vocabulary, and `STRICT` intact. The `CommandNode`'s `ast_builder` is the SQL → `Command` translator. - **DML and `SELECT`** → executed as the **validated SQL itself** (re-rendered canonically from the matched parse, or the validated original text). They change no schema, so modelling them as a typed `Command` buys nothing. For DML the worker — knowing the statement kind and target table from the parse — runs the statement and re-persists that table's CSV; `SELECT` is read-only, run and rendered (§6). This split is also what makes advanced mode genuinely *full*. Because DML / `SELECT` / `CHECK` expressions are **not** lowered into the DSL's deliberately-limited `Expr` (ADR-0026), advanced mode delivers the full SQL expression surface — arithmetic, functions, subqueries, nested boolean operands — that `docs/simple-mode-limitations.md` records as the inverse of the simple-mode subset. The DSL `Expr` is the *DSL's* representation; the SQL surface does not round-trip through it. ### 5. Type vocabulary — the playground's, not the engine's Advanced-mode DDL uses the playground's own ten-type vocabulary (ADR-0005). There is **no fallback to engine storage types**: a column created in advanced mode is a first-class `serial` / `decimal` / `date` / … exactly as a DSL-created one, with the same metadata row. The type-name slot accepts the playground keywords directly (`text`, `int`, `real`, `decimal`, `bool`, `date`, `datetime`, `blob`, `serial`, `shortid`) and standard-SQL aliases that map onto them — `integer`/`smallint`/`bigint` → `int`; `varchar`/`char` → `text`; `boolean` → `bool`; `timestamp` → `datetime`; `numeric` → `decimal`; `float`/`double precision` → `real`; `binary`/`varbinary` → `blob`. A length / precision argument (`varchar(255)`) is accepted and ignored — the playground's types are unparameterised. The engine's own type names are an internal mapping and are neither accepted as input nor shown. ### 6. `SELECT` — the read-only query path `SELECT` touches no metadata, no persistence, no types. It is carried as `Command::Select` holding the validated SQL; the worker (`Request::RunSelect`) prepares and runs it, producing the existing `DataResult`, which renders through the existing data-table renderer (the one `show data` uses, ADR-0016). Columns that carry no playground type — computed expressions, joined columns — render with neutral alignment; the result is capped like `show data`, with `LIMIT` suggested for large outputs. A reference to an internal `__rdbms_*` table is rejected by the grammar (those tables are not in scope). ### 7. Engine neutrality - **No engine type names** in or out (§5). - **No `STRICT`**, no storage options. `STRICT` is applied internally by `do_create_table`; the user neither writes nor sees it. It is simply not part of the authored grammar, so typing it is an ordinary parse error — not a SQLite feature surfaced to the learner. - **Engine-neutral errors.** SQL parse errors, out-of-subset refusals, and execution failures all route through the friendly-error layer (ADR-0019); the engine's raw message and product name never appear. - **Honest limitation.** The grammar enforces the *structural* subset exactly. *Expression-level* neutrality is best-effort: an exotic engine-specific function the grammar admits and the engine then rejects surfaces an engine-neutral error rather than being caught up front. A function allowlist is a possible future hardening (§13). ### 8. Ambient assistance comes for free Because SQL is grammar in the unified tree (§1), the walker gives SQL — with no SQL-specific assistance code — the same as the DSL: - **Syntax highlighting** of SQL keywords, identifiers, literals. - **Tab completion** of SQL keywords, and of schema names (tables, columns) drawn from the same `SchemaCache` the DSL completion already uses. - **Hint-panel prose** at each grammar slot. - The **`[ERR]`/`[WRN]` validity indicator** (ADR-0027). - **Per-command parse-error usage** (ADR-0021). This is the reason for §1: assistance and a batch parser are incompatible; assistance and the unified grammar tree are the same thing. ### 9. Parse errors and the unsupported surface (`Q2`) A construct not in the authored grammar is an ordinary walker parse error; the ADR-0021 per-command usage machinery and the ADR-0027 indicator apply, with engine-neutral wording. There is no separate "valid SQL but unsupported" classifier — that would require the batch parser §1 dropped; the walker's expected-set drives the message instead. ### 10. The DSL → SQL teaching bridge When a **DSL** command runs **in advanced mode**, its output includes the equivalent SQL — so a learner who knows the simple-mode form reads off how to express it in SQL. - It is a `Command` → SQL renderer: the inverse of §4's DDL translator. - It fires only for commands entered via the DSL form, and only in advanced mode (a command the user already typed as SQL is not echoed back; simple mode is left uncluttered). - It renders as a distinct, de-emphasised output line beneath the `[ok]` summary, using the `OutputLine` styled-runs mechanism (ADR-0028). - App-level commands have no SQL form and are not echoed. ### 11. Persistence, metadata, history, replay - **DDL** → `Command` → the typed executor, so `project.yaml`, the metadata tables, and `history.log` stay correct with no new code (§4). - **DML** → the worker re-persists the affected table's CSV after running the statement. - **`history.log`** records the **literal submitted line** — a statement typed as SQL is logged as that SQL. The replay format is therefore app-enterable syntax, no divergence. - **Replay** re-runs each log line through the one walker with the advanced view active, so a project whose history mixes DSL and SQL replays faithfully. - **`project.yaml`** stays a structured schema snapshot; its embedded expressions (a column `CHECK`) are stored as SQL the user could re-enter in advanced mode — one syntax, not a third. ### 12. Safety in advanced mode Advanced mode carries **fewer rails** by design. The DSL's `WHERE`-or-`--all-rows` guard on `update`/`delete` (ADR-0014) is a simple-mode teaching aid; a SQL `DELETE FROM t` with no `WHERE` executes as written. The safety net is the auto-snapshot before destructive operations (ADR-0006), which fires regardless of which surface produced the statement; the mode's visual distinction (ADR-0003) is the user's signal until then. ### 13. Out of scope - **OOS-1.** `CREATE VIEW` / `TRIGGER`. Views are anticipated by the items panel's design (`S2`) but need their own model. - **OOS-2.** `EXPLAIN` of advanced-mode SQL queries. The DSL `explain` (ADR-0028) still works for what it already wraps. - **OOS-3.** A function/expression allowlist for full expression-level engine neutrality (§7) — best-effort now. - **OOS-4.** Multi-statement batches and transaction control. - **OOS-5.** A SQL → DSL echo (the reverse of §10). ## Consequences - The unified grammar tree gains a large body of SQL grammar. The `Node` taxonomy and the walker may need extension to carry it (e.g. deeper recursion for subqueries / CTEs) — a known risk, addressed per phase. - `sqlparser-rs` is **not** adopted as the parser; ADR-0001's reservation is superseded. `Q1`'s wording ("SQL parsed via `sqlparser-rs`") is superseded — SQL is parsed by the unified walker. - `Command` gains a `Select` variant; every exhaustive `match Command` gains an arm (the recurring ADR-0028/0029 gotcha). - The `Database` worker gains a `RunSelect` request and a "run validated DML, re-persist the table" request; DDL reuses the existing typed requests unchanged. - Mode-gating is added to the grammar / walker. - The metadata, persistence, and type machinery is reused unchanged for DDL — the payoff of routing DDL through `Command`. - This is the project's largest single feature so far. The phased plan keeps each step independently shippable; scope-cutting, if a slice proves disproportionate, is an explicit escalation, never a silent trim. - `Q4` is satisfied by this ADR; `Q1` / `Q2` are unblocked and reframed around the unified walker; `M1` gains its "recognised as SQL" hint. ## Implementation notes Phased; each phase independently shippable and test-guarded. The two large grammar slices each warrant their own focused ADR when taken up (ADR-0026-style). 1. **Foundations + first `SELECT`.** Mode-gate the grammar (advanced unlocks the SQL nodes). Author the core SQL **expression grammar** — the ADR-0026 superset — as its own ADR. A single-table `SELECT` (projection, `WHERE`, `ORDER BY`, `LIMIT`) as a SQL `CommandNode` → `Command::Select` → worker `RunSelect` → the existing renderer. Replace the placeholder echo; add the simple-mode "this is SQL" hint. This proves the path end-to-end *with full walker assistance*. 2. **`SELECT` — full.** `JOIN`s, `GROUP BY`/`HAVING`, aggregates, subqueries, `UNION`, CTEs. The big grammar phase — its own ADR. 3. **DML.** `INSERT` / `UPDATE` / `DELETE` grammar; the execute-as-validated-SQL path; the worker re-persist step; settle multi-row `INSERT` and `shortid` auto-fill on a SQL `INSERT`. 4. **DDL.** `CREATE` / `DROP` / `ALTER TABLE`, `CREATE` / `DROP INDEX` grammar → `Command`; the §5 type-name map; FK clauses → `AddRelationship`; may land table-rename (`C1`). 5. **The DSL → SQL teaching echo** (§10). 6. **Polish.** `help sql`; an engine-neutral error sweep; typing-surface / matrix coverage; the `DOC1` SQL-surface reference page. ## See also - ADR-0001 — reserved `sqlparser-rs`; that reservation is superseded here (§1). - ADR-0002 — the engine is an implementation detail; "no engine name in user-facing strings" — §7 extends it. - ADR-0003 — the simple / advanced mode model this builds on. - ADR-0005 — the ten-type vocabulary advanced DDL uses (§5). - ADR-0009 — the DSL conventions; the DSL stays usable in advanced mode. - ADR-0012 / ADR-0013 — the metadata tables the `Command` core keeps in sync, inherited for free (§4, §11). - ADR-0014 — the data-operation model and the `--all-rows` guard advanced mode deliberately relaxes (§12). - ADR-0015 — persistence write-through and replay, reused and made surface-agnostic (§11). - ADR-0016 — the data-table renderer `SELECT` results reuse (§6). - ADR-0019 — the friendly-error layer all SQL errors route through (§7, §9). - ADR-0021 — per-command parse-error usage, free for SQL (§9). - ADR-0022 — ambient typing assistance; §8 is its extension to SQL. - ADR-0023 / ADR-0024 — the unified grammar tree SQL becomes part of (§1, §2). - ADR-0026 — the `WHERE` expression grammar the SQL expression grammar is the superset of (§3). - ADR-0027 — the validity indicator, free for SQL (§8). - ADR-0028 — the `OutputLine` styled-runs the teaching echo uses (§10).