Decides the architecture for SQL in advanced mode (Q1/Q2/Q4): SQL is authored as grammar within the unified grammar tree (ADR-0024) and parsed by the existing walker — not a separate batch parser — so SQL gets the same completion, highlighting, hints, and parse-error reporting as the DSL. Mode gates the SQL forms. DDL routes through the typed Command executor (metadata and the playground type vocabulary preserved); DML and SELECT execute as validated SQL. Engine-neutral posture; DSL→SQL teaching echo; phased plan. Supersedes ADR-0001's sqlparser-rs reservation. Ticks Q4; updates the ADR index and the Q1/Q2 notes. handoff-24 orients the implementation session at Phase 1.
17 KiB
ADR-0030: Advanced mode — the standard-SQL surface
Status
Accepted
Context
ADR-0003 split the input field into two modes. Simple mode (the default) takes the teaching DSL; advanced mode was specified to take "raw SQL, including DDL and queries". The DSL half is fully built (ADR-0009, ADR-0023/0024, and everything since); advanced mode is still a placeholder — a submitted line is echoed back unexecuted.
Requirement Q1 commits to a defined SQL subset, Q2 to
rejecting out-of-subset syntax clearly, Q4 is the subset
specification — this ADR. Two constraints shape every decision
below; both come from how this project already works.
-
The engine is an implementation detail. ADR-0002 established that the database product is never named in user-facing strings. Advanced mode must extend that posture: it is a way to work with standard SQL, as independent of the storage engine as we can make it — not a console onto the engine. The engine's type names, its
STRICTkeyword, its dialect quirks, and its raw error text must not surface. And handing typed text straight to the engine would bypass the typed executor that keeps the internal metadata tables (ADR-0012/0013) in sync, writesproject.yaml+ CSV (ADR-0015), and preserves the playground's rich type vocabulary (ADR-0005). -
Assistance comes from one place. Completion, syntax highlighting, hint-panel prose, the
[ERR]/[WRN]indicator, and per-command parse-error usage all derive from a single unified grammar tree walked incrementally (ADR-0022/0023/0024 — explicitly "the single source of truth"). A batch SQL parser — the kindsqlparser-rs(reserved in ADR-0001) is — produces an AST and nothing else: it cannot say what is valid at the cursor, cannot drive completion, highlighting, or hints. Parsing SQL with such a library would leave advanced mode either without the ambient assistance the DSL has, or dependent on a second, parallel assistance system — both contrary to ADR-0023/0024.
The decision: SQL is not parsed by a separate library. SQL becomes additional grammar within the unified tree, walked by the same walker as the DSL. Advanced mode is not a different parser — it is the same parser with more grammar unlocked.
Decision
1. SQL lives in the unified grammar tree
SQL statements are authored as CommandNode / Node grammar
in the ADR-0024 tree and parsed by the existing walker. The
consequence is the whole point: completion, highlighting,
hint prose, the validity indicator, and parse-error usage
work for SQL exactly as for the DSL, for free, because
they are all walker outputs (§8).
sqlparser-rs is therefore not used as the parser;
ADR-0001's reservation of it is superseded. (An implementer
may retain it narrowly as a test oracle — parse the same SQL,
compare — but it is not on the execution path.)
The honest cost: the supported SQL is exactly what we author into the tree — we are, in effect, writing a SQL grammar. This is the project's largest single feature to date. The target is the full teaching-relevant standard-SQL surface (§3); scope is cut only on demonstrated difficulty, as a deliberate escalation to the user, never silently.
2. Mode gates the grammar
There is one grammar tree. Simple mode exposes the DSL subset of it; advanced mode additionally exposes the SQL forms.
- Shared entry words —
create,drop,insert,update,delete— carry both a DSL form and a SQL form asChoicebranches under oneCommandNode(mechanically howaddalready holds four sub-commands today).selectis a new, SQL-only entry word. - SQL branches are mode-tagged; the walker presents the DSL-only view in simple mode and the full view in advanced.
- The
:one-shot escape andmode advancedunlock the SQL view for a line / persistently — unchanged from ADR-0003. - Because the grammar knows a node is SQL (it is tagged,
merely gated), a simple-mode line that matches a gated SQL
form yields a precise hint — "this is SQL; switch with
mode advanced, or prefix the line with:" — rather than a generic parse error. This satisfiesM1's "recognised as SQL" promise.
The DSL stays usable in advanced mode (the superset rule): nothing a learner already knows stops working.
3. The supported SQL surface (Q4)
The target is the teaching-relevant standard-SQL surface, authored into the tree with no pre-emptive cuts:
SELECT— the full query surface: projection,WHERE, inner/outerJOINs,GROUP BY/HAVING, aggregate functions,ORDER BY,LIMIT/OFFSET, scalar and correlated subqueries,UNION/INTERSECT/EXCEPT, and common table expressions (WITH).INSERT(single- and multi-row),UPDATE,DELETE.CREATE/DROP/ALTER TABLE,CREATE/DROP INDEX.- A SQL expression grammar — arithmetic, function calls,
CASE, the comparison /LIKE/IN/BETWEEN/IS NULLpredicate set, subquery expressions — the superset of ADR-0026'sWHEREgrammar, shared byWHERE,HAVING,CHECK,SELECTprojections, andDEFAULT.
Out of the surface: views, triggers, transaction control
(BEGIN/COMMIT/…), PRAGMA, ATTACH/DETACH, VACUUM,
virtual tables, multi-statement batches. One statement per
submission; a trailing ; is tolerated.
The SQL expression grammar and the full SELECT
grammar are each large enough to warrant their own focused
ADR when implemented — the precedent is ADR-0026 for the
WHERE grammar. ADR-0030 fixes the architecture; those
ADRs fix the detailed grammar.
4. Execution — DDL through Command, DML and SELECT as validated SQL
The walker parsing a SQL statement yields a matched parse. From it:
- DDL → a
Command(CreateTable,DropTable,AddColumn,AddConstraint,AddIndex, …). DDL must run the typed executor, because that is what keeps the metadata tables, the playground type vocabulary, andSTRICTintact. TheCommandNode'sast_builderis the SQL →Commandtranslator. - DML and
SELECT→ executed as the validated SQL itself (re-rendered canonically from the matched parse, or the validated original text). They change no schema, so modelling them as a typedCommandbuys nothing. For DML the worker — knowing the statement kind and target table from the parse — runs the statement and re-persists that table's CSV;SELECTis read-only, run and rendered (§6).
This split is also what makes advanced mode genuinely full.
Because DML / SELECT / CHECK expressions are not
lowered into the DSL's deliberately-limited Expr (ADR-0026),
advanced mode delivers the full SQL expression surface —
arithmetic, functions, subqueries, nested boolean operands —
that docs/simple-mode-limitations.md records as the inverse
of the simple-mode subset. The DSL Expr is the DSL's
representation; the SQL surface does not round-trip through it.
5. Type vocabulary — the playground's, not the engine's
Advanced-mode DDL uses the playground's own ten-type
vocabulary (ADR-0005). There is no fallback to engine
storage types: a column created in advanced mode is a
first-class serial / decimal / date / … exactly as a
DSL-created one, with the same metadata row.
The type-name slot accepts the playground keywords directly
(text, int, real, decimal, bool, date,
datetime, blob, serial, shortid) and standard-SQL
aliases that map onto them — integer/smallint/bigint →
int; varchar/char → text; boolean → bool;
timestamp → datetime; numeric → decimal;
float/double precision → real; binary/varbinary →
blob. A length / precision argument (varchar(255)) is
accepted and ignored — the playground's types are
unparameterised. The engine's own type names are an internal
mapping and are neither accepted as input nor shown.
6. SELECT — the read-only query path
SELECT touches no metadata, no persistence, no types. It is
carried as Command::Select holding the validated SQL; the
worker (Request::RunSelect) prepares and runs it, producing
the existing DataResult, which renders through the existing
data-table renderer (the one show data uses, ADR-0016).
Columns that carry no playground type — computed expressions,
joined columns — render with neutral alignment; the result is
capped like show data, with LIMIT suggested for large
outputs. A reference to an internal __rdbms_* table is
rejected by the grammar (those tables are not in scope).
7. Engine neutrality
- No engine type names in or out (§5).
- No
STRICT, no storage options.STRICTis applied internally bydo_create_table; the user neither writes nor sees it. It is simply not part of the authored grammar, so typing it is an ordinary parse error — not a SQLite feature surfaced to the learner. - Engine-neutral errors. SQL parse errors, out-of-subset refusals, and execution failures all route through the friendly-error layer (ADR-0019); the engine's raw message and product name never appear.
- Honest limitation. The grammar enforces the structural subset exactly. Expression-level neutrality is best-effort: an exotic engine-specific function the grammar admits and the engine then rejects surfaces an engine-neutral error rather than being caught up front. A function allowlist is a possible future hardening (§13).
8. Ambient assistance comes for free
Because SQL is grammar in the unified tree (§1), the walker gives SQL — with no SQL-specific assistance code — the same as the DSL:
- Syntax highlighting of SQL keywords, identifiers, literals.
- Tab completion of SQL keywords, and of schema names
(tables, columns) drawn from the same
SchemaCachethe DSL completion already uses. - Hint-panel prose at each grammar slot.
- The
[ERR]/[WRN]validity indicator (ADR-0027). - Per-command parse-error usage (ADR-0021).
This is the reason for §1: assistance and a batch parser are incompatible; assistance and the unified grammar tree are the same thing.
9. Parse errors and the unsupported surface (Q2)
A construct not in the authored grammar is an ordinary walker parse error; the ADR-0021 per-command usage machinery and the ADR-0027 indicator apply, with engine-neutral wording. There is no separate "valid SQL but unsupported" classifier — that would require the batch parser §1 dropped; the walker's expected-set drives the message instead.
10. The DSL → SQL teaching bridge
When a DSL command runs in advanced mode, its output includes the equivalent SQL — so a learner who knows the simple-mode form reads off how to express it in SQL.
- It is a
Command→ SQL renderer: the inverse of §4's DDL translator. - It fires only for commands entered via the DSL form, and only in advanced mode (a command the user already typed as SQL is not echoed back; simple mode is left uncluttered).
- It renders as a distinct, de-emphasised output line beneath
the
[ok]summary, using theOutputLinestyled-runs mechanism (ADR-0028). - App-level commands have no SQL form and are not echoed.
11. Persistence, metadata, history, replay
- DDL →
Command→ the typed executor, soproject.yaml, the metadata tables, andhistory.logstay correct with no new code (§4). - DML → the worker re-persists the affected table's CSV after running the statement.
history.logrecords the literal submitted line — a statement typed as SQL is logged as that SQL. The replay format is therefore app-enterable syntax, no divergence.- Replay re-runs each log line through the one walker with the advanced view active, so a project whose history mixes DSL and SQL replays faithfully.
project.yamlstays a structured schema snapshot; its embedded expressions (a columnCHECK) are stored as SQL the user could re-enter in advanced mode — one syntax, not a third.
12. Safety in advanced mode
Advanced mode carries fewer rails by design. The DSL's
WHERE-or---all-rows guard on update/delete (ADR-0014)
is a simple-mode teaching aid; a SQL DELETE FROM t with no
WHERE executes as written. The safety net is the
auto-snapshot before destructive operations (ADR-0006), which
fires regardless of which surface produced the statement; the
mode's visual distinction (ADR-0003) is the user's signal
until then.
13. Out of scope
- OOS-1.
CREATE VIEW/TRIGGER. Views are anticipated by the items panel's design (S2) but need their own model. - OOS-2.
EXPLAINof advanced-mode SQL queries. The DSLexplain(ADR-0028) still works for what it already wraps. - OOS-3. A function/expression allowlist for full expression-level engine neutrality (§7) — best-effort now.
- OOS-4. Multi-statement batches and transaction control.
- OOS-5. A SQL → DSL echo (the reverse of §10).
Consequences
- The unified grammar tree gains a large body of SQL grammar.
The
Nodetaxonomy and the walker may need extension to carry it (e.g. deeper recursion for subqueries / CTEs) — a known risk, addressed per phase. sqlparser-rsis not adopted as the parser; ADR-0001's reservation is superseded.Q1's wording ("SQL parsed viasqlparser-rs") is superseded — SQL is parsed by the unified walker.Commandgains aSelectvariant; every exhaustivematch Commandgains an arm (the recurring ADR-0028/0029 gotcha).- The
Databaseworker gains aRunSelectrequest and a "run validated DML, re-persist the table" request; DDL reuses the existing typed requests unchanged. - Mode-gating is added to the grammar / walker.
- The metadata, persistence, and type machinery is reused
unchanged for DDL — the payoff of routing DDL through
Command. - This is the project's largest single feature so far. The phased plan keeps each step independently shippable; scope-cutting, if a slice proves disproportionate, is an explicit escalation, never a silent trim.
Q4is satisfied by this ADR;Q1/Q2are unblocked and reframed around the unified walker;M1gains its "recognised as SQL" hint.
Implementation notes
Phased; each phase independently shippable and test-guarded. The two large grammar slices each warrant their own focused ADR when taken up (ADR-0026-style).
- Foundations + first
SELECT. Mode-gate the grammar (advanced unlocks the SQL nodes). Author the core SQL expression grammar — the ADR-0026 superset — as its own ADR. A single-tableSELECT(projection,WHERE,ORDER BY,LIMIT) as a SQLCommandNode→Command::Select→ workerRunSelect→ the existing renderer. Replace the placeholder echo; add the simple-mode "this is SQL" hint. This proves the path end-to-end with full walker assistance. SELECT— full.JOINs,GROUP BY/HAVING, aggregates, subqueries,UNION, CTEs. The big grammar phase — its own ADR.- DML.
INSERT/UPDATE/DELETEgrammar; the execute-as-validated-SQL path; the worker re-persist step; settle multi-rowINSERTandshortidauto-fill on a SQLINSERT. - DDL.
CREATE/DROP/ALTER TABLE,CREATE/DROP INDEXgrammar →Command; the §5 type-name map; FK clauses →AddRelationship; may land table-rename (C1). - The DSL → SQL teaching echo (§10).
- Polish.
help sql; an engine-neutral error sweep; typing-surface / matrix coverage; theDOC1SQL-surface reference page.
See also
- ADR-0001 — reserved
sqlparser-rs; that reservation is superseded here (§1). - ADR-0002 — the engine is an implementation detail; "no engine name in user-facing strings" — §7 extends it.
- ADR-0003 — the simple / advanced mode model this builds on.
- ADR-0005 — the ten-type vocabulary advanced DDL uses (§5).
- ADR-0009 — the DSL conventions; the DSL stays usable in advanced mode.
- ADR-0012 / ADR-0013 — the metadata tables the
Commandcore keeps in sync, inherited for free (§4, §11). - ADR-0014 — the data-operation model and the
--all-rowsguard advanced mode deliberately relaxes (§12). - ADR-0015 — persistence write-through and replay, reused and made surface-agnostic (§11).
- ADR-0016 — the data-table renderer
SELECTresults reuse (§6). - ADR-0019 — the friendly-error layer all SQL errors route through (§7, §9).
- ADR-0021 — per-command parse-error usage, free for SQL (§9).
- ADR-0022 — ambient typing assistance; §8 is its extension to SQL.
- ADR-0023 / ADR-0024 — the unified grammar tree SQL becomes part of (§1, §2).
- ADR-0026 — the
WHEREexpression grammar the SQL expression grammar is the superset of (§3). - ADR-0027 — the validity indicator, free for SQL (§8).
- ADR-0028 — the
OutputLinestyled-runs the teaching echo uses (§10).