grammar: migrate Phase-1 SELECT to the ADR-0032 fragment (sub-phase 2c)

The Phase-1 SQL `SELECT` grammar nodes that used to live in
`src/dsl/grammar/data.rs` retire — 22 statics / consts and the
`reject_internal_table` validator copy are removed, ~150 lines
of grammar machinery gone. `data::SELECT.shape` now references
the post-`SELECT` portion of the ADR-0032 fragment via a thin
`Node::Subgrammar(&sql_select::SQL_SELECT_TAIL)`.

`SQL_SELECT_TAIL` is a new export from `sql_select.rs`,
parallel to `SQL_SELECT_STATEMENT`. It represents what a
top-level `SELECT` statement looks like AFTER the registry's
entry-word dispatch has already consumed the leading `SELECT`
keyword: the DISTINCT/ALL prefix, projection list, optional
FROM / WHERE / GROUP BY / HAVING, the compound set-op chain
(each subsequent leg's `SELECT` is part of `SET_OP_TAIL`),
outer ORDER BY / LIMIT, and a tolerated trailing `;`.

WITH-prefixed statements (`WITH x AS (…) SELECT * FROM x`)
are NOT in 2c's scope — they need a separate `data::WITH`
`CommandNode` so the entry-word dispatch routes correctly.
For now, top-level WITH continues to fall through to the
chumsky parser route (the same as in Phase 1). The
`SQL_SELECT_STATEMENT` static (which includes the optional
WITH prefix) stays available for use by that future
CommandNode or by any other consumer that needs the full
statement shape.

All seven Phase-1 SQL `SELECT` integration tests
(`tests/sql_select.rs`) pass without modification, satisfying
the 2c exit gate's "behaviour preserved" requirement. The
70 fragment unit tests and the 26 driver-level scope tests
also pass — the migration is a refactor, no new tests
required.

Behaviour change explicitly sanctioned by ADR-0032 §8:
Phase-1's `LIMIT_VALIDATOR` (positive-int-only, parse-time)
is superseded by the full `sql_expr` admission. `LIMIT max(10,
x)` and similar now parse; the engine constrains the value at
execution time per the ADR's "grammar admits, engine
rejects" posture.

Plan §2b status note: the 2026-05-20 deferral of §10.3 stage 2
(CTE output-column harvest derivation) is recorded in
`docs/plans/20260520-adr-0032-phase-2.md` per the
user-approved deferral.

Test totals: 1366 passing (unchanged), 0 failed, 1 ignored.
Clippy clean. data.rs loses ~150 lines of dead grammar; the
single source of truth for the SQL `SELECT` shape is now
`sql_select.rs`.
This commit is contained in:
claude@clouddev1
2026-05-20 15:42:44 +00:00
parent 4ff054ca75
commit a491df32a0
3 changed files with 93 additions and 144 deletions
+46 -4
View File
@@ -660,12 +660,54 @@ static SELECT_STATEMENT_NODES: &[Node] = &[
Node::Subgrammar(&SQL_SELECT_COMPOUND),
Node::Optional(&SEMI),
];
/// The full statement, including the optional `WITH` prefix and
/// a tolerated trailing `;`. This is what `data::SELECT`'s
/// `CommandNode` will reference once sub-phase 2c migrates the
/// Phase-1 grammar.
/// The full statement, including the optional `WITH` prefix
/// and a tolerated trailing `;`.
///
/// Used by the fragment's own tests and by any future
/// `data::WITH` `CommandNode` that dispatches `WITH …`
/// statements. Top-level `SELECT` statements (entry word
/// `select`) reference `SQL_SELECT_TAIL` instead, which omits
/// the leading `SELECT` keyword that the registry's
/// entry-word dispatch already consumed.
pub static SQL_SELECT_STATEMENT: Node = Node::Seq(SELECT_STATEMENT_NODES);
// =================================================================
// select_statement — entry-consumed form (ADR-0030 §6, 2c)
// =================================================================
/// The post-`SELECT` portion of a top-level statement.
/// `data::SELECT`'s `CommandNode` has `entry: Word::keyword
/// ("select")`, so the registry's dispatch consumes the leading
/// `SELECT` keyword before the shape walks. The shape is then
/// the rest of `select_core` (`DISTINCT/ALL`, projection,
/// FROM, WHERE, GROUP BY, HAVING), followed by the compound
/// set-op chain (each subsequent leg's `SELECT` keyword is
/// part of `SET_OP_TAIL`), the outer `ORDER BY` / `LIMIT`, and
/// a tolerated trailing `;`.
///
/// WITH-prefixed statements (`WITH x AS (…) SELECT …`) are
/// dispatched separately by entry word `with`. Adding a
/// `data::WITH` `CommandNode` is a future sub-phase; for now
/// top-level WITH falls back to the chumsky parser route, the
/// same as in Phase 1.
static SQL_SELECT_TAIL_NODES: &[Node] = &[
Node::Subgrammar(&DISTINCT_OR_ALL_OPTIONAL),
Node::Subgrammar(&PROJECTION_LIST),
Node::Optional(&FROM_CLAUSE),
Node::Optional(&WHERE_CLAUSE),
Node::Optional(&GROUP_BY_CLAUSE),
Node::Optional(&HAVING_CLAUSE),
Node::Repeated {
inner: &SET_OP_TAIL,
separator: None,
min: 0,
},
Node::Optional(&ORDER_BY_CLAUSE),
Node::Optional(&LIMIT_CLAUSE),
Node::Optional(&SEMI),
];
pub static SQL_SELECT_TAIL: Node = Node::Seq(SQL_SELECT_TAIL_NODES);
// =================================================================
// Tests
// =================================================================