grammar: SQL SELECT full statement fragment (ADR-0032 Phase 2a)

Author the standalone walkable shape for the full standard-SQL
SELECT per ADR-0032 §1: compound queries with the four set ops
(UNION / UNION ALL / INTERSECT / EXCEPT), the five JOIN flavours
(INNER / LEFT [OUTER] / RIGHT [OUTER] / FULL [OUTER] / CROSS),
GROUP BY / HAVING, WITH and WITH RECURSIVE common table
expressions, LIMIT … OFFSET, DISTINCT / ALL, qualified-wildcard
`t.*` projection, and bare-alias projection (lifting ADR-0030
Phase-1 §4.2).

Recursion into SQL_SELECT_COMPOUND uses Node::Subgrammar for
2a; sub-phase 2b will rewire those references to the new
Node::ScopedSubgrammar variant for completion-scope discipline
(ADR-0032 §10.2). The Phase-1 data::SELECT CommandNode is not
touched here — the new fragment is reachable only from its own
tests until sub-phase 2c performs the migration.

Two implementation mechanisms realize ADR semantics without
changing them:

- Node::Lookahead disambiguates the projection_item Choice
  (bare `*` vs `ident . *` qualified wildcard vs `sql_expr [
  alias ]`) and gates bare-alias slots against continuation
  keywords. The walker's walk_ident accepts any
  identifier-shape token, including keyword-shape ones, and
  Choice / Optional are first-match-wins; without lookahead a
  bare-alias slot would greedily swallow FROM / WHERE / JOIN /
  etc. Per-position follow-sets list which keywords legitimately
  follow each alias slot. Same pattern as data.rs's
  insert_first_paren precedent.

- INNER JOIN and bare JOIN are split into two distinct Choice
  branches (each with a concrete leading keyword) rather than
  sharing one Optional(Word("inner"))-leading branch. Avoids a
  walker hazard where an Optional-leading-child Seq commits to
  idx > 0 and then converts the next child's EOF NoMatch into
  Incomplete, blocking the outer Choice from falling through to
  later branches. Same semantic surface, distinct mechanism.

The §13 OOS shapes all have explicit reject tests (NATURAL,
USING, comma-FROM, LIMIT m,n, window OVER, VALUES, derived
tables). LATERAL has a noted partial limitation: the comma form
rejects via OOS-3, but the single-keyword form `FROM a LATERAL
JOIN b ON …` is admitted structurally because `lateral` parses
as a bare table-source alias for `a`. This matches ADR-0030's
"grammar admits identifier-shape tokens; engine resolves"
posture.

`__rdbms_*` rejection extends to every Phase-2 table-source
slot — the FROM table, each JOIN's table, each CTE name, and
the FROM inside any CTE body — via the reuseable
reject_internal_table validator.

70 new unit tests in sql_select.rs walk every §1 production and
every OOS reject case. Test totals: 1260 baseline + 70 = 1330
passing, 0 failing, 1 ignored (unchanged from baseline). Clippy
clean.

Per the Phase-2 plan sub-phase 2a exit gate. DA gate written
review: PASS.
This commit is contained in:
claude@clouddev1
2026-05-20 11:29:48 +00:00
parent e032f01b2d
commit 8d293358a0
2 changed files with 1172 additions and 0 deletions
+1
View File
@@ -28,6 +28,7 @@ pub mod ddl;
pub mod expr; pub mod expr;
pub mod shared; pub mod shared;
pub mod sql_expr; pub mod sql_expr;
pub mod sql_select;
use crate::dsl::command::Command; use crate::dsl::command::Command;
use crate::dsl::walker::context::WalkContext; use crate::dsl::walker::context::WalkContext;
File diff suppressed because it is too large Load Diff