grammar: SQL SELECT full statement fragment (ADR-0032 Phase 2a)
Author the standalone walkable shape for the full standard-SQL
SELECT per ADR-0032 §1: compound queries with the four set ops
(UNION / UNION ALL / INTERSECT / EXCEPT), the five JOIN flavours
(INNER / LEFT [OUTER] / RIGHT [OUTER] / FULL [OUTER] / CROSS),
GROUP BY / HAVING, WITH and WITH RECURSIVE common table
expressions, LIMIT … OFFSET, DISTINCT / ALL, qualified-wildcard
`t.*` projection, and bare-alias projection (lifting ADR-0030
Phase-1 §4.2).
Recursion into SQL_SELECT_COMPOUND uses Node::Subgrammar for
2a; sub-phase 2b will rewire those references to the new
Node::ScopedSubgrammar variant for completion-scope discipline
(ADR-0032 §10.2). The Phase-1 data::SELECT CommandNode is not
touched here — the new fragment is reachable only from its own
tests until sub-phase 2c performs the migration.
Two implementation mechanisms realize ADR semantics without
changing them:
- Node::Lookahead disambiguates the projection_item Choice
(bare `*` vs `ident . *` qualified wildcard vs `sql_expr [
alias ]`) and gates bare-alias slots against continuation
keywords. The walker's walk_ident accepts any
identifier-shape token, including keyword-shape ones, and
Choice / Optional are first-match-wins; without lookahead a
bare-alias slot would greedily swallow FROM / WHERE / JOIN /
etc. Per-position follow-sets list which keywords legitimately
follow each alias slot. Same pattern as data.rs's
insert_first_paren precedent.
- INNER JOIN and bare JOIN are split into two distinct Choice
branches (each with a concrete leading keyword) rather than
sharing one Optional(Word("inner"))-leading branch. Avoids a
walker hazard where an Optional-leading-child Seq commits to
idx > 0 and then converts the next child's EOF NoMatch into
Incomplete, blocking the outer Choice from falling through to
later branches. Same semantic surface, distinct mechanism.
The §13 OOS shapes all have explicit reject tests (NATURAL,
USING, comma-FROM, LIMIT m,n, window OVER, VALUES, derived
tables). LATERAL has a noted partial limitation: the comma form
rejects via OOS-3, but the single-keyword form `FROM a LATERAL
JOIN b ON …` is admitted structurally because `lateral` parses
as a bare table-source alias for `a`. This matches ADR-0030's
"grammar admits identifier-shape tokens; engine resolves"
posture.
`__rdbms_*` rejection extends to every Phase-2 table-source
slot — the FROM table, each JOIN's table, each CTE name, and
the FROM inside any CTE body — via the reuseable
reject_internal_table validator.
70 new unit tests in sql_select.rs walk every §1 production and
every OOS reject case. Test totals: 1260 baseline + 70 = 1330
passing, 0 failing, 1 ignored (unchanged from baseline). Clippy
clean.
Per the Phase-2 plan sub-phase 2a exit gate. DA gate written
review: PASS.
This commit is contained in:
@@ -28,6 +28,7 @@ pub mod ddl;
|
|||||||
pub mod expr;
|
pub mod expr;
|
||||||
pub mod shared;
|
pub mod shared;
|
||||||
pub mod sql_expr;
|
pub mod sql_expr;
|
||||||
|
pub mod sql_select;
|
||||||
|
|
||||||
use crate::dsl::command::Command;
|
use crate::dsl::command::Command;
|
||||||
use crate::dsl::walker::context::WalkContext;
|
use crate::dsl::walker::context::WalkContext;
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user