feat: curated SQL function list — Tab completion (#15) + typing-time typo hint (#16)

Add src/dsl/sql_functions.rs (KNOWN_SQL_FUNCTIONS) as the shared source
of truth at sql_expr_ident slots:

- #15: offer the functions as Tab candidates under a new
  CandidateKind::Function + ninth Theme colour tok_function (blue,
  distinct from keyword/identifier/type).
- #16: restore the column-typo flag the #6 fix had dropped wholesale —
  invalid_ident_at_cursor now bails only when the partial prefix-matches
  a known function, else falls through to the schema-column check.

A column named like a function (e.g. `count`) is deduped (column wins).
`cast` is excluded — CAST(x AS type) is not a plain-call shape.
The no-validation-allowlist posture stands: the list drives completion +
the typo hint only, never parse-time acceptance.

Docs: ADR-0022 Amendment 6, ADR-0031 status note, README index,
requirements I3/I4 + refreshed test baseline.
This commit is contained in:
claude@clouddev1
2026-05-31 11:49:10 +00:00
parent 01ec926ec8
commit 6d8c9eea36
10 changed files with 570 additions and 25 deletions
@@ -686,6 +686,92 @@ can be revisited if hints routinely need more.
full-screen snapshots (the empty-state placeholder and the `insert`
usage hint now wrap to their full text instead of being clipped).
## Amendment 6 — Curated SQL function names: completion + typing-time typo hint (2026-05-30)
The advanced-mode SQL expression grammar (ADR-0031) accepts a
function-call *shape*`name(args)` — at its `sql_expr_ident` slot
but, by design, does **not** know which names are real functions
(ADR-0031 §1: the slot is `IdentSource::Columns`, optimised for the
common case of a column reference; the walker is a structural matcher,
not an evaluator). That left two gaps at this slot, raised as issues
#15 and #16:
- **#16 (regression).** The earlier issue-#6 function-call validator
fix dropped `invalid_ident_at_cursor`'s "no such column" flag at
every `sql_expr_ident` position — necessary to stop a false positive
on a function name like `sum`, but it also silenced the typing-time
signal for a *genuine* column typo in an incomplete expression
(`select Agx` before a `FROM` brings the schema-existence diagnostic
into scope). Typing-time became *less* eager than submit-time.
- **#15 (discovery).** Tab at a `sql_expr_ident` slot offered schema
columns + a few expression keywords (`null`, `distinct`, `case`, …)
but no function names, so a learner had to already know `sum` / `avg`
/ `upper` to type them.
Both want the same thing: a single source of truth for *"what SQL
function names does this playground recognise"*.
**Change:**
1. **Curated list.** New `src/dsl/sql_functions.rs` with
`KNOWN_SQL_FUNCTIONS` (sorted, lowercase — a pinned invariant) and
an `is_known_function_prefix()` helper. A deliberately *curated
pedagogical set*, not "every SQLite built-in": the aggregates a
learner meets first (`count`/`sum`/`avg`/`min`/`max`), the common
scalars (`length`/`upper`/`lower`/`trim`/`substr`/`coalesce`/`abs`/
`round`), and a broader scalar tier (`date`/`datetime`/`strftime`/
`hex`/`ifnull`/`nullif`/`replace`/`instr`/`typeof`/`random`).
**`cast` is deliberately excluded** — SQLite's `CAST(expr AS type)`
is not a plain-call shape the expression grammar parses, so
offering it would surface a candidate that does not complete; it
stays out until the grammar grows a dedicated `CAST` form.
2. **#16 — restore the typo flag, narrowly.** `invalid_ident_at_cursor`
no longer bails wholesale at a `sql_expr_ident` slot. It bails only
when the partial prefix-matches a known function name; otherwise it
falls through to the existing schema-column check, which flags "no
such column" unless the partial prefix-matches a real column. So
`select Agx` warns again at typing time while `select sum` does not.
The submit-time `unknown_column` diagnostic path is untouched; the
issue-#6 lockdown tests (`genuine_column_typo_in_complete_select_…`,
`advanced_select_partial_function_name_not_flagged_…`) still pass.
3. **#15 — offer functions as candidates.** A new completion source
(Source 1.8) contributes `KNOWN_SQL_FUNCTIONS` (prefix-filtered like
every other source) whenever the expected set contains a
`sql_expr_ident` slot, ordered after keywords/types (a learner
reads clause keywords first, then discovers callables).
4. **New `CandidateKind::Function` + `tok_function` colour.** Like
Amendment 4 gave types their own class, function candidates get a
dedicated kind and a ninth `Theme` colour field (`tok_function`,
a blue distinct from keyword purple, identifier teal, and type
pink/magenta in both `dark()` and `light()`) so a callable reads
apart from a clause keyword, a column reference, and a column type.
**No-validation-allowlist posture stands (ADR-0031 §6/§7).** The list
drives *completion* and the *typo hint* only — never parse-time
acceptance. An unknown or engine-specific function still parses (the
grammar admits the call shape generically) and surfaces an
engine-neutral *execution* error, exactly as before.
**Pedagogy:** the same dedicated-colour rationale as Amendment 4 — a
learner can tell *"this is a function"* at a glance, and Tab now
*teaches* the function vocabulary instead of assuming it.
**Coverage:** `sql_functions::{list_is_sorted_and_lowercase,
list_has_no_duplicates, cast_is_excluded, prefix_match_is_case_insensitive,
empty_prefix_matches_all, unknown_prefix_does_not_match}`;
`completion::{sql_expr_slot_offers_known_function_candidates,
projection_slot_offers_known_function_candidates,
sql_function_candidates_filter_by_prefix,
sql_function_candidates_carry_function_kind,
function_candidates_absent_at_non_expression_slots,
cast_is_not_offered_as_a_function_candidate,
invalid_ident_fires_for_genuine_typo_at_sql_expr_slot,
invalid_ident_does_not_fire_for_function_prefix_at_sql_expr_slot,
invalid_ident_does_not_fire_for_column_prefix_at_sql_expr_slot}`;
`input_render::advanced_select_genuine_column_typo_before_from_warns_at_typing_time`;
`theme::function_colour_is_distinct_from_keyword_identifier_and_type`.
See ADR-0031's status note for the grammar-side anchor.
## Out of scope
Deliberately deferred to keep this ADR shippable as a single
+21
View File
@@ -409,3 +409,24 @@ Later phases extend the same fragment:
set the engine-neutrality posture and the no-allowlist rule.
- `docs/simple-mode-limitations.md` — the DSL limits this grammar
lifts for advanced mode (§1, §4).
## Status note — known-function list layered on the slot (2026-05-30)
The `sql_expr_ident` slot is `IdentSource::Columns` and, per §1 / §5,
does **not** itself know which identifiers are function names — it
optimises for the common case (a column reference) and admits the
function-call shape structurally; §5 explicitly noted "function names
are not completed … a typed function name simply is not a candidate".
**ADR-0022 Amendment 6** layers a curated known-function list
(`src/dsl/sql_functions.rs`) on top of this slot, consumed two ways:
as Tab-completion candidates so a learner can discover `sum` / `upper`
/ … (issue #15 — softening §5's "not completed" line to "completed
from a curated pedagogical list, not an allowlist for validation"),
and as the allow-list that lets the typing-time column-typo hint stay
strict at this slot — flag a partial as "no such column" only when it
matches neither a schema column nor a known function name (issue #16).
The grammar here is unchanged, and §6/§7's no-validation-allowlist
posture stands: the list drives completion + the typo hint, **not**
parse-time acceptance (an unknown function still parses and surfaces an
engine-neutral execution error). The list sits in the completion /
hint layer above the grammar.
+2 -2
View File
File diff suppressed because one or more lines are too long