feat: curated SQL function list — Tab completion (#15) + typing-time typo hint (#16)

Add src/dsl/sql_functions.rs (KNOWN_SQL_FUNCTIONS) as the shared source of truth at sql_expr_ident slots: - #15: offer the functions as Tab candidates under a new CandidateKind::Function + ninth Theme colour tok_function (blue, distinct from keyword/identifier/type). - #16: restore the column-typo flag the #6 fix had dropped wholesale — invalid_ident_at_cursor now bails only when the partial prefix-matches a known function, else falls through to the schema-column check. A column named like a function (e.g. `count`) is deduped (column wins). `cast` is excluded — CAST(x AS type) is not a plain-call shape. The no-validation-allowlist posture stands: the list drives completion + the typo hint only, never parse-time acceptance. Docs: ADR-0022 Amendment 6, ADR-0031 status note, README index, requirements I3/I4 + refreshed test baseline.
2026-05-31 11:49:10 +00:00
parent 01ec926ec8
commit 6d8c9eea36
10 changed files with 570 additions and 25 deletions
@@ -686,6 +686,92 @@ can be revisited if hints routinely need more.
 full-screen snapshots (the empty-state placeholder and the `insert`
 usage hint now wrap to their full text instead of being clipped).

+## Amendment 6 — Curated SQL function names: completion + typing-time typo hint (2026-05-30)
+
+The advanced-mode SQL expression grammar (ADR-0031) accepts a
+function-call *shape* — `name(args)` — at its `sql_expr_ident` slot
+but, by design, does **not** know which names are real functions
+(ADR-0031 §1: the slot is `IdentSource::Columns`, optimised for the
+common case of a column reference; the walker is a structural matcher,
+not an evaluator). That left two gaps at this slot, raised as issues
+#15 and #16:
+
+- **#16 (regression).** The earlier issue-#6 function-call validator
+  fix dropped `invalid_ident_at_cursor`'s "no such column" flag at
+  every `sql_expr_ident` position — necessary to stop a false positive
+  on a function name like `sum`, but it also silenced the typing-time
+  signal for a *genuine* column typo in an incomplete expression
+  (`select Agx` before a `FROM` brings the schema-existence diagnostic
+  into scope). Typing-time became *less* eager than submit-time.
+- **#15 (discovery).** Tab at a `sql_expr_ident` slot offered schema
+  columns + a few expression keywords (`null`, `distinct`, `case`, …)
+  but no function names, so a learner had to already know `sum` / `avg`
+  / `upper` to type them.
+
+Both want the same thing: a single source of truth for *"what SQL
+function names does this playground recognise"*.
+
+**Change:**
+
+1. **Curated list.** New `src/dsl/sql_functions.rs` with
+   `KNOWN_SQL_FUNCTIONS` (sorted, lowercase — a pinned invariant) and
+   an `is_known_function_prefix()` helper. A deliberately *curated
+   pedagogical set*, not "every SQLite built-in": the aggregates a
+   learner meets first (`count`/`sum`/`avg`/`min`/`max`), the common
+   scalars (`length`/`upper`/`lower`/`trim`/`substr`/`coalesce`/`abs`/
+   `round`), and a broader scalar tier (`date`/`datetime`/`strftime`/
+   `hex`/`ifnull`/`nullif`/`replace`/`instr`/`typeof`/`random`).
+   **`cast` is deliberately excluded** — SQLite's `CAST(expr AS type)`
+   is not a plain-call shape the expression grammar parses, so
+   offering it would surface a candidate that does not complete; it
+   stays out until the grammar grows a dedicated `CAST` form.
+2. **#16 — restore the typo flag, narrowly.** `invalid_ident_at_cursor`
+   no longer bails wholesale at a `sql_expr_ident` slot. It bails only
+   when the partial prefix-matches a known function name; otherwise it
+   falls through to the existing schema-column check, which flags "no
+   such column" unless the partial prefix-matches a real column. So
+   `select Agx` warns again at typing time while `select sum` does not.
+   The submit-time `unknown_column` diagnostic path is untouched; the
+   issue-#6 lockdown tests (`genuine_column_typo_in_complete_select_…`,
+   `advanced_select_partial_function_name_not_flagged_…`) still pass.
+3. **#15 — offer functions as candidates.** A new completion source
+   (Source 1.8) contributes `KNOWN_SQL_FUNCTIONS` (prefix-filtered like
+   every other source) whenever the expected set contains a
+   `sql_expr_ident` slot, ordered after keywords/types (a learner
+   reads clause keywords first, then discovers callables).
+4. **New `CandidateKind::Function` + `tok_function` colour.** Like
+   Amendment 4 gave types their own class, function candidates get a
+   dedicated kind and a ninth `Theme` colour field (`tok_function`,
+   a blue distinct from keyword purple, identifier teal, and type
+   pink/magenta in both `dark()` and `light()`) so a callable reads
+   apart from a clause keyword, a column reference, and a column type.
+
+**No-validation-allowlist posture stands (ADR-0031 §6/§7).** The list
+drives *completion* and the *typo hint* only — never parse-time
+acceptance. An unknown or engine-specific function still parses (the
+grammar admits the call shape generically) and surfaces an
+engine-neutral *execution* error, exactly as before.
+
+**Pedagogy:** the same dedicated-colour rationale as Amendment 4 — a
+learner can tell *"this is a function"* at a glance, and Tab now
+*teaches* the function vocabulary instead of assuming it.
+
+**Coverage:** `sql_functions::{list_is_sorted_and_lowercase,
+list_has_no_duplicates, cast_is_excluded, prefix_match_is_case_insensitive,
+empty_prefix_matches_all, unknown_prefix_does_not_match}`;
+`completion::{sql_expr_slot_offers_known_function_candidates,
+projection_slot_offers_known_function_candidates,
+sql_function_candidates_filter_by_prefix,
+sql_function_candidates_carry_function_kind,
+function_candidates_absent_at_non_expression_slots,
+cast_is_not_offered_as_a_function_candidate,
+invalid_ident_fires_for_genuine_typo_at_sql_expr_slot,
+invalid_ident_does_not_fire_for_function_prefix_at_sql_expr_slot,
+invalid_ident_does_not_fire_for_column_prefix_at_sql_expr_slot}`;
+`input_render::advanced_select_genuine_column_typo_before_from_warns_at_typing_time`;
+`theme::function_colour_is_distinct_from_keyword_identifier_and_type`.
+See ADR-0031's status note for the grammar-side anchor.
+
 ## Out of scope

 Deliberately deferred to keep this ADR shippable as a single
@@ -409,3 +409,24 @@ Later phases extend the same fragment:
  set the engine-neutrality posture and the no-allowlist rule.
 - `docs/simple-mode-limitations.md` — the DSL limits this grammar
  lifts for advanced mode (§1, §4).
+
+## Status note — known-function list layered on the slot (2026-05-30)
+
+The `sql_expr_ident` slot is `IdentSource::Columns` and, per §1 / §5,
+does **not** itself know which identifiers are function names — it
+optimises for the common case (a column reference) and admits the
+function-call shape structurally; §5 explicitly noted "function names
+are not completed … a typed function name simply is not a candidate".
+**ADR-0022 Amendment 6** layers a curated known-function list
+(`src/dsl/sql_functions.rs`) on top of this slot, consumed two ways:
+as Tab-completion candidates so a learner can discover `sum` / `upper`
+/ … (issue #15 — softening §5's "not completed" line to "completed
+from a curated pedagogical list, not an allowlist for validation"),
+and as the allow-list that lets the typing-time column-typo hint stay
+strict at this slot — flag a partial as "no such column" only when it
+matches neither a schema column nor a known function name (issue #16).
+The grammar here is unchanged, and §6/§7's no-validation-allowlist
+posture stands: the list drives completion + the typo hint, **not**
+parse-time acceptance (an unknown function still parses and surfaces an
+engine-neutral execution error). The list sits in the completion /
+hint layer above the grammar.