From 2a8618c783a1bb895e75c3bc529e12b9e3d84255 Mon Sep 17 00:00:00 2001 From: "claude@clouddev1" Date: Sat, 9 May 2026 08:49:53 +0000 Subject: [PATCH] ADR-0019: friendly error layer (H1) and i18n message catalog MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Settles the design we discussed across this session's follow-up to the engine-vocabulary audit: - A central `friendly` module owns translation; the existing ad-hoc helpers (`friendly_change_column_engine_error`, `enrich_fk_message`) absorb into it. - Initial catalog covers UNIQUE / FK / NOT NULL / CHECK / type-mismatch errors with operation-tailored, pedagogically-voiced wording in verbose and short variants. - New `messages (short|verbose)` app-level command lets advanced learners shrink the output. In-session state for now; persisted later when settings persistence lands. - Row pinpointing via post-failure re-query, rendered through ADR-0017 §7's bordered diagnostic-table renderer. `FriendlyError` is a structured payload (headline + hint + optional table); `output_render` composes it. - i18n foundation: hierarchical YAML catalog, embedded via `include_str!`, fixed locale (en-US) for now, no external files. `{name}` plain substitution; format specifiers explicitly rejected so a translator cannot reformat values. Value formats stay invariant across all locales (ISO 8601 dates, `.` decimals, `true`/`false`, `NULL`) — explicitly not a translatable concern. - Migration sweep is required follow-on but separable: a `t!()` macro marks call sites and lets per-category PRs land incrementally. Anchor-phrase list (§10) limits test churn for the most common substring assertions. Out of scope and explicitly deferred: advanced-mode SQL error sanitisation (waits on Q1), settings persistence for the messages command, plural-form rules per locale, runtime locale selection, locale-aware value formatting (rejected, not deferred), constraint-management surface (C3 territory). README index updated. --- .../adr/0019-friendly-error-layer-and-i18n.md | 623 ++++++++++++++++++ docs/adr/README.md | 1 + 2 files changed, 624 insertions(+) create mode 100644 docs/adr/0019-friendly-error-layer-and-i18n.md diff --git a/docs/adr/0019-friendly-error-layer-and-i18n.md b/docs/adr/0019-friendly-error-layer-and-i18n.md new file mode 100644 index 0000000..e76abd7 --- /dev/null +++ b/docs/adr/0019-friendly-error-layer-and-i18n.md @@ -0,0 +1,623 @@ +# ADR-0019: Friendly error layer (H1) and i18n message catalog + +## Status + +Accepted. + +Amends ADR-0002 (database engine — user-facing posture). Pulls +forward H1 from `requirements.md`. Introduces the i18n message +catalog as foundation infrastructure that subsequent work will +gradually take over for every user-visible string in the +codebase. + +## Context + +ADR-0002 commits to never exposing the underlying engine in +user-visible strings. Today the implementation has three gaps +between commitment and reality: + +1. **`DbError::friendly_message()` is a passthrough.** It calls + `Display`, which for the `Sqlite { message }` variant + surfaces the raw text from `rusqlite::Error`. In practice + today the visible result is mostly engine-neutral SQL + wording ("no such table: T", "FOREIGN KEY constraint + failed", "UNIQUE constraint failed: Customers.id") — the + engine-vocabulary audit (this session) confirmed there are + currently zero literal `SQLite` / `STRICT` / `PRAGMA` / + `rusqlite` matches in user-reachable strings. But the bar is + one passthrough away from slipping, the wording is not + pedagogically helpful, and it certainly isn't translatable. + +2. **Two ad-hoc helpers already do partial translation.** + `friendly_change_column_engine_error` (db.rs) recognises + constraint / type errors that surface during the + `--dont-convert` change-column path and rewrites them with + abstract "the database refused …" wording. + `enrich_fk_message` appends the relevant FK list to + "FOREIGN KEY constraint failed". Both are call-site-local + and grew up where the immediate need was; they do not + compose. + +3. **No i18n foundation.** Every user-visible string in the + codebase is a literal — typically inside `format!` calls, + sometimes inside helper-returned strings. Adding a second + locale today would mean editing every literal. The project + audience is not US-only, and the cost of laying the + foundation now is much smaller than the cost of doing it + after the codebase has tripled in size. + +ADR-0017 §7 introduced the bordered diagnostic-table renderer +for refusal payloads (lossy rows, incompatible rows, uniqueness +collisions). That renderer is the right vehicle for any +H1 wording that pinpoints offending rows — symmetric with how +`change column` already explains itself. + +The pedagogical posture across `[client-side]` notes +(ADR-0017 §6, ADR-0018 §9) is "the tool did this for you; +here's what raw SQL would have needed". H1 errors should match +that voice: tell the user what happened, why, and what to do +next, rather than echoing a constraint name and stopping there. + +## Decision + +### 1. The unifying principle + +> Every user-visible message goes through a translation +> catalog. The catalog supplies template strings; the +> application renders values into them. Engine error text is +> never surfaced verbatim — it is structurally classified, and +> the catalog produces operation-tailored wording. + +The H1 layer is the single chokepoint. Once the migration sweep +(§9) is complete, no user-reachable string in the codebase is a +plain literal. Engine vocabulary cannot leak through a path the +catalog doesn't own. + +### 2. Architecture + +A new module — `src/friendly/` — owns the translation logic. +The shape: + +```rust +pub struct TranslateContext<'a> { + pub operation: Operation, + pub table: Option<&'a str>, + pub column: Option<&'a str>, + pub source_type: Option, + pub target_type: Option, + pub verbosity: Verbosity, + pub database: Option<&'a Database>, +} + +pub enum Operation { + Insert, + Update, + Delete, + CreateTable, + DropTable, + AddColumn, + DropColumn, + RenameColumn, + ChangeColumnType, + AddRelationship, + DropRelationship, + Query, + Replay, +} + +pub fn translate(error: &DbError, ctx: &TranslateContext) -> FriendlyError; +``` + +`FriendlyError` is a structured payload (§7), not a flat +string. The `output_render` module — which already owns the +diagnostic-table renderer — assembles the payload into final +text. + +The translator may use `ctx.database` to re-query for offending +rows (§6). Re-query is best-effort; failure falls back to the +non-pinpointed wording. + +### 3. Translator catalog — initial scope + +H1's first iteration covers five engine-error categories that +the simple-mode user can plausibly trigger: + +- **UNIQUE constraint violations** — including PK collisions on + INSERT and UPDATE, and the auto-fill collisions surfaced by + ADR-0018 §6's pre-flight check. +- **FOREIGN KEY constraint violations** — distinguishing + parent-side (DELETE / UPDATE that orphans children) from + child-side (INSERT / UPDATE with no matching parent). +- **NOT NULL constraint violations** — INSERT and UPDATE. +- **CHECK constraint violations** — placeholder coverage; we + don't emit CHECK constraints today, but the catalog covers + the case so it's ready when constraint-management lands + (track C3). +- **Type mismatch** — values rejected by STRICT typing because + they aren't the expected storage class (mostly the + `--dont-convert` change-column path today). + +The existing helpers absorb into this catalog: + +- `friendly_change_column_engine_error` → keys under + `error.change_column.*` (one per recognised category). +- `enrich_fk_message` → its FK-listing logic moves into the + catalog renderer for `error.foreign_key.*`. + +### 4. Operation-tailored wording — no generic "constraint failed" + +A NOT NULL violation on INSERT and on UPDATE share an underlying +cause but read very differently to the learner: on INSERT they +forgot a value, on UPDATE they tried to clear one. The catalog +gives each operation its own template. Operation × kind × +verbosity is the typical depth. + +Where two operations genuinely produce identical wording (rare), +the catalog uses a single key referenced from both call sites +rather than duplicating. This is a maintenance optimisation, +not a default; first-cut keys are operation-specific. + +### 5. Pedagogical voice + the `messages (short|verbose)` command + +The default verbosity is **verbose**: headline + cause + hint. +Verbose example for a UNIQUE violation on INSERT: + +``` +A row with this value already exists. +The `id` column on `Customers` is a primary key, which must be +unique. Pick a different value, or update the existing row +with `update Customers ... where id=...`. +``` + +Short variant: + +``` +A row with `id=5` already exists in `Customers`. +``` + +A new app-level command (parallel to `mode`): + +``` +messages — show current verbosity +messages short — switch to short messages +messages verbose — switch to verbose messages (default) +``` + +Verbosity is in-session state on `App`; persisted later when +settings persistence lands. The current verbosity is threaded +through `TranslateContext::verbosity`; the catalog has separate +keys for each (`error.unique.insert.verbose`, +`error.unique.insert.short`). + +Both verbosities respect the screen-real-estate constraint of +the TUI. Verbose is "two or three short paragraphs of +explanation"; not "a wall of text". Short is "one line, no +hint". + +### 6. Row pinpointing via re-query and the diagnostic-table renderer + +For UNIQUE / NOT NULL / FK violations the engine doesn't tell us +*which* row. Where it matters, the translator re-queries the +database post-failure to find the offending rows and renders +them through ADR-0017 §7's bordered diagnostic-table renderer. + +This is symmetric with how `change column` already handles +lossy / incompatible / collision refusals. + +Re-query is best-effort: + +- **UNIQUE on INSERT/UPDATE**: re-query the table for rows where + the offending column matches the value the user tried to + insert. Render the resulting row(s) under a "this row already + exists with the value you tried to use" header. +- **FK child-side on INSERT/UPDATE**: list the parent table and + point out that no matching row exists there. (Pinpointing the + *attempted* child value is straightforward; pinpointing + parent rows that *would* satisfy is not — and not what the + user needs.) +- **FK parent-side on DELETE/UPDATE**: re-query child tables for + rows that reference the row being touched. Render under a + "these rows reference the row you tried to delete" header. + Capped at ADR-0017's `DIAGNOSTIC_ROW_CAP` (100). + +If the re-query fails or returns no rows (engine state changed +under us, locking, the engine's idea of "which row" differs +from ours), fall back to the constraint-only verbose wording — +no diagnostic table. + +The translator owns the re-query. Call sites do **not** do +their own pinpointing — that scatters logic and re-introduces +the inconsistency that the existing two helpers exhibit. + +### 7. The `FriendlyError` payload + +```rust +pub struct FriendlyError { + /// One-line headline. Always present. Both verbosities + /// share this line; verbose appends to it. + pub headline: String, + + /// Pedagogical "what to do next" hint. Populated only in + /// verbose mode; `None` in short mode. + pub hint: Option, + + /// Bordered row pinpoint (ADR-0017 §7 renderer). Populated + /// when re-query succeeded for kinds that pinpoint + /// (§6); `None` otherwise. Verbosity-independent: short + /// mode also benefits from row context when available, but + /// the catalog wording is terse. + pub diagnostic_table: Option, +} +``` + +`DiagnosticTable` is the existing structured form of ADR-0017's +bordered renderer, lifted into a `pub` type the friendly +module can construct. + +The renderer (in `output_render`) composes the three fields: +headline, then (verbose only) a blank line + hint, then +(if present) a blank line + the bordered table. + +### 8. The i18n message catalog + +#### 8.1 Scope + +The catalog covers user-visible **message strings only**. It +does **not** cover value formats — those stay invariant across +all locales (§8.7). + +This ADR introduces the catalog and uses it for the +friendly-error layer. Migration of every other user-visible +string in the codebase (help text, `[ok]` summaries, +`[client-side]` notes, parse errors, modal labels, mode +banners, replay messages, the rebuild-confirmation prompt, the +load-picker entries, etc.) is a required follow-on, but +separable — see §9. + +#### 8.2 Storage and locale + +- The catalog lives in YAML files under `src/friendly/strings/`, + embedded at compile time via `include_str!` and parsed once at + startup with `serde_yaml` (or equivalent). +- One file per locale: `en-US.yaml` is the only file for now. +- **No external file loading at runtime.** Translations land in + the codebase via PR, not by dropping files into a directory. + This protects against accidental corruption and against + bored students discovering they can override strings on a + shared machine. +- Locale selection is fixed for now; runtime selection is + deferred to a follow-on ADR alongside settings persistence. + +#### 8.3 Key structure — flexible, per-category schema + +YAML hierarchical groups. Each top-level category defines the +dimensions its keys carry; not every category has every +dimension. + +The error category typically uses +`{kind} × {operation} × {verbosity}`: + +```yaml +error: + unique: + insert: + verbose: "..." + short: "..." + update: + verbose: "..." + short: "..." + foreign_key: + child_side: + insert: { verbose: "...", short: "..." } + update: { verbose: "...", short: "..." } + parent_side: + delete: { verbose: "...", short: "..." } + update: { verbose: "...", short: "..." } + not_null: + insert: { verbose: "...", short: "..." } + update: { verbose: "...", short: "..." } +``` + +Other categories will use other dimensions: + +```yaml +help: + cli_banner: "..." + in_app_command_list: "..." + +ok: + summary: "[ok] {verb} {subject}" + +client_side: + transformed: { verbose: "...", short: "..." } + auto_fill: { serial: "...", short_id: "..." } + +replay: + completed: "[ok] replay {path} — {count} command(s) run" + failed_with_line: + "replay {path} failed at line {line}: {error}" + failed_no_line: "replay {path} failed: {error}" +``` + +The category schema is documented as a Rust enum / struct in +`friendly::keys` and validated at test time (§8.6). Adding a +new dimension is a code change, not a freeform YAML edit. + +#### 8.4 Format strings — `{name}` plain substitution, no specifiers + +Strings use named placeholders. Values are interpolated at +substitution time: + +```yaml +error.unique.insert.verbose: | + A row with this value already exists. + The `{column}` column on `{table}` is unique-constrained. + Pick a different value or update the existing row. +``` + +**Format specifiers are explicitly rejected.** A translator +cannot write `{value:08.2}`, `{value:>10}`, etc. The substitution +implementation refuses any `{name:...}` form at parse time with +a clear error. Rationale: the catalog positions values inside +prose; it does not reformat them. Value rendering is the +application's job (§8.7) and must be uniform across all +locales. + +Implementation: a hand-rolled substitution helper, ~30 lines, +that walks the template, recognises `{name}` segments, looks +up the value in the supplied keyword arguments, and inserts it. +Unknown placeholders → translation error (caught at test time +when the catalog is exercised). + +#### 8.5 Pluralisation — `(s)` suffix for now + +The existing codebase uses the +`"{count} row(s) inserted"` style. The catalog continues this. +Real plural-form rules per locale (Fluent / ICU MessageFormat +have zero / one / few / many / two / other tagged forms, +language-specific) are explicitly **deferred**. Adding them is +a meaningful design lift, not a drop-in upgrade, and the H1 +scope here is already wide enough. + +A future "advanced i18n" ADR would address pluralisation, +gendered forms, ordinals, and date/number formatting hooks. +This ADR documents that the limitation is known and intentional. + +#### 8.6 Validation + +Two layers of catalog validation: + +- **At test time (build-time).** A unit test loads every catalog + file and verifies: + - The YAML parses. + - Every key referenced from the friendly module exists. + - Every placeholder in every string is in the expected set + for that key (declared by the friendly module, per §8.3). + - No catalog key contains a format specifier. + - No catalog string contains forbidden engine vocabulary + (the same check the engine-vocabulary audit applies). +- **At startup.** A small sanity check parses the embedded YAML + and panics loudly if it doesn't (catches a corrupted build + artefact). Cheap; runs once per process. + +#### 8.7 Value formats stay invariant + +Locale-aware value formatting is **explicitly rejected**. The +app uses a single set of formats across all locales, both for +input and output: + +- **Dates**: ISO 8601 `YYYY-MM-DD`. +- **Datetimes**: ISO 8601 `YYYY-MM-DDTHH:MM:SS`. +- **Decimal point**: `.`. +- **Thousands separator**: none. +- **Booleans**: `true` / `false` — treated as part of the + SQL/relational vocabulary the user is learning, not as a + translatable string. (Like `NULL`, which also stays as + `NULL`.) +- **NULL**: `NULL`. +- **Blob**: `` placeholder. + +Rationale: + +- ISO 8601 is the international standard, not a US-centric + format. The most common worry — "I don't want US-style + MM/DD/YYYY imposed on the user" — does not apply. +- Locale-aware value parsing on input is fragile: an English + installation can't read a German `1,5` without ambiguity, + CSVs cross locale borders, projects shared across teams break + unpredictably. Several real RDBMS allow this; we explicitly + do not, because the playground's pedagogical contract is + "you are learning what databases want", and what they want is + invariant. +- Booleans stay English-style because they're SQL keywords + (`TRUE` / `FALSE`) the learner needs to recognise. + +The validator (§8.6) does not police value formats inside +catalog strings — it can't, since they're free-form prose. But +the format-specifier-rejection rule (§8.4) means a catalog +author cannot accidentally pad, round, or reformat a value +they reference; they can only quote it as-is. + +### 9. Migration sweep — `t!()` macro marker + +A small macro provides the call-site shape: + +```rust +t!(error.unique.insert.verbose, table = "Customers", column = "id") +``` + +- Resolves the catalog key at compile time (or at first call, + with the lookup result memoised). +- Substitutes named arguments per §8.4. +- Returns the rendered `String`. +- Panics at test time if the key is missing or placeholders + don't match (§8.6). + +The migration sweep proceeds **incrementally**. Each PR +migrates a coherent group of literal strings — for instance, +"all replay-related messages", "all `[ok]` summaries", "all +parse error wordings". Reviewable in isolation. + +The remaining work is visible: + +``` +grep -E '"[^"]' src/ | grep -v 't!(' +``` + +approximates the unmigrated callsites. (Imperfect — strings +inside DDL templates, internal log messages, and error-kind +classification helpers are not user-visible — but a useful +proxy.) + +H1 itself ships with all friendly-error-layer call sites +already migrated. The friendly-error catalog populating +`error.*` keys is part of the H1 PR. Other categories +(`help.*`, `ok.*`, `client_side.*`, `replay.*`, `parse.*`, ...) +are populated as their migration PRs land, each adding the +catalog entries it needs and removing the corresponding +literals from code. + +### 10. Backwards compatibility — anchor phrases + +Many existing tests assert substring matches on error wording +(`message.contains("no such table")`, +`message.contains("already exists")`, …). H1 will rewrite the +surrounding wording. Rather than break every such test, the +catalog commits to a small set of **anchor phrases** that stay +stable and that tests can keep asserting against: + +| Anchor phrase | Where it appears | +|---------------|------------------| +| `no such table` | object-not-found errors targeting tables | +| `no such column` | object-not-found errors targeting columns | +| `no such relationship` | object-not-found errors targeting relationships | +| `already exists` | name-collision errors (table, column, relationship) | +| `already has the value` | UNIQUE-violation headlines (replaces the engine's "UNIQUE constraint failed") | +| `cannot be converted` | type-incompatible refusal headers (ADR-0017 §5) | +| `discard information` | lossy refusal headers (ADR-0017 §5) | +| `referenced by` | FK-cascade refusals (parent-side) | +| `[client-side]` | the existing pedagogical-note prefix | + +The list is small on purpose: anchor phrases are a maintenance +liability (a rewording of any of them needs an ADR amendment +plus a test sweep). Tests that need to assert on wording +should prefer these. Tests that don't need to assert on +wording should not. + +## Out of scope + +These are deliberately deferred to keep H1 shippable: + +1. **Translation of every other user-visible string** + (`help.*`, `ok.*`, `client_side.*`, `replay.*`, `parse.*`, + modal labels, mode banners, load-picker entries, …). Required + follow-on; tracked as separable PRs per §9. +2. **Advanced-mode SQL error sanitization.** Q1 isn't wired yet. + When it is, a future ADR will define how to surface "real" SQL + errors with engine artefacts removed, while still respecting + ADR-0002. Today's H1 does not handle this path. +3. **Settings persistence for the `messages` command.** Lives in + a future settings ADR. +4. **Plural-form rules per locale.** §8.5. +5. **Runtime locale selection.** §8.2. +6. **Locale-aware value formatting.** §8.7 — explicitly rejected, + not just deferred. +7. **Constraint-management surface (`add unique`, `add check`).** + The catalog covers CHECK violations as a placeholder so the + wiring is ready, but C3-track work owns the DDL surface. + +## Consequences + +### Positive + +- **Single chokepoint for engine vocabulary.** Once the + migration sweep is complete, the engine-vocabulary audit + becomes structurally enforceable — there is no path for an + engine name to leak that doesn't go through the catalog, and + the catalog's validator policies it. +- **Operation-tailored, pinpointed errors materially improve + pedagogy.** A learner who tries to `INSERT` a duplicate `id` + sees the row that already exists, the column the constraint + applies to, and a hint pointing to UPDATE. They learn what + the constraint *is*, not just that one fired. +- **i18n foundation in place.** Future contributors can add + translations via PR. The codebase doesn't need to be + re-architected for it. +- **The `messages` command** lets advanced learners shrink the + output once they recognise the patterns. Reduces frustration + with users who outgrow the verbose voice. + +### Costs + +- **Catalog file becomes a maintenance dimension.** Every new + error path means a new key. The schema validator catches + forgotten keys but doesn't write them. +- **Migration sweep is non-trivial.** Probably 50–80 user-visible + literals across `src/`. Each is mechanical but the total is + hours. +- **Test churn during the sweep.** Substring assertions on + rewritten messages need updating. The anchor-phrase list + (§10) limits the blast radius for the most common cases but + doesn't eliminate it. +- **A small dependency.** `serde_yaml` (or equivalent) for the + catalog parse. We already use `serde_yaml` for `project.yaml`, + so this is "use what we have", not "add a dependency". + +## Implementation notes + +These are sketch-level — H1 implementation will produce more +detailed plans, but they're enough that a session picking this +up has direction: + +### Order of operations + +1. **Skeleton.** Create `src/friendly/` with the module + structure: `mod.rs`, `keys.rs` (per-category schema), `format.rs` + (the substitution helper), `strings/en-US.yaml` (empty-ish + for now). Wire `t!()` macro. +2. **Catalog populate — `error.*` first.** Five categories per + §3, each with verbose + short variants per relevant + operation. Anchor-phrase compliance per §10. +3. **`FriendlyError` struct + renderer.** Lift `DiagnosticTable` + to a public type from `output_render`. Compose the renderer. +4. **Translator.** `translate(error, ctx)` that classifies the + `DbError` payload and produces a `FriendlyError`. Re-query + logic per §6. +5. **Wire `friendly_message` to the translator.** The two + existing helpers (`friendly_change_column_engine_error`, + `enrich_fk_message`) are removed; their behaviour is now in + the catalog. +6. **`messages (short|verbose)` command.** App-level, parallel + to `mode`. Stored on `App::messages_verbosity`; threaded + into `TranslateContext`. +7. **Validator unit test.** Per §8.6 — exercises the catalog + end-to-end at test time. +8. **Update tests that assert on rewritten wording.** Use the + anchor phrases where applicable. Document any test that + could not use an anchor phrase as a candidate for + future stability work. + +### Things that interact subtly + +- **The engine-vocabulary audit** (this session, + `tests/engine_vocabulary_audit.rs`) keeps applying. With H1 + the audit becomes a regression net for the catalog: any + catalog string containing forbidden vocabulary fails the + audit at test time. +- **ADR-0017's diagnostic-table renderer** is currently + produced from inside `db.rs` (`render_lossy_diagnostic`, + `render_incompatible_diagnostic`, …). Those renderers do not + go away — they have a specific shape (per-cell "old → new" + with reason) that the friendly module's row-pinpointer does + not. They migrate to the catalog for their *prose* + (headers, footer hints) but keep their structure-rendering + logic. +- **The `[client-side]` notes from ADR-0017/0018** are not + errors — they're success notes. They go through the catalog + too (under `client_side.*`) but as part of the migration + sweep, not the H1 PR. Their voice is already pedagogical; + the catalog migration is mechanical. +- **`ParseError`** is its own error type (DSL parse errors + surfaced through `humanise()`), not a `DbError`. It gets the + same translation treatment under `parse.*`, but as part of + the migration sweep. diff --git a/docs/adr/README.md b/docs/adr/README.md index 3f20739..48cdcca 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -24,3 +24,4 @@ This directory contains the project's ADRs, recorded per - [ADR-0016 — Pretty table rendering for data and structure views](0016-pretty-table-rendering.md) - [ADR-0017 — Column type-change compatibility](0017-column-type-change-compatibility.md) - [ADR-0018 — Auto-fill contracts for `serial` and `shortid` columns](0018-auto-fill-contracts-for-serial-and-shortid.md) +- [ADR-0019 — Friendly error layer (H1) and i18n message catalog](0019-friendly-error-layer-and-i18n.md)