# ADR-0019: Friendly error layer (H1) and i18n message catalog ## Status Accepted. Amends ADR-0002 (database engine — user-facing posture). Pulls forward H1 from `requirements.md`. Introduces the i18n message catalog as foundation infrastructure that subsequent work will gradually take over for every user-visible string in the codebase. ## Context ADR-0002 commits to never exposing the underlying engine in user-visible strings. Today the implementation has three gaps between commitment and reality: 1. **`DbError::friendly_message()` is a passthrough.** It calls `Display`, which for the `Sqlite { message }` variant surfaces the raw text from `rusqlite::Error`. In practice today the visible result is mostly engine-neutral SQL wording ("no such table: T", "FOREIGN KEY constraint failed", "UNIQUE constraint failed: Customers.id") — the engine-vocabulary audit (this session) confirmed there are currently zero literal `SQLite` / `STRICT` / `PRAGMA` / `rusqlite` matches in user-reachable strings. But the bar is one passthrough away from slipping, the wording is not pedagogically helpful, and it certainly isn't translatable. 2. **Two ad-hoc helpers already do partial translation.** `friendly_change_column_engine_error` (db.rs) recognises constraint / type errors that surface during the `--dont-convert` change-column path and rewrites them with abstract "the database refused …" wording. `enrich_fk_message` appends the relevant FK list to "FOREIGN KEY constraint failed". Both are call-site-local and grew up where the immediate need was; they do not compose. 3. **No i18n foundation.** Every user-visible string in the codebase is a literal — typically inside `format!` calls, sometimes inside helper-returned strings. Adding a second locale today would mean editing every literal. The project audience is not US-only, and the cost of laying the foundation now is much smaller than the cost of doing it after the codebase has tripled in size. ADR-0017 §7 introduced the bordered diagnostic-table renderer for refusal payloads (lossy rows, incompatible rows, uniqueness collisions). That renderer is the right vehicle for any H1 wording that pinpoints offending rows — symmetric with how `change column` already explains itself. The pedagogical posture across `[client-side]` notes (ADR-0017 §6, ADR-0018 §9) is "the tool did this for you; here's what raw SQL would have needed". H1 errors should match that voice: tell the user what happened, why, and what to do next, rather than echoing a constraint name and stopping there. ## Decision ### 1. The unifying principle > Every user-visible message goes through a translation > catalog. The catalog supplies template strings; the > application renders values into them. Engine error text is > never surfaced verbatim — it is structurally classified, and > the catalog produces operation-tailored wording. The H1 layer is the single chokepoint. Once the migration sweep (§9) is complete, no user-reachable string in the codebase is a plain literal. Engine vocabulary cannot leak through a path the catalog doesn't own. ### 2. Architecture A new module — `src/friendly/` — owns the translation logic. The shape: ```rust pub struct TranslateContext<'a> { pub operation: Operation, pub table: Option<&'a str>, pub column: Option<&'a str>, pub source_type: Option, pub target_type: Option, pub verbosity: Verbosity, pub database: Option<&'a Database>, } pub enum Operation { Insert, Update, Delete, CreateTable, DropTable, AddColumn, DropColumn, RenameColumn, ChangeColumnType, AddRelationship, DropRelationship, Query, Replay, } pub fn translate(error: &DbError, ctx: &TranslateContext) -> FriendlyError; ``` `FriendlyError` is a structured payload (§7), not a flat string. The `output_render` module — which already owns the diagnostic-table renderer — assembles the payload into final text. The translator may use `ctx.database` to re-query for offending rows (§6). Re-query is best-effort; failure falls back to the non-pinpointed wording. ### 3. Translator catalog — initial scope H1's first iteration covers five engine-error categories that the simple-mode user can plausibly trigger: - **UNIQUE constraint violations** — including PK collisions on INSERT and UPDATE, and the auto-fill collisions surfaced by ADR-0018 §6's pre-flight check. - **FOREIGN KEY constraint violations** — distinguishing parent-side (DELETE / UPDATE that orphans children) from child-side (INSERT / UPDATE with no matching parent). - **NOT NULL constraint violations** — INSERT and UPDATE. - **CHECK constraint violations** — placeholder coverage; we don't emit CHECK constraints today, but the catalog covers the case so it's ready when constraint-management lands (track C3). - **Type mismatch** — values rejected by STRICT typing because they aren't the expected storage class (mostly the `--dont-convert` change-column path today). The existing helpers absorb into this catalog: - `friendly_change_column_engine_error` → keys under `error.change_column.*` (one per recognised category). - `enrich_fk_message` → its FK-listing logic moves into the catalog renderer for `error.foreign_key.*`. ### 4. Operation-tailored wording — no generic "constraint failed" A NOT NULL violation on INSERT and on UPDATE share an underlying cause but read very differently to the learner: on INSERT they forgot a value, on UPDATE they tried to clear one. The catalog gives each operation its own template. Operation × kind × verbosity is the typical depth. Where two operations genuinely produce identical wording (rare), the catalog uses a single key referenced from both call sites rather than duplicating. This is a maintenance optimisation, not a default; first-cut keys are operation-specific. ### 5. Pedagogical voice + the `messages (short|verbose)` command The default verbosity is **verbose**: headline + cause + hint. Verbose example for a UNIQUE violation on INSERT: ``` A row with this value already exists. The `id` column on `Customers` is a primary key, which must be unique. Pick a different value, or update the existing row with `update Customers ... where id=...`. ``` Short variant: ``` A row with `id=5` already exists in `Customers`. ``` A new app-level command (parallel to `mode`): ``` messages — show current verbosity messages short — switch to short messages messages verbose — switch to verbose messages (default) ``` Verbosity is in-session state on `App`; persisted later when settings persistence lands. The current verbosity is threaded through `TranslateContext::verbosity`; the catalog has separate keys for each (`error.unique.insert.verbose`, `error.unique.insert.short`). Both verbosities respect the screen-real-estate constraint of the TUI. Verbose is "two or three short paragraphs of explanation"; not "a wall of text". Short is "one line, no hint". ### 6. Row pinpointing via re-query and the diagnostic-table renderer For UNIQUE / NOT NULL / FK violations the engine doesn't tell us *which* row. Where it matters, the translator re-queries the database post-failure to find the offending rows and renders them through ADR-0017 §7's bordered diagnostic-table renderer. This is symmetric with how `change column` already handles lossy / incompatible / collision refusals. Re-query is best-effort: - **UNIQUE on INSERT/UPDATE**: re-query the table for rows where the offending column matches the value the user tried to insert. Render the resulting row(s) under a "this row already exists with the value you tried to use" header. - **FK child-side on INSERT/UPDATE**: list the parent table and point out that no matching row exists there. (Pinpointing the *attempted* child value is straightforward; pinpointing parent rows that *would* satisfy is not — and not what the user needs.) - **FK parent-side on DELETE/UPDATE**: re-query child tables for rows that reference the row being touched. Render under a "these rows reference the row you tried to delete" header. Capped at ADR-0017's `DIAGNOSTIC_ROW_CAP` (100). If the re-query fails or returns no rows (engine state changed under us, locking, the engine's idea of "which row" differs from ours), fall back to the constraint-only verbose wording — no diagnostic table. The translator owns the re-query. Call sites do **not** do their own pinpointing — that scatters logic and re-introduces the inconsistency that the existing two helpers exhibit. ### 7. The `FriendlyError` payload ```rust pub struct FriendlyError { /// One-line headline. Always present. Both verbosities /// share this line; verbose appends to it. pub headline: String, /// Pedagogical "what to do next" hint. Populated only in /// verbose mode; `None` in short mode. pub hint: Option, /// Bordered row pinpoint (ADR-0017 §7 renderer). Populated /// when re-query succeeded for kinds that pinpoint /// (§6); `None` otherwise. Verbosity-independent: short /// mode also benefits from row context when available, but /// the catalog wording is terse. pub diagnostic_table: Option, } ``` `DiagnosticTable` is the existing structured form of ADR-0017's bordered renderer, lifted into a `pub` type the friendly module can construct. The renderer (in `output_render`) composes the three fields: headline, then (verbose only) a blank line + hint, then (if present) a blank line + the bordered table. ### 8. The i18n message catalog #### 8.1 Scope The catalog covers user-visible **message strings only**. It does **not** cover value formats — those stay invariant across all locales (§8.7). This ADR introduces the catalog and uses it for the friendly-error layer. Migration of every other user-visible string in the codebase (help text, `[ok]` summaries, `[client-side]` notes, parse errors, modal labels, mode banners, replay messages, the rebuild-confirmation prompt, the load-picker entries, etc.) is a required follow-on, but separable — see §9. #### 8.2 Storage and locale - The catalog lives in YAML files under `src/friendly/strings/`, embedded at compile time via `include_str!` and parsed once at startup with `serde_yaml` (or equivalent). - One file per locale: `en-US.yaml` is the only file for now. - **No external file loading at runtime.** Translations land in the codebase via PR, not by dropping files into a directory. This protects against accidental corruption and against bored students discovering they can override strings on a shared machine. - Locale selection is fixed for now; runtime selection is deferred to a follow-on ADR alongside settings persistence. #### 8.3 Key structure — flexible, per-category schema YAML hierarchical groups. Each top-level category defines the dimensions its keys carry; not every category has every dimension. The error category typically uses `{kind} × {operation} × {verbosity}`: ```yaml error: unique: insert: verbose: "..." short: "..." update: verbose: "..." short: "..." foreign_key: child_side: insert: { verbose: "...", short: "..." } update: { verbose: "...", short: "..." } parent_side: delete: { verbose: "...", short: "..." } update: { verbose: "...", short: "..." } not_null: insert: { verbose: "...", short: "..." } update: { verbose: "...", short: "..." } ``` Other categories will use other dimensions: ```yaml help: cli_banner: "..." in_app_command_list: "..." ok: summary: "[ok] {verb} {subject}" client_side: transformed: { verbose: "...", short: "..." } auto_fill: { serial: "...", short_id: "..." } replay: completed: "[ok] replay {path} — {count} command(s) run" failed_with_line: "replay {path} failed at line {line}: {error}" failed_no_line: "replay {path} failed: {error}" ``` The category schema is documented as a Rust enum / struct in `friendly::keys` and validated at test time (§8.6). Adding a new dimension is a code change, not a freeform YAML edit. #### 8.4 Format strings — `{name}` plain substitution, no specifiers Strings use named placeholders. Values are interpolated at substitution time: ```yaml error.unique.insert.verbose: | A row with this value already exists. The `{column}` column on `{table}` is unique-constrained. Pick a different value or update the existing row. ``` **Format specifiers are explicitly rejected.** A translator cannot write `{value:08.2}`, `{value:>10}`, etc. The substitution implementation refuses any `{name:...}` form at parse time with a clear error. Rationale: the catalog positions values inside prose; it does not reformat them. Value rendering is the application's job (§8.7) and must be uniform across all locales. Implementation: a hand-rolled substitution helper, ~30 lines, that walks the template, recognises `{name}` segments, looks up the value in the supplied keyword arguments, and inserts it. Unknown placeholders → translation error (caught at test time when the catalog is exercised). #### 8.5 Pluralisation — `(s)` suffix for now The existing codebase uses the `"{count} row(s) inserted"` style. The catalog continues this. Real plural-form rules per locale (Fluent / ICU MessageFormat have zero / one / few / many / two / other tagged forms, language-specific) are explicitly **deferred**. Adding them is a meaningful design lift, not a drop-in upgrade, and the H1 scope here is already wide enough. A future "advanced i18n" ADR would address pluralisation, gendered forms, ordinals, and date/number formatting hooks. This ADR documents that the limitation is known and intentional. #### 8.6 Validation Two layers of catalog validation: - **At test time (build-time).** A unit test loads every catalog file and verifies: - The YAML parses. - Every key referenced from the friendly module exists. - Every placeholder in every string is in the expected set for that key (declared by the friendly module, per §8.3). - No catalog key contains a format specifier. - No catalog string contains forbidden engine vocabulary (the same check the engine-vocabulary audit applies). - **At startup.** A small sanity check parses the embedded YAML and panics loudly if it doesn't (catches a corrupted build artefact). Cheap; runs once per process. #### 8.7 Value formats stay invariant Locale-aware value formatting is **explicitly rejected**. The app uses a single set of formats across all locales, both for input and output: - **Dates**: ISO 8601 `YYYY-MM-DD`. - **Datetimes**: ISO 8601 `YYYY-MM-DDTHH:MM:SS`. - **Decimal point**: `.`. - **Thousands separator**: none. - **Booleans**: `true` / `false` — treated as part of the SQL/relational vocabulary the user is learning, not as a translatable string. (Like `NULL`, which also stays as `NULL`.) - **NULL**: `NULL`. - **Blob**: `` placeholder. Rationale: - ISO 8601 is the international standard, not a US-centric format. The most common worry — "I don't want US-style MM/DD/YYYY imposed on the user" — does not apply. - Locale-aware value parsing on input is fragile: an English installation can't read a German `1,5` without ambiguity, CSVs cross locale borders, projects shared across teams break unpredictably. Several real RDBMS allow this; we explicitly do not, because the playground's pedagogical contract is "you are learning what databases want", and what they want is invariant. - Booleans stay English-style because they're SQL keywords (`TRUE` / `FALSE`) the learner needs to recognise. The validator (§8.6) does not police value formats inside catalog strings — it can't, since they're free-form prose. But the format-specifier-rejection rule (§8.4) means a catalog author cannot accidentally pad, round, or reformat a value they reference; they can only quote it as-is. ### 9. Migration sweep — `t!()` macro marker A small macro provides the call-site shape: ```rust t!(error.unique.insert.verbose, table = "Customers", column = "id") ``` - Resolves the catalog key at compile time (or at first call, with the lookup result memoised). - Substitutes named arguments per §8.4. - Returns the rendered `String`. - Panics at test time if the key is missing or placeholders don't match (§8.6). The migration sweep proceeds **incrementally**. Each PR migrates a coherent group of literal strings — for instance, "all replay-related messages", "all `[ok]` summaries", "all parse error wordings". Reviewable in isolation. The remaining work is visible: ``` grep -E '"[^"]' src/ | grep -v 't!(' ``` approximates the unmigrated callsites. (Imperfect — strings inside DDL templates, internal log messages, and error-kind classification helpers are not user-visible — but a useful proxy.) H1 itself ships with all friendly-error-layer call sites already migrated. The friendly-error catalog populating `error.*` keys is part of the H1 PR. Other categories (`help.*`, `ok.*`, `client_side.*`, `replay.*`, `parse.*`, ...) are populated as their migration PRs land, each adding the catalog entries it needs and removing the corresponding literals from code. ### 10. Backwards compatibility — anchor phrases Many existing tests assert substring matches on error wording (`message.contains("no such table")`, `message.contains("already exists")`, …). H1 will rewrite the surrounding wording. Rather than break every such test, the catalog commits to a small set of **anchor phrases** that stay stable and that tests can keep asserting against: | Anchor phrase | Where it appears | |---------------|------------------| | `no such table` | object-not-found errors targeting tables | | `no such column` | object-not-found errors targeting columns | | `no such relationship` | object-not-found errors targeting relationships | | `already exists` | name-collision errors (table, column, relationship) | | `already has the value` | UNIQUE-violation headlines (replaces the engine's "UNIQUE constraint failed") | | `cannot be converted` | type-incompatible refusal headers (ADR-0017 §5) | | `discard information` | lossy refusal headers (ADR-0017 §5) | | `referenced by` | FK-cascade refusals (parent-side) | | `[client-side]` | the existing pedagogical-note prefix | The list is small on purpose: anchor phrases are a maintenance liability (a rewording of any of them needs an ADR amendment plus a test sweep). Tests that need to assert on wording should prefer these. Tests that don't need to assert on wording should not. ## Out of scope These are deliberately deferred to keep H1 shippable: 1. **Translation of every other user-visible string** (`help.*`, `ok.*`, `client_side.*`, `replay.*`, `parse.*`, modal labels, mode banners, load-picker entries, …). Required follow-on; tracked as separable PRs per §9. 2. **Advanced-mode SQL error sanitization.** Q1 isn't wired yet. When it is, a future ADR will define how to surface "real" SQL errors with engine artefacts removed, while still respecting ADR-0002. Today's H1 does not handle this path. 3. **Settings persistence for the `messages` command.** Lives in a future settings ADR. 4. **Plural-form rules per locale.** §8.5. 5. **Runtime locale selection.** §8.2. 6. **Locale-aware value formatting.** §8.7 — explicitly rejected, not just deferred. 7. **Constraint-management surface (`add unique`, `add check`).** The catalog covers CHECK violations as a placeholder so the wiring is ready, but C3-track work owns the DDL surface. ## Consequences ### Positive - **Single chokepoint for engine vocabulary.** Once the migration sweep is complete, the engine-vocabulary audit becomes structurally enforceable — there is no path for an engine name to leak that doesn't go through the catalog, and the catalog's validator policies it. - **Operation-tailored, pinpointed errors materially improve pedagogy.** A learner who tries to `INSERT` a duplicate `id` sees the row that already exists, the column the constraint applies to, and a hint pointing to UPDATE. They learn what the constraint *is*, not just that one fired. - **i18n foundation in place.** Future contributors can add translations via PR. The codebase doesn't need to be re-architected for it. - **The `messages` command** lets advanced learners shrink the output once they recognise the patterns. Reduces frustration with users who outgrow the verbose voice. ### Costs - **Catalog file becomes a maintenance dimension.** Every new error path means a new key. The schema validator catches forgotten keys but doesn't write them. - **Migration sweep is non-trivial.** Probably 50–80 user-visible literals across `src/`. Each is mechanical but the total is hours. - **Test churn during the sweep.** Substring assertions on rewritten messages need updating. The anchor-phrase list (§10) limits the blast radius for the most common cases but doesn't eliminate it. - **A small dependency.** `serde_yaml` (or equivalent) for the catalog parse. We already use `serde_yaml` for `project.yaml`, so this is "use what we have", not "add a dependency". ## Implementation notes These are sketch-level — H1 implementation will produce more detailed plans, but they're enough that a session picking this up has direction: ### Order of operations 1. **Skeleton.** Create `src/friendly/` with the module structure: `mod.rs`, `keys.rs` (per-category schema), `format.rs` (the substitution helper), `strings/en-US.yaml` (empty-ish for now). Wire `t!()` macro. 2. **Catalog populate — `error.*` first.** Five categories per §3, each with verbose + short variants per relevant operation. Anchor-phrase compliance per §10. 3. **`FriendlyError` struct + renderer.** Lift `DiagnosticTable` to a public type from `output_render`. Compose the renderer. 4. **Translator.** `translate(error, ctx)` that classifies the `DbError` payload and produces a `FriendlyError`. Re-query logic per §6. 5. **Wire `friendly_message` to the translator.** The two existing helpers (`friendly_change_column_engine_error`, `enrich_fk_message`) are removed; their behaviour is now in the catalog. 6. **`messages (short|verbose)` command.** App-level, parallel to `mode`. Stored on `App::messages_verbosity`; threaded into `TranslateContext`. 7. **Validator unit test.** Per §8.6 — exercises the catalog end-to-end at test time. 8. **Update tests that assert on rewritten wording.** Use the anchor phrases where applicable. Document any test that could not use an anchor phrase as a candidate for future stability work. ### Things that interact subtly - **The engine-vocabulary audit** (this session, `tests/engine_vocabulary_audit.rs`) keeps applying. With H1 the audit becomes a regression net for the catalog: any catalog string containing forbidden vocabulary fails the audit at test time. - **ADR-0017's diagnostic-table renderer** is currently produced from inside `db.rs` (`render_lossy_diagnostic`, `render_incompatible_diagnostic`, …). Those renderers do not go away — they have a specific shape (per-cell "old → new" with reason) that the friendly module's row-pinpointer does not. They migrate to the catalog for their *prose* (headers, footer hints) but keep their structure-rendering logic. - **The `[client-side]` notes from ADR-0017/0018** are not errors — they're success notes. They go through the catalog too (under `client_side.*`) but as part of the migration sweep, not the H1 PR. Their voice is already pedagogical; the catalog migration is mechanical. - **`ParseError`** is its own error type (DSL parse errors surfaced through `humanise()`), not a `DbError`. It gets the same translation treatment under `parse.*`, but as part of the migration sweep.