# ADR-0019: Friendly error layer (H1) and i18n message catalog ## Status Accepted. Amends ADR-0002 (database engine — user-facing posture). Pulls forward H1 from `requirements.md`. Introduces the i18n message catalog as foundation infrastructure that subsequent work will gradually take over for every user-visible string in the codebase. ## Context ADR-0002 commits to never exposing the underlying engine in user-visible strings. Today the implementation has three gaps between commitment and reality: 1. **`DbError::friendly_message()` is a passthrough.** It calls `Display`, which for the `Sqlite { message }` variant surfaces the raw text from `rusqlite::Error`. In practice today the visible result is mostly engine-neutral SQL wording ("no such table: T", "FOREIGN KEY constraint failed", "UNIQUE constraint failed: Customers.id") — the engine-vocabulary audit (this session) confirmed there are currently zero literal `SQLite` / `STRICT` / `PRAGMA` / `rusqlite` matches in user-reachable strings. But the bar is one passthrough away from slipping, the wording is not pedagogically helpful, and it certainly isn't translatable. 2. **Two ad-hoc helpers already do partial translation.** `friendly_change_column_engine_error` (db.rs) recognises constraint / type errors that surface during the `--dont-convert` change-column path and rewrites them with abstract "the database refused …" wording. `enrich_fk_message` appends the relevant FK list to "FOREIGN KEY constraint failed". Both are call-site-local and grew up where the immediate need was; they do not compose. 3. **No i18n foundation.** Every user-visible string in the codebase is a literal — typically inside `format!` calls, sometimes inside helper-returned strings. Adding a second locale today would mean editing every literal. The project audience is not US-only, and the cost of laying the foundation now is much smaller than the cost of doing it after the codebase has tripled in size. ADR-0017 §7 introduced the bordered diagnostic-table renderer for refusal payloads (lossy rows, incompatible rows, uniqueness collisions). That renderer is the right vehicle for any H1 wording that pinpoints offending rows — symmetric with how `change column` already explains itself. The pedagogical posture across `[client-side]` notes (ADR-0017 §6, ADR-0018 §9) is "the tool did this for you; here's what raw SQL would have needed". H1 errors should match that voice: tell the user what happened, why, and what to do next, rather than echoing a constraint name and stopping there. ## Decision ### 1. The unifying principle > Every user-visible message goes through a translation > catalog. The catalog supplies template strings; the > application renders values into them. Engine error text is > never surfaced verbatim — it is structurally classified, and > the catalog produces operation-tailored wording. The H1 layer is the single chokepoint. Once the migration sweep (§9) is complete, no user-reachable string in the codebase is a plain literal. Engine vocabulary cannot leak through a path the catalog doesn't own. ### 2. Architecture A new module — `src/friendly/` — owns the translation logic. The shape: ```rust pub struct TranslateContext<'a> { pub operation: Operation, pub table: Option<&'a str>, pub column: Option<&'a str>, pub source_type: Option, pub target_type: Option, pub verbosity: Verbosity, pub database: Option<&'a Database>, } pub enum Operation { Insert, Update, Delete, CreateTable, DropTable, AddColumn, DropColumn, RenameColumn, ChangeColumnType, AddRelationship, DropRelationship, Query, Replay, } pub fn translate(error: &DbError, ctx: &TranslateContext) -> FriendlyError; ``` `FriendlyError` is a structured payload (§7), not a flat string. The `output_render` module — which already owns the diagnostic-table renderer — assembles the payload into final text. The translator may use `ctx.database` to re-query for offending rows (§6). Re-query is best-effort; failure falls back to the non-pinpointed wording. ### 3. Translator catalog — initial scope H1's first iteration covers five engine-error categories that the simple-mode user can plausibly trigger: - **UNIQUE constraint violations** — including PK collisions on INSERT and UPDATE, and the auto-fill collisions surfaced by ADR-0018 §6's pre-flight check. - **FOREIGN KEY constraint violations** — distinguishing parent-side (DELETE / UPDATE that orphans children) from child-side (INSERT / UPDATE with no matching parent). - **NOT NULL constraint violations** — INSERT and UPDATE. - **CHECK constraint violations** — placeholder coverage; we don't emit CHECK constraints today, but the catalog covers the case so it's ready when constraint-management lands (track C3). - **Type mismatch** — values rejected by STRICT typing because they aren't the expected storage class (mostly the `--dont-convert` change-column path today). The existing helpers absorb into this catalog: - `friendly_change_column_engine_error` → keys under `error.change_column.*` (one per recognised category). - `enrich_fk_message` → its FK-listing logic moves into the catalog renderer for `error.foreign_key.*`. ### 4. Operation-tailored wording — no generic "constraint failed" A NOT NULL violation on INSERT and on UPDATE share an underlying cause but read very differently to the learner: on INSERT they forgot a value, on UPDATE they tried to clear one. The catalog gives each operation its own template. Operation × kind × verbosity is the typical depth. Where two operations genuinely produce identical wording (rare), the catalog uses a single key referenced from both call sites rather than duplicating. This is a maintenance optimisation, not a default; first-cut keys are operation-specific. ### 5. Pedagogical voice + the `messages (short|verbose)` command The default verbosity is **verbose**: headline + cause + hint. Verbose example for a UNIQUE violation on INSERT: ``` A row with this value already exists. The `id` column on `Customers` is a primary key, which must be unique. Pick a different value, or update the existing row with `update Customers ... where id=...`. ``` Short variant: ``` A row with `id=5` already exists in `Customers`. ``` A new app-level command (parallel to `mode`): ``` messages — show current verbosity messages short — switch to short messages messages verbose — switch to verbose messages (default) ``` Verbosity is in-session state on `App`; persisted later when settings persistence lands. The current verbosity is threaded through `TranslateContext::verbosity`; the catalog has separate keys for each (`error.unique.insert.verbose`, `error.unique.insert.short`). Both verbosities respect the screen-real-estate constraint of the TUI. Verbose is "two or three short paragraphs of explanation"; not "a wall of text". Short is "one line, no hint". ### 6. Row pinpointing via re-query and the diagnostic-table renderer For UNIQUE / NOT NULL / FK violations the engine doesn't tell us *which* row. Where it matters, the translator re-queries the database post-failure to find the offending rows and renders them through ADR-0017 §7's bordered diagnostic-table renderer. This is symmetric with how `change column` already handles lossy / incompatible / collision refusals. Re-query is best-effort: - **UNIQUE on INSERT/UPDATE**: re-query the table for rows where the offending column matches the value the user tried to insert. Render the resulting row(s) under a "this row already exists with the value you tried to use" header. - **FK child-side on INSERT/UPDATE**: list the parent table and point out that no matching row exists there. (Pinpointing the *attempted* child value is straightforward; pinpointing parent rows that *would* satisfy is not — and not what the user needs.) - **FK parent-side on DELETE/UPDATE**: re-query child tables for rows that reference the row being touched. Render under a "these rows reference the row you tried to delete" header. Capped at ADR-0017's `DIAGNOSTIC_ROW_CAP` (100). If the re-query fails or returns no rows (engine state changed under us, locking, the engine's idea of "which row" differs from ours), fall back to the constraint-only verbose wording — no diagnostic table. The translator owns the re-query. Call sites do **not** do their own pinpointing — that scatters logic and re-introduces the inconsistency that the existing two helpers exhibit. ### 7. The `FriendlyError` payload ```rust pub struct FriendlyError { /// One-line headline. Always present. Both verbosities /// share this line; verbose appends to it. pub headline: String, /// Pedagogical "what to do next" hint. Populated only in /// verbose mode; `None` in short mode. pub hint: Option, /// Bordered row pinpoint (ADR-0017 §7 renderer). Populated /// when re-query succeeded for kinds that pinpoint /// (§6); `None` otherwise. Verbosity-independent: short /// mode also benefits from row context when available, but /// the catalog wording is terse. pub diagnostic_table: Option, } ``` `DiagnosticTable` is the existing structured form of ADR-0017's bordered renderer, lifted into a `pub` type the friendly module can construct. The renderer (in `output_render`) composes the three fields: headline, then (verbose only) a blank line + hint, then (if present) a blank line + the bordered table. ### 8. The i18n message catalog #### 8.1 Scope The catalog covers user-visible **message strings only**. It does **not** cover value formats — those stay invariant across all locales (§8.7). This ADR introduces the catalog and uses it for the friendly-error layer. Migration of every other user-visible string in the codebase (help text, `[ok]` summaries, `[client-side]` notes, parse errors, modal labels, mode banners, replay messages, the rebuild-confirmation prompt, the load-picker entries, etc.) is a required follow-on, but separable — see §9. #### 8.2 Storage and locale - The catalog lives in YAML files under `src/friendly/strings/`, embedded at compile time via `include_str!` and parsed once at startup with `serde_yaml` (or equivalent). - One file per locale: `en-US.yaml` is the only file for now. - **No external file loading at runtime.** Translations land in the codebase via PR, not by dropping files into a directory. This protects against accidental corruption and against bored students discovering they can override strings on a shared machine. - Locale selection is fixed for now; runtime selection is deferred to a follow-on ADR alongside settings persistence. #### 8.3 Key structure — flexible, per-category schema YAML hierarchical groups. Each top-level category defines the dimensions its keys carry; not every category has every dimension. The error category typically uses `{kind} × {operation} × {verbosity}`: ```yaml error: unique: insert: verbose: "..." short: "..." update: verbose: "..." short: "..." foreign_key: child_side: insert: { verbose: "...", short: "..." } update: { verbose: "...", short: "..." } parent_side: delete: { verbose: "...", short: "..." } update: { verbose: "...", short: "..." } not_null: insert: { verbose: "...", short: "..." } update: { verbose: "...", short: "..." } ``` Other categories will use other dimensions: ```yaml help: cli_banner: "..." in_app_command_list: "..." ok: summary: "[ok] {verb} {subject}" client_side: transformed: { verbose: "...", short: "..." } auto_fill: { serial: "...", short_id: "..." } replay: completed: "[ok] replay {path} — {count} command(s) run" failed_with_line: "replay {path} failed at line {line}: {error}" failed_no_line: "replay {path} failed: {error}" ``` The category schema is documented as a Rust enum / struct in `friendly::keys` and validated at test time (§8.6). Adding a new dimension is a code change, not a freeform YAML edit. #### 8.4 Format strings — `{name}` plain substitution, no specifiers Strings use named placeholders. Values are interpolated at substitution time: ```yaml error.unique.insert.verbose: | A row with this value already exists. The `{column}` column on `{table}` is unique-constrained. Pick a different value or update the existing row. ``` **Format specifiers are explicitly rejected.** A translator cannot write `{value:08.2}`, `{value:>10}`, etc. The substitution implementation refuses any `{name:...}` form at parse time with a clear error. Rationale: the catalog positions values inside prose; it does not reformat them. Value rendering is the application's job (§8.7) and must be uniform across all locales. Implementation: a hand-rolled substitution helper, ~30 lines, that walks the template, recognises `{name}` segments, looks up the value in the supplied keyword arguments, and inserts it. Unknown placeholders → translation error (caught at test time when the catalog is exercised). #### 8.5 Pluralisation — translator's choice, not a system feature The existing codebase uses the `"{count} row(s) inserted"` style. The catalog continues this. Pluralisation is **a translator concern, not a system concern**. Placeholders only carry values (§8.4); the wording around them is fully under the translator's control. Languages that have an equivalent of the `(s)` shorthand can use it; languages that don't can rephrase to avoid the question entirely (`"applied to {count} row(s)"` → `"applied to {count} rows in total"` works for every count). For a teaching tool with a narrow output surface this is sufficient. Real plural-form rules per locale (Fluent / ICU MessageFormat, with zero / one / few / many / two / other tagged forms) are **explicitly not a goal of this project**. They would let a template supply separate singular and plural strings indexed by the same count, which is technically nicer but not materially better for the kind of output this app produces. We are not going to add them. #### 8.6 Validation Two layers of catalog validation: - **At test time (build-time).** A unit test loads every catalog file and verifies: - The YAML parses. - Every key referenced from the friendly module exists. - Every placeholder in every string is in the expected set for that key (declared by the friendly module, per §8.3). - No catalog key contains a format specifier. - No catalog string contains forbidden engine vocabulary (the same check the engine-vocabulary audit applies). - **At startup.** A small sanity check parses the embedded YAML and panics loudly if it doesn't (catches a corrupted build artefact). Cheap; runs once per process. #### 8.7 Value formats stay invariant Locale-aware value formatting is **explicitly rejected**. The app uses a single set of formats across all locales, both for input and output: - **Dates**: ISO 8601 `YYYY-MM-DD`. - **Datetimes**: ISO 8601 `YYYY-MM-DDTHH:MM:SS`. - **Decimal point**: `.`. - **Thousands separator**: none. - **Booleans**: `true` / `false` — treated as part of the SQL/relational vocabulary the user is learning, not as a translatable string. (Like `NULL`, which also stays as `NULL`.) - **NULL**: `NULL`. - **Blob**: `` placeholder. Rationale: - ISO 8601 is the international standard, not a US-centric format. The most common worry — "I don't want US-style MM/DD/YYYY imposed on the user" — does not apply. - Locale-aware value parsing on input is fragile: an English installation can't read a German `1,5` without ambiguity, CSVs cross locale borders, projects shared across teams break unpredictably. Several real RDBMS allow this; we explicitly do not, because the playground's pedagogical contract is "you are learning what databases want", and what they want is invariant. - Booleans stay English-style because they're SQL keywords (`TRUE` / `FALSE`) the learner needs to recognise. The validator (§8.6) does not police value formats inside catalog strings — it can't, since they're free-form prose. But the format-specifier-rejection rule (§8.4) means a catalog author cannot accidentally pad, round, or reformat a value they reference; they can only quote it as-is. ### 9. Migration sweep — `t!()` macro marker A small macro provides the call-site shape: ```rust t!(error.unique.insert.verbose, table = "Customers", column = "id") ``` - Resolves the catalog key at compile time (or at first call, with the lookup result memoised). - Substitutes named arguments per §8.4. - Returns the rendered `String`. - Panics at test time if the key is missing or placeholders don't match (§8.6). The migration sweep proceeds **incrementally**. Each PR migrates a coherent group of literal strings — for instance, "all replay-related messages", "all `[ok]` summaries", "all parse error wordings". Reviewable in isolation. The remaining work is visible: ``` grep -E '"[^"]' src/ | grep -v 't!(' ``` approximates the unmigrated callsites. (Imperfect — strings inside DDL templates, internal log messages, and error-kind classification helpers are not user-visible — but a useful proxy.) H1 itself ships with all friendly-error-layer call sites already migrated. The friendly-error catalog populating `error.*` keys is part of the H1 PR. Other categories (`help.*`, `ok.*`, `client_side.*`, `replay.*`, `parse.*`, ...) are populated as their migration PRs land, each adding the catalog entries it needs and removing the corresponding literals from code. ### 10. Backwards compatibility — anchor phrases Many existing tests assert substring matches on error wording (`message.contains("no such table")`, `message.contains("already exists")`, …). H1 will rewrite the surrounding wording. Rather than break every such test, the catalog commits to a small set of **anchor phrases** that stay stable and that tests can keep asserting against: | Anchor phrase | Where it appears | |---------------|------------------| | `no such table` | object-not-found errors targeting tables | | `no such column` | object-not-found errors targeting columns | | `no such relationship` | object-not-found errors targeting relationships | | `already exists` | name-collision errors (table, column, relationship) | | `already has the value` | UNIQUE-violation headlines (replaces the engine's "UNIQUE constraint failed") | | `cannot be converted` | type-incompatible refusal headers (ADR-0017 §5) | | `discard information` | lossy refusal headers (ADR-0017 §5) | | `referenced by` | FK-cascade refusals (parent-side) | | `[client-side]` | the existing pedagogical-note prefix | The list is small on purpose: anchor phrases are a maintenance liability (a rewording of any of them needs an ADR amendment plus a test sweep). Tests that need to assert on wording should prefer these. Tests that don't need to assert on wording should not. ## Out of scope These are deliberately deferred to keep H1 shippable: 1. **Translation of every other user-visible string** (`help.*`, `ok.*`, `client_side.*`, `replay.*`, `parse.*`, modal labels, mode banners, load-picker entries, …). Required follow-on; tracked as separable PRs per §9. 2. **Advanced-mode SQL error sanitization.** Q1 isn't wired yet. When it is, a future ADR will define how to surface "real" SQL errors with engine artefacts removed, while still respecting ADR-0002. Today's H1 does not handle this path. 3. **Settings persistence for the `messages` command.** Lives in a future settings ADR. 4. **Plural-form rules per locale.** §8.5 — explicitly not a goal, not just deferred. Translations handle pluralisation in their wording (`(s)` shorthand or rephrased). 5. **Runtime locale selection.** §8.2. 6. **Locale-aware value formatting.** §8.7 — explicitly rejected, not just deferred. 7. **Constraint-management surface (`add unique`, `add check`).** The catalog covers CHECK violations as a placeholder so the wiring is ready, but C3-track work owns the DDL surface. ## Consequences ### Positive - **Single chokepoint for engine vocabulary.** Once the migration sweep is complete, the engine-vocabulary audit becomes structurally enforceable — there is no path for an engine name to leak that doesn't go through the catalog, and the catalog's validator policies it. - **Operation-tailored, pinpointed errors materially improve pedagogy.** A learner who tries to `INSERT` a duplicate `id` sees the row that already exists, the column the constraint applies to, and a hint pointing to UPDATE. They learn what the constraint *is*, not just that one fired. - **i18n foundation in place.** Future contributors can add translations via PR. The codebase doesn't need to be re-architected for it. - **The `messages` command** lets advanced learners shrink the output once they recognise the patterns. Reduces frustration with users who outgrow the verbose voice. ### Costs - **Catalog file becomes a maintenance dimension.** Every new error path means a new key. The schema validator catches forgotten keys but doesn't write them. - **Migration sweep is non-trivial.** Probably 50–80 user-visible literals across `src/`. Each is mechanical but the total is hours. - **Test churn during the sweep.** Substring assertions on rewritten messages need updating. The anchor-phrase list (§10) limits the blast radius for the most common cases but doesn't eliminate it. - **A small dependency.** `serde_yaml` (or equivalent) for the catalog parse. We already use `serde_yaml` for `project.yaml`, so this is "use what we have", not "add a dependency". ## Implementation notes These are sketch-level — H1 implementation will produce more detailed plans, but they're enough that a session picking this up has direction: ### Order of operations 1. **Skeleton.** Create `src/friendly/` with the module structure: `mod.rs`, `keys.rs` (per-category schema), `format.rs` (the substitution helper), `strings/en-US.yaml` (empty-ish for now). Wire `t!()` macro. 2. **Catalog populate — `error.*` first.** Five categories per §3, each with verbose + short variants per relevant operation. Anchor-phrase compliance per §10. 3. **`FriendlyError` struct + renderer.** Lift `DiagnosticTable` to a public type from `output_render`. Compose the renderer. 4. **Translator.** `translate(error, ctx)` that classifies the `DbError` payload and produces a `FriendlyError`. Re-query logic per §6. 5. **Wire `friendly_message` to the translator.** The two existing helpers (`friendly_change_column_engine_error`, `enrich_fk_message`) are removed; their behaviour is now in the catalog. 6. **`messages (short|verbose)` command.** App-level, parallel to `mode`. Stored on `App::messages_verbosity`; threaded into `TranslateContext`. 7. **Validator unit test.** Per §8.6 — exercises the catalog end-to-end at test time. 8. **Update tests that assert on rewritten wording.** Use the anchor phrases where applicable. Document any test that could not use an anchor phrase as a candidate for future stability work. ### Things that interact subtly - **The engine-vocabulary audit** (this session, `tests/engine_vocabulary_audit.rs`) keeps applying. With H1 the audit becomes a regression net for the catalog: any catalog string containing forbidden vocabulary fails the audit at test time. - **ADR-0017's diagnostic-table renderer** is currently produced from inside `db.rs` (`render_lossy_diagnostic`, `render_incompatible_diagnostic`, …). Those renderers do not go away — they have a specific shape (per-cell "old → new" with reason) that the friendly module's row-pinpointer does not. They migrate to the catalog for their *prose* (headers, footer hints) but keep their structure-rendering logic. - **The `[client-side]` notes from ADR-0017/0018** are not errors — they're success notes. They go through the catalog too (under `client_side.*`) but as part of the migration sweep, not the H1 PR. Their voice is already pedagogical; the catalog migration is mechanical. - **`ParseError`** is its own error type (DSL parse errors surfaced through `humanise()`), not a `DbError`. It gets the same translation treatment under `parse.*`, but as part of the migration sweep.