ADR-0019: friendly error layer (H1) and i18n message catalog

Settles the design we discussed across this session's
follow-up to the engine-vocabulary audit:

- A central `friendly` module owns translation; the existing
  ad-hoc helpers (`friendly_change_column_engine_error`,
  `enrich_fk_message`) absorb into it.
- Initial catalog covers UNIQUE / FK / NOT NULL / CHECK /
  type-mismatch errors with operation-tailored,
  pedagogically-voiced wording in verbose and short variants.
- New `messages (short|verbose)` app-level command lets
  advanced learners shrink the output. In-session state for
  now; persisted later when settings persistence lands.
- Row pinpointing via post-failure re-query, rendered through
  ADR-0017 §7's bordered diagnostic-table renderer.
  `FriendlyError` is a structured payload (headline + hint +
  optional table); `output_render` composes it.
- i18n foundation: hierarchical YAML catalog, embedded via
  `include_str!`, fixed locale (en-US) for now, no external
  files. `{name}` plain substitution; format specifiers
  explicitly rejected so a translator cannot reformat values.
  Value formats stay invariant across all locales (ISO 8601
  dates, `.` decimals, `true`/`false`, `NULL`) — explicitly
  not a translatable concern.
- Migration sweep is required follow-on but separable: a
  `t!()` macro marks call sites and lets per-category PRs
  land incrementally. Anchor-phrase list (§10) limits test
  churn for the most common substring assertions.

Out of scope and explicitly deferred: advanced-mode SQL
error sanitisation (waits on Q1), settings persistence for
the messages command, plural-form rules per locale, runtime
locale selection, locale-aware value formatting (rejected,
not deferred), constraint-management surface (C3 territory).

README index updated.
This commit is contained in:
claude@clouddev1
2026-05-09 08:49:53 +00:00
parent c4ee264636
commit 2a8618c783
2 changed files with 624 additions and 0 deletions
@@ -0,0 +1,623 @@
# ADR-0019: Friendly error layer (H1) and i18n message catalog
## Status
Accepted.
Amends ADR-0002 (database engine — user-facing posture). Pulls
forward H1 from `requirements.md`. Introduces the i18n message
catalog as foundation infrastructure that subsequent work will
gradually take over for every user-visible string in the
codebase.
## Context
ADR-0002 commits to never exposing the underlying engine in
user-visible strings. Today the implementation has three gaps
between commitment and reality:
1. **`DbError::friendly_message()` is a passthrough.** It calls
`Display`, which for the `Sqlite { message }` variant
surfaces the raw text from `rusqlite::Error`. In practice
today the visible result is mostly engine-neutral SQL
wording ("no such table: T", "FOREIGN KEY constraint
failed", "UNIQUE constraint failed: Customers.id") — the
engine-vocabulary audit (this session) confirmed there are
currently zero literal `SQLite` / `STRICT` / `PRAGMA` /
`rusqlite` matches in user-reachable strings. But the bar is
one passthrough away from slipping, the wording is not
pedagogically helpful, and it certainly isn't translatable.
2. **Two ad-hoc helpers already do partial translation.**
`friendly_change_column_engine_error` (db.rs) recognises
constraint / type errors that surface during the
`--dont-convert` change-column path and rewrites them with
abstract "the database refused …" wording.
`enrich_fk_message` appends the relevant FK list to
"FOREIGN KEY constraint failed". Both are call-site-local
and grew up where the immediate need was; they do not
compose.
3. **No i18n foundation.** Every user-visible string in the
codebase is a literal — typically inside `format!` calls,
sometimes inside helper-returned strings. Adding a second
locale today would mean editing every literal. The project
audience is not US-only, and the cost of laying the
foundation now is much smaller than the cost of doing it
after the codebase has tripled in size.
ADR-0017 §7 introduced the bordered diagnostic-table renderer
for refusal payloads (lossy rows, incompatible rows, uniqueness
collisions). That renderer is the right vehicle for any
H1 wording that pinpoints offending rows — symmetric with how
`change column` already explains itself.
The pedagogical posture across `[client-side]` notes
(ADR-0017 §6, ADR-0018 §9) is "the tool did this for you;
here's what raw SQL would have needed". H1 errors should match
that voice: tell the user what happened, why, and what to do
next, rather than echoing a constraint name and stopping there.
## Decision
### 1. The unifying principle
> Every user-visible message goes through a translation
> catalog. The catalog supplies template strings; the
> application renders values into them. Engine error text is
> never surfaced verbatim — it is structurally classified, and
> the catalog produces operation-tailored wording.
The H1 layer is the single chokepoint. Once the migration sweep
(§9) is complete, no user-reachable string in the codebase is a
plain literal. Engine vocabulary cannot leak through a path the
catalog doesn't own.
### 2. Architecture
A new module — `src/friendly/` — owns the translation logic.
The shape:
```rust
pub struct TranslateContext<'a> {
pub operation: Operation,
pub table: Option<&'a str>,
pub column: Option<&'a str>,
pub source_type: Option<Type>,
pub target_type: Option<Type>,
pub verbosity: Verbosity,
pub database: Option<&'a Database>,
}
pub enum Operation {
Insert,
Update,
Delete,
CreateTable,
DropTable,
AddColumn,
DropColumn,
RenameColumn,
ChangeColumnType,
AddRelationship,
DropRelationship,
Query,
Replay,
}
pub fn translate(error: &DbError, ctx: &TranslateContext) -> FriendlyError;
```
`FriendlyError` is a structured payload (§7), not a flat
string. The `output_render` module — which already owns the
diagnostic-table renderer — assembles the payload into final
text.
The translator may use `ctx.database` to re-query for offending
rows (§6). Re-query is best-effort; failure falls back to the
non-pinpointed wording.
### 3. Translator catalog — initial scope
H1's first iteration covers five engine-error categories that
the simple-mode user can plausibly trigger:
- **UNIQUE constraint violations** — including PK collisions on
INSERT and UPDATE, and the auto-fill collisions surfaced by
ADR-0018 §6's pre-flight check.
- **FOREIGN KEY constraint violations** — distinguishing
parent-side (DELETE / UPDATE that orphans children) from
child-side (INSERT / UPDATE with no matching parent).
- **NOT NULL constraint violations** — INSERT and UPDATE.
- **CHECK constraint violations** — placeholder coverage; we
don't emit CHECK constraints today, but the catalog covers
the case so it's ready when constraint-management lands
(track C3).
- **Type mismatch** — values rejected by STRICT typing because
they aren't the expected storage class (mostly the
`--dont-convert` change-column path today).
The existing helpers absorb into this catalog:
- `friendly_change_column_engine_error` → keys under
`error.change_column.*` (one per recognised category).
- `enrich_fk_message` → its FK-listing logic moves into the
catalog renderer for `error.foreign_key.*`.
### 4. Operation-tailored wording — no generic "constraint failed"
A NOT NULL violation on INSERT and on UPDATE share an underlying
cause but read very differently to the learner: on INSERT they
forgot a value, on UPDATE they tried to clear one. The catalog
gives each operation its own template. Operation × kind ×
verbosity is the typical depth.
Where two operations genuinely produce identical wording (rare),
the catalog uses a single key referenced from both call sites
rather than duplicating. This is a maintenance optimisation,
not a default; first-cut keys are operation-specific.
### 5. Pedagogical voice + the `messages (short|verbose)` command
The default verbosity is **verbose**: headline + cause + hint.
Verbose example for a UNIQUE violation on INSERT:
```
A row with this value already exists.
The `id` column on `Customers` is a primary key, which must be
unique. Pick a different value, or update the existing row
with `update Customers ... where id=...`.
```
Short variant:
```
A row with `id=5` already exists in `Customers`.
```
A new app-level command (parallel to `mode`):
```
messages — show current verbosity
messages short — switch to short messages
messages verbose — switch to verbose messages (default)
```
Verbosity is in-session state on `App`; persisted later when
settings persistence lands. The current verbosity is threaded
through `TranslateContext::verbosity`; the catalog has separate
keys for each (`error.unique.insert.verbose`,
`error.unique.insert.short`).
Both verbosities respect the screen-real-estate constraint of
the TUI. Verbose is "two or three short paragraphs of
explanation"; not "a wall of text". Short is "one line, no
hint".
### 6. Row pinpointing via re-query and the diagnostic-table renderer
For UNIQUE / NOT NULL / FK violations the engine doesn't tell us
*which* row. Where it matters, the translator re-queries the
database post-failure to find the offending rows and renders
them through ADR-0017 §7's bordered diagnostic-table renderer.
This is symmetric with how `change column` already handles
lossy / incompatible / collision refusals.
Re-query is best-effort:
- **UNIQUE on INSERT/UPDATE**: re-query the table for rows where
the offending column matches the value the user tried to
insert. Render the resulting row(s) under a "this row already
exists with the value you tried to use" header.
- **FK child-side on INSERT/UPDATE**: list the parent table and
point out that no matching row exists there. (Pinpointing the
*attempted* child value is straightforward; pinpointing
parent rows that *would* satisfy is not — and not what the
user needs.)
- **FK parent-side on DELETE/UPDATE**: re-query child tables for
rows that reference the row being touched. Render under a
"these rows reference the row you tried to delete" header.
Capped at ADR-0017's `DIAGNOSTIC_ROW_CAP` (100).
If the re-query fails or returns no rows (engine state changed
under us, locking, the engine's idea of "which row" differs
from ours), fall back to the constraint-only verbose wording —
no diagnostic table.
The translator owns the re-query. Call sites do **not** do
their own pinpointing — that scatters logic and re-introduces
the inconsistency that the existing two helpers exhibit.
### 7. The `FriendlyError` payload
```rust
pub struct FriendlyError {
/// One-line headline. Always present. Both verbosities
/// share this line; verbose appends to it.
pub headline: String,
/// Pedagogical "what to do next" hint. Populated only in
/// verbose mode; `None` in short mode.
pub hint: Option<String>,
/// Bordered row pinpoint (ADR-0017 §7 renderer). Populated
/// when re-query succeeded for kinds that pinpoint
/// (§6); `None` otherwise. Verbosity-independent: short
/// mode also benefits from row context when available, but
/// the catalog wording is terse.
pub diagnostic_table: Option<DiagnosticTable>,
}
```
`DiagnosticTable` is the existing structured form of ADR-0017's
bordered renderer, lifted into a `pub` type the friendly
module can construct.
The renderer (in `output_render`) composes the three fields:
headline, then (verbose only) a blank line + hint, then
(if present) a blank line + the bordered table.
### 8. The i18n message catalog
#### 8.1 Scope
The catalog covers user-visible **message strings only**. It
does **not** cover value formats — those stay invariant across
all locales (§8.7).
This ADR introduces the catalog and uses it for the
friendly-error layer. Migration of every other user-visible
string in the codebase (help text, `[ok]` summaries,
`[client-side]` notes, parse errors, modal labels, mode
banners, replay messages, the rebuild-confirmation prompt, the
load-picker entries, etc.) is a required follow-on, but
separable — see §9.
#### 8.2 Storage and locale
- The catalog lives in YAML files under `src/friendly/strings/`,
embedded at compile time via `include_str!` and parsed once at
startup with `serde_yaml` (or equivalent).
- One file per locale: `en-US.yaml` is the only file for now.
- **No external file loading at runtime.** Translations land in
the codebase via PR, not by dropping files into a directory.
This protects against accidental corruption and against
bored students discovering they can override strings on a
shared machine.
- Locale selection is fixed for now; runtime selection is
deferred to a follow-on ADR alongside settings persistence.
#### 8.3 Key structure — flexible, per-category schema
YAML hierarchical groups. Each top-level category defines the
dimensions its keys carry; not every category has every
dimension.
The error category typically uses
`{kind} × {operation} × {verbosity}`:
```yaml
error:
unique:
insert:
verbose: "..."
short: "..."
update:
verbose: "..."
short: "..."
foreign_key:
child_side:
insert: { verbose: "...", short: "..." }
update: { verbose: "...", short: "..." }
parent_side:
delete: { verbose: "...", short: "..." }
update: { verbose: "...", short: "..." }
not_null:
insert: { verbose: "...", short: "..." }
update: { verbose: "...", short: "..." }
```
Other categories will use other dimensions:
```yaml
help:
cli_banner: "..."
in_app_command_list: "..."
ok:
summary: "[ok] {verb} {subject}"
client_side:
transformed: { verbose: "...", short: "..." }
auto_fill: { serial: "...", short_id: "..." }
replay:
completed: "[ok] replay {path} — {count} command(s) run"
failed_with_line:
"replay {path} failed at line {line}: {error}"
failed_no_line: "replay {path} failed: {error}"
```
The category schema is documented as a Rust enum / struct in
`friendly::keys` and validated at test time (§8.6). Adding a
new dimension is a code change, not a freeform YAML edit.
#### 8.4 Format strings — `{name}` plain substitution, no specifiers
Strings use named placeholders. Values are interpolated at
substitution time:
```yaml
error.unique.insert.verbose: |
A row with this value already exists.
The `{column}` column on `{table}` is unique-constrained.
Pick a different value or update the existing row.
```
**Format specifiers are explicitly rejected.** A translator
cannot write `{value:08.2}`, `{value:>10}`, etc. The substitution
implementation refuses any `{name:...}` form at parse time with
a clear error. Rationale: the catalog positions values inside
prose; it does not reformat them. Value rendering is the
application's job (§8.7) and must be uniform across all
locales.
Implementation: a hand-rolled substitution helper, ~30 lines,
that walks the template, recognises `{name}` segments, looks
up the value in the supplied keyword arguments, and inserts it.
Unknown placeholders → translation error (caught at test time
when the catalog is exercised).
#### 8.5 Pluralisation — `(s)` suffix for now
The existing codebase uses the
`"{count} row(s) inserted"` style. The catalog continues this.
Real plural-form rules per locale (Fluent / ICU MessageFormat
have zero / one / few / many / two / other tagged forms,
language-specific) are explicitly **deferred**. Adding them is
a meaningful design lift, not a drop-in upgrade, and the H1
scope here is already wide enough.
A future "advanced i18n" ADR would address pluralisation,
gendered forms, ordinals, and date/number formatting hooks.
This ADR documents that the limitation is known and intentional.
#### 8.6 Validation
Two layers of catalog validation:
- **At test time (build-time).** A unit test loads every catalog
file and verifies:
- The YAML parses.
- Every key referenced from the friendly module exists.
- Every placeholder in every string is in the expected set
for that key (declared by the friendly module, per §8.3).
- No catalog key contains a format specifier.
- No catalog string contains forbidden engine vocabulary
(the same check the engine-vocabulary audit applies).
- **At startup.** A small sanity check parses the embedded YAML
and panics loudly if it doesn't (catches a corrupted build
artefact). Cheap; runs once per process.
#### 8.7 Value formats stay invariant
Locale-aware value formatting is **explicitly rejected**. The
app uses a single set of formats across all locales, both for
input and output:
- **Dates**: ISO 8601 `YYYY-MM-DD`.
- **Datetimes**: ISO 8601 `YYYY-MM-DDTHH:MM:SS`.
- **Decimal point**: `.`.
- **Thousands separator**: none.
- **Booleans**: `true` / `false` — treated as part of the
SQL/relational vocabulary the user is learning, not as a
translatable string. (Like `NULL`, which also stays as
`NULL`.)
- **NULL**: `NULL`.
- **Blob**: `<blob N bytes>` placeholder.
Rationale:
- ISO 8601 is the international standard, not a US-centric
format. The most common worry — "I don't want US-style
MM/DD/YYYY imposed on the user" — does not apply.
- Locale-aware value parsing on input is fragile: an English
installation can't read a German `1,5` without ambiguity,
CSVs cross locale borders, projects shared across teams break
unpredictably. Several real RDBMS allow this; we explicitly
do not, because the playground's pedagogical contract is
"you are learning what databases want", and what they want is
invariant.
- Booleans stay English-style because they're SQL keywords
(`TRUE` / `FALSE`) the learner needs to recognise.
The validator (§8.6) does not police value formats inside
catalog strings — it can't, since they're free-form prose. But
the format-specifier-rejection rule (§8.4) means a catalog
author cannot accidentally pad, round, or reformat a value
they reference; they can only quote it as-is.
### 9. Migration sweep — `t!()` macro marker
A small macro provides the call-site shape:
```rust
t!(error.unique.insert.verbose, table = "Customers", column = "id")
```
- Resolves the catalog key at compile time (or at first call,
with the lookup result memoised).
- Substitutes named arguments per §8.4.
- Returns the rendered `String`.
- Panics at test time if the key is missing or placeholders
don't match (§8.6).
The migration sweep proceeds **incrementally**. Each PR
migrates a coherent group of literal strings — for instance,
"all replay-related messages", "all `[ok]` summaries", "all
parse error wordings". Reviewable in isolation.
The remaining work is visible:
```
grep -E '"[^"]' src/ | grep -v 't!('
```
approximates the unmigrated callsites. (Imperfect — strings
inside DDL templates, internal log messages, and error-kind
classification helpers are not user-visible — but a useful
proxy.)
H1 itself ships with all friendly-error-layer call sites
already migrated. The friendly-error catalog populating
`error.*` keys is part of the H1 PR. Other categories
(`help.*`, `ok.*`, `client_side.*`, `replay.*`, `parse.*`, ...)
are populated as their migration PRs land, each adding the
catalog entries it needs and removing the corresponding
literals from code.
### 10. Backwards compatibility — anchor phrases
Many existing tests assert substring matches on error wording
(`message.contains("no such table")`,
`message.contains("already exists")`, …). H1 will rewrite the
surrounding wording. Rather than break every such test, the
catalog commits to a small set of **anchor phrases** that stay
stable and that tests can keep asserting against:
| Anchor phrase | Where it appears |
|---------------|------------------|
| `no such table` | object-not-found errors targeting tables |
| `no such column` | object-not-found errors targeting columns |
| `no such relationship` | object-not-found errors targeting relationships |
| `already exists` | name-collision errors (table, column, relationship) |
| `already has the value` | UNIQUE-violation headlines (replaces the engine's "UNIQUE constraint failed") |
| `cannot be converted` | type-incompatible refusal headers (ADR-0017 §5) |
| `discard information` | lossy refusal headers (ADR-0017 §5) |
| `referenced by` | FK-cascade refusals (parent-side) |
| `[client-side]` | the existing pedagogical-note prefix |
The list is small on purpose: anchor phrases are a maintenance
liability (a rewording of any of them needs an ADR amendment
plus a test sweep). Tests that need to assert on wording
should prefer these. Tests that don't need to assert on
wording should not.
## Out of scope
These are deliberately deferred to keep H1 shippable:
1. **Translation of every other user-visible string**
(`help.*`, `ok.*`, `client_side.*`, `replay.*`, `parse.*`,
modal labels, mode banners, load-picker entries, …). Required
follow-on; tracked as separable PRs per §9.
2. **Advanced-mode SQL error sanitization.** Q1 isn't wired yet.
When it is, a future ADR will define how to surface "real" SQL
errors with engine artefacts removed, while still respecting
ADR-0002. Today's H1 does not handle this path.
3. **Settings persistence for the `messages` command.** Lives in
a future settings ADR.
4. **Plural-form rules per locale.** §8.5.
5. **Runtime locale selection.** §8.2.
6. **Locale-aware value formatting.** §8.7 — explicitly rejected,
not just deferred.
7. **Constraint-management surface (`add unique`, `add check`).**
The catalog covers CHECK violations as a placeholder so the
wiring is ready, but C3-track work owns the DDL surface.
## Consequences
### Positive
- **Single chokepoint for engine vocabulary.** Once the
migration sweep is complete, the engine-vocabulary audit
becomes structurally enforceable — there is no path for an
engine name to leak that doesn't go through the catalog, and
the catalog's validator policies it.
- **Operation-tailored, pinpointed errors materially improve
pedagogy.** A learner who tries to `INSERT` a duplicate `id`
sees the row that already exists, the column the constraint
applies to, and a hint pointing to UPDATE. They learn what
the constraint *is*, not just that one fired.
- **i18n foundation in place.** Future contributors can add
translations via PR. The codebase doesn't need to be
re-architected for it.
- **The `messages` command** lets advanced learners shrink the
output once they recognise the patterns. Reduces frustration
with users who outgrow the verbose voice.
### Costs
- **Catalog file becomes a maintenance dimension.** Every new
error path means a new key. The schema validator catches
forgotten keys but doesn't write them.
- **Migration sweep is non-trivial.** Probably 5080 user-visible
literals across `src/`. Each is mechanical but the total is
hours.
- **Test churn during the sweep.** Substring assertions on
rewritten messages need updating. The anchor-phrase list
(§10) limits the blast radius for the most common cases but
doesn't eliminate it.
- **A small dependency.** `serde_yaml` (or equivalent) for the
catalog parse. We already use `serde_yaml` for `project.yaml`,
so this is "use what we have", not "add a dependency".
## Implementation notes
These are sketch-level — H1 implementation will produce more
detailed plans, but they're enough that a session picking this
up has direction:
### Order of operations
1. **Skeleton.** Create `src/friendly/` with the module
structure: `mod.rs`, `keys.rs` (per-category schema), `format.rs`
(the substitution helper), `strings/en-US.yaml` (empty-ish
for now). Wire `t!()` macro.
2. **Catalog populate — `error.*` first.** Five categories per
§3, each with verbose + short variants per relevant
operation. Anchor-phrase compliance per §10.
3. **`FriendlyError` struct + renderer.** Lift `DiagnosticTable`
to a public type from `output_render`. Compose the renderer.
4. **Translator.** `translate(error, ctx)` that classifies the
`DbError` payload and produces a `FriendlyError`. Re-query
logic per §6.
5. **Wire `friendly_message` to the translator.** The two
existing helpers (`friendly_change_column_engine_error`,
`enrich_fk_message`) are removed; their behaviour is now in
the catalog.
6. **`messages (short|verbose)` command.** App-level, parallel
to `mode`. Stored on `App::messages_verbosity`; threaded
into `TranslateContext`.
7. **Validator unit test.** Per §8.6 — exercises the catalog
end-to-end at test time.
8. **Update tests that assert on rewritten wording.** Use the
anchor phrases where applicable. Document any test that
could not use an anchor phrase as a candidate for
future stability work.
### Things that interact subtly
- **The engine-vocabulary audit** (this session,
`tests/engine_vocabulary_audit.rs`) keeps applying. With H1
the audit becomes a regression net for the catalog: any
catalog string containing forbidden vocabulary fails the
audit at test time.
- **ADR-0017's diagnostic-table renderer** is currently
produced from inside `db.rs` (`render_lossy_diagnostic`,
`render_incompatible_diagnostic`, …). Those renderers do not
go away — they have a specific shape (per-cell "old → new"
with reason) that the friendly module's row-pinpointer does
not. They migrate to the catalog for their *prose*
(headers, footer hints) but keep their structure-rendering
logic.
- **The `[client-side]` notes from ADR-0017/0018** are not
errors — they're success notes. They go through the catalog
too (under `client_side.*`) but as part of the migration
sweep, not the H1 PR. Their voice is already pedagogical;
the catalog migration is mechanical.
- **`ParseError`** is its own error type (DSL parse errors
surfaced through `humanise()`), not a `DbError`. It gets the
same translation treatment under `parse.*`, but as part of
the migration sweep.
+1
View File
@@ -24,3 +24,4 @@ This directory contains the project's ADRs, recorded per
- [ADR-0016 — Pretty table rendering for data and structure views](0016-pretty-table-rendering.md)
- [ADR-0017 — Column type-change compatibility](0017-column-type-change-compatibility.md)
- [ADR-0018 — Auto-fill contracts for `serial` and `shortid` columns](0018-auto-fill-contracts-for-serial-and-shortid.md)
- [ADR-0019 — Friendly error layer (H1) and i18n message catalog](0019-friendly-error-layer-and-i18n.md)