ADR-0017 + ADR-0002 amendment: type-change compatibility + engine-agnostic posture

Specifies the curated per-cell classification (clean /
lossy / incompatible) for column type changes, the static
transformer matrix (numeric chains, text↔structured types,
always-clean stringifications), and the PK / shortid /
uniqueness-bearing handling. Replaces the B2/C2
placeholder of "rely on engine STRICT and surface its
errors" with a learner-friendly model that:

* refuses incompatibles up-front,
* refuses lossy conversions by default with a re-run-with-
  --force-conversion hint,
* refines the PK refusal: an inbound-FK PK is only refused
  when the new type would change the FK target type
  (so `serial → int` and `shortid → text` on FK-referenced
  PKs are allowed; `int → text` etc. still refuse),
* adds a post-transformation uniqueness check for PK and
  shortid columns,
* uses the pretty-table renderer (ADR-0016) for all
  diagnostic row lists,
* emits a `[client-side] …` note in the success summary
  whenever the transformer rewrote any cell.

`--force-conversion` accepts loss; `--dont-convert` skips
the client-side layer entirely; mutually exclusive.

Forward-look: a future iteration may add resolution flags
(`--default 0`, `--on-incompatible '<value>'`).

Also amends ADR-0002 with a new "User-facing posture"
section cementing that the database engine choice is an
implementation detail and is never named in user-visible
strings. Adds a corresponding bullet to CLAUDE.md's
working-style rules so every session picks it up.

Implementation lands as a follow-up.
This commit is contained in:
claude@clouddev1
2026-05-08 10:53:20 +00:00
parent 7b97786ab7
commit c3e5f9014f
4 changed files with 510 additions and 0 deletions
+8
View File
@@ -156,6 +156,14 @@ Key invariants in the code:
- **Pedagogy wins ties.** When a design choice trades clarity
for raw capability, prefer clarity. Real RDBMS power-user
features exist; this app is not the place to teach them.
- **No engine name in user-facing strings.** The choice of
database engine is an implementation detail per ADR-0002
(User-facing posture). Error messages, success notes,
help text, and any other user-visible string refer to
"the database" or "the engine" in the abstract — never
the specific product (SQLite, STRICT, rusqlite, PRAGMA).
ADR-internal prose and code comments may name it where
technically necessary for precision.
- **Confirm commits.** Per the user's global rules, every
`git commit` is preceded by an explicit message proposal
and user approval. No AI attribution in commit messages.
+24
View File
@@ -39,6 +39,30 @@ Use **SQLite** via the `rusqlite` crate. All tables are created as
Simplified user-facing column types (see ADR-0005) are mapped to
the underlying SQLite STRICT types at parse time.
## User-facing posture
The choice of SQLite is an implementation detail. The
playground's user-facing surface — error messages, success
notes, help text, and any other string the user sees —
never names the underlying engine. "The database" or "the
engine" in the abstract is the canonical phrasing; the
specific product (SQLite, STRICT, rusqlite, PRAGMA) is
ADR-internal and code-comment vocabulary only.
Rationale: students should leave with knowledge that
generalises. Naming the engine in user-visible strings
ties their mental model to a specific product when the
lessons are about relational concepts. The friendly-error
layer (H1) wraps engine error text before surfacing;
advanced mode's SQL surface doesn't expose the engine name
either — the user writes SQL because SQL is the language,
not because SQLite is the engine.
This commitment is referenced from later ADRs (notably
ADR-0017 §6 on the client-side conversion note); when in
doubt about a user-facing string's wording, this section
governs.
## Consequences
- Tight, low-friction packaging — `rusqlite` bundles SQLite.
@@ -0,0 +1,477 @@
# ADR-0017: Column type-change compatibility
## Status
Accepted
## Context
ADR-0013 introduced the rebuild-table primitive that lets us
change a column's type by reading-out, recreating-with-new-shape,
and writing-back. ADR-0014 specified the value model (per-type
literal validators in `dsl/value.rs`). The B2/C2 work that
landed `change column [in] [table] <T>: <col> (<newtype>)`
plumbed those together but left a real spec gap: what
conversions are allowed, what happens to data that doesn't
cleanly fit the new type, and how the user is told.
The current behaviour ("rely on SQLite STRICT to reject
incompatible cells; surface whatever error it produces") is
a placeholder — pedagogically poor (raw STRICT errors are
not learner-friendly) and silently permissive in cases
where SQLite would coerce values in ways the user might not
intend.
This ADR specifies a curated compatibility model that
combines a static "is this conversion attempted at all"
matrix with a per-cell runtime classification, plus
two opt-in flags that give the user control over the trade
between safety and force.
## Decision
### 1. Per-cell outcome classification
Every cell of the column being changed is classified into
exactly one of three outcomes during a **dry-run pass that
runs before any SQL writes**:
- **Clean** — a transformer produces a value the new type
accepts without information loss. The rebuild proceeds
and stores the transformed value.
- **Lossy** — a transformer produces a valid value, but
some property of the original cell (precision, fractional
part, time component, …) is discarded. By default these
rows refuse the operation; the user opts in to the loss
with `--force-conversion`.
- **Incompatible** — no transformer for this pair can
produce a valid value for this cell. The operation is
refused; no flag overrides this. The user must
pre-process the data (or wait for a future syntax
extension; see §5 forward-look).
The classifications are properties of *(source type, target
type, cell value)*, not just of the type pair. The same
type pair (`real``int`) yields **clean** for cells
storing whole numbers and **lossy** for cells with
fractional parts.
### 2. Default decision tree
A `change column …` invocation without flags:
1. **Static refusal check.** If the type pair has no
transformer at all (see §3 — e.g. `bool → date`,
anything ↔ `blob`, anything → `serial`), refuse
immediately with a friendly explanation of the
incompatibility class.
2. **Per-cell dry run.** Apply the transformer to each
cell; classify into clean / lossy / incompatible.
3. **Refuse on incompatibles.** If any cell classifies
incompatible, refuse and present a list (capped at
100 rows; tail rendered as "… and N more"). Each
listed row identifies the row index and the offending
value plus a short reason ("not a valid int", "not
'0' or '1'", etc.). The error does NOT mention
`--force-conversion`: that flag does not help with
incompatibles.
4. **Refuse on lossy (default).** If all cells are clean
or lossy and at least one is lossy, refuse with a
capped list of the lossy rows (showing the would-be
transformation: `row 5: 3.14 → 3 (truncated)`),
followed by the message *"if you want to execute this
conversion in spite of the problems, re-run with
`--force-conversion`."*
5. **All clean.** Proceed with the rebuild. If the
transformer was non-identity (i.e. any cell required
actual transformation rather than passing through
unchanged), emit the **client-side note** (§6)
alongside the success summary.
### 3. Static transformer matrix
Pair-wise: a `Some(transformer)` means "attempt with a
per-cell dry-run." `None` means "static refusal" (§2 step
1).
The transformers below cover the **numeric and text
universe**. Cells of `serial` columns appear only on the
source side (target `serial` is statically refused);
relationship-involved columns and PK columns are also
statically refused (carried over from B2/C2). Anything ↔
`blob` is deferred for v1.
#### Always-clean transformers (no per-cell loss possible)
| Source | Target | Notes |
|------------------|-----------|-------------------------------------------------|
| `int` / `serial` | `real` | widening; precision caveat for ¦v¦ > 2⁵³ noted in docs but not policed |
| `int` / `serial` | `decimal` | exact decimal representation |
| `int` / `serial` | `text` | stringify |
| `bool` | `int` | 0/1 |
| `bool` | `real` | 0.0/1.0 |
| `bool` | `decimal` | "0"/"1" |
| `bool` | `text` | "true"/"false" — matches the DSL boolean grammar (§5 of ADR-0014), not SQLite's native integer stringification |
| `decimal` | `text` | already text-backed under STRICT |
| `date` | `text` | same |
| `datetime` | `text` | same |
| `shortid` | `text` | same |
| `real` | `text` | shortest-round-trip decimal form |
#### Per-cell-classified transformers (clean OR lossy OR incompatible per cell)
| Source | Target | Per-cell classification |
|------------|-----------|---------------------------------------------------------------------------------------------------------------|
| `real` | `int` | clean when the value is exactly representable as an integer (e.g. `3.0`); lossy when there's a fractional part (e.g. `3.14 → 3`); never incompatible |
| `real` | `decimal` | clean when the f64 round-trips through the decimal grammar; lossy otherwise (precision artifacts) |
| `real` | `bool` | clean when value is exactly `0.0` or `1.0`; incompatible otherwise |
| `decimal` | `int` | clean when integer-valued; lossy if fractional |
| `decimal` | `real` | clean when it fits f64 exactly; lossy on precision loss |
| `decimal` | `bool` | clean for exact `0` / `1`; incompatible otherwise |
| `int` | `bool` | clean for `0` / `1`; incompatible otherwise |
| `text` | `int` | narrowest-first chain: try `int` parse (clean); fall back to `real` parse and truncate (lossy); else incompatible |
| `text` | `real` | try `real` parse (clean); else `decimal` parse if it fits f64 (clean) or doesn't (lossy); else incompatible |
| `text` | `decimal` | try `decimal` grammar (clean); else `real` parse (lossy precision); else incompatible |
| `text` | `bool` | "true" / "false" (case-insensitive) → clean; everything else incompatible (no implicit `0` / `1` parse — matches the DSL boolean grammar) |
| `text` | `date` | match `YYYY-MM-DD` → clean; everything else incompatible |
| `text` | `datetime`| ISO-8601 datetime → clean; bare date → lossy with implicit `T00:00:00Z`; else incompatible |
| `text` | `shortid` | base58 alphabet, 1012 chars (per ADR-0014) → clean; else incompatible |
#### Statically refused (no entry in the matrix)
- Anything → `serial` (carried over from B2/C2)
- Anything → or from `blob` (v1 deferral; encoding ambiguity)
- Same-type identity (no-op; carried over from B2/C2)
- `date``datetime` direct (deferred for v1; users route via `text` if needed)
- All cross-domain pairs not listed above (e.g. `bool`
`date`, `real``datetime`, `int``shortid`)
The relationship-involvement preconditions (§4) apply
*before* this matrix is consulted.
### 4. Primary-key and uniqueness-bearing columns
The B2/C2 implementation refused all type changes to PK
columns and to any column involved in a declared
relationship. That was too coarse: it conflated two
concerns that should be split.
#### 4.1 Inbound foreign keys: when does the cascade actually bite?
The cascade only matters when the new type would change
the FK target type that referencing columns must have. Per
ADR-0011's `fk_target_type()` rule:
- `serial.fk_target_type() == Int`
- `shortid.fk_target_type() == Text`
- All other types: identity
So `serial → int` on a PK preserves `fk_target_type` (both
yield `Int`); FK columns referencing the PK stay `int`,
the underlying storage is unchanged, no cascade is needed.
Same for `shortid → text` and `text → shortid` (both yield
`Text`).
The precondition is therefore:
> If the column has any inbound FK *and*
> `old_type.fk_target_type() != new_type.fk_target_type()`,
> refuse with a friendly cascade message
> ("`<T>.<col>` is referenced by N relationship(s); changing
> its type to `<dst>` would change the type that referencing
> columns require — drop those relationships first or pick a
> target type whose FK shape matches the current one").
> Otherwise allow.
This unblocks the most-natural PK conversion (`serial →
int`, removing auto-increment while preserving stored
values) and `shortid ↔ text` round-trips on PKs that have
real-world relationships.
#### 4.2 Outbound foreign keys: refuse for v1
If the column is itself an FK (the child side of a
relationship), changing its type would either require its
new type to match the parent's `fk_target_type` (which
typically reduces to a no-op) or break the constraint. v1
refuses outbound-FK type changes outright; the user drops
the relationship first.
#### 4.3 Uniqueness-bearing columns: post-transformation collision check
Some types and constraints carry a uniqueness contract
that the per-cell classification can't see — multiple
distinct source values can collapse to the same target
value under a lossy transformation, violating the
contract even though every individual cell transformed
"successfully."
In v1 the uniqueness check applies to:
- **Primary-key columns** (the SQL-level UNIQUE+NOT NULL
guarantee).
- **shortid columns** (the design-level contract that
shortids are unique short identifiers, even when
not the PK).
- *(Future)* **`UNIQUE` constraint columns** when C3's
full constraint set lands.
After the per-cell pass produces transformed values, the
transformed values are checked for duplicates. Any
collision is **incompatible** (cross-row, structural —
no `--force-conversion` override; the user must clean
the source data). The error reports the colliding rows:
> Cannot change `T.col` from real to int: 2 row(s)
> would collapse to the same value.
>
> row 5 ('3.14') and row 12 ('3.7') would both
> become '3'.
In practice the only realistic incoming conversion that
exercises both the per-cell shortid-grammar check *and*
the uniqueness check is `text → shortid`. Other source
types fail per-cell (`int 42` doesn't match the base58 +
length grammar) before uniqueness becomes relevant.
#### 4.4 Combined preconditions
Putting §4.1, §4.2, §4.3 together, a `change column …`
invocation is refused at the precondition stage when any
of:
- The column is the *child* side of a relationship
(outbound FK on this column).
- The column is the *parent* side of a relationship and
`old_type.fk_target_type() != new_type.fk_target_type()`.
…and is refused after the per-cell dry run (and so still
classifies as incompatible) when:
- The column is uniqueness-bearing (PK, shortid, or future
UNIQUE) and the transformed values contain duplicates.
Otherwise the matrix's per-cell classification governs.
### 5. Override flags
Both flags are opt-in per ADR-0009's `--<name>` convention.
#### `--force-conversion`
Skips the lossy-refusal in step 4 of §2. **Does not**
change the static refusal (step 1) or the incompatible
refusal (step 3): no flag makes `text "abc" → int` work.
When invoked, the dry run still classifies cells; lossy
cells transform per the matrix; the client-side note (§6)
includes both the count of cells that needed transformation
*and* the count that were lossy (split by classification).
**Forward-look (not in this ADR's implementation scope).** A
later iteration may extend the grammar to let the user
specify resolutions for incompatibles, e.g.:
```
change column T: c (int) --default 0
change column T: c (int) --on-incompatible '0'
```
…which would land cells that fail the parse with the given
default value. This generalises `--force-conversion` from a
binary "accept loss" toggle into a continuum and gives the
learner practical experience with the kinds of resolutions
that real-world data work needs. The current ADR
deliberately doesn't commit to syntax for that — the
binary `--force-conversion` is enough for v1, and the
forward-look exists to preserve the design space rather
than constrain a future ADR.
#### `--dont-convert`
Skips the entire client-side layer: no transformer, no
dry-run, no per-cell classification. Hands the source
column's raw cells to the rebuild's `INSERT INTO new
SELECT FROM old` step and lets the database's STRICT
typing decide. Engine error text is never surfaced
verbatim; failures are reported via the same
friendly-error layer the rest of the app uses.
This is the escape hatch for users who explicitly want to
see what the database itself will do — a pedagogical lever
for "what does raw SQL behaviour look like here?" without
dropping into advanced mode.
#### Mutual exclusion
`--force-conversion` and `--dont-convert` are mutually
exclusive. Specifying both is a parse error: forcing
client-side conversion while disabling client-side
conversion is contradictory. The error message names both
flags and says "pick one."
### 6. Reporting: the "client-side conversion was applied" note
When a successful change involves any non-identity
transformation (i.e. cells were rewritten before reaching
the database), the success summary includes a line of the
form:
> [client-side] N row(s) were transformed before being
> stored. In raw SQL this would need an explicit `CAST` or
> application-level code.
When `--force-conversion` succeeded with lossy rows, the
note adds the lossy count specifically:
> [client-side] N row(s) transformed; M of those discarded
> information (lossy). In raw SQL this would need an
> explicit `CAST` or application-level code.
User-facing strings throughout this ADR — and throughout
the application generally — never name the underlying
database engine. The engine is an implementation detail;
the playground's pedagogical surface is "the database" in
the abstract. (ADR-internal prose still references SQLite
where technically necessary for the spec writer; that's
not user-visible.)
The note's purpose is pedagogical, not diagnostic — it
points at the moment where the tool went beyond what bare
SQL allows. Without it, a learner would have no way to
know that "this just worked" was actually the playground
doing them a favour.
### 7. Error presentation
Tabular detail in both error and success output is
rendered through the pretty-table renderer (ADR-0016) —
no ad-hoc indented-line layouts. The rule is: anywhere
the output describes more than a handful of rows of
structured per-row detail, it goes through
`render_data_table` (or an equivalent helper). This keeps
the visual identity consistent across DDL, query results,
and these conversion diagnostics.
#### Lossy refusals
```
Cannot change `T.col` from real to int: 50 row(s) would
discard information.
┌─────┬───────┬─────┬───────────────────────────────────┐
│ Row │ From │ To │ Reason │
├─────┼───────┼─────┼───────────────────────────────────┤
│ 5 │ 3.14 │ 3 │ truncated; would discard 0.14 │
│ 12 │ 2.71 │ 2 │ truncated; would discard 0.71 │
│ 18 │ 1.5 │ 1 │ truncated; would discard 0.5 │
│ … │ … │ … │ … and 47 more │
└─────┴───────┴─────┴───────────────────────────────────┘
if you want to execute this conversion in spite of the
problems, re-run with `--force-conversion`.
```
#### Incompatible refusals
```
Cannot change `T.col` from text to int: 3 row(s) cannot
be converted.
┌─────┬───────┬───────────────────────┐
│ Row │ Value │ Reason │
├─────┼───────┼───────────────────────┤
│ 3 │ abc │ not a valid int │
│ 7 │ x42 │ not a valid int │
│ 12 │ │ not a valid int │
└─────┴───────┴───────────────────────┘
```
The trailing `--force-conversion` hint is omitted for
incompatibles (no flag helps; future syntax — §5
forward-look — would re-introduce one).
#### Uniqueness collisions
```
Cannot change `T.col` from real to int: 1 collision(s)
would violate uniqueness.
┌─────────┬─────────────────┬──────────────────┐
│ Becomes │ Source rows │ Source values │
├─────────┼─────────────────┼──────────────────┤
│ 3 │ row 5, row 12 │ 3.14, 3.7 │
└─────────┴─────────────────┴──────────────────┘
```
#### Common rules
- Each detail table is **capped at 100 rows**. Beyond that,
a single trailing row with `…` placeholders and the
literal text "and N more" inside the row is rendered
inside the table — not as a footer line. Keeps the
bordered shape intact.
- Row indices are 1-based to match how learners count
rows; the tool counts internally from 0.
- Numeric "Row" / "Becomes" columns inherit numeric
right-alignment from ADR-0016 §2.
- Cells that would render multi-line content (for `text →`
conversions where source values contain newlines) honour
ADR-0016 §3's `↵` substitution, so the table stays one
display row per logical row.
### 8. Out of scope
- **OOS-1.** Anything ↔ `blob` conversion. Encoding
ambiguity (base64? raw bytes? UTF-8 attempt?) deserves
its own discussion.
- **OOS-2.** `date``datetime` direct conversion. Format
rewriting is small but warrants a per-conversion test
matrix; defer until a real user need surfaces.
- **OOS-3.** Resolution-specifying flags (`--default`,
`--on-incompatible '<value>'`, etc.) per the §5
forward-look.
- **OOS-4.** Bulk conversions across multiple columns in
one command. Each `change column` runs independently.
- **OOS-5.** Cross-row contextual transformations (e.g.
"rank by value to fit a smaller numeric range"). The
transformer is per-cell, deliberately stateless.
## Consequences
- The placeholder behaviour ("rely on SQLite STRICT") is
replaced with a documented per-cell model that produces
learner-friendly errors and pedagogical client-side
notes.
- The transformer matrix is an additional surface to keep
in step with `dsl/value.rs`'s validators. Each new
user-facing type added to the type vocabulary needs its
matrix entries reviewed.
- The `[client-side] …` note is the load-bearing
pedagogical artefact: it's how a learner discovers that
the tool did them a favour. Future visualisation /
styling work (V4) should preserve its prominence.
- `--dont-convert` keeps the door open to a "raw SQL
behaviour" learning mode without forcing the user into
advanced mode.
- The forward-look in §5 means `--force-conversion`'s
semantics may broaden later. Implementations should
treat the flag's effect as "accept loss" rather than as
the canonical resolution mechanism.
## Relationship to earlier ADRs
- **ADR-0013** — the rebuild-table primitive remains the
mechanism. This ADR adds a per-row transformation step
*between* read-out and write-back; the existing primitive
can be parametrised by a row-by-row transformer or paired
with a sibling helper.
- **ADR-0014** — the per-type validators in `dsl/value.rs`
power the dry-run classification. No changes to those
validators; they're consumed read-only here.
- **ADR-0011** — FK target-type compatibility is
unaffected. Relationship-involved columns are statically
refused before this ADR's matrix is consulted.
- **ADR-0009** — the `--force-conversion` and
`--dont-convert` flags follow the established
long-flag opt-in convention.
+1
View File
@@ -22,3 +22,4 @@ This directory contains the project's ADRs, recorded per
- [ADR-0014 — Data operations, value literals, and the auto-show pattern](0014-data-operations-and-value-model.md)
- [ADR-0015 — Project storage runtime](0015-project-storage-runtime.md)
- [ADR-0016 — Pretty table rendering for data and structure views](0016-pretty-table-rendering.md)
- [ADR-0017 — Column type-change compatibility](0017-column-type-change-compatibility.md)