Settles the undo/snapshot half (U1/U2) before implementation: - every-mutation single-step undo (supersedes destructive-only model) - hybrid whole-project snapshot (db backup API + yaml/csv copy), reconciling ADR-0006 with ADR-0015's derived-db model - persisted N=50 ring; redo discarded on new work - batch ops (replay + future) record one undo step; import excluded - --no-undo disable switch Adds the implementation plan and updates README index, requirements U1/U2, and CLAUDE.md in lockstep.
24 KiB
Plan: ADR-0006 undo / snapshot half (U1/U2)
Date: 2026-05-24 · Author: session 36 · ADR: 0006 (undo snapshots and replay log) — replay half (U3/U4) shipped in ADR-0034; this plan covers the undo / snapshot half (U1/U2).
Status of this doc: draft for review. The load-bearing design calls below were confirmed with the user before writing (see §2); the second-order calls in §10 are recommendations awaiting sign-off. No code is written until this plan is approved.
1. Baseline
cargo test: green (exit 0). Handoff 35 records 1659 passing / 0 failing / 0 skipped / 1 ignored (thefriendly/mod.rsdoctest). This is the Phase-1 baseline Phase-5 is measured against.cargo clippy --all-targets -- -D warnings: clean (per handoff).
2. Decisions locked with the user (do not re-litigate)
All confirmed this session via direct Q&A.
- Snapshot mechanism — hybrid whole-project snapshot. A
snapshot stores the database (copied via SQLite's online backup
API, the only safe way to copy a live db) plus the
authoritative text (
project.yaml+data/*.csv, copied as inert files). Undo restores all three directly — no rebuild, no re-derivation. This satisfies both ADR-0006 (backup API is used) and ADR-0015 (text remains the authoritative copy), so the mechanism needs no ADR amendment. Rationale for storing the db too (rather than text-only + rebuild): the user prioritised a direct, robust restore for a safety feature over the smaller disk footprint of a rebuild-on-undo approach. - Lifetime — persisted on disk, ring capped at N = 50. Snapshots
live under the project directory and survive quit (undo works
after reopening). The ring keeps the most recent 50 snapshots;
the oldest is evicted on overflow.
50is a single tunable constant. - Redo — discarded on new work. Standard linear semantics: any new mutation after an undo clears the redo stack.
- Scope — every mutation (single-step, Ctrl-Z-style undo). A snapshot is taken before every data/schema mutation (insert, update, delete, drop, all DDL, all SQL DML), not only before destructive ops. The user judged this the right model for the playground's teaching philosophy. This supersedes two pillars of ADR-0006 and requires an amendment (see §4).
undorequires confirmation that names the command being undone. (Confirmation is kept per the user's explicit instruction, even though single-step undo is more frequent than the ADR's original "rare and consequential" framing.)- Disable switch — in the picture from day one. A way to turn
undo off entirely, so per-command snapshotting can be avoided if it
proves too heavy on small hardware. Shape: a
--no-undoCLI flag (CLI-only for v1, no in-app toggle — confirmed). - Batch operations record ONE undo step. A single user command
that executes many sub-operations —
replaytoday, and any future in-project batch command — takes one boundary snapshot for the whole batch, not one per sub-command. Undo rolls the batch back wholesale. This is both a perf win (replaying a longhistory.logis one db backup, not N) and the consistent reading of "one undo step per user command."importis not a batch op — it creates a new project and switches to it (ADR-0015 §11), leaving the current project untouched on disk, so it sits outside the per-project undo ring entirely (the new project just starts with an empty ring). Project-switch-level "undo back to the previous project" is a separate, out-of-scope mechanism.
3. Phase 1 — Requirements checklist
From ADR-0006 (as amended), the user decisions in §2, and the project's working standards.
Functional
- R1 Auto-snapshot before every mutation (DSL + SQL), gated by the enable flag.
- R1b A batch command (
replay, future batch ops) records exactly one boundary snapshot for the whole batch (not one per sub-command), via the Begin/EndBatch primitive; finalized only if ≥1 mutation actually ran.importtakes no undo step (project switch, not a current-db mutation). - R2 A snapshot captures db (backup API) +
project.yaml+data/*.csv, atomically, representing the pre-mutation committed state. - R3 Snapshots are committed into the ring only after the triggering mutation commits successfully; a rolled-back op leaves no snapshot. Preserves ADR-0015 §6 commit-db-last ordering.
- R4 Persisted ring buffer under the project dir, cap 50, oldest evicted on overflow.
- R5
undoapp command (both modes, no sigil): confirmation modal naming the command to be undone + its relative timestamp → on confirm, restore the top snapshot (db + text) → pop the ring. - R6 Undo pushes the pre-undo (current) state onto a redo
stack so
redois possible (ADR-0006). Redo restores it. - R7
redoapp command restores the top redo entry. New mutation clears the redo stack (R-decision 3). - R8 Undo/redo into an empty stack reports a friendly note ("nothing to undo" / "nothing to redo"); never errors.
- R9 Disable switch (
--no-undo): when set, no snapshots are taken (zero per-command overhead) andundo/redoreport that undo is disabled. - R10 Restore correctly re-establishes a consistent (db, yaml,
csv) triple, including the NULL-vs-empty CSV distinction and
internal metadata tables (
__rdbms_playground_*). - R11 After undo/redo, the TUI refreshes its view (re-query) so the user sees the restored state.
Cross-cutting / integration
- R12
undo/redoadded toruntime::is_app_lifecycle_entry_wordso replay skips them (ADR-0034 Amendment 1) — and to the completion entry-keyword set (src/completion.rs), which the function's doc-comment says must stay in lockstep. - R13 Snapshots dir added to the
.gitignoretemplate (src/project/mod.rs), the export exclusion (src/archive.rs— likeplayground.dbandhistory.log), and the temp-project cleanup allowlist (safely_delete_temp_project) so an otherwise- empty temp carrying a snapshots dir is still safely deletable. - R14
Cargo.toml: add"backup"to therusqlitefeature list. - R15 Engine-neutral user-facing strings (ADR-0002): no "SQLite" / "backup API" / "PRAGMA" in notes or modal text. Use "snapshot" / "the database".
Documentation
- R16 Write an ADR-0006 amendment (see §4); update
docs/adr/README.mdin the same edit (ADR-0000 index rule). - R17 Update
docs/requirements.mdU1/U2 from designed → implemented; updateCLAUDE.md's ADR-0006 "Designed; not yet implemented" line.
Testing (ADR-0008 four tiers)
- R18 Tier 1 (unit/pure): ring eviction, redo-clear-on-new- work, mutation-vs-readonly classification, modal key handling, command parsing, disable-flag gating, CLI parse.
- R19 Tier 2 (insta snapshots): undo/redo confirmation modal rendering; disabled-undo note.
- R20 Tier 3 (integration, real db + temp project): snapshot
taken on mutation; undo restores db and text; redo; ring
eviction at the cap; redo cleared by new work;
--no-undotakes no snapshots; persistence-ordering invariant preserved. - R21 Full-stack flow (per global standards): bootstrap a real project, run a sequence of DSL and SQL mutations, undo/redo across both, verify read-model + on-disk yaml/csv/db all consistent, reopen the project and confirm the ring persisted.
- R21b Batch: a
replayof a multi-command file produces exactly one ring entry; undo rolls the whole replay back; an all-skips replay leaves no undo step;importproduces no undo step in the current project. - R22 Phase-3 N/A matrix row closed ("auto-snapshot fires for SQL DML the same as DSL") — now non-vacuous.
- R23 End state: all green, zero skips, no regression vs the §1 baseline.
4. ADR-0006 amendment scope
The every-mutation decision (§2.4) contradicts ADR-0006 as written, so an amendment is required (the ADR is Accepted — we amend in place with a dated note, per the project's "no silent drift" rule). The amendment records:
- Snapshot scope: "before every destructive operation" → "before every data/schema mutation" (single-step undo). State the pedagogical rationale (intuitive Ctrl-Z; the playground favours clarity over micro-optimisation).
- Confirmation rationale: the original "undo is rare and consequential, so always confirm" is replaced by "undo is single-step and confirmation names the exact command being undone." Confirmation is still always shown; there is still no suppress flag. The summary simplifies — with per-command snapshots, undo rolls back exactly one command, so the "discarded changes" summary is just that command's description (no expensive db-diff needed). This resolves ADR-0006's "counts of rows added/modified/deleted per table since the snapshot" clause: under single-step undo there is no intervening un-snapshotted work, so the clause collapses to "the one command."
- Snapshot mechanism: clarify the hybrid (db via backup API + text via copy) and how it reconciles with ADR-0015 (db is derived; text authoritative) — restore re-establishes the whole triple.
- Storage & lifetime: persisted ring under the project, cap 50, git-ignored, export-excluded, temp-cleanup-aware.
- Redo: discarded on new work.
- Disable switch:
--no-undoas a hardware escape hatch. - Batch granularity: a single user command that runs many
sub-operations (
replay, future batch ops) is one undo step, taken at the batch boundary. Consistent with "one undo step per user command."importis a project switch and takes no undo step.
5. Architecture & design
5.1 On-disk layout
<project>/
project.yaml
data/<table>.csv
playground.db
.snapshots/
index.json # ordered ring + redo stack: ids, timestamps,
# command text, status; the source of truth
# for ordering/eviction
0001/
playground.db # backup-API copy of the pre-op db
project.yaml # pre-op copy
data/<table>.csv # pre-op copies
0002/ ...
.staging/ # temp area for an in-flight snapshot
.snapshots/is git-ignored, export-excluded, and on the temp-cleanup allowlist (R13).- The ring (undo) and the redo stack are both recorded in
index.json; snapshot payload dirs are shared storage referenced by id. Eviction beyond 50 deletes the oldest payload dir. - Ids are monotonic; never reused, to avoid stale references.
5.2 Snapshot mechanism
- db copy:
rusqlite::backup—Backup::new(&live, &snap)copies live → snapshot file; for restore,Backup::new(&snap, &live)copies snapshot → live (works on the open worker connection; requires thebackupcargo feature, R14). No transaction may be open on the live connection during a backup step. - text copy: plain file copy of
project.yaml+ everydata/*.csvinto the snapshot dir. (CSVs are produced by the project's hand-rolled writer that preserves NULL-vs-empty; we copy the already-written files verbatim, so the distinction is preserved for free — R10.)
5.3 Timing vs ADR-0015 §6 (the load-bearing invariant)
A snapshot brackets the existing 4-step persistence sequence:
0. STAGE snapshot (NEW) — backup live db + copy current text into
.snapshots/.staging/. The current on-disk
state is the last committed state (db, yaml,
csv are committed together per ADR-0015), so
staging captures the true pre-op state.
1. open txn; validate; mutate db + metadata (ADR-0015 step 1)
2. write text temp files; fsync (step 2)
3. atomic-rename text into place (step 3)
4. commit db (step 4)
5. FINALIZE snapshot (NEW) — atomic-rename .staging/ → .snapshots/<id>/,
append to index.json ring, evict oldest if
>50, clear redo stack.
On any failure in 1–4 → txn rolls back; DISCARD .staging/.
This preserves commit-db-last: the snapshot is finalised after
the db commit, and a rolled-back op leaves no ring entry (R3). A crash
mid-sequence leaves a .staging/ orphan, cleaned on next open exactly
like ADR-0015's orphan text temps.
5.4 Worker chokepoint (single point, not per-handler)
Intercept centrally in db.rs::handle_request (the dispatcher), not
in each do_* handler:
- Classify each
Requestas mutating or read-only (a singleis_mutating(&Request) -> bool, exhaustively matched so a new request variant forces a classification decision at compile time). - For mutating requests, wrap the handler call:
stage_snapshot()→ run handler → onOkfinalize_snapshot()+ clear redo; onErrdiscard_staging(). - The command text for the snapshot meta comes from the request's
existing
source: Stringfield (added during the ADR-0034 journal work). - Refactor note: mutating handlers currently send their reply via
the
oneshotinside the match arm. To let the wrapper observe success/failure, mutating handlers return theirResultto the dispatcher, which sends the reply and drives snapshot finalize/discard. This is the main structural change indb.rs; read-only handlers are untouched. (RebuildFromTextis mutating and snapshots like the rest; it reconstructs from text, but the pre-op db+text are still the correct undo target.) - Gate: when undo is disabled (R9),
is_mutatingstill holds but the stage/finalize calls are no-ops — the worker is told at construction whether snapshotting is enabled.
5.5 Undo / redo flow (commands → modal → worker)
Mirrors the existing two-phase rebuild modal flow, with the
prepare step reading snapshot metadata rather than the project:
undoparses toCommand::App(AppCommand::Undo)→dispatch_app_commandreturnsAction::PrepareUndo.- Runtime handles
PrepareUndo: reads.snapshots/index.jsontop entry (command text + timestamp) — a cheap file read, likesummarize_projectdoes for rebuild — and postsAppEvent::UndoPrepared { command, when }(orUndoUnavailableif the ring is empty → friendly note, R8). App::updateopensModal::UndoConfirmshowing "Undo<command>(run )? [Y] yes [N] no".Y→Action::Undo;N/Esc→ close + "undo cancelled".- Runtime handles
Action::Undo→ workerRequest::Undo:- push a snapshot of current state onto the redo stack,
- restore the top ring snapshot (backup-restore db + rename text back), pop the ring,
- reply with the undone command text.
- Runtime posts
AppEvent::UndoSucceeded { command }/UndoFailed { error };Appcloses the modal, notes the outcome, and triggers a re-query so the view refreshes (R11).
redo is symmetric (PrepareRedo / Redo, Request::Redo, restores
the top redo entry, pushes current onto the undo ring). Whether redo
also confirms is an open sub-decision (§10).
5.6 Batch operations — one undo step (Begin/EndBatch)
run_replay (runtime.rs:1622) dispatches one worker request per
replayed command via execute_command_typed. Left to §5.4, a long
replay would stage one snapshot per line. Instead, a reusable batch
primitive collapses a batch to a single ring entry:
Request::BeginBatch { source }— stages one snapshot (meta = the batch command, e.g.replay history.log) and sets the worker'sin_batchflag. While set, per-request staging in §5.4 is skipped (the boundary snapshot already captured pre-batch state).Request::EndBatch— finalizes the staged snapshot into the ring only if ≥1 mutation occurred during the batch (else discards it, so an all-skips replay leaves no no-op undo step); clearsin_batchand the redo stack.run_replaybrackets its loop with begin/end. A mid-replayPersistenceFataltears the runtime down beforeEndBatch, leaving a.staging/orphan that is cleaned on next open (ADR-0015 parity); undo state is moot at that point anyway.- The same primitive serves any future in-project batch command.
importdoes not use this — it creates a new project and switches (ADR-0015 §11); the current project is untouched, so there is nothing to snapshot and no undo step.rebuildis already a single request (RebuildFromText) → naturally one snapshot, no batch handling needed.
5.7 New / changed types (from the Explore survey)
| Layer | Change |
|---|---|
Cargo.toml |
add "backup" to rusqlite features |
src/cli.rs |
Args.no_undo: bool; parse --no-undo |
src/dsl/command.rs |
AppCommand::Undo, AppCommand::Redo |
src/dsl/grammar/app.rs |
UNDO / REDO CommandNode (EMPTY_SEQ, like REBUILD) |
src/action.rs |
PrepareUndo, Undo, PrepareRedo, Redo |
src/event.rs |
UndoPrepared/UndoUnavailable/UndoSucceeded/UndoFailed (+ redo) |
src/app.rs |
Modal::UndoConfirm (+ redo) struct; event arms; handle_undo_confirm_key; dispatch arms |
src/ui.rs |
render_undo_confirm (mirror render_rebuild_confirm) |
src/db.rs |
is_mutating; stage/finalize/discard_snapshot; Request::Undo/Redo/BeginBatch/EndBatch; dispatcher wrap; undo_enabled + in_batch worker fields |
src/persistence/ (or new src/snapshots/) |
ring + redo store, index.json, eviction, restore |
src/runtime.rs |
is_app_lifecycle_entry_word += undo,redo; PrepareUndo/Undo (+redo) handling; thread no_undo to worker; bracket run_replay with Begin/EndBatch |
src/project/mod.rs |
.snapshots/ in .gitignore template |
src/archive.rs |
exclude .snapshots/ from export |
safely_delete_temp_project |
.snapshots/ on the contents allowlist |
src/completion.rs |
undo/redo in app-command entry keywords |
src/friendly/strings/en-US.yaml |
modal title/prompt, success/cancel/disabled notes, CLI help |
6. Phase 2/3 — candidates considered (condensed)
The mechanism choice was the crux; three candidates were evaluated and the user selected the hybrid (§2.1). Recorded here so the rejected options aren't silently dropped:
| Candidate | Stores | Undo restore | Verdict |
|---|---|---|---|
| C1 db-only via backup API | db | restore db + re-derive text via finalize_persistence |
Honors ADR-0006 letter; needs a new "db→all-text" re-derive path. Rejected: less direct restore. |
| C2 (chosen) hybrid | db + yaml + csv | restore all three directly | Most robust/direct restore; satisfies ADR-0006 + ADR-0015 with no mechanism amendment. Selected. |
| C3 text-only | yaml + csv | restore text + rebuild_from_text |
Smallest disk; reuses tested rebuild. Rejected: rebuild dependency at undo time is less direct for a safety feature. |
Scope candidates (destructive-only / +UPDATE / every-mutation) and redo semantics were likewise put to the user; selections in §2.
7. Devil's Advocate review of this plan
- Per-mutation snapshot cost. Backing up the db + copying all CSVs
before every mutation is the real cost; a bulk paste of N inserts
makes N snapshots. Mitigations: teaching-scale dbs are small;
ring cap 50 bounds total storage;
--no-undois the escape hatch; the ADR-0015 "batch command" remains the future remedy. A hardlink/CoW optimisation for unchanged files between consecutive snapshots is noted as a future improvement, not v1. - Disk growth. 50 × (db + text) could reach tens of MB for larger projects. Honest consequence; bounded by the cap and the disable flag. Flagged in the amendment's Consequences.
- Backup API + open transaction. A backup step requires no open txn on the live connection. The stage step runs before the mutation txn opens, and restore runs outside any txn — safe by construction, but tests must assert it (R20).
- Restore must include internal metadata tables. The db backup is
byte-faithful so
__rdbms_playground_*tables come along automatically; the text copy carries the yaml/csv. R10 test guards this explicitly. - Temp-cleanup interaction. A mutation in a fresh temp creates
.snapshots/; if the user then undoes back to empty, an empty temp may carry a.snapshots/dir. The allowlist entry (R13) keepssafely_delete_temp_projectwilling to delete it; a test covers it. - Replay safety. Without R12, a
history.logcontainingundowould abort/panic replay (ADR-0034). Adding to both the runtime filter and the completion set is mandatory, not optional. is_mutatingexhaustiveness. Must be a non-_match so any futureRequestvariant fails to compile until classified — prevents a new mutation silently escaping the snapshot hook.
No DA finding is deferred; all are addressed above or tracked as explicit future items (hardlink optimisation) for user awareness.
8. Implementation sequence (test-first throughout)
- Cargo + CLI: add
backupfeature;--no-undoflag + parse tests. (R14, R9-partial) - Snapshot store module:
index.jsonmodel, ring + redo, stage/ finalize/discard/restore, eviction — Tier-1 tests first against a temp dir + in-memory/temp db. (R2, R3, R4, R6, R7) - Worker integration:
is_mutating(exhaustive), dispatcher wrap,undo_enabledgate,Request::Undo/Redo. Tier-3 tests: mutation→snapshot, undo restores db+text, redo, eviction, redo-clear, ordering preserved,--no-undono-ops. (R1, R3, R10) - Commands + grammar:
AppCommand::Undo/Redo, grammar nodes, parse tests; replay-filter + completion entries + their lockstep test. (R5, R7, R12) - Action/event/modal/UI: prepare→confirm→execute flow; modal key handling (Tier 1); modal + disabled-note rendering (Tier 2 insta); re-query on success. (R5, R8, R11, R19)
- Cross-cutting files:
.gitignoretemplate, export exclusion, temp-cleanup allowlist — each with a test. (R13) - Full-stack flow mixing DSL + SQL mutations, undo/redo across both, reopen-and-verify-persistence. (R21, R22)
- Docs: ADR-0006 amendment + README index; requirements U1/U2; CLAUDE.md line. (R16, R17)
- Phase 5 verification: full suite green, zero skips, no regression vs §1; DA final pass. (R23)
9. Engine-neutral string notes (ADR-0002)
User-facing text uses "snapshot", "undo", "the database". Never
"SQLite", "backup API", "PRAGMA", "VACUUM". Modal: "Undo <command>
(run )?". Disabled: "Undo is turned off for this
session." Empty: "Nothing to undo." These wordings are placeholders
pending the catalog edit.
10. Sub-decisions — all confirmed (2026-05-24)
- Disable mechanism shape:
--no-undoCLI flag only for v1 (no in-app toggle). - Redo also confirms, naming the command to re-apply (symmetry with undo).
- Redo is in scope for this pass (discard-on-new-work semantics).
- Confirm key:
Yconfirms /N/Esccancel, matching the rebuild modal.
Still open (one question to the user): whether project-switch-level
"undo back to the previous project" should be tracked as a separate
future feature. Default if unanswered: leave import/project-switch
wholly outside undo (the prior project is intact on disk and reachable
via load / --resume).
11. Next action after approval
On sign-off: write the ADR-0006 amendment first (it's the contract), update the README index, then implement per §8 — test-first, escalate any further ADR-vs-implementation mismatch rather than deciding.