# Plan: ADR-0006 undo / snapshot half (U1/U2) **Date:** 2026-05-24 · **Author:** session 36 · **ADR:** 0006 (undo snapshots and replay log) — replay half (U3/U4) shipped in ADR-0034; this plan covers the **undo / snapshot half (U1/U2)**. **Status of this doc:** draft for review. The load-bearing design calls below were confirmed with the user before writing (see §2); the second-order calls in §10 are recommendations awaiting sign-off. No code is written until this plan is approved. --- ## 1. Baseline - `cargo test`: green (exit 0). Handoff 35 records **1659 passing / 0 failing / 0 skipped / 1 ignored** (the `friendly/mod.rs` doctest). This is the Phase-1 baseline Phase-5 is measured against. - `cargo clippy --all-targets -- -D warnings`: clean (per handoff). --- ## 2. Decisions locked with the user (do not re-litigate) All confirmed this session via direct Q&A. 1. **Snapshot mechanism — hybrid whole-project snapshot.** A snapshot stores the database (copied via SQLite's **online backup API**, the only safe way to copy a live db) **plus** the authoritative text (`project.yaml` + `data/*.csv`, copied as inert files). Undo **restores all three directly** — no rebuild, no re-derivation. This satisfies **both** ADR-0006 (backup API *is* used) **and** ADR-0015 (text remains the authoritative copy), so the *mechanism* needs no ADR amendment. Rationale for storing the db too (rather than text-only + rebuild): the user prioritised a direct, robust restore for a safety feature over the smaller disk footprint of a rebuild-on-undo approach. 2. **Lifetime — persisted on disk, ring capped at N = 50.** Snapshots live under the project directory and survive quit (undo works after reopening). The ring keeps the most recent **50** snapshots; the oldest is evicted on overflow. `50` is a single tunable constant. 3. **Redo — discarded on new work.** Standard linear semantics: any new mutation after an undo clears the redo stack. 4. **Scope — every mutation (single-step, Ctrl-Z-style undo).** A snapshot is taken before **every** data/schema mutation (insert, update, delete, drop, all DDL, all SQL DML), not only before destructive ops. The user judged this the right model for the playground's teaching philosophy. **This supersedes two pillars of ADR-0006** and requires an amendment (see §4). 5. **`undo` requires confirmation** that names the command being undone. (Confirmation is kept per the user's explicit instruction, even though single-step undo is more frequent than the ADR's original "rare and consequential" framing.) 6. **Disable switch — in the picture from day one.** A way to turn undo off entirely, so per-command snapshotting can be avoided if it proves too heavy on small hardware. Shape: a `--no-undo` CLI flag (CLI-only for v1, no in-app toggle — confirmed). 7. **Batch operations record ONE undo step.** A single user command that executes many sub-operations — `replay` today, and any future in-project batch command — takes **one boundary snapshot** for the whole batch, not one per sub-command. Undo rolls the batch back wholesale. This is both a perf win (replaying a long `history.log` is one db backup, not N) and the consistent reading of "one undo step per user command." **`import` is *not* a batch op** — it creates a new project and switches to it (ADR-0015 §11), leaving the current project untouched on disk, so it sits **outside** the per-project undo ring entirely (the new project just starts with an empty ring). Project-switch-level "undo back to the previous project" is a separate, out-of-scope mechanism. --- ## 3. Phase 1 — Requirements checklist From ADR-0006 (as amended), the user decisions in §2, and the project's working standards. ### Functional - [ ] **R1** Auto-snapshot before every mutation (DSL + SQL), gated by the enable flag. - [ ] **R1b** A **batch** command (`replay`, future batch ops) records exactly one boundary snapshot for the whole batch (not one per sub-command), via the Begin/EndBatch primitive; finalized only if ≥1 mutation actually ran. `import` takes **no** undo step (project switch, not a current-db mutation). - [ ] **R2** A snapshot captures db (backup API) + `project.yaml` + `data/*.csv`, atomically, representing the **pre-mutation committed** state. - [ ] **R3** Snapshots are committed into the ring **only after** the triggering mutation commits successfully; a rolled-back op leaves no snapshot. Preserves ADR-0015 §6 commit-db-last ordering. - [ ] **R4** Persisted ring buffer under the project dir, cap 50, oldest evicted on overflow. - [ ] **R5** `undo` app command (both modes, no sigil): confirmation modal naming the command to be undone + its relative timestamp → on confirm, restore the top snapshot (db + text) → pop the ring. - [ ] **R6** Undo pushes the pre-undo (current) state onto a redo stack so `redo` is possible (ADR-0006). Redo restores it. - [ ] **R7** `redo` app command restores the top redo entry. New mutation clears the redo stack (R-decision 3). - [ ] **R8** Undo/redo into an empty stack reports a friendly note ("nothing to undo" / "nothing to redo"); never errors. - [ ] **R9** Disable switch (`--no-undo`): when set, no snapshots are taken (zero per-command overhead) and `undo`/`redo` report that undo is disabled. - [ ] **R10** Restore correctly re-establishes a consistent (db, yaml, csv) triple, including the NULL-vs-empty CSV distinction and internal metadata tables (`__rdbms_playground_*`). - [ ] **R11** After undo/redo, the TUI refreshes its view (re-query) so the user sees the restored state. ### Cross-cutting / integration - [ ] **R12** `undo`/`redo` added to `runtime::is_app_lifecycle_entry_word` so **replay skips them** (ADR-0034 Amendment 1) — and to the completion entry-keyword set (`src/completion.rs`), which the function's doc-comment says must stay in lockstep. - [ ] **R13** Snapshots dir added to the `.gitignore` template (`src/project/mod.rs`), the **export exclusion** (`src/archive.rs` — like `playground.db` and `history.log`), and the **temp-project cleanup allowlist** (`safely_delete_temp_project`) so an otherwise- empty temp carrying a snapshots dir is still safely deletable. - [ ] **R14** `Cargo.toml`: add `"backup"` to the `rusqlite` feature list. - [ ] **R15** Engine-neutral user-facing strings (ADR-0002): no "SQLite" / "backup API" / "PRAGMA" in notes or modal text. Use "snapshot" / "the database". ### Documentation - [ ] **R16** Write an **ADR-0006 amendment** (see §4); update `docs/adr/README.md` in the same edit (ADR-0000 index rule). - [ ] **R17** Update `docs/requirements.md` U1/U2 from designed → implemented; update `CLAUDE.md`'s ADR-0006 "Designed; not yet implemented" line. ### Testing (ADR-0008 four tiers) - [ ] **R18** Tier 1 (unit/pure): ring eviction, redo-clear-on-new- work, mutation-vs-readonly classification, modal key handling, command parsing, disable-flag gating, CLI parse. - [ ] **R19** Tier 2 (insta snapshots): undo/redo confirmation modal rendering; disabled-undo note. - [ ] **R20** Tier 3 (integration, real db + temp project): snapshot taken on mutation; undo restores db **and** text; redo; ring eviction at the cap; redo cleared by new work; `--no-undo` takes no snapshots; persistence-ordering invariant preserved. - [ ] **R21** Full-stack flow (per global standards): bootstrap a real project, run a sequence of DSL **and SQL** mutations, undo/redo across both, verify read-model + on-disk yaml/csv/db all consistent, reopen the project and confirm the ring persisted. - [ ] **R21b** Batch: a `replay` of a multi-command file produces exactly **one** ring entry; undo rolls the whole replay back; an all-skips replay leaves no undo step; `import` produces no undo step in the current project. - [ ] **R22** Phase-3 N/A matrix row closed ("auto-snapshot fires for SQL DML the same as DSL") — now non-vacuous. - [ ] **R23** End state: all green, **zero skips**, no regression vs the §1 baseline. --- ## 4. ADR-0006 amendment scope The every-mutation decision (§2.4) contradicts ADR-0006 as written, so an amendment is required (the ADR is *Accepted* — we amend in place with a dated note, per the project's "no silent drift" rule). The amendment records: - **Snapshot scope:** "before every destructive operation" → "before every data/schema mutation" (single-step undo). State the pedagogical rationale (intuitive Ctrl-Z; the playground favours clarity over micro-optimisation). - **Confirmation rationale:** the original "undo is rare and consequential, so always confirm" is replaced by "undo is single-step and confirmation names the exact command being undone." Confirmation is still always shown; there is still no suppress flag. The summary simplifies — with per-command snapshots, undo rolls back exactly one command, so the "discarded changes" summary is just that command's description (no expensive db-diff needed). This resolves ADR-0006's "counts of rows added/modified/deleted per table since the snapshot" clause: under single-step undo there is no intervening un-snapshotted work, so the clause collapses to "the one command." - **Snapshot mechanism:** clarify the hybrid (db via backup API + text via copy) and how it reconciles with ADR-0015 (db is derived; text authoritative) — restore re-establishes the whole triple. - **Storage & lifetime:** persisted ring under the project, cap 50, git-ignored, export-excluded, temp-cleanup-aware. - **Redo:** discarded on new work. - **Disable switch:** `--no-undo` as a hardware escape hatch. - **Batch granularity:** a single user command that runs many sub-operations (`replay`, future batch ops) is one undo step, taken at the batch boundary. Consistent with "one undo step per user command." `import` is a project switch and takes no undo step. --- ## 5. Architecture & design ### 5.1 On-disk layout ``` / project.yaml data/.csv playground.db .snapshots/ index.yaml # ordered ring + redo stack: ids, timestamps, # command text, status; the source of truth # for ordering/eviction 0001/ playground.db # backup-API copy of the pre-op db project.yaml # pre-op copy data/
.csv # pre-op copies 0002/ ... .staging/ # temp area for an in-flight snapshot ``` - **`.snapshots/` is git-ignored, export-excluded, and on the temp-cleanup allowlist** (R13). - The **ring** (undo) and the **redo stack** are both recorded in `index.yaml`; snapshot payload dirs are shared storage referenced by id. Eviction beyond 50 deletes the oldest payload dir. - Ids are monotonic; never reused, to avoid stale references. ### 5.2 Snapshot mechanism - **db copy:** `rusqlite::backup` — `Backup::new(&live, &snap)` copies live → snapshot file; for restore, `Backup::new(&snap, &live)` copies snapshot → live (works on the open worker connection; requires the `backup` cargo feature, R14). No transaction may be open on the live connection during a backup step. - **text copy:** plain file copy of `project.yaml` + every `data/*.csv` into the snapshot dir. (CSVs are produced by the project's hand-rolled writer that preserves NULL-vs-empty; we copy the already-written files verbatim, so the distinction is preserved for free — R10.) ### 5.3 Timing vs ADR-0015 §6 (the load-bearing invariant) A snapshot brackets the existing 4-step persistence sequence: ``` 0. STAGE snapshot (NEW) — backup live db + copy current text into .snapshots/.staging/. The current on-disk state is the last committed state (db, yaml, csv are committed together per ADR-0015), so staging captures the true pre-op state. 1. open txn; validate; mutate db + metadata (ADR-0015 step 1) 2. write text temp files; fsync (step 2) 3. atomic-rename text into place (step 3) 4. commit db (step 4) 5. FINALIZE snapshot (NEW) — atomic-rename .staging/ → .snapshots//, append to index.yaml ring, evict oldest if >50, clear redo stack. On any failure in 1–4 → txn rolls back; DISCARD .staging/. ``` This **preserves commit-db-last**: the snapshot is finalised *after* the db commit, and a rolled-back op leaves no ring entry (R3). A crash mid-sequence leaves a `.staging/` orphan, cleaned on next open exactly like ADR-0015's orphan text temps. ### 5.4 Worker chokepoint (single point, not per-handler) Intercept centrally in `db.rs::handle_request` (the dispatcher), not in each `do_*` handler: - **Classify** each `Request` as *mutating* or *read-only* (a single `is_mutating(&Request) -> bool`, exhaustively matched so a new request variant forces a classification decision at compile time). - For mutating requests, **wrap** the handler call: `stage_snapshot()` → run handler → on `Ok` `finalize_snapshot()` + clear redo; on `Err` `discard_staging()`. - The command text for the snapshot meta comes from the request's existing `source: String` field (added during the ADR-0034 journal work). - **Refactor note:** mutating handlers currently send their reply via the `oneshot` inside the match arm. To let the wrapper observe success/failure, mutating handlers return their `Result` to the dispatcher, which sends the reply **and** drives snapshot finalize/discard. This is the main structural change in `db.rs`; read-only handlers are untouched. (`RebuildFromText` is mutating and snapshots like the rest; it reconstructs from text, but the pre-op db+text are still the correct undo target.) - **Gate:** when undo is disabled (R9), `is_mutating` still holds but the stage/finalize calls are no-ops — the worker is told at construction whether snapshotting is enabled. ### 5.5 Undo / redo flow (commands → modal → worker) Mirrors the existing two-phase `rebuild` modal flow, with the **prepare** step reading snapshot metadata rather than the project: 1. `undo` parses to `Command::App(AppCommand::Undo)` → `dispatch_app_command` returns `Action::PrepareUndo`. 2. Runtime handles `PrepareUndo`: reads `.snapshots/index.yaml` top entry (command text + timestamp) — a cheap file read, like `summarize_project` does for rebuild — and posts `AppEvent::UndoPrepared { command, when }` (or `UndoUnavailable` if the ring is empty → friendly note, R8). 3. `App::update` opens `Modal::UndoConfirm` showing "Undo `` (run )? [Y] yes [N] no". `Y` → `Action::Undo`; `N`/`Esc` → close + "undo cancelled". 4. Runtime handles `Action::Undo` → worker `Request::Undo`: - push a snapshot of **current** state onto the redo stack, - restore the top ring snapshot (backup-restore db + rename text back), pop the ring, - reply with the undone command text. 5. Runtime posts `AppEvent::UndoSucceeded { command }` / `UndoFailed { error }`; `App` closes the modal, notes the outcome, and triggers a re-query so the view refreshes (R11). `redo` is symmetric (`PrepareRedo` / `Redo`, `Request::Redo`, restores the top redo entry, pushes current onto the undo ring). Whether redo also confirms is an open sub-decision (§10). ### 5.6 Batch operations — one undo step (Begin/EndBatch) `run_replay` (`runtime.rs:1622`) dispatches **one worker request per replayed command** via `execute_command_typed`. Left to §5.4, a long replay would stage one snapshot per line. Instead, a reusable batch primitive collapses a batch to a single ring entry: - `Request::BeginBatch { source }` — stages **one** snapshot (meta = the batch command, e.g. `replay history.log`) and sets the worker's `in_batch` flag. While set, per-request staging in §5.4 is **skipped** (the boundary snapshot already captured pre-batch state). - `Request::EndBatch` — finalizes the staged snapshot into the ring **only if ≥1 mutation occurred** during the batch (else discards it, so an all-skips replay leaves no no-op undo step); clears `in_batch` and the redo stack. - `run_replay` brackets its loop with begin/end. A mid-replay `PersistenceFatal` tears the runtime down before `EndBatch`, leaving a `.staging/` orphan that is cleaned on next open (ADR-0015 parity); undo state is moot at that point anyway. - The same primitive serves any future in-project batch command. - `import` does **not** use this — it creates a new project and switches (ADR-0015 §11); the current project is untouched, so there is nothing to snapshot and no undo step. `rebuild` is already a single request (`RebuildFromText`) → naturally one snapshot, no batch handling needed. ### 5.7 New / changed types (from the Explore survey) | Layer | Change | |---|---| | `Cargo.toml` | add `"backup"` to `rusqlite` features | | `src/cli.rs` | `Args.no_undo: bool`; parse `--no-undo` | | `src/dsl/command.rs` | `AppCommand::Undo`, `AppCommand::Redo` | | `src/dsl/grammar/app.rs` | `UNDO` / `REDO` `CommandNode` (EMPTY_SEQ, like `REBUILD`) | | `src/action.rs` | `PrepareUndo`, `Undo`, `PrepareRedo`, `Redo` | | `src/event.rs` | `UndoPrepared`/`UndoUnavailable`/`UndoSucceeded`/`UndoFailed` (+ redo) | | `src/app.rs` | `Modal::UndoConfirm` (+ redo) struct; event arms; `handle_undo_confirm_key`; dispatch arms | | `src/ui.rs` | `render_undo_confirm` (mirror `render_rebuild_confirm`) | | `src/db.rs` | `is_mutating`; `stage/finalize/discard_snapshot`; `Request::Undo`/`Redo`/`BeginBatch`/`EndBatch`; dispatcher wrap; `undo_enabled` + `in_batch` worker fields | | `src/undo.rs` (new) | snapshot ring + redo store, `index.yaml`, stage/finalize/discard/undo/redo/cleanup, eviction, restore (named `undo` — `src/snapshots/` is the insta dir) | | `src/runtime.rs` | `is_app_lifecycle_entry_word += undo,redo`; `PrepareUndo/Undo` (+redo) handling; thread `no_undo` to worker; bracket `run_replay` with Begin/EndBatch | | `src/project/mod.rs` | `.snapshots/` in `.gitignore` template | | `src/archive.rs` | exclude `.snapshots/` from export | | `safely_delete_temp_project` | `.snapshots/` on the contents allowlist | | `src/completion.rs` | `undo`/`redo` in app-command entry keywords | | `src/friendly/strings/en-US.yaml` | modal title/prompt, success/cancel/disabled notes, CLI help | --- ## 6. Phase 2/3 — candidates considered (condensed) The mechanism choice was the crux; three candidates were evaluated and the user selected the hybrid (§2.1). Recorded here so the rejected options aren't silently dropped: | Candidate | Stores | Undo restore | Verdict | |---|---|---|---| | **C1** db-only via backup API | db | restore db + re-derive text via `finalize_persistence` | Honors ADR-0006 letter; needs a new "db→all-text" re-derive path. Rejected: less direct restore. | | **C2 (chosen)** hybrid | db + yaml + csv | restore all three directly | Most robust/direct restore; satisfies ADR-0006 + ADR-0015 with no mechanism amendment. **Selected.** | | **C3** text-only | yaml + csv | restore text + `rebuild_from_text` | Smallest disk; reuses tested rebuild. Rejected: rebuild dependency at undo time is less direct for a safety feature. | Scope candidates (destructive-only / +UPDATE / every-mutation) and redo semantics were likewise put to the user; selections in §2. --- ## 7. Devil's Advocate review of this plan - **Per-mutation snapshot cost.** Backing up the db + copying all CSVs before *every* mutation is the real cost; a bulk paste of N inserts makes N snapshots. *Mitigations:* teaching-scale dbs are small; ring cap 50 bounds total storage; `--no-undo` is the escape hatch; the ADR-0015 "batch command" remains the future remedy. A hardlink/CoW optimisation for unchanged files between consecutive snapshots is noted as a **future** improvement, not v1. - **Disk growth.** 50 × (db + text) could reach tens of MB for larger projects. Honest consequence; bounded by the cap and the disable flag. Flagged in the amendment's Consequences. - **Backup API + open transaction.** A backup step requires no open txn on the live connection. The stage step runs *before* the mutation txn opens, and restore runs outside any txn — safe by construction, but tests must assert it (R20). - **Restore must include internal metadata tables.** The db backup is byte-faithful so `__rdbms_playground_*` tables come along automatically; the text copy carries the yaml/csv. R10 test guards this explicitly. - **Temp-cleanup interaction.** A mutation in a fresh temp creates `.snapshots/`; if the user then undoes back to empty, an empty temp may carry a `.snapshots/` dir. The allowlist entry (R13) keeps `safely_delete_temp_project` willing to delete it; a test covers it. - **Replay safety.** Without R12, a `history.log` containing `undo` would abort/panic replay (ADR-0034). Adding to both the runtime filter and the completion set is mandatory, not optional. - **`is_mutating` exhaustiveness.** Must be a non-`_` match so any future `Request` variant fails to compile until classified — prevents a new mutation silently escaping the snapshot hook. No DA finding is deferred; all are addressed above or tracked as explicit future items (hardlink optimisation) for user awareness. --- ## 8. Implementation sequence (test-first throughout) 1. **Cargo + CLI:** add `backup` feature; `--no-undo` flag + parse tests. (R14, R9-partial) 2. **Snapshot store module:** `index.yaml` model, ring + redo, stage/ finalize/discard/restore, eviction — Tier-1 tests first against a temp dir + in-memory/temp db. (R2, R3, R4, R6, R7) 3. **Worker integration:** `is_mutating` (exhaustive), dispatcher wrap, `undo_enabled` gate, `Request::Undo`/`Redo`. Tier-3 tests: mutation→snapshot, undo restores db+text, redo, eviction, redo-clear, ordering preserved, `--no-undo` no-ops. (R1, R3, R10) 4. **Commands + grammar:** `AppCommand::Undo/Redo`, grammar nodes, parse tests; replay-filter + completion entries + their lockstep test. (R5, R7, R12) 5. **Action/event/modal/UI:** prepare→confirm→execute flow; modal key handling (Tier 1); modal + disabled-note rendering (Tier 2 insta); re-query on success. (R5, R8, R11, R19) 6. **Cross-cutting files:** `.gitignore` template, export exclusion, temp-cleanup allowlist — each with a test. (R13) 7. **Full-stack flow** mixing DSL + SQL mutations, undo/redo across both, reopen-and-verify-persistence. (R21, R22) 8. **Docs:** ADR-0006 amendment + README index; requirements U1/U2; CLAUDE.md line. (R16, R17) 9. **Phase 5 verification:** full suite green, zero skips, no regression vs §1; DA final pass. (R23) --- ## 9. Engine-neutral string notes (ADR-0002) User-facing text uses "snapshot", "undo", "the database". Never "SQLite", "backup API", "PRAGMA", "VACUUM". Modal: "Undo `` (run )?". Disabled: "Undo is turned off for this session." Empty: "Nothing to undo." These wordings are placeholders pending the catalog edit. --- ## 10. Sub-decisions — all confirmed (2026-05-24) 1. **Disable mechanism shape:** `--no-undo` CLI flag only for v1 (no in-app toggle). 2. **Redo also confirms,** naming the command to re-apply (symmetry with undo). 3. **Redo is in scope** for this pass (discard-on-new-work semantics). 4. **Confirm key:** `Y` confirms / `N` / `Esc` cancel, matching the rebuild modal. **Still open (one question to the user):** whether project-switch-level "undo back to the previous project" should be tracked as a separate future feature. Default if unanswered: leave `import`/project-switch wholly outside undo (the prior project is intact on disk and reachable via `load` / `--resume`). --- ## 11. Next action after approval On sign-off: write the ADR-0006 amendment first (it's the contract), update the README index, then implement per §8 — test-first, escalate any further ADR-vs-implementation mismatch rather than deciding.