From 504c24c9961271a97bc921376ddcdbdf9c15381a Mon Sep 17 00:00:00 2001 From: "claude@clouddev1" Date: Sun, 24 May 2026 09:29:52 +0000 Subject: [PATCH] docs: ADR-0034 implementation plan (history journal + replay filter) Two-sub-task test-first plan mapping 1:1 to ADR-0034's named sub-tasks: (1) journal failures + per-consumer filtering (status-tagged append, best-effort err writes, hydration reads all), (2) replay parses the journal format (ok-only filter, dual-shape input). Opens with a headline failing test that reproduces the live replay history.log bug. --- .../20260524-adr-0034-history-journal.md | 255 ++++++++++++++++++ 1 file changed, 255 insertions(+) create mode 100644 docs/plans/20260524-adr-0034-history-journal.md diff --git a/docs/plans/20260524-adr-0034-history-journal.md b/docs/plans/20260524-adr-0034-history-journal.md new file mode 100644 index 0000000..38f8e63 --- /dev/null +++ b/docs/plans/20260524-adr-0034-history-journal.md @@ -0,0 +1,255 @@ +# Implementation plan — ADR-0034 (history journal + replay filter) + +**Plan author date:** 2026-05-24 +**Driving ADR:** [ADR-0034 — `history.log` as a complete command journal; replay reads success-only](../adr/0034-history-journal-and-replay-filter.md) (**Accepted**) +**Amends:** ADR-0006 (replay log), ADR-0015 §5 (history format) / §12 (hydration) + +## Purpose + +ADR-0034 is the *decision*: `history.log` becomes a complete, +status-tagged journal; each consumer filters (hydration reads +all, replay reads `ok` only); replay learns the +`||` format while still accepting bare +`.commands` scripts. This plan is the *mechanics* — the build +order, the per-sub-task exit gates, and the headline failing test +that proves the live bug before any fix lands. + +It is deliberately lighter than the Phase-2/3 plans: ADR-0034 is +a focused two-sub-task change to a small, well-isolated surface +(`src/persistence/history.rs`, its `persistence/mod.rs` wrappers, +`run_replay` in `src/runtime.rs`, and the runtime/app error +paths). The ADR already names the two sub-tasks (§"Implementation +note"); this plan refines them with explicit gates. + +## The live bug this fixes (verified 2026-05-23, handoff-34 §4) + +`replay history.log` fails on line 1: `run_replay` +(`src/runtime.rs:1630`) feeds each whole raw line to +`parse_command_with_schema` and only skips blank/`#` lines — it +has no understanding of the `||` record shape +that `history::format_record` already writes. The replay tests +all use hand-written bare-command scripts, so the gap is +invisible to the suite (ADR-0034 Problem 3). + +## Working-mode reminder + +Solo with the DA hat, per `CLAUDE.md`. The DA critique is written +at each sub-task gate before its verdict. Escalate ADR-vs-code +mismatches rather than drift — this session's `/runda` round and +the two Phase-3 escalations show the value. + +**Test discipline (non-negotiable, global rule).** Test-first: +the headline failing test (Step 0) is written and confirmed RED +before any production change. Each bug/behaviour is reproduced +with a failing test first. "All green, no skips" is the only +acceptable end state. + +**Commits.** Not pre-authorized for this plan (unlike Phases +2–3). Each commit is proposed for the user's approval at the +sub-task gate, per the global git-safety rule. No AI attribution. +Push is the user's step. + +## Code surface (grounded — real names) + +- `src/persistence/history.rs`: + - `format_record(command_text, timestamp_iso) -> String` — + **hardcodes `|ok|`** (line ~28). The status-parameterisation + point. + - `append(path, line)`, `escape_command` / `unescape_command`, + `parse_record_source(line)` (splits on `|`, returns the + source), `read_recent_sources(path, max_n)` (hydration), + `utc_iso8601_now()`. +- `src/persistence/mod.rs`: + - the `ok` journal wrapper (~line 278: `format_record` + + `append`) — the worker's success path + (`finalize_persistence`, transactionally coupled). + - the hydration wrapper (~line 288: `read_recent_sources`). +- `src/runtime.rs`: + - `run_replay` (line 1605) — the dual-shape parsing target. + - `read_history_seed` (~598) → `read_recent_history` → + `app.seed_history(...)` — the recall-ring hydration path. + - the dispatch/error paths that surface parse failures and + worker/engine errors — the **best-effort `err` append** sites. + +## Step 0 — baseline + headline failing test + +1. Confirm the baseline: `cargo test` → **1645 / 0 / 0 / 1**; + `cargo clippy --all-targets -- -D warnings` clean. +2. Author the **headline reproduction test** (Tier-3): build a + project, write an actual pipe-format `history.log` with mixed + `ok`/`err` lines (real `|ok|…` shape), `run_replay` + it, and assert the `ok` commands applied. **Confirm it FAILS + today** (replay chokes on the timestamp prefix). This test + stays RED until Sub-task 2. +3. Record the exact failure in the sub-task notes (proves the bug + is real, not misdiagnosed). + +## Sub-task 1 — Journal failures + per-consumer filtering (ADR §1, §2, §4) + +### Scope (in) +- Add a `status` parameter to the append path: `format_record` + stops hardcoding `ok`; the `persistence/mod.rs` wrapper takes + the status (worker success path passes `ok`). +- Best-effort **`err` append** from the runtime/app error paths + (parse failures and dispatch/engine failures). A failure to + record a failure is logged and ignored — never escalates a user + error into a fatal (ADR-0034 §4). +- Hydration (`read_recent_sources` / `read_history_seed`) reads + **all** records regardless of status — make the intent explicit + and test-pinned (it likely already ignores the status field; + confirm and lock it). + +### Scope (out — explicit) +- Replay's journal-format parsing + `ok`-only filter — that is + Sub-task 2. +- Finer-grained status values (parse-err vs exec-err) — ADR-0034 + §1 leaves these a non-breaking future extension; not now. + +### Build steps +1. Reproduce: a test asserting a *failed* command appears in + `history.log` tagged `err` — RED first. +2. Thread the status through `format_record` + the + `persistence/mod.rs` wrapper; success path passes `ok` + (behaviour unchanged for the happy path). +3. Wire the best-effort `err` append at the runtime/app error + path(s). Identify the exact sites (parse-failure surface; + dispatch/worker-error surface) and keep the append + non-fatal. +4. Pin hydration-reads-all with a test. + +### Exit gate (required green tests) +- A command that **fails to parse** appears in `history.log` as + `…|err|`. +- A command that **dispatches but the worker rejects** (e.g. a + UNIQUE violation) appears as `…|err|`. +- A successful command still appears as `…|ok|` + (regression lock on the happy path). +- **Hydration round-trip:** an `err` line in `history.log` is + recalled into the input-history ring on project open + (`read_history_seed` / `seed_history`), matching the + in-session "record everything" behaviour. +- All prior tests green (baseline 1645); clippy clean. + +### DA gate (written) +- Is the `err` append genuinely best-effort? A disk/IO failure + while recording a *failure* must not turn a recoverable user + error into a crash. Verify the error is swallowed + logged. +- Does the success path still couple the `ok` line to the + committed state transactionally (`finalize_persistence` before + `tx.commit()`)? No regression to ADR-0015's crash-recoverable + ordering. +- Are both failure *kinds* (parse-time, execute-time) covered, or + only one? The ADR names both. +- Did anything beyond ADR-0034 §1/§2/§4 land (new status values, + format changes)? If so, surface it. + +## Sub-task 2 — Replay parses the journal format (ADR §3) + +### Scope (in) +- `run_replay` accepts **both** input shapes: + - **journal record** — a line matching the + `||` prefix → extract `` + (unescape per the `history.rs` helper), **skip unless + `status == ok`**. + - **bare command** — any other non-blank, non-`#` line → + replayed as-is (current behaviour). +- Detection by the leading timestamp+status prefix only, so a + bare command containing `|` (e.g. `select 'a|b' from t`) is + **not** misread. + +### Scope (out — explicit) +- Changing the replay execution/dispatch model (mode, schema + re-snapshot per line) — unchanged; ADR-0033 Amendment 3 already + fixed replay's Advanced-mode parse. This sub-task is purely + about line-shape recognition. + +### Build steps +1. The Step-0 headline test is the RED anchor. +2. Add journal-record detection + source extraction in + `run_replay` (reuse `history.rs`'s parse/unescape helpers — + expose them at the right visibility rather than duplicating). +3. Apply the `ok`-only filter (skip `err` records silently, like + blank/`#` lines — they are not failures of the replay). +4. Confirm bare-command scripts are untouched. + +### Exit gate (required green tests) +- **Headline (now GREEN):** replaying an actual `history.log` + (pipe format, mixed `ok`/`err`) runs the `ok` commands and + skips the `err` ones; final state reflects exactly the `ok` + sequence. +- A plain `.commands` script (bare commands) still replays + unchanged (regression lock — re-run the existing + `replay_command.rs` suite + the 3k + `e2e_replay_phase3_dml_forms_from_a_script`). +- A bare command containing `|` (`select 'a|b' from t` in a + script) is **not** misparsed as a journal record. +- An `err` record is skipped, not treated as a parse failure (no + `ReplayFailed` for a well-formed `err` line). +- All prior tests green; clippy clean. + +### DA gate (written) +- Is the prefix detection robust? A bare command starting with + something that *looks* timestamp-ish but isn't a valid + `||` prefix must fall through to + bare-command handling, not be silently dropped. Test the + boundary. +- Does the `ok`-only filter use the same status tokens the writer + emits (no drift between writer and reader)? Single source of + truth for the token set. +- Is source unescaping the *same* routine hydration uses + (`unescape_command`)? No second, divergent unescape. +- Round-trip integrity: a command written by the worker and then + replayed produces the identical command — escape→unescape is + lossless for the awkward cases (embedded `|`, newlines, + backslashes; `history.rs` already tests these for hydration — + mirror for replay). + +## Requirement / ADR mapping + +| ADR-0034 | Decision | Proven by (target) | +|----------|----------|--------------------| +| §1 | journal records every command, `ok`/`err` | Sub-task 1 writer tests | +| §2 hydration | recall reads all records | Sub-task 1 hydration round-trip | +| §2 replay | replay reads `ok` only | Sub-task 2 headline + skip-err | +| §3 | dual-shape input; prefix detection | Sub-task 2 bare-`|` + boundary tests | +| §4 | ok=worker/transactional, err=best-effort | Sub-task 1 happy-path + best-effort tests | +| §5 | backward compat (all-`ok` files) | existing `read_recent_sources` tests stay green; replay-of-all-`ok`-log test | + +## Verification (phase-exit) + +- `cargo test` — zero failures, zero skips, no regression from + the 1645 baseline; report exact counts. +- `cargo clippy --all-targets -- -D warnings` clean. +- Every row of the mapping above has a green test; the headline + `replay history.log` test is green. +- Final written DA review (specific critiques first, verdict + last; rubber-stamp PASS is a process failure). +- Update ADR-0034 status note / README index if the ADR's status + presentation needs it (it is already Accepted — likely just a + pointer to the implementing commits), per the index-upkeep + rule. + +## Open questions to escalate before/at code time + +1. **`err` append sites.** The ADR says "runtime/app error path" + but does not pin the exact functions. If the natural site is + ambiguous (e.g. parse failures surface in more than one place), + surface the choice rather than guessing — especially if + threading the source text to the error path is non-trivial. +2. **Status token set ownership.** Where the `ok`/`err` tokens + live (a shared const/enum vs string literals) so writer and + reader cannot drift — propose the smallest shared definition; + escalate only if it forces a wider refactor. +3. **Does an `err` line need the un-parsed source verbatim?** A + parse failure has no validated command — the journal should + record what the user *typed*. Confirm the raw input is + available at the err-append site (it is for recall today; + verify for the journal). + +## What this plan does NOT contain + +- Time estimates (milestones, not hours). +- Re-statements of ADR-0034's decisions — it refers by section; + the implementer reads both. +- The deferred finer-grained status values, M4, ADR-0006 + auto-snapshot, or Phase 4 — all tracked in handoff-34 §5.