Files
rdbms-playground/docs/plans/20260524-adr-0034-history-journal.md
T
claude@clouddev1 e4f2f5fa15 feat: ADR-0034 — history journal records err + replay parses/filters the journal
Replay (§3): run_replay parses <ts>|<status>|<source> journal records — runs ok, skips non-ok — while still accepting bare .commands scripts (prefix-detected so a | inside a bare command isn't misread). Fixes replay history.log, which died on line 1.

Journal failures (§1/§2): failed commands are recorded err via a new Action::JournalFailure, emitted by the pure-sync App for both parse failures and worker-execution failures (runtime appends best-effort, never fatal). Hydration reads all records so typo'd/rejected commands are recallable across sessions.

Amendment 1 — replay filters app-lifecycle commands: a working replay history.log exposed that the journal also records save as/load/new/export/import/rebuild/mode (which would panic the worker dispatch or abort replay). Replay now re-applies only schema/data writes and skips every app-lifecycle command + nested replay, classified by entry word so modal/incomplete forms (save as, bare mode) and quit skip uniformly rather than aborting. All skips continue (reversing the nested-replay refusal); import and nested replay warn. replay.error_nested removed; replay.skipped_import/_replay added; ReplayCompleted carries warnings. requirements.md U3/U4 updated; app-command runtime-failure journalling tracked as a follow-up.

1659 passing / 0 failing / 0 skipped / 1 ignored. Clippy clean.
2026-05-24 18:59:06 +00:00

272 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Implementation plan — ADR-0034 (history journal + replay filter)
**Plan author date:** 2026-05-24
**Driving ADR:** [ADR-0034 — `history.log` as a complete command journal; replay reads success-only](../adr/0034-history-journal-and-replay-filter.md) (**Accepted**)
**Amends:** ADR-0006 (replay log), ADR-0015 §5 (history format) / §12 (hydration)
## Purpose
ADR-0034 is the *decision*: `history.log` becomes a complete,
status-tagged journal; each consumer filters (hydration reads
all, replay reads `ok` only); replay learns the
`<ts>|<status>|<source>` format while still accepting bare
`.commands` scripts. This plan is the *mechanics* — the build
order, the per-sub-task exit gates, and the headline failing test
that proves the live bug before any fix lands.
It is deliberately lighter than the Phase-2/3 plans: ADR-0034 is
a focused two-sub-task change to a small, well-isolated surface
(`src/persistence/history.rs`, its `persistence/mod.rs` wrappers,
`run_replay` in `src/runtime.rs`, and the runtime/app error
paths). The ADR already names the two sub-tasks (§"Implementation
note"); this plan refines them with explicit gates.
## The live bug this fixes (verified 2026-05-23, handoff-34 §4)
`replay history.log` fails on line 1: `run_replay`
(`src/runtime.rs:1630`) feeds each whole raw line to
`parse_command_with_schema` and only skips blank/`#` lines — it
has no understanding of the `<ts>|<status>|<source>` record shape
that `history::format_record` already writes. The replay tests
all use hand-written bare-command scripts, so the gap is
invisible to the suite (ADR-0034 Problem 3).
## Working-mode reminder
Solo with the DA hat, per `CLAUDE.md`. The DA critique is written
at each sub-task gate before its verdict. Escalate ADR-vs-code
mismatches rather than drift — this session's `/runda` round and
the two Phase-3 escalations show the value.
**Test discipline (non-negotiable, global rule).** Test-first:
the headline failing test (Step 0) is written and confirmed RED
before any production change. Each bug/behaviour is reproduced
with a failing test first. "All green, no skips" is the only
acceptable end state.
**Commits.** Not pre-authorized for this plan (unlike Phases
23). Each commit is proposed for the user's approval at the
sub-task gate, per the global git-safety rule. No AI attribution.
Push is the user's step.
## Code surface (grounded — real names)
- `src/persistence/history.rs`:
- `format_record(command_text, timestamp_iso) -> String`
**hardcodes `|ok|`** (line ~28). The status-parameterisation
point.
- `append(path, line)`, `escape_command` / `unescape_command`,
`parse_record_source(line)` (splits on `|`, returns the
source), `read_recent_sources(path, max_n)` (hydration),
`utc_iso8601_now()`.
- `src/persistence/mod.rs`:
- the `ok` journal wrapper (~line 278: `format_record` +
`append`) — the worker's success path
(`finalize_persistence`, transactionally coupled).
- the hydration wrapper (~line 288: `read_recent_sources`).
- `src/runtime.rs`:
- `run_replay` (line 1605) — the dual-shape parsing target.
- `read_history_seed` (~598) → `read_recent_history`
`app.seed_history(...)` — the recall-ring hydration path.
- the dispatch/error paths that surface parse failures and
worker/engine errors — the **best-effort `err` append** sites.
## Step 0 — baseline + headline failing test
1. Confirm the baseline: `cargo test`**1645 / 0 / 0 / 1**;
`cargo clippy --all-targets -- -D warnings` clean.
2. Author the **headline reproduction test** (Tier-3): build a
project, write an actual pipe-format `history.log` with mixed
`ok`/`err` lines (real `<iso8601>|ok|…` shape), `run_replay`
it, and assert the `ok` commands applied. **Confirm it FAILS
today** (replay chokes on the timestamp prefix). This test
stays RED until Sub-task 2.
3. Record the exact failure in the sub-task notes (proves the bug
is real, not misdiagnosed).
## Sub-task 1 — Journal failures + per-consumer filtering (ADR §1, §2, §4)
### Scope (in)
- Add a `status` parameter to the append path: `format_record`
stops hardcoding `ok`; the `persistence/mod.rs` wrapper takes
the status (worker success path passes `ok`).
- Best-effort **`err` append** from the runtime/app error paths
(parse failures and dispatch/engine failures). A failure to
record a failure is logged and ignored — never escalates a user
error into a fatal (ADR-0034 §4).
- Hydration (`read_recent_sources` / `read_history_seed`) reads
**all** records regardless of status — make the intent explicit
and test-pinned (it likely already ignores the status field;
confirm and lock it).
### Scope (out — explicit)
- Replay's journal-format parsing + `ok`-only filter — that is
Sub-task 2.
- Finer-grained status values (parse-err vs exec-err) — ADR-0034
§1 leaves these a non-breaking future extension; not now.
### Build steps
1. Reproduce: a test asserting a *failed* command appears in
`history.log` tagged `err` — RED first.
2. Thread the status through `format_record` + the
`persistence/mod.rs` wrapper; success path passes `ok`
(behaviour unchanged for the happy path).
3. Wire the best-effort `err` append at the runtime/app error
path(s). Identify the exact sites (parse-failure surface;
dispatch/worker-error surface) and keep the append
non-fatal.
4. Pin hydration-reads-all with a test.
### Exit gate (required green tests)
- A command that **fails to parse** appears in `history.log` as
`…|err|<source>`.
- A command that **dispatches but the worker rejects** (e.g. a
UNIQUE violation) appears as `…|err|<source>`.
- A successful command still appears as `…|ok|<source>`
(regression lock on the happy path).
- **Hydration round-trip:** an `err` line in `history.log` is
recalled into the input-history ring on project open
(`read_history_seed` / `seed_history`), matching the
in-session "record everything" behaviour.
- All prior tests green (baseline 1645); clippy clean.
### DA gate (written)
- Is the `err` append genuinely best-effort? A disk/IO failure
while recording a *failure* must not turn a recoverable user
error into a crash. Verify the error is swallowed + logged.
- Does the success path still couple the `ok` line to the
committed state transactionally (`finalize_persistence` before
`tx.commit()`)? No regression to ADR-0015's crash-recoverable
ordering.
- Are both failure *kinds* (parse-time, execute-time) covered, or
only one? The ADR names both.
- Did anything beyond ADR-0034 §1/§2/§4 land (new status values,
format changes)? If so, surface it.
## Sub-task 2 — Replay parses the journal format (ADR §3)
### Scope (in)
- `run_replay` accepts **both** input shapes:
- **journal record** — a line matching the
`<iso8601-timestamp>|<status>|` prefix → extract `<source>`
(unescape per the `history.rs` helper), **skip unless
`status == ok`**.
- **bare command** — any other non-blank, non-`#` line →
replayed as-is (current behaviour).
- Detection by the leading timestamp+status prefix only, so a
bare command containing `|` (e.g. `select 'a|b' from t`) is
**not** misread.
### Scope (out — explicit)
- Changing the replay execution/dispatch model (mode, schema
re-snapshot per line) — unchanged; ADR-0033 Amendment 3 already
fixed replay's Advanced-mode parse. This sub-task is purely
about line-shape recognition.
### Build steps
1. The Step-0 headline test is the RED anchor.
2. Add journal-record detection + source extraction in
`run_replay` (reuse `history.rs`'s parse/unescape helpers —
expose them at the right visibility rather than duplicating).
3. Apply the `ok`-only filter (skip `err` records silently, like
blank/`#` lines — they are not failures of the replay).
4. Confirm bare-command scripts are untouched.
### Exit gate (required green tests)
- **Headline (now GREEN):** replaying an actual `history.log`
(pipe format, mixed `ok`/`err`) runs the `ok` commands and
skips the `err` ones; final state reflects exactly the `ok`
sequence.
- A plain `.commands` script (bare commands) still replays
unchanged (regression lock — re-run the existing
`replay_command.rs` suite + the 3k
`e2e_replay_phase3_dml_forms_from_a_script`).
- A bare command containing `|` (`select 'a|b' from t` in a
script) is **not** misparsed as a journal record.
- An `err` record is skipped, not treated as a parse failure (no
`ReplayFailed` for a well-formed `err` line).
- All prior tests green; clippy clean.
### DA gate (written)
- Is the prefix detection robust? A bare command starting with
something that *looks* timestamp-ish but isn't a valid
`<iso8601>|<status>|` prefix must fall through to
bare-command handling, not be silently dropped. Test the
boundary.
- Does the `ok`-only filter use the same status tokens the writer
emits (no drift between writer and reader)? Single source of
truth for the token set.
- Is source unescaping the *same* routine hydration uses
(`unescape_command`)? No second, divergent unescape.
- Round-trip integrity: a command written by the worker and then
replayed produces the identical command — escape→unescape is
lossless for the awkward cases (embedded `|`, newlines,
backslashes; `history.rs` already tests these for hydration —
mirror for replay).
## Requirement / ADR mapping
| ADR-0034 | Decision | Proven by (target) |
|----------|----------|--------------------|
| §1 | journal records every command, `ok`/`err` | Sub-task 1 writer tests |
| §2 hydration | recall reads all records | Sub-task 1 hydration round-trip |
| §2 replay | replay reads `ok` only | Sub-task 2 headline + skip-err |
| §3 | dual-shape input; prefix detection | Sub-task 2 bare-`|` + boundary tests |
| §4 | ok=worker/transactional, err=best-effort | Sub-task 1 happy-path + best-effort tests |
| §5 | backward compat (all-`ok` files) | existing `read_recent_sources` tests stay green; replay-of-all-`ok`-log test |
## Verification (phase-exit)
- `cargo test` — zero failures, zero skips, no regression from
the 1645 baseline; report exact counts.
- `cargo clippy --all-targets -- -D warnings` clean.
- Every row of the mapping above has a green test; the headline
`replay history.log` test is green.
- Final written DA review (specific critiques first, verdict
last; rubber-stamp PASS is a process failure).
- Update ADR-0034 status note / README index if the ADR's status
presentation needs it (it is already Accepted — likely just a
pointer to the implementing commits), per the index-upkeep
rule.
## Open questions to escalate before/at code time
1. **`err` append sites.** The ADR says "runtime/app error path"
but does not pin the exact functions. If the natural site is
ambiguous (e.g. parse failures surface in more than one place),
surface the choice rather than guessing — especially if
threading the source text to the error path is non-trivial.
2. **Status token set ownership.** Where the `ok`/`err` tokens
live (a shared const/enum vs string literals) so writer and
reader cannot drift — propose the smallest shared definition;
escalate only if it forces a wider refactor.
3. **Does an `err` line need the un-parsed source verbatim?** A
parse failure has no validated command — the journal should
record what the user *typed*. Confirm the raw input is
available at the err-append site (it is for recall today;
verify for the journal).
## Implementation outcome (2026-05-24)
Both sub-tasks landed test-first as planned. A **third concern surfaced
during implementation** and was resolved with the user, becoming
**ADR-0034 Amendment 1**: making `replay history.log` work (Sub-task 2)
exposed that the journal also records app-lifecycle commands
(`save as` / `load` / `new` / `export` / `import` / `rebuild` / `mode`),
which would panic the worker dispatch or abort replay. Replay now
**skips** every app-lifecycle command + nested `replay` (re-applying
only schema/data writes); all skips continue (the prior nested-`replay`
refusal is reversed), with a `[skip]` warning on `import` / nested
`replay`. Classification is by entry word so the modal / incomplete
forms skip uniformly rather than aborting. See the amendment in
`docs/adr/0034-…` and `tests/replay_command.rs`. `requirements.md`
U3 / U4 updated to match.
## What this plan does NOT contain
- Time estimates (milestones, not hours).
- Re-statements of ADR-0034's decisions — it refers by section;
the implementer reads both.
- The deferred finer-grained status values, M4, ADR-0006
auto-snapshot, or Phase 4 — all tracked in handoff-34 §5.