Files
rdbms-playground/docs/adr/0004-project-file-format.md
claude@clouddev1 4fca862c6c Project storage runtime: ADR-0015 + ADR-0004/0007 amendments
Designs track-2 lifecycle and persistence end-to-end: per-command
write-through to db+yaml+csv+history.log gated by the combined db
persistence logic with commit-db-last ordering; existence-only load
with explicit rebuild command; --resume CLI flag backed by
<data-root>/last_project; in-TUI list-with-browse picker; lock file
for single-instance enforcement; fatal-banner-then-quit failure
model (with --resume making restart cheap); fatal CSV row-load
errors with full diagnosis; YYYYMMDD-word-word-word temp naming
with display-name prettifier; collision-checked names for both
temp and user-supplied projects. Project name lives only on the
filesystem (not duplicated in YAML). ADR-0004 and ADR-0007 amended
in place. requirements.md and CLAUDE.md updated; OOS-6 (global
rolling history) tracked as deferred.
2026-05-07 19:53:47 +00:00

4.0 KiB

ADR-0004: Project file format

Status

Accepted. Amended by ADR-0015 — see the "Amendments" section at the end of this file for the specifics; the rest of this ADR remains the canonical reference for the project file format.

Context

Projects must be:

  • Shareable — students and instructors should be able to send projects to each other and reconstruct the full database state.
  • Diffable — version control should produce meaningful diffs as a schema or data set evolves.
  • Versioned — the format will change as the app evolves, and old projects must continue to load.
  • Efficient enough for moderate amounts of practice data without forcing users into pathological YAML files of tens of thousands of rows.

The on-disk SQLite file (.db) is convenient but binary and not suited to sharing or diffing.

Decision

A project is a directory containing:

<project-name>/
  project.yaml         # schema, relationships, metadata, version
  data/
    <table>.csv        # one CSV file per table, with header row
  playground.db        # derived; rebuildable from project.yaml + data/
  history.log          # append-only command/replay log (see ADR-0006)
  • project.yaml carries a top-level version: 1 field from the outset, plus all schema, relationship, and project metadata.
  • Table data lives in data/<table>.csv (UTF-8, header row, RFC 4180 quoting). One file per table keeps diffs scoped and avoids monolithic YAML.
  • playground.db is a derived artifact. The authoritative state is project.yaml + data/. The .db file is kept when present (we never silently drop it) but can be rebuilt from the text sources at any time.
    • Rebuilding when no .db exists: silent, automatic.
    • Rebuilding when a .db exists: requires user confirmation with a summary diff (e.g. "3 tables, 47 rows will be recreated; existing .db will be replaced").
  • A .gitignore template is created in each project; by default the .db file is ignored so version control captures only the authoritative sources.

Consequences

  • Projects round-trip cleanly through git, email, and zip.
  • Large practice data sets remain efficient (CSV is appropriate).
  • Schema review remains pleasant (YAML is appropriate).
  • The app must be able to (re)build a database from the text sources at any time — this is a first-class code path, not an edge case.
  • The version field opens the door to format migrations as the app evolves; old projects load by running registered migrators in sequence.

Amendments

Amendment 1 — runtime data flow (ADR-0015)

The phrase "playground.db is a derived artifact" describes a recovery property: the database can always be reconstructed from project.yaml + data/. It does not describe runtime data flow.

At write time, all persistence targets (the SQLite database, project.yaml, the relevant data/<table>.csv files, and history.log) share a single source — the user's command — and are written alongside one another in a defined order (see ADR-0015 §6). None of the text files is "downstream" of the database at write time.

Amendment 2 — .db rebuild trigger (ADR-0015)

The "rebuild with confirmation when .db exists" semantics in the original Decision section are replaced by a simpler model:

  • On load, if playground.db exists, it is opened as-is.
  • On load, if playground.db is missing, it is rebuilt silently from project.yaml + data/.
  • A new app-level command, rebuild, explicitly discards the current playground.db and reconstructs it from the text sources, with a confirmation prompt and a summary of what will be reconstructed.

The application does not attempt to detect drift between the database and the text sources automatically. rebuild is the explicit user-driven path for cases where drift exists (git pull over an existing .db, hand edits to YAML/CSV, recovery after a rare failure described in ADR-0015 §6).