# ADR-0004: Project file format ## Status Accepted ## Context Projects must be: - Shareable — students and instructors should be able to send projects to each other and reconstruct the full database state. - Diffable — version control should produce meaningful diffs as a schema or data set evolves. - Versioned — the format will change as the app evolves, and old projects must continue to load. - Efficient enough for moderate amounts of practice data without forcing users into pathological YAML files of tens of thousands of rows. The on-disk SQLite file (`.db`) is convenient but binary and not suited to sharing or diffing. ## Decision A project is a directory containing: ``` / project.yaml # schema, relationships, metadata, version data/ .csv # one CSV file per table, with header row playground.db # derived; rebuildable from project.yaml + data/ history.log # append-only command/replay log (see ADR-0006) ``` - `project.yaml` carries a top-level `version: 1` field from the outset, plus all schema, relationship, and project metadata. - Table data lives in `data/
.csv` (UTF-8, header row, RFC 4180 quoting). One file per table keeps diffs scoped and avoids monolithic YAML. - `playground.db` is a **derived artifact**. The authoritative state is `project.yaml` + `data/`. The `.db` file is kept when present (we never silently drop it) but can be rebuilt from the text sources at any time. - Rebuilding when no `.db` exists: silent, automatic. - Rebuilding when a `.db` exists: requires user confirmation with a summary diff (e.g. "3 tables, 47 rows will be recreated; existing `.db` will be replaced"). - A `.gitignore` template is created in each project; by default the `.db` file is ignored so version control captures only the authoritative sources. ## Consequences - Projects round-trip cleanly through git, email, and zip. - Large practice data sets remain efficient (CSV is appropriate). - Schema review remains pleasant (YAML is appropriate). - The app must be able to (re)build a database from the text sources at any time — this is a first-class code path, not an edge case. - The `version` field opens the door to format migrations as the app evolves; old projects load by running registered migrators in sequence.