# ADR-0015: Project storage runtime ## Status Accepted. Amends ADR-0004 (project file format) and ADR-0007 (sharing and export); see the "Relationship to earlier ADRs" section at the end for the exact deltas. ## Context ADR-0004 defined the on-disk shape of a project — `project.yaml` + `data/.csv` + `history.log`, with `playground.db` as a derived artifact. It deliberately did not specify runtime semantics: when a project comes into existence, where it lives, how the on-disk files are kept consistent with the running SQLite database, what happens on load, on failure, on concurrent open, and how the canonical app-level commands (`save`, `load`, `new`, `export`, `import`) are scoped. Track 1 of the application built everything against an in-memory SQLite database. Every quit lost all work. This is the largest single UX gap left in the project, and the next useful feature (replay/undo, ADR-0006) depends on the `history.log` written here. This ADR fills the runtime gap. It commits to a single persistence model — *every successful command writes through to all targets immediately, and validation gates everything* — and works the resulting design through to file naming, the load picker, the failure model, and concurrent-open behaviour. ## Decision ### 1. Lifecycle and locations There is no in-memory database in normal operation. Every session is backed by a project on disk. - **Startup with no CLI argument:** the application creates a new temporary project under the OS data directory (see below), opens it, and runs against it. - **Startup with a CLI argument** (`rdbms-playground `, requirement L1): the application opens the project at that path. If the path does not exist or does not look like a project (no `project.yaml`), it refuses with a friendly error. - **`save` / `save as`** elevate or copy a project to a chosen location. - **`load`** opens a different project (see section 7). - **`new`** creates a fresh temp project from inside the running application, after closing the current one. The OS data root is platform-standard: - Linux: `$XDG_DATA_HOME/rdbms-playground` (defaulting to `~/.local/share/rdbms-playground` when `XDG_DATA_HOME` is unset). - macOS: `~/Library/Application Support/rdbms-playground`. - Windows: `%APPDATA%\rdbms-playground`. Inside the data root: `projects/` holds projects — both auto-generated temp ones and ones the user has saved with a name of their choosing. There is no requirement that named projects move *out* of the data root, and no encouragement to do so: keeping a saved project right alongside the temp ones is the easiest workflow and is fully supported. Users who prefer a different home (a course directory, a shared drive, a git working tree) save there instead. The application prescribes nothing. The data root also carries a small state file `last_project` (a single line containing the absolute path of the most recently opened project). It exists to support `--resume` (section 7). A `--data-dir` CLI flag fully replaces the OS-standard data root for the duration of that run; both project creation and the load picker's listing use the supplied directory and only that directory. The `last_project` state file is read and written under the active data root, so a user with multiple data roots gets independent resume histories per root, which is the intuitive behaviour. ### 2. Project naming and display name Temp project directory names follow the pattern `---`, where the words are drawn from a small built-in wordlist compiled into the binary (no external file or network call). Example: `20260507-water-buffalo-skating`. The leading date keeps the file listing chronologically sortable; the words give learners something nameable to refer to. Named projects use whatever directory name the user chose at `save` time. **Collision handling.** - For auto-generated temp names: before creating the directory, the application checks for an existing entry of the same name in the data root and regenerates the three-word slug if one is found. The wordlist is large enough (multiple categories, dozens of words each) that collisions are essentially never observed in practice; the check is cheap and removes the failure mode entirely. - For user-supplied names at `save` / `save as` / `import`: if the target directory already exists (whether it contains a project or anything else), the operation is refused with a friendly error. The user picks a different name or moves/removes the existing directory first. We deliberately do not auto-suffix or merge — silently changing the name the user typed, or writing into someone else's directory, is worse than asking them to pick again. The application carries a *display name* derived from the project directory name by a small prettifier: - Strip a leading `YYYYMMDD-` if present (temp projects). - Split on `-` (kebab-case), `_` (snake_case), or case boundaries (camelCase / PascalCase). - Title-case each word. So `20260507-water-buffalo-skating` displays as "Water Buffalo Skating"; `MyOrders` displays as "My Orders"; `customer_demo` displays as "Customer Demo". The display name is shown in the bottom status bar at all times, prefixed with `Project:` so it's unambiguous. This is how the user knows which project they are editing. ### 3. `project.yaml` shape Flat ordered lists. Tables and columns preserve declaration order; relationships preserve creation order. ```yaml version: 1 project: created_at: 2026-05-07T14:30:12Z tables: - name: Customers primary_key: [id] columns: - { name: id, type: serial } - { name: Name, type: text } relationships: - name: Customers_id_to_Orders_CustId parent: { table: Customers, column: id } child: { table: Orders, column: CustId } on_delete: cascade on_update: no_action ``` The `version: 1` field is required. Migrators (section 9) upgrade older versions on load. The project's name is **not** stored in `project.yaml`; the directory name on disk is the canonical name. Recording it twice would create an opportunity for the two to drift if the user renamed the directory by hand; with one source of truth, that question doesn't arise. ### 4. CSV encoding One file per table, `data/.csv`, UTF-8, RFC 4180 quoting, header row carrying column names in declaration order. Per-type encoding: | Type | CSV form | |------------|---------------------------------------| | `text` | RFC 4180 string | | `int` | decimal integer | | `real` | shortest-round-trip decimal | | `decimal` | string form already validated by `value.rs` | | `bool` | `true` / `false` | | `date` | `YYYY-MM-DD` | | `datetime` | ISO 8601 with `T` and a `Z` or offset | | `blob` | base64 (standard alphabet, padded) | | `serial` | integer | | `shortid` | base58 string | NULL is the empty unquoted field; the empty quoted field (`""`) is an empty string. The distinction is preserved because SQL preserves it and the playground is meant to teach SQL. ### 5. `history.log` format Append-only, one record per line, three pipe-separated fields: ``` 2026-05-07T14:30:12Z|ok|create table Customers with pk id:serial 2026-05-07T14:30:30Z|ok|insert into Customers ('Alice') ``` - **Timestamp** in ISO 8601 with `Z`. - **Status** is always `ok` in v1, because failed commands are not recorded — this matches ADR-0006's "successfully executed command" wording and keeps the log directly replayable. The status field is kept in the line format anyway so future use cases (audit logs that record attempts, validation diagnostics, distinguishing user-issued from imported commands) can carry additional values without a format break. - **Command** is the user's input as typed. Newlines (when multi-line input arrives, requirement I1) are escaped as literal `\n`. `history.log` is **not** included in `export` (see section 11 and the ADR-0007 amendment). It is private to the user's working copy. ### 6. Persistence ordering A successful user command produces effects in four targets: the SQLite database, `project.yaml`, the relevant `data/
.csv` file(s), and `history.log`. INV-2 from the Phase-1 record requires that the **combined db persistence logic** — validation, metadata-table handling, the SQLite mutations — gate everything else. The implementation order inside a command is: 1. **Validate and stage in the database.** Open a SQLite transaction. Perform validation, schema/metadata mutations, data mutations. Do not commit yet. 2. **Stage text targets.** Write `project.yaml` (if schema or relationships changed) and affected `data/
.csv` files (if rows changed) to temp files inside the project directory. Append the new line for `history.log` to a temp copy. `fsync` each. 3. **Rename text targets.** Atomic rename each temp file to its final path (POSIX `rename(2)`; on Windows `MoveFileEx(REPLACE_EXISTING)`). 4. **Commit the SQLite transaction.** Failure handling: - Failure in step 1 or 2 → roll back the SQLite transaction; no rename happens; on-disk state is unchanged. Surface the failure (see section 8) and quit. - Failure in step 3 (rename fails after `fsync`) → roll back the SQLite transaction; orphan temp files remain in the project directory and are cleaned up on next open. On-disk semantic state is unchanged. Surface and quit. - Failure in step 4 (commit fails after rename succeeded) → rare; on next launch the on-disk text is ahead of the `playground.db`. The user sees stale data and runs `rebuild` (section 7) to recover. Documented edge case; acceptable for v1. This ordering is "commit db last so a fatal failure leaves disk state recoverable via `rebuild`." ### 7. Load and rebuild **Load on startup or via the `load` command.** If `playground.db` exists in the project directory, it is opened as-is. If it does not exist, it is rebuilt silently from `project.yaml` + `data/
.csv`. There is no automatic detection of drift between the database and the text sources on load; that's what `rebuild` is for. **`--resume` CLI option.** Equivalent to passing the path recorded in the `/last_project` state file as the positional CLI argument. If `last_project` is missing or points at a path that no longer exists, `--resume` exits with an error pointing the user at the absent project; it does **not** silently fall back to creating a new temp project, because the user's intent ("resume what I had") is clear and silent fallback would mask the problem. `--resume` and an explicit positional path are mutually exclusive; the combination errors out. The `last_project` file is rewritten on every successful project open (startup, `load`, `new`, `save as`, `import`). A clean exit doesn't clear it — that's the whole point of `--resume` after a quit. **CSV row-load failure during rebuild.** When rebuilding `playground.db` from `project.yaml` + `data/
.csv`, each row insert can fail (malformed CSV, type-validation failure, FK violation, NOT NULL violation, etc.). The behaviour mirrors the persistence failure model (section 8): the rebuild stops at the first failing row and surfaces a fatal error of the form > Unable to load row *N* from `data/
.csv` into table > `
`: *<diagnosis from the value/FK/constraint > validator>* The application then quits. There is no realistic case where a CSV produced by a previous well-behaved session contains an unloadable row; if one does, something has gone wrong (hand edit, partial git merge, file corruption) and the user should fix the file or restore an earlier copy. Continuing past the bad row would either lose data silently (skip it) or load partial state (stop but keep what loaded), both of which leave the user in a worse position than a clear error message. **`rebuild` app-level command.** Discards the current `playground.db` and reconstructs it from `project.yaml` + `data/`. Always shows a confirmation prompt with a summary ("12 tables, 47 rows will be reconstructed; existing `playground.db` will be replaced") before doing the work. Useful when: - The user pulled new YAML/CSV from git over an old `.db`. - A prior persistence failure left the `.db` behind the text (section 6, step-4 failure mode). - The user hand-edited the YAML or CSV outside the app. **Load picker UX.** The `load` command opens an in-TUI modal listing temp projects from the data dir, sorted newest first, with the prettified display name and creation timestamp. Arrow keys select; Enter loads; Esc cancels; pressing `b` (for "browse") switches the modal to a path-entry prompt for projects outside the data dir. This covers both common (pick a recent temp) and uncommon (open a named project at a custom path) cases without forcing the user into a fully manual path entry up front. ### 8. Failure model Persistence failures are fatal. The application surfaces a banner with the operation, the path, and the OS error message, then quits cleanly so the banner remains visible above the shell prompt. The user investigates (disk full, permission denied, network filesystem hiccup) and restarts. This is the right model because the realistic failure modes for a local data directory do not heal transiently. Showing a warning and continuing risks silent loss when the user later quits the app while the failure window is still open. The persistence ordering in section 6 ensures that "fatal failure → quit" never leaves the disk in a state that cannot be recovered: it is either unchanged (the common case) or recoverable via `rebuild` (the rare step-4 failure). The "quit on failure" mode is also not anticipated to be particularly disruptive in practice. Even if a transient issue (a network drive timing out, an antivirus scanner holding a file briefly) does cause a fatal failure, the user's path back into the session is just `rdbms-playground --resume`. With section 6's ordering guaranteeing recoverable disk state and `--resume` guaranteeing one-command return, the cost of erring on the side of "stop and let the user investigate" is small enough that the safety benefit dominates. ### 9. Migration framework (F3) `project.yaml` carries `version: 1` from the outset. Future format changes bump the version and add a registered migrator function: ```rust fn migrate_v1_to_v2(raw: &mut RawProject) -> Result<(), MigrateError> { ... } ``` Migrators are stored in an ordered list keyed by source version. On load, the application: 1. Reads the file's `version`. 2. If `version < latest_known`, copies the original file to `project.yaml.v.bak` (where `` is the original version). 3. Runs each migrator in sequence from `version + 1` to `latest_known`. 4. Writes the upgraded YAML back at the new version. 5. If any migrator fails, restores the `.bak` and surfaces the failure as a fatal load error. The framework is built in v1 even though no migrator exists yet. The first real migrator (when v2 lands) exercises it. ### 10. Concurrency A lock file `/.rdbms-playground.lock` is written when a project is opened, containing the PID and hostname of the owning process. On open: - If no lock file exists: take the lock and proceed. - If a lock file exists with a live PID on this host: refuse with a friendly error pointing the user at the running instance. - If a lock file exists but the PID is dead (or it lists a different hostname): take the lock (clean handover from a crashed prior instance). The lock is removed on clean exit. Crashes leave it behind; the next open reclaims it. The lock blocks only other rdbms-playground TUI instances. External read-only tooling (`sqlite3 playground.db -readonly`, text editors looking at `project.yaml`, etc.) is not prevented. The user is on their own if they fiddle with the project files concurrently with the running app — that's a power-user workflow we don't get in the way of. ### 11. App-level commands The track 2 command set, all available in both modes per ADR-0003: - **`save`** — for a temp project, prompts for a target directory and elevates to a named project (effectively identical to `save as`). For a named project, reports "auto-saved; use `save as` to copy to a new location." - **`save as`** — prompts for a target directory; copies the entire project there and switches to operating on the copy. - **`load`** — opens the load picker (section 7). - **`new`** — creates a fresh temp project; closes the current one cleanly first (auto-save guarantees the current state is on disk). - **`rebuild`** — section 7. - **`export`** — produces a zip per ADR-0007, *excluding* both `playground.db` and `history.log` (see ADR-0007 amendment below). Default filename pattern unchanged. - **`import`** — accepts an exported zip, unpacks it into a named project at a chosen location, runs `rebuild` on open. The exported zip has no `playground.db` and no `history.log`, so a fresh `playground.db` is created from YAML+CSV, and `history.log` starts empty. The chosen target directory must not already exist (per the §2 collision rule); the user picks a different name or removes the existing directory first. The `.gitignore` template (F2) is created in every new project directory and excludes: ``` /playground.db /.rdbms-playground.lock /project.yaml.v*.bak ``` `playground.db` is rebuildable; the lock file is per-process; migration backup files are local recovery aids that don't belong in shared history. The `data/` directory and `project.yaml` itself are *not* ignored — they are the shared source of truth. `history.log` is **not** ignored by default. Whether to commit one's working log is a per-user, per-project taste question — some learners will treat the log as part of the audit trail and want it in git; others will prefer to keep it private. The export zip handles the "share with strangers" case (ADR-0007 amendment 1); committing to git is a different decision and we leave it to the user. ### 12. Persistent input history (I2-persist) The in-memory navigable input history (Up/Down arrows, draft preservation, consecutive-duplicate dedup) gains a loader: on project open, the history navigation seed is populated from the project's `history.log` (latest N entries, where N is the same in-memory cap as today). New successful commands append to `history.log` and are pushed onto the in-memory stack as they are now. Project-scoped only. A separate global rolling history is deferred to a future ADR (OOS-6). ### 13. Out of scope The following are tracked but not part of this ADR: - **OOS-1.** Snapshot ring buffer and `undo` (U1, U2, ADR-0006). - **OOS-2.** `replay` command (U4). The `history.log` format is replay-compatible; the command itself ships later. - **OOS-3.** Multi-tab output / V4 session log work. - **OOS-4.** Tab completion or syntax highlighting for the new commands' arguments. - **OOS-5.** L2 (submitting a command alongside project load). - **OOS-6.** Global rolling input history. ## Relationship to earlier ADRs This ADR amends two earlier ADRs in place rather than superseding them outright; the earlier ADRs remain the canonical reference for everything outside the amended clauses. - **ADR-0004 — Project file format.** The "playground.db is a derived artifact" framing remains correct for *recovery* (the database can be reconstructed from text sources at any time). It does not describe runtime data flow: at write time, all four targets (db, yaml, csv, history.log) share a single source — the user's command — and are written alongside one another per section 6 here. The "rebuild with confirmation when `.db` exists" semantics are reframed: there is no automatic drift detection on load; the rebuild path is the explicit `rebuild` command, which prompts for confirmation when invoked. - **ADR-0007 — Sharing and export.** The export contents are now `project.yaml` + `data/`, *excluding* both `playground.db` (as before) and `history.log` (new). Rationale: the history is the user's working log and may contain commands they don't want to share. Export remains zip-based; default filename pattern is unchanged. The amendments are made in place in those ADR files, with a note pointing to this ADR. ## Consequences - The biggest UX gap closes: quitting no longer loses work. - A failed command leaves the disk unchanged. A succeeded command is durable on disk before the application acknowledges it, with one documented edge case that the `rebuild` command exists to fix. - The persistence path runs four file writes per command in the common case. At teaching scale this is invisible; at bulk-insert scale (thousands of rows in tight loops) it could matter, and a future "batch" command will be the remedy. Premature debouncing is rejected (it would create a real inconsistency window for negligible gain at this scale). - The "commit db last" ordering is the load-bearing invariant for failure recovery. Future contributors changing the persistence flow must preserve it. - The display-name prettifier is small and lives close to the project loader; future filename conventions (instructor-supplied lesson kits, perhaps) plug into it. - The lock file is a small piece of state that survives crashes; the "live PID on this host" check is the load-bearing piece of its correctness. Cross-host network filesystems will give us false positives there; we accept that and document it if real users hit it. - `history.log` becomes the persistent history surface. Once `replay` (OOS-2) and `undo` (OOS-1) land, they read from the same file with no schema changes.