rdbms-playground/CLAUDE.md

# RDBMS Playground — project notes for Claude

## What this project is

A cross-platform TUI application that gives learners a sandbox for
exploring relational database concepts: tables, columns, primary
and foreign keys, relationships, indexes, queries, and query
plans. The audience is students from beginners to those ready for
raw SQL, and the design accommodates both ends of that spectrum.

The application is a teaching tool, not a database administration
tool. Decisions about the type system, command surface, and
backend choices are skewed toward pedagogy over breadth.

## Authoritative decisions

All significant design decisions live in `docs/adr/`. Read
`docs/adr/README.md` for the index. **Before proposing changes
that touch a decided area, read the relevant ADR.** Decisions are
not re-litigated casually — if a decision needs to change, write a
new ADR that supersedes the old one.

Current decisions at a glance (each backed by an ADR):

- **Stack:** Rust + Ratatui + Crossterm; both the DSL and
  advanced-mode SQL are parsed by a single hand-rolled
  grammar/walker (ADR-0024's unified grammar tree; SQL added by
  ADRs 0030–0036) — superseding ADR-0001's original plan of
  `chumsky` for the DSL + a reserved `sqlparser-rs` for SQL
  (neither is a dependency now); `rusqlite` for the database
  (ADR-0001).
- **Backend:** SQLite with `STRICT` tables and FK enforcement on
  (ADR-0002). Database access through a dedicated worker thread
  with mpsc/oneshot request channels (ADR-0010).
- **Input:** simple mode (DSL only) by default; advanced mode
  (SQL + app-level commands) on toggle; `:` one-shot escape from
  simple to advanced (ADR-0003). No other sigils.
- **Project format:** `project.yaml` + `data/<table>.csv` +
  `history.log`; `playground.db` is a derived artifact (ADR-0004,
  amended by ADR-0015). Implemented through Iteration 4 +
  cleanup; export/import (Iter 5) and migration framework /
  --resume / persistent input history (Iter 6) pending.
- **Project storage runtime:** every command persists through to
  db + yaml + csv + history.log in one execution context, gated
  by the combined db persistence logic; commit-db-last ordering
  for crash-recoverable state; existence-only load + explicit
  `rebuild` command; in-TUI list-with-browse load picker; lock
  file for single-instance enforcement; persistence failures
  are fatal (banner + quit) (ADR-0015). Empty tables produce no
  CSV. CSV reader hand-rolled to preserve NULL-vs-empty
  distinction. Temp projects are marked by a literal `[temp]`
  segment in their directory name (validate_user_name rejects
  brackets, so user-named projects can never collide).
- **Temp project cleanup:** unmodified empty temps
  (kind=Temp, empty schema, empty data dir) are auto-deleted
  on switch and on quit by `safely_delete_temp_project`,
  which stacks containment / symlink-rejection /
  `[temp]`-marker / contents-allowlist guards. Anything
  unexpected → refuse, never delete the wrong thing.
- **Types:** `text`, `int`, `real`, `decimal`, `bool`, `date`,
  `datetime`, `blob`, `serial`, `shortid`. Compound primary keys
  supported. No real UUIDs (ADR-0005). FK column type
  compatibility via `Type::fk_target_type()` — `serial → int`,
  `shortid → text`, others identity (ADR-0011).
- **Safety:** append-only `history.log` for replay and scripting
  (ADR-0006 U3/U4) — *implemented* (ADR-0034). Undo/snapshot half
  (U1/U2): `undo` / `redo` app commands (no sigil) with auto-snapshot
  before **every** mutation into a persisted N=50 ring; hybrid
  whole-project snapshot (db backup API + yaml/csv copy); `--no-undo`
  to disable (ADR-0006 **Amendment 1**). *(Implemented 2026-05-24 —
  `src/undo.rs` ring + worker hook in `src/db.rs`; one undo step per
  user command, batch ops collapse to one, `import` excluded.)*
- **Sharing:** `export` command produces a zip without the `.db`;
  no hosted publishing (ADR-0007).
- **Testing:** four-tier strategy from `cargo test` units up to
  PTY-based end-to-end (ADR-0008). Tiers 1–3 are active; **Tier 4
  is not yet wired** — ADR-0008 specifies the PTY harness and the
  four critical flows, but no PTY deps or tests exist yet
  (verified 2026-06-07; corrects an earlier "wired only for the
  listed critical flows" claim). Tracked as `requirements.md` TT4.
- **DSL syntax conventions:** required clauses use keyword
  grammar (`with pk`, `to table` optional, `from..to`, `set`,
  `where`); `--` flags are reserved for opt-in choices; one
  sigil only (`:`); keywords case-insensitive, identifiers
  case-preserving (ADR-0009).
- **Internal metadata tables** (ADR-0012, ADR-0013): the database
  carries `__rdbms_playground_columns` for user-facing column
  types and `__rdbms_playground_relationships` for named FKs.
  These are the source of truth for round-tripping schema info.
  Internal tables follow the `__rdbms_*` naming convention and
  are filtered out of `list_tables`.
- **FK relationships:** declared via `add 1:n relationship [as
  <name>] from <P>.<col> to <C>.<col> [on delete <action>] [on
  update <action>] [--create-fk]`. Implemented through the
  rebuild-table primitive — the same machinery backs B2's
  column drop/rename/type-change operations (ADR-0013), which
  are implemented in both simple mode (`drop column` /
  `rename column` / `change column`) and advanced mode
  (`ALTER TABLE`, ADR-0035 §4e/§4f).
- **Data operations:** `insert / update / delete / show data`
  with required WHERE plus `--all-rows` opt-in for unfiltered
  ops; auto-show after writes shows just the affected rows;
  DELETE reports per-relationship cascade summaries (ADR-0014).
- **Indexes & query plans:** `add index` / `drop index`
  (ADR-0025); `explain show data|update|delete` runs
  `EXPLAIN QUERY PLAN` and renders an annotated, span-styled
  plan tree (ADR-0028). In advanced mode `explain` also wraps
  SQL `select` / `with` / `insert` / `update` / `delete`
  (ADR-0039). `EXPLAIN QUERY PLAN` never executes, so
  explaining a destructive command is safe.

## Repository layout

```
.
├── Cargo.toml                 # dependencies, lints (nursery)
├── CLAUDE.md                  # this file
├── docs/
│   ├── adr/                   # all decision records (read 0000 first)
│   ├── handoff/               # session-handover notes
│   └── requirements.md        # the Phase-1 checklist with progress
├── src/
│   ├── action.rs              # Action enum (Quit / ExecuteDsl)
│   ├── app.rs                 # App state + pure update() + Tier-1 tests
│   ├── cli.rs                 # CLI args (--theme, --log-file)
│   ├── db.rs                  # rusqlite worker, all DDL/DML, metadata tables
│   ├── dsl/
│   │   ├── action.rs          # ReferentialAction enum + parsing
│   │   ├── command.rs         # Command AST + RelationshipSelector + RowFilter
│   │   ├── mod.rs             # re-exports
│   │   ├── parser.rs          # parse entry point → unified-grammar walker
│   │   ├── shortid.rs         # base58 generator + validator
│   │   ├── types.rs           # user-facing Type enum + fk_target_type
│   │   └── value.rs           # Value/Bound + per-type validation
│   ├── event.rs               # AppEvent (input + DSL outcomes)
│   ├── lib.rs                 # module re-exports for tests
│   ├── logging.rs             # tracing setup, file-backed
│   ├── main.rs                # binary entry; thin
│   ├── mode.rs                # Simple/Advanced mode enum
│   ├── runtime.rs             # Tokio loop, terminal setup, dispatch
│   ├── snapshots/             # insta snapshots for Tier-2 tests
│   ├── theme.rs               # light/dark themes
│   └── ui.rs                  # ratatui rendering
└── tests/
    └── walking_skeleton.rs    # Tier-3 integration tests
```

Key invariants in the code:

- **`update()` is pure-sync.** It returns `Vec<Action>` for the
  runtime to enact. Side effects belong in the runtime, not the
  update function — that's what makes Tier 1/3 tests tractable.
- **Database access goes through the worker thread.** Always.
  No direct `rusqlite::Connection` use outside `db.rs`.
- **Schema mutations update metadata in the same transaction.**
  See ADR-0012 / ADR-0013. Adding a new DDL operation must keep
  the column- and relationship-metadata tables in sync.
- **Renderer is pure render of `App` state.** It reports
  viewport metrics back via `note_output_viewport` so subsequent
  scroll input is wrap-aware.

## Working style for this project

- **Documentation discipline.** Significant decisions get an
  ADR. In-flight discussion stays in conversation or issues
  until it settles. The ADR-0000 index-upkeep rule applies:
  every ADR change updates `docs/adr/README.md` in the same
  edit.
- **Testing.** Per the user's global standards, tests are
  established before changes, bugs are reproduced with failing
  tests before being fixed, and "all green, no skips" is the
  only acceptable end state. Integration tests exercise full
  flows.
- **No silent feature loss.** Anything in an ADR is decided. If
  implementation reveals that a decision is wrong or
  impractical, raise it explicitly and update the ADR — do not
  quietly drift.
- **Pedagogy wins ties.** When a design choice trades clarity
  for raw capability, prefer clarity. Real RDBMS power-user
  features exist; this app is not the place to teach them.
- **No engine name in user-facing strings.** The choice of
  database engine is an implementation detail per ADR-0002
  (User-facing posture). Error messages, success notes,
  help text, and any other user-visible string refer to
  "the database" or "the engine" in the abstract — never
  the specific product (SQLite, STRICT, rusqlite, PRAGMA).
  ADR-internal prose and code comments may name it where
  technically necessary for precision.
- **Confirm commits.** Per the user's global rules, every
  `git commit` is preceded by an explicit message proposal
  and user approval. No AI attribution in commit messages.

## Build hygiene

`target/` is git-ignored and 100% regenerable, but it grows
without bound — cargo never garbage-collects old hash-suffixed
artifacts, so stale test binaries (each ~100 MB, statically
linking the bundled engine + debug info), incremental-compile
caches, and orphaned example binaries pile up across sessions
(it reached **~38 GB** before the first sweep).

Two prevention levers are configured in `Cargo.toml` `[profile.dev]`
(the `test` profile inherits both):

- **`incremental = false`** — the incremental cache alone reached
  **16 GB** here (≈28 compilation units × every historical config,
  never evicted), for little benefit in a full-`cargo test` workflow.
  Off, it never regrows; the cost is whole-crate recompiles instead of
  partial — seconds for a crate this size.
- **`debug = "line-tables-only"`** — the default `debug = 2` is
  ~85–90 % of each test binary; line tables keep file:line in panics
  and backtraces (we debug via `tracing` logs) at a fraction of the
  size.

Even with those, stale artifacts still accumulate (cargo has no
target/ eviction). **Run `cargo sweep` every now and then** to reclaim
that — `cargo-sweep` (installed) prunes everything *except* the current
build's artifacts:

- **Keep only the current build (the usual sweep):** stamp,
  build, then delete everything the build didn't touch —
  ```
  cargo sweep --stamp
  cargo build --all-targets        # touch the artifacts to keep
  cargo sweep --file               # remove everything older than the stamp
  ```
  Add `--dry-run` to `--file` first to preview what goes. Caveat:
  `build --all-targets` only updates the mtime of what it
  actually (re)builds, so already-fresh *dependency* artifacts
  fall before the stamp and get swept too — they recompile once
  on the next build (a one-time cost; everything is regenerable).
  The first 38 GB → 20 GB sweep freed **19 GiB** this way.
- **Lighter routine options:** `cargo sweep --time 30` drops
  artifacts untouched for 30+ days; `cargo sweep --maxsize 10GB`
  trims oldest until under a size cap.
- **`--installed` is *not* the tool for same-toolchain cruft.**
  It keeps only artifacts from currently-installed rustup
  toolchains, so it frees space *only after you uninstall/replace
  a toolchain*. For the usual "many builds, one toolchain"
  accumulation it cleans **nothing** (verified: it would free 0
  here) — use the stamp/file workflow instead.

A good cadence is a sweep between major milestones (e.g. at
session handoff). `cargo clean` remains the nuclear option (wipes
all of `target/`, forcing a full from-scratch rebuild).

## Things deliberately deferred

These are explicitly tracked (mostly in `requirements.md`) but
not yet implemented:

- **Project storage** (track 2): largely implemented through
  Iteration 4 + cleanup pass + safety hardening (Iterations
  1–4 of ADR-0015). Pending pieces: `export` / `import` (Iter
  5), `--resume` + persistent input history hydration +
  migration framework scaffold (Iter 6).
- **Modify relationship** (C3a): drop+add covers the use case
  today.
- **m:n convenience** (C4): auto-generates a junction table
  with appropriate FKs — depends on relationships being solid
  (they are).
- **Strong syntax-help in parse errors** (H1a): point users at
  missing keywords/clauses rather than the unexpected
  character. *(H1 — the friendly **database**-error layer — is
  done, ADR-0019; H1a is its separate parse-error sibling,
  ADR-0021, still partial.)*
- **Tutorial/lesson system**: acknowledged as in scope for
  design; needs its own ADR.
- **Session log + Markdown export** (V4): the bigger UX
  project — scrollable session journal, smart structure
  rendering, save-as-markdown.
- **Readline shortcuts** (I1b): Ctrl-A/Ctrl-E, Ctrl-W/Ctrl-K/
  Ctrl-U.
- **Multi-line input** (I1): Enter inserts newline,
  Ctrl-Enter submits.
- **Tab completion** (I3), **syntax highlighting** (I4).
- **ER diagram export** (V3).
- **CI** (TT5): test infrastructure exists; CI workflow not
  yet configured.

## Handoff notes

When taking over a session, read in order:

1. `docs/handoff/` — most recent file gives session context.
2. `CLAUDE.md` (this file).
3. `docs/requirements.md` — current progress on each item.
4. `docs/adr/README.md` and any ADR you'll touch.