rdbms-playground/CLAUDE.md

# RDBMS Playground — project notes for Claude

## What this project is

A cross-platform TUI application that gives learners a sandbox for
exploring relational database concepts: tables, columns, primary
and foreign keys, relationships, indexes, queries, and query
plans. The audience is students from beginners to those ready for
raw SQL, and the design accommodates both ends of that spectrum.

The application is a teaching tool, not a database administration
tool. Decisions about the type system, command surface, and
backend choices are skewed toward pedagogy over breadth.

## Authoritative decisions

All significant design decisions live in `docs/adr/`. Read
`docs/adr/README.md` for the index. **Before proposing changes
that touch a decided area, read the relevant ADR.** Decisions are
not re-litigated casually — if a decision needs to change, write a
new ADR that supersedes the old one.

Current decisions at a glance (each backed by an ADR):

- **Stack:** Rust + Ratatui + Crossterm; both the DSL and
  advanced-mode SQL are parsed by a single hand-rolled
  grammar/walker (ADR-0024's unified grammar tree; SQL added by
  ADRs 0030–0036) — superseding ADR-0001's original plan of
  `chumsky` for the DSL + a reserved `sqlparser-rs` for SQL
  (neither is a dependency now); `rusqlite` for the database
  (ADR-0001).
- **Backend:** SQLite with `STRICT` tables and FK enforcement on
  (ADR-0002). Database access through a dedicated worker thread
  with mpsc/oneshot request channels (ADR-0010).
- **Input:** simple mode (DSL only) by default; advanced mode
  (SQL + app-level commands) on toggle; `:` one-shot escape from
  simple to advanced (ADR-0003). No other sigils.
- **Project format:** `project.yaml` + `data/<table>.csv` +
  `history.log`; `playground.db` is a derived artifact (ADR-0004,
  amended by ADR-0015). Fully implemented (ADR-0015 Iterations
  1–6): export/import, `--resume`, persistent input history, and
  the migration framework scaffold are all done.
- **Project storage runtime:** every command persists through to
  db + yaml + csv + history.log in one execution context, gated
  by the combined db persistence logic; commit-db-last ordering
  for crash-recoverable state; existence-only load + explicit
  `rebuild` command; in-TUI list-with-browse load picker; lock
  file for single-instance enforcement; persistence failures
  are fatal (banner + quit) (ADR-0015). Empty tables produce no
  CSV. CSV reader hand-rolled to preserve NULL-vs-empty
  distinction. Temp projects are marked by a literal `[temp]`
  segment in their directory name (validate_user_name rejects
  brackets, so user-named projects can never collide).
- **Temp project cleanup:** unmodified empty temps
  (kind=Temp, empty schema, empty data dir) are auto-deleted
  on switch and on quit by `safely_delete_temp_project`,
  which stacks containment / symlink-rejection /
  `[temp]`-marker / contents-allowlist guards. Anything
  unexpected → refuse, never delete the wrong thing.
- **Types:** `text`, `int`, `real`, `decimal`, `bool`, `date`,
  `datetime`, `blob`, `serial`, `shortid`. Compound primary keys
  supported. No real UUIDs (ADR-0005). FK column type
  compatibility via `Type::fk_target_type()` — `serial → int`,
  `shortid → text`, others identity (ADR-0011).
- **Safety:** append-only `history.log` for replay and scripting
  (ADR-0006 U3/U4) — *implemented* (ADR-0034). Undo/snapshot half
  (U1/U2): `undo` / `redo` app commands (no sigil) with auto-snapshot
  before **every** mutation into a persisted N=50 ring; hybrid
  whole-project snapshot (db backup API + yaml/csv copy); `--no-undo`
  to disable (ADR-0006 **Amendment 1**). *(Implemented 2026-05-24 —
  `src/undo.rs` ring + worker hook in `src/db.rs`; one undo step per
  user command, batch ops collapse to one, `import` excluded.)*
- **Sharing:** `export` command produces a zip without the `.db`;
  no hosted publishing (ADR-0007).
- **Testing:** four-tier strategy from `cargo test` units up to
  PTY-based end-to-end (ADR-0008). Tiers 1–3 are active; **Tier 4
  is not yet wired** — ADR-0008 specifies the PTY harness and the
  four critical flows, but no PTY deps or tests exist yet
  (verified 2026-06-07; corrects an earlier "wired only for the
  listed critical flows" claim). Tracked as `requirements.md` TT4.
- **DSL syntax conventions:** required clauses use keyword
  grammar (`with pk`, `to table` optional, `from..to`, `set`,
  `where`); `--` flags are reserved for opt-in choices; one
  sigil only (`:`); keywords case-insensitive, identifiers
  case-preserving (ADR-0009).
- **Internal metadata tables** (ADR-0012, ADR-0013): the database
  carries `__rdbms_playground_columns` for user-facing column
  types and `__rdbms_playground_relationships` for named FKs.
  These are the source of truth for round-tripping schema info.
  Internal tables follow the `__rdbms_*` naming convention and
  are filtered out of `list_tables`.
- **FK relationships:** declared via `add 1:n relationship [as
  <name>] from <P>.<col> to <C>.<col> [on delete <action>] [on
  update <action>] [--create-fk]`. Implemented through the
  rebuild-table primitive — the same machinery backs B2's
  column drop/rename/type-change operations (ADR-0013), which
  are implemented in both simple mode (`drop column` /
  `rename column` / `change column`) and advanced mode
  (`ALTER TABLE`, ADR-0035 §4e/§4f).
- **Data operations:** `insert / update / delete / show data`
  with required WHERE plus `--all-rows` opt-in for unfiltered
  ops; auto-show after writes shows just the affected rows;
  DELETE reports per-relationship cascade summaries (ADR-0014).
- **Indexes & query plans:** `add index` / `drop index`
  (ADR-0025); `explain show data|update|delete` runs
  `EXPLAIN QUERY PLAN` and renders an annotated, span-styled
  plan tree (ADR-0028). In advanced mode `explain` also wraps
  SQL `select` / `with` / `insert` / `update` / `delete`
  (ADR-0039). `EXPLAIN QUERY PLAN` never executes, so
  explaining a destructive command is safe.
- **Continuous integration & release** (built on the `ci` branch,
  2026-06-15; decisions in `docs/ci/adr/` — **ADR-ci-001/002/003**,
  a namespace kept separate from the main ADR sequence to avoid
  cross-branch number collisions, like the website's): a self-hosted
  **Gitea Actions** pipeline built on a **nix flake** (pinned Rust
  `1.95.0` — one source of toolchain for dev *and* CI) plus a
  prebuilt CI image. **Gate** (`ci.yaml`): `clippy -D warnings` +
  `cargo test` on every branch push / PR. **Release** on a `v*` tag
  (`release.yaml`): the four non-macOS **D1** targets cross-built
  with `cargo-zigbuild` (Linux musl static + standalone Windows
  `.exe`); the two macOS targets via the **dispatched**
  `release-macos.yaml` on a Tart Apple-Silicon runner (de-nix the
  `libiconv` load path + ad-hoc re-sign). All published to a Gitea
  release with `.sha256`s. **`fmt` is intentionally not gated yet**
  (the tree isn't stock-`rustfmt`-clean). `workflow_dispatch` is
  Gitea-default-branch-only, so `release-macos` is dispatchable once
  this lands on `main`.

## Repository layout

```
.
├── Cargo.toml                 # dependencies, lints (nursery)
├── CLAUDE.md                  # this file
├── docs/
│   ├── adr/                   # all decision records (read 0000 first)
│   ├── handoff/               # session-handover notes
│   └── requirements.md        # the Phase-1 checklist with progress
├── src/
│   ├── action.rs              # Action enum (Quit / ExecuteDsl)
│   ├── app.rs                 # App state + pure update() + Tier-1 tests
│   ├── cli.rs                 # CLI args (--theme, --log-file)
│   ├── db.rs                  # rusqlite worker, all DDL/DML, metadata tables
│   ├── dsl/
│   │   ├── action.rs          # ReferentialAction enum + parsing
│   │   ├── command.rs         # Command AST + RelationshipSelector + RowFilter
│   │   ├── mod.rs             # re-exports
│   │   ├── parser.rs          # parse entry point → unified-grammar walker
│   │   ├── shortid.rs         # base58 generator + validator
│   │   ├── types.rs           # user-facing Type enum + fk_target_type
│   │   └── value.rs           # Value/Bound + per-type validation
│   ├── event.rs               # AppEvent (input + DSL outcomes)
│   ├── lib.rs                 # module re-exports for tests
│   ├── logging.rs             # tracing setup, file-backed
│   ├── main.rs                # binary entry; thin
│   ├── mode.rs                # Simple/Advanced mode enum
│   ├── runtime.rs             # Tokio loop, terminal setup, dispatch
│   ├── snapshots/             # insta snapshots for Tier-2 tests
│   ├── theme.rs               # light/dark themes
│   └── ui.rs                  # ratatui rendering
└── tests/
    └── walking_skeleton.rs    # Tier-3 integration tests
```

Key invariants in the code:

- **`update()` is pure-sync.** It returns `Vec<Action>` for the
  runtime to enact. Side effects belong in the runtime, not the
  update function — that's what makes Tier 1/3 tests tractable.
- **Database access goes through the worker thread.** Always.
  No direct `rusqlite::Connection` use outside `db.rs`.
- **Schema mutations update metadata in the same transaction.**
  See ADR-0012 / ADR-0013. Adding a new DDL operation must keep
  the column- and relationship-metadata tables in sync.
- **Renderer is pure render of `App` state.** It reports
  viewport metrics back via `note_output_viewport` so subsequent
  scroll input is wrap-aware.

## Working style for this project

- **Documentation discipline.** Significant decisions get an
  ADR. In-flight discussion stays in conversation or issues
  until it settles. The ADR-0000 index-upkeep rule applies:
  every ADR change updates `docs/adr/README.md` in the same
  edit.
- **Issue tracking.** Bugs and enhancements are filed as Gitea
  issues (see *Issue tracking — Gitea via `tea`* below).
  `docs/requirements.md` and the ADRs remain the source of truth
  for **scope and decisions**; issues are the lightweight tracker
  for **discrete work items**, cross-referenced from commits and
  handoffs (e.g. `fix: … (#12)`). The project is near completion
  of its initial requirements, so no heavyweight planning workflow
  is run — the document-based requirements are augmented with
  issue references as work proceeds. A change that touches a
  *decided* area still earns an ADR; the issue references the ADR,
  it does not replace it.
- **Testing.** Per the user's global standards, tests are
  established before changes, bugs are reproduced with failing
  tests before being fixed, and "all green, no skips" is the
  only acceptable end state. Integration tests exercise full
  flows.
- **No silent feature loss.** Anything in an ADR is decided. If
  implementation reveals that a decision is wrong or
  impractical, raise it explicitly and update the ADR — do not
  quietly drift.
- **Pedagogy wins ties.** When a design choice trades clarity
  for raw capability, prefer clarity. Real RDBMS power-user
  features exist; this app is not the place to teach them.
- **No engine name in user-facing strings.** The choice of
  database engine is an implementation detail per ADR-0002
  (User-facing posture). Error messages, success notes,
  help text, and any other user-visible string refer to
  "the database" or "the engine" in the abstract — never
  the specific product (SQLite, STRICT, rusqlite, PRAGMA).
  ADR-internal prose and code comments may name it where
  technically necessary for precision.
- **Confirm commits.** Per the user's global rules, every
  `git commit` is preceded by an explicit message proposal
  and user approval. No AI attribution in commit messages.

## Issue tracking — Gitea via `tea`

Extends (does not replace) the generic Gitea/`tea` safety rules in
the global `CLAUDE.md`. Use `tea` to manage Gitea issues; `tea
--help`, `tea issues --help`, etc. for command reference.

**Repo coordinates.** This repo lives on the self-hosted Gitea at
`git.lazyeval.net` as `oli/rdbms-playground`. `tea` **auto-detects
it correctly off the git remote** — verified — so plain `tea issues`
works here even though the machine's *default* `tea` login is a
different host (`git.oliversturm.com`). Pass `--login
git.lazyeval.net --repo oli/rdbms-playground` only as a fallback if
auto-detection ever slips. **Never** fall back to raw API calls
(`curl`/`fetch`) when `tea` misbehaves — tokens leak into shell
history; fix `tea` instead (usually `--login`/`--repo`).

**Labels.** Preconfigured (`bug`, `enhancement` are in use).
**Ask the user before creating new labels.** Create with `tea
labels create --name <n> --color <hex> --description <d>`.

### Critical gotchas

- **`tea` blocks on stdin in a non-TTY → hangs.** `tea comment`,
  `tea issue … --comments`, and similar **wait on stdin** when not
  attached to a terminal, so they hang silently. **Always append
  `< /dev/null`**, and wrap in `timeout 30` as a safety net:
  `timeout 30 tea comment <idx> "$body" < /dev/null`. Verify the
  write landed afterwards (re-fetch); don't trust a clean exit alone.
- **Multi-line comment / description bodies**: heredocs do **not**
  work with `--description` / the comment-body arg. Write the
  markdown to a temp file and pass it via shell substitution: `tea
  comment <idx> "$(cat /tmp/body.md)" < /dev/null` (same for `tea
  issues edit --description "$(cat …)"`).
- **Read an issue's RAW body** (for editing): the default/`--output
  yaml` view is a lossy rendered box. Use JSON: `tea issue <idx>
  --fields body --output json < /dev/null | jq -r '.body'`. **`tea
  issues edit --description` replaces the WHOLE body** — splice
  surgically and keep the raw backup before applying.
- **Reopen**: use `tea issues reopen <index>`, NOT `tea issues edit
  --state open`.
- **Milestones** (not currently used here, but if introduced): set
  with `tea issues edit --milestone "<name>" <idx>` (empty string
  clears it). **Options MUST precede the `<idx>`** — flag-after-index
  silently no-ops. The `tea issues create --milestone …` flag is
  **unreliable** — set the milestone with a follow-up `edit` and
  verify.
- **Display blind-spot — don't loop on this.** `tea issue <n>
  --fields milestone` and `--fields comments` render `None`/`0`
  **even when set** — they are NOT a source of truth. Confirm a
  **milestone** via the filtered list (`tea issues list --milestones
  "<name>" --limit 100 | grep <idx>`; presence = set); confirm a
  **posted comment** via `tea issue <n> --comments` (NOT the
  `comments` count field). Labels/state/title DO render correctly on
  the single-issue fetch; only milestone + comments don't.
- **Pagination**: default ~50 issues. Use `--limit 100` (or more)
  for full lists; `--state all` to include closed; `--output
  tsv`/`json` for parseable output.

## Build hygiene

`target/` is git-ignored and 100% regenerable, but it grows
without bound — cargo never garbage-collects old hash-suffixed
artifacts, so stale test binaries (each ~100 MB, statically
linking the bundled engine + debug info), incremental-compile
caches, and orphaned example binaries pile up across sessions
(it reached **~38 GB** before the first sweep).

Two prevention levers are configured in `Cargo.toml` `[profile.dev]`
(the `test` profile inherits both):

- **`incremental = false`** — the incremental cache alone reached
  **16 GB** here (≈28 compilation units × every historical config,
  never evicted), for little benefit in a full-`cargo test` workflow.
  Off, it never regrows; the cost is whole-crate recompiles instead of
  partial — seconds for a crate this size.
- **`debug = "line-tables-only"`** — the default `debug = 2` is
  ~85–90 % of each test binary; line tables keep file:line in panics
  and backtraces (we debug via `tracing` logs) at a fraction of the
  size.

Even with those, stale artifacts still accumulate (cargo has no
target/ eviction). **Run `cargo sweep` every now and then** to reclaim
that — `cargo-sweep` (installed) prunes everything *except* the current
build's artifacts:

- **Keep only the current build (the usual sweep):** stamp,
  build, then delete everything the build didn't touch —
  ```
  cargo sweep --stamp
  cargo build --all-targets        # touch the artifacts to keep
  cargo sweep --file               # remove everything older than the stamp
  ```
  Add `--dry-run` to `--file` first to preview what goes. Caveat:
  `build --all-targets` only updates the mtime of what it
  actually (re)builds, so already-fresh *dependency* artifacts
  fall before the stamp and get swept too — they recompile once
  on the next build (a one-time cost; everything is regenerable).
  The first 38 GB → 20 GB sweep freed **19 GiB** this way.
- **Lighter routine options:** `cargo sweep --time 30` drops
  artifacts untouched for 30+ days; `cargo sweep --maxsize 10GB`
  trims oldest until under a size cap.
- **`--installed` is *not* the tool for same-toolchain cruft.**
  It keeps only artifacts from currently-installed rustup
  toolchains, so it frees space *only after you uninstall/replace
  a toolchain*. For the usual "many builds, one toolchain"
  accumulation it cleans **nothing** (verified: it would free 0
  here) — use the stamp/file workflow instead.

A good cadence is a sweep between major milestones (e.g. at
session handoff). `cargo clean` remains the nuclear option (wipes
all of `target/`, forcing a full from-scratch rebuild).

## Things deliberately deferred

These are explicitly tracked (mostly in `requirements.md`) but
not yet implemented:

- **Modify relationship** (C3a): drop+add covers the use case
  today.
- **Strong syntax-help in parse errors** (H1a): point users at
  missing keywords/clauses rather than the unexpected
  character. *(H1 — the friendly **database**-error layer — is
  done, ADR-0019; H1a is its separate parse-error sibling,
  ADR-0021, still partial.)*
- **Tutorial/lesson system**: acknowledged as in scope for
  design; needs its own ADR.
- **Session log + Markdown export** (V4): the bigger UX
  project — scrollable session journal, smart structure
  rendering, save-as-markdown.
- **Readline shortcuts** (I1b): Ctrl-A/Ctrl-E, Ctrl-W/Ctrl-K/
  Ctrl-U.
- **Multi-line input** (I1): Enter inserts newline,
  Ctrl-Enter submits.
- **Tab completion** (I3), **syntax highlighting** (I4).
- **ER diagram export** (V3).
- **Full TT5** (CI): the pipeline is live (see the CI decision
  above / `docs/ci/adr/`), but "all tiers on all OSes" isn't
  complete — **Windows is build-only** (cross-compiled, not
  executed: no Windows runner) and **Tier 4** (PTY, TT4) isn't
  wired in CI.
- **D3 packaging**: prebuilt binaries + checksums ship to Gitea
  releases, but the Homebrew / Scoop / winget / `cargo binstall`
  manifests are not done.

## Handoff notes

When taking over a session, read in order:

1. `docs/handoff/` — most recent file gives session context.
2. `CLAUDE.md` (this file).
3. `docs/requirements.md` — current progress on each item.
4. `docs/adr/README.md` and any ADR you'll touch.