docs(ci): establish docs/ci/adr namespace (ci-001 pipeline, ci-002 flake)
ci / gate (push) Successful in 2m33s

Records the CI/release pipeline as ADR-ci-001 and relocates the nix-flake
ADR from main's ADR-0049 to ADR-ci-002 (content unchanged, history note
added). Both live in docs/ci/adr/ with a README index — a dated,
ci-segmented namespace disjoint from main's integer ADR sequence, the
same split the website subproject uses to avoid cross-branch number
collisions. Drops the ADR-0049 entry from docs/adr/README.

ci-001 covers the runner model, the baked nix CI image, the clippy+test
gate, the static-musl release on tag, trigger hygiene, auth, and scope.
This commit is contained in:
claude@clouddev1
2026-06-12 22:38:34 +00:00
parent 89b9392c25
commit da8bfebc36
4 changed files with 221 additions and 7 deletions
+185
View File
@@ -0,0 +1,185 @@
# ADR-ci-001: CI + release pipeline on Gitea Actions
## Status
**Accepted (2026-06-12); implemented the same day on the `ci` branch.** Every
fork below was settled with the user as the pipeline was built, and each stage
was verified live before acceptance:
- a throwaway probe workflow established how the runner executes jobs;
- the CI image was built and checked locally (runner contract, warm devShell);
- the gate ran green (**clippy clean; 2424 tests pass / 0 fail / 1 intentional
ignored doctest**);
- the release was exercised end-to-end — tag `v0.0.0-citest2` published a Gitea
release carrying the static binary (~10 MB) and its `.sha256`.
This ADR records the **CI/release pipeline**. The **dev/build environment it
runs on** — the nix flake (devShell + reproducible build, pinned Rust 1.95.0)
— is **ADR-ci-002** (relocated here from main's ADR-0049); this ADR builds on
it rather than restating it.
> **Namespacing.** Kept in `docs/ci/adr/` (id `ADR-ci-001`), disjoint from
> `main`'s integer ADR sequence, mirroring the website subproject's
> `docs/website/adr/`. This avoids the cross-branch number collisions that
> previously forced website ADRs to be renumbered (see that namespace's
> history note and ADR-0000 "Numbering discipline").
## Context
The project is near feature-complete and needs CI (`requirements.md` **TT5**;
the **CI** item in the deferred list) and a release path for its distributed
binaries (**D1**/**D2**/**D3**). The self-hosted Gitea instance
(`git.lazyeval.net`) has its Actions runner freshly set up — a first-time
in-anger use — with a DinD-capable setup and a reusable `docker-build`
template, exercised by a handful of sample workflows.
The starting constraints, and what the probe found:
- The runner label is **`ci-public`**. A throwaway probe
(`ci-probe.yaml`, since removed) established that **jobs run *inside* a
container** — `ghcr.io/catthehacker/ubuntu:act-22.04` by default, as **root**
— and therefore the runner *host's* nix is **not** on the steps' PATH
(`nix NOT on PATH`, `no /nix`). A custom job `container:` *can* be pulled
(it pulled `nixos/nix:latest`), but the runner keeps job containers alive
with `entrypoint: /bin/sleep` and runs JS actions (e.g. `actions/checkout`)
with `node`, so the container must provide **`sleep` + `bash` + `node`** —
a bare `nixos/nix` image has none and fails to start.
- The reusable template only does `docker build`; it neither runs a Rust gate
nor pushes images nor uploads release assets — so a Rust pipeline can't just
call it.
- The whole motivation (per the user) is for CI to use the project's **nix
flake** for its tools rather than relying on whatever the build machine has
— i.e. **one toolchain definition shared by dev and CI**.
## Decision
### 1. Toolchain delivery — a baked nix CI image
CI gets its toolchain from a **purpose-built job-container image**, not from
host nix and not by installing nix per-job:
- **Base `node:22-bookworm-slim`.** Debian slim already provides `bash` +
coreutils (`sleep`); the `node` tag adds the actions runtime. This satisfies
the act_runner job-container contract at a fraction of the size of the
catthehacker runner images (chosen on the user's prompt to avoid those
multi-GB images), and far more reliably than a bare `nixos/nix` (which can't
start). `.gitea/ci-image/Dockerfile`.
- **Single-user nix on top**, flakes enabled, with the **flake's devShell
pre-warmed** (`nix develop` realizes nixpkgs + the pinned Rust toolchain +
`cargo-sweep` + the musl cc into the store). CI then runs `nix develop -c …`
against a warm store — the *same* pinned toolchain as dev (ADR-ci-002),
reaching a ready toolchain in ~1.4 s.
- **Built + pushed by `build-ci-image.yaml`** via the DinD service to the
Gitea container registry as `git.lazyeval.net/<owner>/rdbms-playground-ci`,
a **public** package (anonymous pull, no gate-side credentials). It runs only
when an image input changes (Dockerfile / `flake.nix` / `flake.lock` /
`rust-toolchain.toml`) or on manual dispatch.
### 2. Gate — `ci.yaml`
On branch pushes and PRs, a single job runs **inside the CI image**:
`nix develop -c cargo clippy --all-targets -- -D warnings` then
`nix develop -c cargo test --no-fail-fast`.
**`fmt` is deliberately not gated.** The tree isn't clean under stock
`rustfmt` (~100 files would change; no `rustfmt.toml` is committed) and
reformatting would churn blame across the in-flight website branch and ongoing
`main` work — so, by user decision, the gate is **clippy + test** and fmt is
revisited on `main` (also recorded in ADR-ci-002).
### 3. Release — `release.yaml`
On a `v*` tag, one job in the CI image:
1. **tests** (`cargo test`) — so a tag can never publish untested code, even
one pointing at a never-gated commit (user choice over relying solely on the
branch gate);
2. **builds the static binary** for **`x86_64-unknown-linux-musl`** (D2:
single static binary, no runtime deps). The glibc/nix-store build is
non-portable; the musl target with `crt-static` is fully static. rusqlite's
`bundled` SQLite C is compiled by a **musl `cc`** (`pkgsCross.musl64`) wired
into the flake devShell via `CC_<target>` + `CARGO_TARGET_<TARGET>_LINKER`;
`[profile.release] strip = "symbols"` trims it (~13 MB → ~10 MB);
3. **publishes** the binary + a `.sha256` to a Gitea release via the API and
the auto-provided **`GITEA_TOKEN`** — no third-party action (just `curl` +
`node`, both in the image).
### 4. Triggers — branch vs tag hygiene
- Gate and image-build are scoped to **branch** pushes (`branches: ['**']`).
Tag pushes ignore `paths:` filters and would otherwise spuriously rebuild the
unchanged image and re-gate an already-gated commit; the branch filter
excludes tags. **`release.yaml` owns tags** (`tags: ['v*']`).
- Pushing commits + a tag together still gates the commits (via the branch
ref) and releases (via the tag ref) — no lost coverage, no duplicate runs.
### 5. Auth
- **Image push:** a dedicated PAT with `write:package`, supplied as the
`REGISTRY_USERNAME` / `REGISTRY_TOKEN` Actions secrets (the package owner
must match the token's user — an `oli`-namespace push with a different user
is refused with `reqPackageAccess`).
- **Release publish:** the auto `GITEA_TOKEN` (repo/release scope).
### 6. Scope this iteration — Linux x86_64, step by step
The user's target is the full **D1** matrix, approached incrementally. This
iteration ships **Linux x86_64 only**; the rest is deferred (below).
## Consequences
- **One toolchain, dev and CI.** They build through the same flake and cannot
drift. New image rebuilds only when the flake/toolchain/Dockerfile change.
- **D2 is met on Linux.** The release artifact is a genuinely static,
stripped musl binary that runs with no runtime dependencies.
- **DinD is per-job (no layer cache across runs),** so every `build-ci-image`
run rebuilds from scratch (~6 min). Acceptable at its trigger frequency;
base-pull caching via the `dind-cached` proxy variant is a possible later
optimisation.
- **The CI image is ~5.5 GB+** (the Rust toolchain closure, now also musl).
Pulled once per runner and cached; slimming (multi-stage, prune) is optional.
- **Every gate run recompiles the full dependency graph** (warm *toolchain*,
cold *deps*; clippy and test don't share artifacts), ~2 min total. Fine for
now; dependency/`target` caching is a deferred speed item.
- **`GITEA_TOKEN` must retain release scope;** if an instance policy narrows
it, the release publish falls back to a repo-scoped PAT secret.
## Alternatives considered
- **Run on the runner host's nix.** Rejected — the probe showed steps run in a
container where host nix is unreachable.
- **Install nix per-job in the default image.** Works but cold every run
(slow) and throwaway once the image exists; rejected in favour of the baked
image.
- **`catthehacker` or bare `nixos/nix` as the base.** catthehacker is a
multi-GB runner emulation we don't need; bare `nixos/nix` lacks
`sleep`/`bash`/`node` and won't start. `node:22-bookworm-slim` is the small,
contract-satisfying middle (user's suggestion).
- **A standard `rust:1.95` CI image instead of the flake.** Simpler in CI but a
*second* toolchain definition (drift) — counter to the unify-with-dev goal.
- **A third-party Gitea release action.** Avoided; the API + auto token keep
the release self-contained and debuggable.
## Deferred / out of scope (tracked, step by step)
- **D1 matrix:** aarch64, macOS, Windows builds (cross toolchains; macOS is the
hard part on a Linux runner).
- **D3 packaging:** Homebrew / Scoop / winget / `cargo-binstall` manifests
(and binstall-friendly asset naming/archives).
- **Tier 4 (PTY E2E):** still unwired (`requirements.md` **TT4**); the gate runs
tiers 13 only, so **TT5** ("CI runs all tiers on Linux/macOS/Windows") is
partially met — Linux, tiers 13.
- **CI speed:** dependency/`target` caching (cargo-chef into the image, or
`actions/cache`), and image slimming / `dind-cached` base-pull caching.
- **Website deploy:** the static site → Cloudflare via Gitea Actions (a
separate, simpler workflow on the website branch).
- **fmt gate:** revisit on `main` once a `rustfmt` style is chosen.
## Relationship to other decisions
- **Builds on ADR-ci-002** (nix flake dev + build env). This ADR adds the
musl-target/cc to that flake and consumes it from CI.
- **Advances `requirements.md`:** **TT5** (CI runs the tiers — Linux, 13),
**D2** (static binary — Linux, done), **D1**/**D3** (partial/deferred).
- **Mirrors the website subproject's** separate ADR namespace and its
static→Cloudflare-via-Gitea-Actions deployment posture (ADR-website-001).