Files
rdbms-playground/docs/ci/adr/20260612-adr-ci-001.md
T
claude@clouddev1 da8bfebc36
ci / gate (push) Successful in 2m33s
docs(ci): establish docs/ci/adr namespace (ci-001 pipeline, ci-002 flake)
Records the CI/release pipeline as ADR-ci-001 and relocates the nix-flake
ADR from main's ADR-0049 to ADR-ci-002 (content unchanged, history note
added). Both live in docs/ci/adr/ with a README index — a dated,
ci-segmented namespace disjoint from main's integer ADR sequence, the
same split the website subproject uses to avoid cross-branch number
collisions. Drops the ADR-0049 entry from docs/adr/README.

ci-001 covers the runner model, the baked nix CI image, the clippy+test
gate, the static-musl release on tag, trigger hygiene, auth, and scope.
2026-06-12 22:38:34 +00:00

186 lines
9.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ADR-ci-001: CI + release pipeline on Gitea Actions
## Status
**Accepted (2026-06-12); implemented the same day on the `ci` branch.** Every
fork below was settled with the user as the pipeline was built, and each stage
was verified live before acceptance:
- a throwaway probe workflow established how the runner executes jobs;
- the CI image was built and checked locally (runner contract, warm devShell);
- the gate ran green (**clippy clean; 2424 tests pass / 0 fail / 1 intentional
ignored doctest**);
- the release was exercised end-to-end — tag `v0.0.0-citest2` published a Gitea
release carrying the static binary (~10 MB) and its `.sha256`.
This ADR records the **CI/release pipeline**. The **dev/build environment it
runs on** — the nix flake (devShell + reproducible build, pinned Rust 1.95.0)
— is **ADR-ci-002** (relocated here from main's ADR-0049); this ADR builds on
it rather than restating it.
> **Namespacing.** Kept in `docs/ci/adr/` (id `ADR-ci-001`), disjoint from
> `main`'s integer ADR sequence, mirroring the website subproject's
> `docs/website/adr/`. This avoids the cross-branch number collisions that
> previously forced website ADRs to be renumbered (see that namespace's
> history note and ADR-0000 "Numbering discipline").
## Context
The project is near feature-complete and needs CI (`requirements.md` **TT5**;
the **CI** item in the deferred list) and a release path for its distributed
binaries (**D1**/**D2**/**D3**). The self-hosted Gitea instance
(`git.lazyeval.net`) has its Actions runner freshly set up — a first-time
in-anger use — with a DinD-capable setup and a reusable `docker-build`
template, exercised by a handful of sample workflows.
The starting constraints, and what the probe found:
- The runner label is **`ci-public`**. A throwaway probe
(`ci-probe.yaml`, since removed) established that **jobs run *inside* a
container** — `ghcr.io/catthehacker/ubuntu:act-22.04` by default, as **root**
— and therefore the runner *host's* nix is **not** on the steps' PATH
(`nix NOT on PATH`, `no /nix`). A custom job `container:` *can* be pulled
(it pulled `nixos/nix:latest`), but the runner keeps job containers alive
with `entrypoint: /bin/sleep` and runs JS actions (e.g. `actions/checkout`)
with `node`, so the container must provide **`sleep` + `bash` + `node`** —
a bare `nixos/nix` image has none and fails to start.
- The reusable template only does `docker build`; it neither runs a Rust gate
nor pushes images nor uploads release assets — so a Rust pipeline can't just
call it.
- The whole motivation (per the user) is for CI to use the project's **nix
flake** for its tools rather than relying on whatever the build machine has
— i.e. **one toolchain definition shared by dev and CI**.
## Decision
### 1. Toolchain delivery — a baked nix CI image
CI gets its toolchain from a **purpose-built job-container image**, not from
host nix and not by installing nix per-job:
- **Base `node:22-bookworm-slim`.** Debian slim already provides `bash` +
coreutils (`sleep`); the `node` tag adds the actions runtime. This satisfies
the act_runner job-container contract at a fraction of the size of the
catthehacker runner images (chosen on the user's prompt to avoid those
multi-GB images), and far more reliably than a bare `nixos/nix` (which can't
start). `.gitea/ci-image/Dockerfile`.
- **Single-user nix on top**, flakes enabled, with the **flake's devShell
pre-warmed** (`nix develop` realizes nixpkgs + the pinned Rust toolchain +
`cargo-sweep` + the musl cc into the store). CI then runs `nix develop -c …`
against a warm store — the *same* pinned toolchain as dev (ADR-ci-002),
reaching a ready toolchain in ~1.4 s.
- **Built + pushed by `build-ci-image.yaml`** via the DinD service to the
Gitea container registry as `git.lazyeval.net/<owner>/rdbms-playground-ci`,
a **public** package (anonymous pull, no gate-side credentials). It runs only
when an image input changes (Dockerfile / `flake.nix` / `flake.lock` /
`rust-toolchain.toml`) or on manual dispatch.
### 2. Gate — `ci.yaml`
On branch pushes and PRs, a single job runs **inside the CI image**:
`nix develop -c cargo clippy --all-targets -- -D warnings` then
`nix develop -c cargo test --no-fail-fast`.
**`fmt` is deliberately not gated.** The tree isn't clean under stock
`rustfmt` (~100 files would change; no `rustfmt.toml` is committed) and
reformatting would churn blame across the in-flight website branch and ongoing
`main` work — so, by user decision, the gate is **clippy + test** and fmt is
revisited on `main` (also recorded in ADR-ci-002).
### 3. Release — `release.yaml`
On a `v*` tag, one job in the CI image:
1. **tests** (`cargo test`) — so a tag can never publish untested code, even
one pointing at a never-gated commit (user choice over relying solely on the
branch gate);
2. **builds the static binary** for **`x86_64-unknown-linux-musl`** (D2:
single static binary, no runtime deps). The glibc/nix-store build is
non-portable; the musl target with `crt-static` is fully static. rusqlite's
`bundled` SQLite C is compiled by a **musl `cc`** (`pkgsCross.musl64`) wired
into the flake devShell via `CC_<target>` + `CARGO_TARGET_<TARGET>_LINKER`;
`[profile.release] strip = "symbols"` trims it (~13 MB → ~10 MB);
3. **publishes** the binary + a `.sha256` to a Gitea release via the API and
the auto-provided **`GITEA_TOKEN`** — no third-party action (just `curl` +
`node`, both in the image).
### 4. Triggers — branch vs tag hygiene
- Gate and image-build are scoped to **branch** pushes (`branches: ['**']`).
Tag pushes ignore `paths:` filters and would otherwise spuriously rebuild the
unchanged image and re-gate an already-gated commit; the branch filter
excludes tags. **`release.yaml` owns tags** (`tags: ['v*']`).
- Pushing commits + a tag together still gates the commits (via the branch
ref) and releases (via the tag ref) — no lost coverage, no duplicate runs.
### 5. Auth
- **Image push:** a dedicated PAT with `write:package`, supplied as the
`REGISTRY_USERNAME` / `REGISTRY_TOKEN` Actions secrets (the package owner
must match the token's user — an `oli`-namespace push with a different user
is refused with `reqPackageAccess`).
- **Release publish:** the auto `GITEA_TOKEN` (repo/release scope).
### 6. Scope this iteration — Linux x86_64, step by step
The user's target is the full **D1** matrix, approached incrementally. This
iteration ships **Linux x86_64 only**; the rest is deferred (below).
## Consequences
- **One toolchain, dev and CI.** They build through the same flake and cannot
drift. New image rebuilds only when the flake/toolchain/Dockerfile change.
- **D2 is met on Linux.** The release artifact is a genuinely static,
stripped musl binary that runs with no runtime dependencies.
- **DinD is per-job (no layer cache across runs),** so every `build-ci-image`
run rebuilds from scratch (~6 min). Acceptable at its trigger frequency;
base-pull caching via the `dind-cached` proxy variant is a possible later
optimisation.
- **The CI image is ~5.5 GB+** (the Rust toolchain closure, now also musl).
Pulled once per runner and cached; slimming (multi-stage, prune) is optional.
- **Every gate run recompiles the full dependency graph** (warm *toolchain*,
cold *deps*; clippy and test don't share artifacts), ~2 min total. Fine for
now; dependency/`target` caching is a deferred speed item.
- **`GITEA_TOKEN` must retain release scope;** if an instance policy narrows
it, the release publish falls back to a repo-scoped PAT secret.
## Alternatives considered
- **Run on the runner host's nix.** Rejected — the probe showed steps run in a
container where host nix is unreachable.
- **Install nix per-job in the default image.** Works but cold every run
(slow) and throwaway once the image exists; rejected in favour of the baked
image.
- **`catthehacker` or bare `nixos/nix` as the base.** catthehacker is a
multi-GB runner emulation we don't need; bare `nixos/nix` lacks
`sleep`/`bash`/`node` and won't start. `node:22-bookworm-slim` is the small,
contract-satisfying middle (user's suggestion).
- **A standard `rust:1.95` CI image instead of the flake.** Simpler in CI but a
*second* toolchain definition (drift) — counter to the unify-with-dev goal.
- **A third-party Gitea release action.** Avoided; the API + auto token keep
the release self-contained and debuggable.
## Deferred / out of scope (tracked, step by step)
- **D1 matrix:** aarch64, macOS, Windows builds (cross toolchains; macOS is the
hard part on a Linux runner).
- **D3 packaging:** Homebrew / Scoop / winget / `cargo-binstall` manifests
(and binstall-friendly asset naming/archives).
- **Tier 4 (PTY E2E):** still unwired (`requirements.md` **TT4**); the gate runs
tiers 13 only, so **TT5** ("CI runs all tiers on Linux/macOS/Windows") is
partially met — Linux, tiers 13.
- **CI speed:** dependency/`target` caching (cargo-chef into the image, or
`actions/cache`), and image slimming / `dind-cached` base-pull caching.
- **Website deploy:** the static site → Cloudflare via Gitea Actions (a
separate, simpler workflow on the website branch).
- **fmt gate:** revisit on `main` once a `rustfmt` style is chosen.
## Relationship to other decisions
- **Builds on ADR-ci-002** (nix flake dev + build env). This ADR adds the
musl-target/cc to that flake and consumes it from CI.
- **Advances `requirements.md`:** **TT5** (CI runs the tiers — Linux, 13),
**D2** (static binary — Linux, done), **D1**/**D3** (partial/deferred).
- **Mirrors the website subproject's** separate ADR namespace and its
static→Cloudflare-via-Gitea-Actions deployment posture (ADR-website-001).