Files
rdbms-playground/docs/ci/adr/20260612-adr-ci-001.md
claude@clouddev1 5869eec4f4 docs(ci): ADR-ci-003 — cross-platform release builds (D1 matrix)
Record the multi-platform build strategy as its own decision: cargo-zigbuild
for the four non-macOS targets, the static/standalone posture per platform,
the Windows synchronization stub, the test->build matrix workflow, and the
macOS deferral with its licensing rationale (the public CI image can't carry
the SDK). Shrinks the ci-001 amendment to a pointer; updates the index.

Runtime-verified by the user: Linux x86_64 + Windows aarch64 run correctly.
2026-06-13 19:11:29 +00:00

10 KiB
Raw Permalink Blame History

ADR-ci-001: CI + release pipeline on Gitea Actions

Status

Accepted (2026-06-12); implemented the same day on the ci branch. Every fork below was settled with the user as the pipeline was built, and each stage was verified live before acceptance:

  • a throwaway probe workflow established how the runner executes jobs;
  • the CI image was built and checked locally (runner contract, warm devShell);
  • the gate ran green (clippy clean; 2424 tests pass / 0 fail / 1 intentional ignored doctest);
  • the release was exercised end-to-end — tag v0.0.0-citest2 published a Gitea release carrying the static binary (~10 MB) and its .sha256.

This ADR records the CI/release pipeline. The dev/build environment it runs on — the nix flake (devShell + reproducible build, pinned Rust 1.95.0) — is ADR-ci-002 (relocated here from main's ADR-0049); this ADR builds on it rather than restating it.

Namespacing. Kept in docs/ci/adr/ (id ADR-ci-001), disjoint from main's integer ADR sequence, mirroring the website subproject's docs/website/adr/. This avoids the cross-branch number collisions that previously forced website ADRs to be renumbered (see that namespace's history note and ADR-0000 "Numbering discipline").

Amendment — 2026-06-13: D1 matrix (non-macOS)

§3 (Release) below describes the original single-target (x86_64 Linux) job. The release is now a testbuild matrix over the four non-macOS D1 targets (Linux + Windows × x86_64/aarch64), cross-built with cargo-zigbuild. The full decision — tooling, targets, the Windows synchronization stub, the matrix shape, and the macOS deferral with its licensing rationale — is recorded in its own record: ADR-ci-003.

Context

The project is near feature-complete and needs CI (requirements.md TT5; the CI item in the deferred list) and a release path for its distributed binaries (D1/D2/D3). The self-hosted Gitea instance (git.lazyeval.net) has its Actions runner freshly set up — a first-time in-anger use — with a DinD-capable setup and a reusable docker-build template, exercised by a handful of sample workflows.

The starting constraints, and what the probe found:

  • The runner label is ci-public. A throwaway probe (ci-probe.yaml, since removed) established that jobs run inside a containerghcr.io/catthehacker/ubuntu:act-22.04 by default, as root — and therefore the runner host's nix is not on the steps' PATH (nix NOT on PATH, no /nix). A custom job container: can be pulled (it pulled nixos/nix:latest), but the runner keeps job containers alive with entrypoint: /bin/sleep and runs JS actions (e.g. actions/checkout) with node, so the container must provide sleep + bash + node — a bare nixos/nix image has none and fails to start.
  • The reusable template only does docker build; it neither runs a Rust gate nor pushes images nor uploads release assets — so a Rust pipeline can't just call it.
  • The whole motivation (per the user) is for CI to use the project's nix flake for its tools rather than relying on whatever the build machine has — i.e. one toolchain definition shared by dev and CI.

Decision

1. Toolchain delivery — a baked nix CI image

CI gets its toolchain from a purpose-built job-container image, not from host nix and not by installing nix per-job:

  • Base node:22-bookworm-slim. Debian slim already provides bash + coreutils (sleep); the node tag adds the actions runtime. This satisfies the act_runner job-container contract at a fraction of the size of the catthehacker runner images (chosen on the user's prompt to avoid those multi-GB images), and far more reliably than a bare nixos/nix (which can't start). .gitea/ci-image/Dockerfile.
  • Single-user nix on top, flakes enabled, with the flake's devShell pre-warmed (nix develop realizes nixpkgs + the pinned Rust toolchain + cargo-sweep + the musl cc into the store). CI then runs nix develop -c … against a warm store — the same pinned toolchain as dev (ADR-ci-002), reaching a ready toolchain in ~1.4 s.
  • Built + pushed by build-ci-image.yaml via the DinD service to the Gitea container registry as git.lazyeval.net/<owner>/rdbms-playground-ci, a public package (anonymous pull, no gate-side credentials). It runs only when an image input changes (Dockerfile / flake.nix / flake.lock / rust-toolchain.toml) or on manual dispatch.

2. Gate — ci.yaml

On branch pushes and PRs, a single job runs inside the CI image: nix develop -c cargo clippy --all-targets -- -D warnings then nix develop -c cargo test --no-fail-fast.

fmt is deliberately not gated. The tree isn't clean under stock rustfmt (~100 files would change; no rustfmt.toml is committed) and reformatting would churn blame across the in-flight website branch and ongoing main work — so, by user decision, the gate is clippy + test and fmt is revisited on main (also recorded in ADR-ci-002).

3. Release — release.yaml

On a v* tag, one job in the CI image:

  1. tests (cargo test) — so a tag can never publish untested code, even one pointing at a never-gated commit (user choice over relying solely on the branch gate);
  2. builds the static binary for x86_64-unknown-linux-musl (D2: single static binary, no runtime deps). The glibc/nix-store build is non-portable; the musl target with crt-static is fully static. rusqlite's bundled SQLite C is compiled by a musl cc (pkgsCross.musl64) wired into the flake devShell via CC_<target> + CARGO_TARGET_<TARGET>_LINKER; [profile.release] strip = "symbols" trims it (~13 MB → ~10 MB);
  3. publishes the binary + a .sha256 to a Gitea release via the API and the auto-provided GITEA_TOKEN — no third-party action (just curl + node, both in the image).

4. Triggers — branch vs tag hygiene

  • Gate and image-build are scoped to branch pushes (branches: ['**']). Tag pushes ignore paths: filters and would otherwise spuriously rebuild the unchanged image and re-gate an already-gated commit; the branch filter excludes tags. release.yaml owns tags (tags: ['v*']).
  • Pushing commits + a tag together still gates the commits (via the branch ref) and releases (via the tag ref) — no lost coverage, no duplicate runs.

5. Auth

  • Image push: a dedicated PAT with write:package, supplied as the REGISTRY_USERNAME / REGISTRY_TOKEN Actions secrets (the package owner must match the token's user — an oli-namespace push with a different user is refused with reqPackageAccess).
  • Release publish: the auto GITEA_TOKEN (repo/release scope).

6. Scope this iteration — Linux x86_64, step by step

The user's target is the full D1 matrix, approached incrementally. This iteration ships Linux x86_64 only; the rest is deferred (below).

Consequences

  • One toolchain, dev and CI. They build through the same flake and cannot drift. New image rebuilds only when the flake/toolchain/Dockerfile change.
  • D2 is met on Linux. The release artifact is a genuinely static, stripped musl binary that runs with no runtime dependencies.
  • DinD is per-job (no layer cache across runs), so every build-ci-image run rebuilds from scratch (~6 min). Acceptable at its trigger frequency; base-pull caching via the dind-cached proxy variant is a possible later optimisation.
  • The CI image is ~5.5 GB+ (the Rust toolchain closure, now also musl). Pulled once per runner and cached; slimming (multi-stage, prune) is optional.
  • Every gate run recompiles the full dependency graph (warm toolchain, cold deps; clippy and test don't share artifacts), ~2 min total. Fine for now; dependency/target caching is a deferred speed item.
  • GITEA_TOKEN must retain release scope; if an instance policy narrows it, the release publish falls back to a repo-scoped PAT secret.

Alternatives considered

  • Run on the runner host's nix. Rejected — the probe showed steps run in a container where host nix is unreachable.
  • Install nix per-job in the default image. Works but cold every run (slow) and throwaway once the image exists; rejected in favour of the baked image.
  • catthehacker or bare nixos/nix as the base. catthehacker is a multi-GB runner emulation we don't need; bare nixos/nix lacks sleep/bash/node and won't start. node:22-bookworm-slim is the small, contract-satisfying middle (user's suggestion).
  • A standard rust:1.95 CI image instead of the flake. Simpler in CI but a second toolchain definition (drift) — counter to the unify-with-dev goal.
  • A third-party Gitea release action. Avoided; the API + auto token keep the release self-contained and debuggable.

Deferred / out of scope (tracked, step by step)

  • D1 matrix: macOS only now (x86_64 + aarch64). The four non-macOS targets shipped via cargo-zigbuild (see the 2026-06-13 amendment); macOS needs Apple's SDK (osxcross + private SDK, or a Mac runner).
  • D3 packaging: Homebrew / Scoop / winget / cargo-binstall manifests (and binstall-friendly asset naming/archives).
  • Tier 4 (PTY E2E): still unwired (requirements.md TT4); the gate runs tiers 13 only, so TT5 ("CI runs all tiers on Linux/macOS/Windows") is partially met — Linux, tiers 13.
  • CI speed: dependency/target caching (cargo-chef into the image, or actions/cache), and image slimming / dind-cached base-pull caching.
  • Website deploy: the static site → Cloudflare via Gitea Actions (a separate, simpler workflow on the website branch).
  • fmt gate: revisit on main once a rustfmt style is chosen.

Relationship to other decisions

  • Builds on ADR-ci-002 (nix flake dev + build env). This ADR adds the musl-target/cc to that flake and consumes it from CI.
  • Advances requirements.md: TT5 (CI runs the tiers — Linux, 13), D2 (static binary — Linux, done), D1/D3 (partial/deferred).
  • Mirrors the website subproject's separate ADR namespace and its static→Cloudflare-via-Gitea-Actions deployment posture (ADR-website-001).