Files
rdbms-playground/docs/ci/adr/20260612-adr-ci-001.md
T
claude@clouddev1 5869eec4f4 docs(ci): ADR-ci-003 — cross-platform release builds (D1 matrix)
Record the multi-platform build strategy as its own decision: cargo-zigbuild
for the four non-macOS targets, the static/standalone posture per platform,
the Windows synchronization stub, the test->build matrix workflow, and the
macOS deferral with its licensing rationale (the public CI image can't carry
the SDK). Shrinks the ci-001 amendment to a pointer; updates the index.

Runtime-verified by the user: Linux x86_64 + Windows aarch64 run correctly.
2026-06-13 19:11:29 +00:00

196 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ADR-ci-001: CI + release pipeline on Gitea Actions
## Status
**Accepted (2026-06-12); implemented the same day on the `ci` branch.** Every
fork below was settled with the user as the pipeline was built, and each stage
was verified live before acceptance:
- a throwaway probe workflow established how the runner executes jobs;
- the CI image was built and checked locally (runner contract, warm devShell);
- the gate ran green (**clippy clean; 2424 tests pass / 0 fail / 1 intentional
ignored doctest**);
- the release was exercised end-to-end — tag `v0.0.0-citest2` published a Gitea
release carrying the static binary (~10 MB) and its `.sha256`.
This ADR records the **CI/release pipeline**. The **dev/build environment it
runs on** — the nix flake (devShell + reproducible build, pinned Rust 1.95.0)
— is **ADR-ci-002** (relocated here from main's ADR-0049); this ADR builds on
it rather than restating it.
> **Namespacing.** Kept in `docs/ci/adr/` (id `ADR-ci-001`), disjoint from
> `main`'s integer ADR sequence, mirroring the website subproject's
> `docs/website/adr/`. This avoids the cross-branch number collisions that
> previously forced website ADRs to be renumbered (see that namespace's
> history note and ADR-0000 "Numbering discipline").
## Amendment — 2026-06-13: D1 matrix (non-macOS)
§3 (Release) below describes the original **single-target** (x86_64 Linux) job.
The release is now a **`test``build` matrix** over the four non-macOS D1
targets (Linux + Windows × x86_64/aarch64), cross-built with `cargo-zigbuild`.
The full decision — tooling, targets, the Windows `synchronization` stub, the
matrix shape, and the macOS deferral with its licensing rationale — is recorded
in its own record: **[ADR-ci-003](20260613-adr-ci-003.md)**.
## Context
The project is near feature-complete and needs CI (`requirements.md` **TT5**;
the **CI** item in the deferred list) and a release path for its distributed
binaries (**D1**/**D2**/**D3**). The self-hosted Gitea instance
(`git.lazyeval.net`) has its Actions runner freshly set up — a first-time
in-anger use — with a DinD-capable setup and a reusable `docker-build`
template, exercised by a handful of sample workflows.
The starting constraints, and what the probe found:
- The runner label is **`ci-public`**. A throwaway probe
(`ci-probe.yaml`, since removed) established that **jobs run *inside* a
container** — `ghcr.io/catthehacker/ubuntu:act-22.04` by default, as **root**
— and therefore the runner *host's* nix is **not** on the steps' PATH
(`nix NOT on PATH`, `no /nix`). A custom job `container:` *can* be pulled
(it pulled `nixos/nix:latest`), but the runner keeps job containers alive
with `entrypoint: /bin/sleep` and runs JS actions (e.g. `actions/checkout`)
with `node`, so the container must provide **`sleep` + `bash` + `node`** —
a bare `nixos/nix` image has none and fails to start.
- The reusable template only does `docker build`; it neither runs a Rust gate
nor pushes images nor uploads release assets — so a Rust pipeline can't just
call it.
- The whole motivation (per the user) is for CI to use the project's **nix
flake** for its tools rather than relying on whatever the build machine has
— i.e. **one toolchain definition shared by dev and CI**.
## Decision
### 1. Toolchain delivery — a baked nix CI image
CI gets its toolchain from a **purpose-built job-container image**, not from
host nix and not by installing nix per-job:
- **Base `node:22-bookworm-slim`.** Debian slim already provides `bash` +
coreutils (`sleep`); the `node` tag adds the actions runtime. This satisfies
the act_runner job-container contract at a fraction of the size of the
catthehacker runner images (chosen on the user's prompt to avoid those
multi-GB images), and far more reliably than a bare `nixos/nix` (which can't
start). `.gitea/ci-image/Dockerfile`.
- **Single-user nix on top**, flakes enabled, with the **flake's devShell
pre-warmed** (`nix develop` realizes nixpkgs + the pinned Rust toolchain +
`cargo-sweep` + the musl cc into the store). CI then runs `nix develop -c …`
against a warm store — the *same* pinned toolchain as dev (ADR-ci-002),
reaching a ready toolchain in ~1.4 s.
- **Built + pushed by `build-ci-image.yaml`** via the DinD service to the
Gitea container registry as `git.lazyeval.net/<owner>/rdbms-playground-ci`,
a **public** package (anonymous pull, no gate-side credentials). It runs only
when an image input changes (Dockerfile / `flake.nix` / `flake.lock` /
`rust-toolchain.toml`) or on manual dispatch.
### 2. Gate — `ci.yaml`
On branch pushes and PRs, a single job runs **inside the CI image**:
`nix develop -c cargo clippy --all-targets -- -D warnings` then
`nix develop -c cargo test --no-fail-fast`.
**`fmt` is deliberately not gated.** The tree isn't clean under stock
`rustfmt` (~100 files would change; no `rustfmt.toml` is committed) and
reformatting would churn blame across the in-flight website branch and ongoing
`main` work — so, by user decision, the gate is **clippy + test** and fmt is
revisited on `main` (also recorded in ADR-ci-002).
### 3. Release — `release.yaml`
On a `v*` tag, one job in the CI image:
1. **tests** (`cargo test`) — so a tag can never publish untested code, even
one pointing at a never-gated commit (user choice over relying solely on the
branch gate);
2. **builds the static binary** for **`x86_64-unknown-linux-musl`** (D2:
single static binary, no runtime deps). The glibc/nix-store build is
non-portable; the musl target with `crt-static` is fully static. rusqlite's
`bundled` SQLite C is compiled by a **musl `cc`** (`pkgsCross.musl64`) wired
into the flake devShell via `CC_<target>` + `CARGO_TARGET_<TARGET>_LINKER`;
`[profile.release] strip = "symbols"` trims it (~13 MB → ~10 MB);
3. **publishes** the binary + a `.sha256` to a Gitea release via the API and
the auto-provided **`GITEA_TOKEN`** — no third-party action (just `curl` +
`node`, both in the image).
### 4. Triggers — branch vs tag hygiene
- Gate and image-build are scoped to **branch** pushes (`branches: ['**']`).
Tag pushes ignore `paths:` filters and would otherwise spuriously rebuild the
unchanged image and re-gate an already-gated commit; the branch filter
excludes tags. **`release.yaml` owns tags** (`tags: ['v*']`).
- Pushing commits + a tag together still gates the commits (via the branch
ref) and releases (via the tag ref) — no lost coverage, no duplicate runs.
### 5. Auth
- **Image push:** a dedicated PAT with `write:package`, supplied as the
`REGISTRY_USERNAME` / `REGISTRY_TOKEN` Actions secrets (the package owner
must match the token's user — an `oli`-namespace push with a different user
is refused with `reqPackageAccess`).
- **Release publish:** the auto `GITEA_TOKEN` (repo/release scope).
### 6. Scope this iteration — Linux x86_64, step by step
The user's target is the full **D1** matrix, approached incrementally. This
iteration ships **Linux x86_64 only**; the rest is deferred (below).
## Consequences
- **One toolchain, dev and CI.** They build through the same flake and cannot
drift. New image rebuilds only when the flake/toolchain/Dockerfile change.
- **D2 is met on Linux.** The release artifact is a genuinely static,
stripped musl binary that runs with no runtime dependencies.
- **DinD is per-job (no layer cache across runs),** so every `build-ci-image`
run rebuilds from scratch (~6 min). Acceptable at its trigger frequency;
base-pull caching via the `dind-cached` proxy variant is a possible later
optimisation.
- **The CI image is ~5.5 GB+** (the Rust toolchain closure, now also musl).
Pulled once per runner and cached; slimming (multi-stage, prune) is optional.
- **Every gate run recompiles the full dependency graph** (warm *toolchain*,
cold *deps*; clippy and test don't share artifacts), ~2 min total. Fine for
now; dependency/`target` caching is a deferred speed item.
- **`GITEA_TOKEN` must retain release scope;** if an instance policy narrows
it, the release publish falls back to a repo-scoped PAT secret.
## Alternatives considered
- **Run on the runner host's nix.** Rejected — the probe showed steps run in a
container where host nix is unreachable.
- **Install nix per-job in the default image.** Works but cold every run
(slow) and throwaway once the image exists; rejected in favour of the baked
image.
- **`catthehacker` or bare `nixos/nix` as the base.** catthehacker is a
multi-GB runner emulation we don't need; bare `nixos/nix` lacks
`sleep`/`bash`/`node` and won't start. `node:22-bookworm-slim` is the small,
contract-satisfying middle (user's suggestion).
- **A standard `rust:1.95` CI image instead of the flake.** Simpler in CI but a
*second* toolchain definition (drift) — counter to the unify-with-dev goal.
- **A third-party Gitea release action.** Avoided; the API + auto token keep
the release self-contained and debuggable.
## Deferred / out of scope (tracked, step by step)
- **D1 matrix:** **macOS only** now (x86_64 + aarch64). The four non-macOS
targets shipped via cargo-zigbuild (see the 2026-06-13 amendment); macOS needs
Apple's SDK (osxcross + private SDK, or a Mac runner).
- **D3 packaging:** Homebrew / Scoop / winget / `cargo-binstall` manifests
(and binstall-friendly asset naming/archives).
- **Tier 4 (PTY E2E):** still unwired (`requirements.md` **TT4**); the gate runs
tiers 13 only, so **TT5** ("CI runs all tiers on Linux/macOS/Windows") is
partially met — Linux, tiers 13.
- **CI speed:** dependency/`target` caching (cargo-chef into the image, or
`actions/cache`), and image slimming / `dind-cached` base-pull caching.
- **Website deploy:** the static site → Cloudflare via Gitea Actions (a
separate, simpler workflow on the website branch).
- **fmt gate:** revisit on `main` once a `rustfmt` style is chosen.
## Relationship to other decisions
- **Builds on ADR-ci-002** (nix flake dev + build env). This ADR adds the
musl-target/cc to that flake and consumes it from CI.
- **Advances `requirements.md`:** **TT5** (CI runs the tiers — Linux, 13),
**D2** (static binary — Linux, done), **D1**/**D3** (partial/deferred).
- **Mirrors the website subproject's** separate ADR namespace and its
static→Cloudflare-via-Gitea-Actions deployment posture (ADR-website-001).