Files
rdbms-playground/docs/ci/adr/20260612-adr-ci-001.md
T
claude@clouddev1 298475b326
build-ci-image / build (push) Successful in 9m56s
ci / gate (push) Successful in 2m47s
release / test (push) Successful in 2m18s
release / build (aarch64-pc-windows-gnullvm) (push) Successful in 3m31s
release / build (aarch64-unknown-linux-musl) (push) Successful in 3m52s
release / build (x86_64-pc-windows-gnu) (push) Successful in 4m14s
release / build (x86_64-unknown-linux-musl) (push) Successful in 3m25s
ci: D1 release matrix over the four non-macOS targets
release.yaml becomes test (once, host) -> build (matrix) over the four
cargo-zigbuild targets; each matrix job uploads its binary + .sha256 to
the shared release (idempotent create-or-get). Records the expansion in
ADR-ci-001 (2026-06-13 amendment); macOS stays deferred.
2026-06-13 12:14:49 +00:00

216 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ADR-ci-001: CI + release pipeline on Gitea Actions
## Status
**Accepted (2026-06-12); implemented the same day on the `ci` branch.** Every
fork below was settled with the user as the pipeline was built, and each stage
was verified live before acceptance:
- a throwaway probe workflow established how the runner executes jobs;
- the CI image was built and checked locally (runner contract, warm devShell);
- the gate ran green (**clippy clean; 2424 tests pass / 0 fail / 1 intentional
ignored doctest**);
- the release was exercised end-to-end — tag `v0.0.0-citest2` published a Gitea
release carrying the static binary (~10 MB) and its `.sha256`.
This ADR records the **CI/release pipeline**. The **dev/build environment it
runs on** — the nix flake (devShell + reproducible build, pinned Rust 1.95.0)
— is **ADR-ci-002** (relocated here from main's ADR-0049); this ADR builds on
it rather than restating it.
> **Namespacing.** Kept in `docs/ci/adr/` (id `ADR-ci-001`), disjoint from
> `main`'s integer ADR sequence, mirroring the website subproject's
> `docs/website/adr/`. This avoids the cross-branch number collisions that
> previously forced website ADRs to be renumbered (see that namespace's
> history note and ADR-0000 "Numbering discipline").
## Amendment — 2026-06-13: D1 matrix expanded (non-macOS targets)
The release now builds the **four non-macOS D1 targets**, all cross-compiled
from the Linux x86_64 runner with **`cargo-zigbuild`** (Zig's bundled clang +
libc as one universal cross cc/linker — including the `cc`-crate compile of
rusqlite's bundled SQLite C — added to the flake devShell, replacing the
single-target musl cc):
- `x86_64-unknown-linux-musl`, `aarch64-unknown-linux-musl` — static (D2);
- `x86_64-pc-windows-gnu`, `aarch64-pc-windows-gnullvm` — standalone `.exe`.
`release.yaml` became a **`test` (once, host) → `build` (matrix over the four
targets)** workflow; each matrix job uploads its artifact + `.sha256` to the
shared release (idempotent create-or-get).
**Windows link fix:** Rust's std links `-lsynchronization` (WaitOnAddress
thread-parking), an import lib that rust-overlay's toolchain doesn't ship and
Zig's mingw lacks. Those symbols are forwarded by `kernel32` (already linked),
so an **empty stub** `libsynchronization.a` (committed at `ci/winstub/`, wired
via `.cargo/config.toml` for the Windows targets only) satisfies the linker.
Verified locally: all four build; the Linux binaries are statically linked; the
Windows artifacts are valid PE32+ (x86-64 / Aarch64) — not yet runtime
smoke-tested on Windows.
**macOS stays deferred** (see Deferred): `arboard`→AppKit needs Apple's SDK,
which a Linux runner can't supply cleanly — and the CI image is *public*, so the
SDK can't be baked in even if the licensing grey area were accepted. macOS is
its own step (osxcross + a private SDK, or a real Mac runner).
## Context
The project is near feature-complete and needs CI (`requirements.md` **TT5**;
the **CI** item in the deferred list) and a release path for its distributed
binaries (**D1**/**D2**/**D3**). The self-hosted Gitea instance
(`git.lazyeval.net`) has its Actions runner freshly set up — a first-time
in-anger use — with a DinD-capable setup and a reusable `docker-build`
template, exercised by a handful of sample workflows.
The starting constraints, and what the probe found:
- The runner label is **`ci-public`**. A throwaway probe
(`ci-probe.yaml`, since removed) established that **jobs run *inside* a
container** — `ghcr.io/catthehacker/ubuntu:act-22.04` by default, as **root**
— and therefore the runner *host's* nix is **not** on the steps' PATH
(`nix NOT on PATH`, `no /nix`). A custom job `container:` *can* be pulled
(it pulled `nixos/nix:latest`), but the runner keeps job containers alive
with `entrypoint: /bin/sleep` and runs JS actions (e.g. `actions/checkout`)
with `node`, so the container must provide **`sleep` + `bash` + `node`** —
a bare `nixos/nix` image has none and fails to start.
- The reusable template only does `docker build`; it neither runs a Rust gate
nor pushes images nor uploads release assets — so a Rust pipeline can't just
call it.
- The whole motivation (per the user) is for CI to use the project's **nix
flake** for its tools rather than relying on whatever the build machine has
— i.e. **one toolchain definition shared by dev and CI**.
## Decision
### 1. Toolchain delivery — a baked nix CI image
CI gets its toolchain from a **purpose-built job-container image**, not from
host nix and not by installing nix per-job:
- **Base `node:22-bookworm-slim`.** Debian slim already provides `bash` +
coreutils (`sleep`); the `node` tag adds the actions runtime. This satisfies
the act_runner job-container contract at a fraction of the size of the
catthehacker runner images (chosen on the user's prompt to avoid those
multi-GB images), and far more reliably than a bare `nixos/nix` (which can't
start). `.gitea/ci-image/Dockerfile`.
- **Single-user nix on top**, flakes enabled, with the **flake's devShell
pre-warmed** (`nix develop` realizes nixpkgs + the pinned Rust toolchain +
`cargo-sweep` + the musl cc into the store). CI then runs `nix develop -c …`
against a warm store — the *same* pinned toolchain as dev (ADR-ci-002),
reaching a ready toolchain in ~1.4 s.
- **Built + pushed by `build-ci-image.yaml`** via the DinD service to the
Gitea container registry as `git.lazyeval.net/<owner>/rdbms-playground-ci`,
a **public** package (anonymous pull, no gate-side credentials). It runs only
when an image input changes (Dockerfile / `flake.nix` / `flake.lock` /
`rust-toolchain.toml`) or on manual dispatch.
### 2. Gate — `ci.yaml`
On branch pushes and PRs, a single job runs **inside the CI image**:
`nix develop -c cargo clippy --all-targets -- -D warnings` then
`nix develop -c cargo test --no-fail-fast`.
**`fmt` is deliberately not gated.** The tree isn't clean under stock
`rustfmt` (~100 files would change; no `rustfmt.toml` is committed) and
reformatting would churn blame across the in-flight website branch and ongoing
`main` work — so, by user decision, the gate is **clippy + test** and fmt is
revisited on `main` (also recorded in ADR-ci-002).
### 3. Release — `release.yaml`
On a `v*` tag, one job in the CI image:
1. **tests** (`cargo test`) — so a tag can never publish untested code, even
one pointing at a never-gated commit (user choice over relying solely on the
branch gate);
2. **builds the static binary** for **`x86_64-unknown-linux-musl`** (D2:
single static binary, no runtime deps). The glibc/nix-store build is
non-portable; the musl target with `crt-static` is fully static. rusqlite's
`bundled` SQLite C is compiled by a **musl `cc`** (`pkgsCross.musl64`) wired
into the flake devShell via `CC_<target>` + `CARGO_TARGET_<TARGET>_LINKER`;
`[profile.release] strip = "symbols"` trims it (~13 MB → ~10 MB);
3. **publishes** the binary + a `.sha256` to a Gitea release via the API and
the auto-provided **`GITEA_TOKEN`** — no third-party action (just `curl` +
`node`, both in the image).
### 4. Triggers — branch vs tag hygiene
- Gate and image-build are scoped to **branch** pushes (`branches: ['**']`).
Tag pushes ignore `paths:` filters and would otherwise spuriously rebuild the
unchanged image and re-gate an already-gated commit; the branch filter
excludes tags. **`release.yaml` owns tags** (`tags: ['v*']`).
- Pushing commits + a tag together still gates the commits (via the branch
ref) and releases (via the tag ref) — no lost coverage, no duplicate runs.
### 5. Auth
- **Image push:** a dedicated PAT with `write:package`, supplied as the
`REGISTRY_USERNAME` / `REGISTRY_TOKEN` Actions secrets (the package owner
must match the token's user — an `oli`-namespace push with a different user
is refused with `reqPackageAccess`).
- **Release publish:** the auto `GITEA_TOKEN` (repo/release scope).
### 6. Scope this iteration — Linux x86_64, step by step
The user's target is the full **D1** matrix, approached incrementally. This
iteration ships **Linux x86_64 only**; the rest is deferred (below).
## Consequences
- **One toolchain, dev and CI.** They build through the same flake and cannot
drift. New image rebuilds only when the flake/toolchain/Dockerfile change.
- **D2 is met on Linux.** The release artifact is a genuinely static,
stripped musl binary that runs with no runtime dependencies.
- **DinD is per-job (no layer cache across runs),** so every `build-ci-image`
run rebuilds from scratch (~6 min). Acceptable at its trigger frequency;
base-pull caching via the `dind-cached` proxy variant is a possible later
optimisation.
- **The CI image is ~5.5 GB+** (the Rust toolchain closure, now also musl).
Pulled once per runner and cached; slimming (multi-stage, prune) is optional.
- **Every gate run recompiles the full dependency graph** (warm *toolchain*,
cold *deps*; clippy and test don't share artifacts), ~2 min total. Fine for
now; dependency/`target` caching is a deferred speed item.
- **`GITEA_TOKEN` must retain release scope;** if an instance policy narrows
it, the release publish falls back to a repo-scoped PAT secret.
## Alternatives considered
- **Run on the runner host's nix.** Rejected — the probe showed steps run in a
container where host nix is unreachable.
- **Install nix per-job in the default image.** Works but cold every run
(slow) and throwaway once the image exists; rejected in favour of the baked
image.
- **`catthehacker` or bare `nixos/nix` as the base.** catthehacker is a
multi-GB runner emulation we don't need; bare `nixos/nix` lacks
`sleep`/`bash`/`node` and won't start. `node:22-bookworm-slim` is the small,
contract-satisfying middle (user's suggestion).
- **A standard `rust:1.95` CI image instead of the flake.** Simpler in CI but a
*second* toolchain definition (drift) — counter to the unify-with-dev goal.
- **A third-party Gitea release action.** Avoided; the API + auto token keep
the release self-contained and debuggable.
## Deferred / out of scope (tracked, step by step)
- **D1 matrix:** **macOS only** now (x86_64 + aarch64). The four non-macOS
targets shipped via cargo-zigbuild (see the 2026-06-13 amendment); macOS needs
Apple's SDK (osxcross + private SDK, or a Mac runner).
- **D3 packaging:** Homebrew / Scoop / winget / `cargo-binstall` manifests
(and binstall-friendly asset naming/archives).
- **Tier 4 (PTY E2E):** still unwired (`requirements.md` **TT4**); the gate runs
tiers 13 only, so **TT5** ("CI runs all tiers on Linux/macOS/Windows") is
partially met — Linux, tiers 13.
- **CI speed:** dependency/`target` caching (cargo-chef into the image, or
`actions/cache`), and image slimming / `dind-cached` base-pull caching.
- **Website deploy:** the static site → Cloudflare via Gitea Actions (a
separate, simpler workflow on the website branch).
- **fmt gate:** revisit on `main` once a `rustfmt` style is chosen.
## Relationship to other decisions
- **Builds on ADR-ci-002** (nix flake dev + build env). This ADR adds the
musl-target/cc to that flake and consumes it from CI.
- **Advances `requirements.md`:** **TT5** (CI runs the tiers — Linux, 13),
**D2** (static binary — Linux, done), **D1**/**D3** (partial/deferred).
- **Mirrors the website subproject's** separate ADR namespace and its
static→Cloudflare-via-Gitea-Actions deployment posture (ADR-website-001).