singularity-forge/docs/specs/sf-self-deploy.md
Mikael Hugo 6d8fc62243
Some checks are pending
sf self-deploy / build, test, and publish server image (push) Waiting to run
sf self-deploy / deploy test and probe (push) Blocked by required conditions
sf self-deploy / promote prod (push) Blocked by required conditions
fix: use shared sf webserver project config
2026-05-17 22:09:28 +02:00

4 KiB

SF Self-Deploy Contract

SF deploys as a long-running server owned by the deployment platform, not by an interactive TUI session. Forgejo is the build authority: it verifies a source revision, publishes an immutable OCI image, then rolls a test server before prod.

Purpose

The server must be reloadable without humans killing old processes by hand, and the CLI/web surfaces must be able to prove which build they are controlling. The artifact boundary is therefore:

  1. source revision in git
  2. Forgejo build/test result
  3. OCI image tag or digest
  4. dist/sf-release-manifest.json
  5. /api/healthz, /api/ready, and /api/version probes

Build Authority

Forgejo runs .forgejo/workflows/self-deploy.yml on main and manual dispatch. The required gates are:

  • npm ci
  • npm --prefix web ci
  • npm run build:core
  • npm run build:web-host
  • npm run typecheck:extensions
  • npm run test:unit
  • build docker/Dockerfile.sf-server
  • generate dist/sf-release-manifest.json

The image builder can be Docker, BuildKit, Kaniko, or nix2container. SF does not depend on the builder implementation. The deployment contract starts at the OCI image plus release manifest.

Server Runtime

The server image starts:

node /opt/sf/dist/loader.js server /workspace --host 0.0.0.0 --port 4000

The web host receives SF_RELEASE_MANIFEST, SF_WEB_PROJECT_CWD, SF_WEB_HOST, and SF_WEB_PORT in its environment. Probes are unauthenticated so Kubernetes, Traefik, and Forgejo can verify rollouts without a browser token.

On vega, the local production server may run from the live checkout while still being containerised:

npm run docker:vega:up

That profile runs one shared SF webserver. It mounts this SF checkout at /opt/sf, mounts the initial controlled repo at /workspace, mounts the repo parent at /workspaces, also mounts the repo parent at its real host path (/home/mhugo/code on vega), persists ~/.sf, and binds port 4000 to ${SF_VEGA_BIND:-127.0.0.1}. SF_WORKSPACE_DIR selects the initial repo; it defaults to this checkout for dogfooding. SF_WORKSPACES_DIR selects the parent directory available for repo switching and defaults to the parent of this SF checkout:

SF_WORKSPACE_DIR=/home/mhugo/code/other-repo SF_WORKSPACES_DIR=/home/mhugo/code npm run docker:vega:up

Set SF_VEGA_BIND to the vega Tailscale address when the server should be reachable over Tailscale; do not bind public 0.0.0.0 unless a proxy/firewall owns access control.

On hosts without the Docker Compose plugin, npm run docker:vega:up uses scripts/run-vega-source-server.mjs to build docker/Dockerfile.source-server and run the equivalent docker run command directly. This is one SF server implementation, one shared webserver process, and repo-scoped worker/session state underneath it. Restarting the runner replaces the shared vega webserver, not one container per repo.

Promotion

Test must roll before prod:

  1. set test deployment image to the new digest
  2. wait for rollout
  3. call /api/healthz
  4. call /api/ready
  5. call /api/version
  6. promote the same image digest to prod
  7. repeat the same probes

Prod must not install latest from npm during rollout. Runtime auto-update means the deployment controller rolls a verified image; it does not mean the running process mutates its own package tree.

Reload Model

For a source-mounted vega container, the foreground process is the staged Next standalone server at dist/web/standalone/server.js. Rebuild or restart the container after changing server/web code. In Kubernetes or k3s, rollout replacement is the reload mechanism. Long term, CLI commands should call the server RPC surface by default when a healthy server owns the project, while local sf server remains the bootstrap and recovery path.

Open Work

  • Wire /api/version into the web footer/admin panel.
  • Add an RPC smoke probe once the stable server RPC endpoint is finalized.
  • Move the Forgejo workflow's deployment target names into /srv/infra GitOps values when the cluster manifests exist.