singularity-forge/docs/specs/sf-self-deploy.md
Mikael Hugo e50f2c0af1
Some checks are pending
sf self-deploy / build, test, and publish server image (push) Waiting to run
sf self-deploy / deploy test and probe (push) Blocked by required conditions
sf self-deploy / promote prod (push) Blocked by required conditions
chore: align workflow + docs with k3s-only deploy path
Followup to the dead-docker delete: remove `docker:vega:*` package.json
scripts, the projects-view upgrade button, and the docker-compose-vega
sections of sf-self-deploy.md. Self-deploy workflow stays k3s-only
(build → push → deploy-test → deploy-prod via kubectl set image).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 01:04:05 +02:00

3.2 KiB

SF Self-Deploy Contract

SF deploys as a long-running server owned by the deployment platform, not by an interactive TUI session. Forgejo is the build authority: it verifies a source revision, publishes an immutable OCI image, then rolls a test server before prod.

Purpose

The server must be reloadable without humans killing old processes by hand, and the CLI/web surfaces must be able to prove which build they are controlling. The artifact boundary is therefore:

  1. source revision in git
  2. Forgejo build/test result
  3. OCI image tag or digest
  4. dist/sf-release-manifest.json
  5. /api/healthz, /api/ready, and /api/version probes

Build Authority

Forgejo runs .forgejo/workflows/self-deploy.yml on main and manual dispatch. The required gates are:

  • npm ci
  • npm --prefix web ci
  • npm run build:core
  • npm run build:web-host
  • npm run typecheck:extensions
  • npm run test:unit
  • build docker/Dockerfile.sf-server
  • generate dist/sf-release-manifest.json

The image builder is Docker/BuildKit. The default Forgejo image repository is registry.infra.centralcloud.com/singularity/sf-server, matching the in-cluster registry host already used by GitOps workloads. The deployment contract starts at the OCI image plus release manifest.

Server Runtime

The server image starts:

node /opt/sf/dist/loader.js server /workspace --host 0.0.0.0 --port 4000

The web host receives SF_RELEASE_MANIFEST, SF_WEB_PROJECT_CWD, SF_WEB_HOST, and SF_WEB_PORT in its environment. Probes are unauthenticated so Kubernetes, Traefik, and Forgejo can verify rollouts without a browser token.

Vega runs this image under k3s. The GitOps manifests live in /srv/infra/clusters/default/tenants/hugo/apps/sf-server/ and define one shared SF webserver deployment plus a test deployment. The shared webserver owns project switching and repo-scoped worker/session state; it is not one webserver per repo.

The pod mounts persistent SF state and the host repo workspace paths required by the project picker. Runtime source mutation is not the deploy mechanism. A new git revision becomes live only after Forgejo builds an image, rolls the test deployment, probes it, then promotes the same image to prod.

Promotion

Test must roll before prod:

  1. set test deployment image to the new digest
  2. wait for rollout
  3. call /api/healthz
  4. call /api/ready
  5. call /api/version
  6. promote the same image digest to prod
  7. repeat the same probes

Prod must not install latest from npm during rollout. Runtime auto-update means the deployment controller rolls a verified image; it does not mean the running process mutates its own package tree.

Reload Model

In k3s, rollout replacement is the reload mechanism. /api/healthz and /api/ready return 503 during shutdown so the service can drain before the old pod exits. Long term, CLI commands should call the server RPC surface by default when a healthy server owns the project, while local sf server remains the bootstrap and recovery path.

Open Work

  • Wire /api/version into the web footer/admin panel.
  • Add an RPC smoke probe once the stable server RPC endpoint is finalized.
  • Move any remaining Forgejo deployment target defaults into /srv/infra GitOps values once the app is fully managed there.