Operator-direction 2026-05-17 "we will never use mac" — no compat preservation. Single-cutover replacement. - new packages/daemon/src/systemd.ts: install/uninstall/status using systemctl --user + ~/.config/systemd/user/sf-server.service - new packages/daemon/src/systemd.test.ts: ports launchd tests, same shape, mocked systemctl via RunCommandFn injection + SF_SYSTEMD_USER_DIR env override for real filesystem tests - cli-main.ts: switch import + update help text + status messages - index.ts: re-export systemd module (installSystemdUnit, uninstallSystemdUnit, systemdUnitStatus, generateUnit, getServicePath, SystemdStatus, SystemdUnitOptions) - DELETED: launchd.ts (253 LOC), launchd.test.ts (379 LOC) - docs/dev/drafts/M053-per-repo-supervisor.md: remove "launchd" mention - CHANGELOG.md: document systemd-only install path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4 KiB
M053 Supervisor And Web Server Shape
M053 uses one real server: the SF web/Next.js process. It is the operator surface and read-model aggregator.
Enrolled repositories do not run their own web server and do not host a shared
workflow daemon. Each repo has its own supervised worker boundary for
sf autonomous (implemented by the existing non-TUI machine surface). On Linux
that boundary is a user-level systemd unit;
other platforms can add equivalent supervisor adapters. The worker writes a
small repo-local status projection, and the single web server reads those
projections.
Process Model
The operator server is singular. It serves the browser UI, exposes /api/swarms,
and reads ~/.sf/swarms.json to discover enrolled repositories.
Each repository worker is separate. It runs from that repo's working directory,
uses that repo's .sf/sf.db, and writes only that repo's .sf/ runtime files.
The worker is allowed to write .sf/status.projection.json with temp-file,
fsync, and rename. It is not allowed to mutate another repo's DB or aggregate
another repo's doctor/self-feedback/ledger data.
The supervisor is an OS/process boundary, not the product brain. A systemd user unit (or equivalent adapter on other platforms) may restart a worker and expose process health, but the planning state remains repo-local and the web server remains the operator surface.
Status Projection
Each repo publishes .sf/status.projection.json with projectionVersion: 1.
The projection contains only the read-only fields the web dashboard needs:
active milestone, active slice, current unit, next unit, queue depth,
last-cycle outcome, writer timestamp, and coarse health.
The web reader treats missing, corrupt, or unknown-version projections as a per-repo degraded row. One broken projection must not break the dashboard.
The projection excludes self-feedback aggregation, full doctor reports, last-green-ledger details, cross-repo learning, and cross-repo dispatch. Those belong behind a future federation requirement, not M053.
Registry Sync
sf-server owns swarm registry refresh. It scans configured
projects.scan_roots from ~/.sf/daemon.yaml and atomically rewrites
~/.sf/swarms.json on startup and every swarms.refresh_ms while the server is
running. Operators can run a one-shot refresh with sf-server --sync-swarms.
This replaces the old script/watchdog shape. There is no repo-local enrollment script and no nested meta-supervisor. The registry is a server-owned read model: repo add/remove/rename is picked up by scan roots, and the web server reads the single registry.
Per-repo execution supervision remains a platform adapter. On Linux the target
adapter is a user systemd unit that starts sf autonomous from the repo
directory and restarts it with backoff. That adapter is owned by server/package
code, not by scripts.
On Linux the first adapter is a user systemd unit named from a stable hash of
the repo path. The unit uses Restart=always, RestartSec=30s, and systemd
start-limit settings for crash-loop backoff.
RPC Client Boundary
@singularity-forge/rpc-client is the reusable stdio JSON-RPC adapter. Root
headless clients and packages/daemon should import it directly. The coding
agent remains the RPC server implementation and still owns interactive/session
internals; it should not be the source of reusable client code for web, daemon,
or headless orchestration.
Definition Of Done
M053 is done when:
- The non-TUI status query writes the versioned atomic status projection.
- Web has a Swarms view backed by
/api/swarms. - The web reader survives missing/corrupt projections per repo.
sf-serverauto-syncs~/.sf/swarms.jsonfrom configured scan roots and can run the same refresh once withsf-server --sync-swarms.- Legacy script/watchdog entrypoints are removed from the normal lifecycle.
- Linux server/package code can create a user-level systemd worker for
sf autonomous. - Root headless/client utilities use
@singularity-forge/rpc-client.
M053 is not done by creating a per-repo server. That is explicitly out of scope.