# ADR-012: Multi-instance federation — when sf instances interlink
**Date**: 2026-04-29
**Status**: proposed (deferred — capture for future implementation)
## Context
sf today is **per-project**: each project has its own `.sf/sf.db`, and a single daemon (`packages/daemon`) on a host serves all projects under its scan roots. As deployment grows beyond one host (laptop, `mikki-bunker`, `aidev`), the question arises: should sf instances on different hosts (or different projects on the same host) interlink? And if so, on which surfaces?
Without thought-out federation, instances repeatedly re-learn the same lessons — anti-patterns, model outages, provider quirks — wasting tokens and duplicating mistakes. With over-eager federation, sf inherits cross-host trust, schema-version, and latency problems it doesn't need yet.
This ADR maps the federation surfaces, takes a position on each, and sequences the work.
## Decision
**Defer most federation. Wire Singularity Memory first as the single load-bearing federation primitive; defer federated benchmarks, cross-repo orchestration, and federated agents until the pain is concrete.**
**Status:** designed in `SPEC.md` §16 — Singularity Memory (`sm`) is the explicit cross-instance knowledge layer. HTTP API holding memories, learnings, anti-patterns. Runs embedded (single-user sf) or remote (shared service on tailnet, reachable from sf, Hermes, OpenClaw, Claude Code, Cursor through their own client integrations).
**Code reality:** not yet wired. `src/resources/extensions/sf/memory-store.ts` and `memory-extractor.ts` write to a local SQLite `memories` table. The spec's "remote-mode" isn't connected.
**Decision:** **wire it.** Singularity Memory is the load-bearing federation primitive. If Mikki learns "Provider X drops requests at 03:00 UTC", that anti-pattern should be reachable from any sf instance on the tailnet without re-learning. Once wired, ~80 % of the "should they interlink?" question answers itself.
### Surface 2 — Benchmarks and circuit breakers
**Status:** per-DB today. `benchmark_results` and `circuit_breakers` tables live in each project's `.sf/sf.db`. One instance trips a breaker on `kimi-coding/k2p5`; another instance has to independently rediscover the outage.
**Decision:** **defer; revisit after Singularity Memory lands.** Two clean options when we revisit:
- **Ride Singularity Memory** — store benchmark observations as a memory category, recall as needed. Cheap; semantically clean (benchmarks ARE learning).
- **Separate thin HTTP service** — purpose-built benchmark aggregator with statistical smoothing and a publish/subscribe channel for circuit-breaker events.
The pain ceiling is bounded today (per-instance discovery is at worst a few wasted dispatches). Only build when concrete cost emerges.
### Surface 3 — Cross-project unit dependencies
**Status:** not designed. sf has no concept of "milestone in repo A produces an artefact repo B depends on". The unit hierarchy (milestone → slice → task) is project-local.
**Decision:** **out of scope for sf.** Cross-repo orchestration is a different abstraction layer — it belongs in a meta-coordinator that consumes sf's daemon/RPC or headless interfaces, not in sf itself. Building it inside sf would conflate "agent that ships one project" with "fleet manager that ships an org's roadmap." Different products.
**Status:** not designed. `SPEC.md` §17 (NEW) introduces persistent agents, but scopes them to a single project's DB.
**Decision:** **defer.** Per-instance for v3. If Mikki has a "code-reviewer" persistent agent, it lives in Mikki's DB. Federation requires:
- Cross-host auth (who can wake whose agents).
- Agent-state schema versioning (instances may run different sf versions).
- Leader-election story for shared-agent updates.
- A migration path from per-instance → federated.
None of this earns its keep until we have a concrete use case where one agent should genuinely serve multiple projects/hosts. Premature now.
### Surface 5 — Distributed execution (clarifying note, not federation)
**Status:** spec'd in `SPEC.md` §22 (NEW); not built. SSH workers — one daemon dispatches units to remote worker hosts.
**Decision:** **clarify that this is NOT federation.** Distributed execution = one daemon owns many workers (parallel scaling). Federation = many daemons share state across hosts (knowledge sharing). Different problems. The spec already separates them; this ADR just affirms the line.
## Consequences
**Positive (after Singularity Memory lands)**
- **Knowledge sharing without re-learning** — anti-patterns, gotchas, contract findings reachable across hosts and other agent products on the tailnet.
- **Reusable for non-sf agents** — Hermes, Claude Code, Cursor can also read/write Singularity Memory, so the network effect grows beyond sf.
**Negative**
- **Tailnet dependency** — when remote-mode Singularity Memory is configured, tailnet outage degrades sf to local-only. Mitigation: spec already allows embedded (in-process) mode; remote is opt-in.
- **Cross-instance prompt-injection surface** — a malicious memory written by one instance could leak into another's recall. Mitigation: Singularity Memory MUST track provenance per memory and let consumers filter by trusted source. Capture as a sub-ADR if/when implemented.
- **Schema versioning across instances** — different sf versions accessing the same memory store. Mitigation: memory schema must be append-only and additive; new fields are optional reads.
**Risks and mitigations**
- *Risk:* Singularity Memory becomes a bottleneck — sf can't dispatch when memory is down.
- *Mitigation:* sf MUST treat memory as best-effort. A memory-fetch failure logs degraded-mode and proceeds with empty recall. Local SQLite stays as the authoritative scheduler state (per `SPEC.md` §3).
- *Risk:* federated benchmarks make sf overconfident in stale data.
- *Mitigation:* every benchmark observation carries `recorded_at` and `host`. Consumers weight by recency and reject stale data older than `circuit_breaker_resets_at + N`.
- *Mitigation:* same as the prompt-injection mitigation above — provenance + trusted-source filter, plus rate-limiting per writer.
## Out of Scope
- **Cross-repo unit graph** — meta-coordinator territory.
- **Federated persistent-agent fleets** — defer until concrete pain.
- **Multi-tenant Singularity Memory** — current design assumes a single-user-or-team trust domain. Multi-tenant is a separate product.
- **Auto-sharding sf instances** — sf is one daemon per host; we don't horizontally split a single host's daemon.
## Sequencing
| When | Action |
|---|---|
| Tier 1+ (next 1–3 months) | Wire Singularity Memory remote-mode in `memory-store.ts`. Provider chain fallback: remote → embedded → local-only. Update `SPEC.md` §16 status from PARTIAL to EXISTS once landed. |
| After Singularity Memory in production for 1+ month | Decide whether to ride it for benchmarks (Surface 2) or build a separate service. Decision driven by observed cost of duplicated benchmark discovery. |