ADR-014: Singularity Knowledge + Agent Platform stack

Date: 2026-04-29 Status: proposed (deferred — capture for staged execution)

Context

SPEC.md §16 defines a cross-instance knowledge layer (Singularity Memory). SPEC.md §17–18 defines persistent agents and inter-agent messaging (status NEW). sf instances today carry their own local memory store (memory-store.ts); persistent agents are not implemented at all.

Two trajectories converge:

Knowledge federates — anti-patterns, learnings, contracts should be reachable across sf instances and across other agent products on the tailnet (Hermes, OpenClaw, Claude Code, Cursor).
Persistent agents centralise — long-lived cross-project agents (code-reviewer with cross-project memory, memory-curator, security-auditor, build-watch) are too heavy and too cross-cutting to live per-project.

These two needs collapse into one service: the Singularity Knowledge + Agent Platform — a single Go server hosting the federated memory store and the central persistent-agent runtime.

This ADR fixes the stack.

The implementation arm of this ADR lives in singularity-memory/MIGRATION.md.

Decision

Language: Go.
Storage backbone: Postgres + vchord (existing) — accessed from Go via pgx. No data migration; same schema, same vchord index.
Identity / auth / sync layer: charmbracelet/charm-server patterns — SSH-key identity, JWT issuance, encrypted KV for user-level prefs and config. Adopted as ported library code; not run as a sidecar.
Agent runtime: charmbracelet/fantasy — multi-provider LLM access (Anthropic, OpenAI, Google, Bedrock, OpenRouter, etc. via catwalk). Used for embeddings/summarisation today; for full central persistent agents tomorrow.
HTTP API: Go net/http + chi or echo router, serving the exact current OpenAPI contract.
MCP server: same wire protocol as today's Python implementation. Clients (sf, Hermes, OpenClaw, Claude Code, Cursor) keep working unchanged.
CLI scaffolding: charmbracelet/fang.
Observability: promwish-style Prometheus metrics, scraped from a shared metrics endpoint.
Admin UI (Phase 3): pony + ultraviolet for the view layer (reversed from earlier deferral; now adopted as a deliberate foundation bet — admin UI tolerates churn better than user-facing surfaces). Served over SSH via wish.

Alternatives Considered

Stack

Stay Python + FastAPI + Postgres. Status quo. Works today.
- Rejected: misses the foundation bet for central persistent agents (sf SPEC §17). Building those on Python + raw OpenAI/Anthropic SDK calls means retrofitting fantasy-style agent semantics later — real refactor cost. The trigger to migrate isn't pain in the current server; it's foundation laying for what comes next.
Rust + axum + Postgres. Uniformly fast, but Charm's agentic ecosystem (fantasy, catwalk, wish, charm-server, the entire Bubble Tea family) is Go-native. Rust on the server side would mean reimplementing those abstractions or shelling out. Rejected — wrong ecosystem.
TypeScript + Node + Postgres. Keeps language alignment with sf core. But sf is moving toward parallel-build (ADR-016): TS in sf core, Go in new services. The Node ecosystem doesn't have an equivalent to fantasy + charm-server + Wish. Rejected.

Storage backbone

Replace Postgres + vchord with charm-server's native KV. charm-server is a personal/team encrypted KV; it's not a vector DB or BM25 index. We'd lose retrieval sophistication. Rejected.
Replace Postgres with sqlite-vec. Embeddable single-binary deployment is appealing, but BM25 quality on tsvector is hard to match without a full re-tune, and we'd be redoing data migration on top. Rejected for v1; revisit in a v2 retrieval ADR if the Go server needs to ship without Postgres.
Keep Postgres + vchord, connect via Go pgx. ← chosen. Battle-tested retrieval, zero data migration, focus the migration on language/runtime/agent-platform changes only.

Agent runtime

Direct SDK calls (anthropic-sdk-go, openai-go, go-genai). Simplest for today's narrow LLM use (embeddings + summarisation). But future central persistent agents need agent-loop semantics (multi-turn, tool calls); building those on raw SDKs reinvents fantasy's abstractions. Rejected — foundation bet.
Build our own agent runtime in Go. Pure NIH. Rejected.
charmbracelet/fantasy. ← chosen. 730 stars, actively developed, clean API, multi-provider via catwalk.

Consequences

Positive

Foundation is right for central persistent agents (sf SPEC §17). Adding new agents means defining their tools and system prompt, not rebuilding the runtime.
Single static Go binary is operationally simpler than Python uv/venv + Alembic + worker on each deployment host.
Charm ecosystem alignment with sf-worker (ADR-013), flight recorder (ADR-015), Charm TUI client (ADR-017). One language for the new-services tier.
Wire contract preserved — clients are zero-touch.

Negative

Migration is a real undertaking — ~12 weeks total, with the recall endpoint as the critical parity gate. See MIGRATION.md.
Polyglot deployment grows — Python (during transition) + Go (new) + TS (sf core) + Rust (sf native). Bounded; once Python retires, three languages with clear boundaries.
fantasy and pony are pre-1.0 — API churn is real.

Risks and mitigations

Risk: recall quality regression between Python and Go.
- Mitigation: held-out evaluation set; ±2% recall@k threshold enforced in CI before flipping traffic.
Risk: pgx + vchord custom-type decoder edge cases.
- Mitigation: prove out in Phase 1 against a small endpoint; engage vchord author if blocked.
Risk: fantasy API churn during the migration.
- Mitigation: pin a version; one planned upgrade midway through the migration.
Risk: central agents prove unworkable as a model and we've over-built the foundation.
- Mitigation: the foundation cost is incremental (fantasy ≈ raw SDK + a thin abstraction). Worst case we use fantasy for embeddings only and never grow it. No wasted bet.

Out of Scope

Cross-tenant Singularity Memory — single trust domain per deployment.
Retrieval-pipeline redesign — BM25 + vector + RRF + reranker semantics are preserved exactly.
DB migration — Postgres + vchord stay.
Public-internet endpoint — tailnet only per ADR-013.

Sequencing

Phase	What	Cost
0	Prep: commit OpenAPI spec, build test suite, set up CI (per existing `TODO.md`)	1–2 weeks
1	Greenfield Go scaffold parallel to Python; first endpoint (`GET /v1/banks`)	2–3 weeks
2	Endpoint parity (recall is the critical gate)	4–8 weeks
3	Worker + admin UI (`pony` + `ultraviolet` on `wish`)	2–3 weeks
4	Central persistent-agent host (depends on sf SPEC §17 scoping)	variable
5	Python deprecation	1 week

Total: ~12 weeks for Phases 0–3 + Phase 5; Phase 4 lands when sf-side agent layer is scoped.

References

MIGRATION.md (singularity-memory repo) — implementation arm.
SPEC.md §16 — Knowledge Layer.
SPEC.md §17–18 — Persistent Agents and Inter-Agent Messaging.
ADR-012 — Multi-instance federation (this is one of its surfaces).
ADR-013 — Network and remote-execution (deployment substrate).
ADR-016 — Charm AI stack adoption (frames the polyglot decision).
charmbracelet/charm — KV with sync (auth/identity patterns ported here).
charmbracelet/fantasy — agent runtime.
charmbracelet/catwalk — provider/model registry.

7.6 KiB Raw Blame History Unescape Escape