7.6 KiB
ADR-014: Singularity Knowledge + Agent Platform stack
Date: 2026-04-29 Status: proposed (deferred — capture for staged execution)
Context
SPEC.md §16 defines a cross-instance knowledge layer (Singularity Memory). SPEC.md §17–18 defines persistent agents and inter-agent messaging (status NEW). sf instances today carry their own local memory store (memory-store.ts); persistent agents are not implemented at all.
Two trajectories converge:
- Knowledge federates — anti-patterns, learnings, contracts should be reachable across sf instances and across other agent products on the tailnet (Hermes, OpenClaw, Claude Code, Cursor).
- Persistent agents centralise — long-lived cross-project agents (code-reviewer with cross-project memory, memory-curator, security-auditor, build-watch) are too heavy and too cross-cutting to live per-project.
These two needs collapse into one service: the Singularity Knowledge + Agent Platform — a single Go server hosting the federated memory store and the central persistent-agent runtime.
This ADR fixes the stack.
The implementation arm of this ADR lives in singularity-memory/MIGRATION.md.
Decision
- Language: Go.
- Storage backbone: Postgres + vchord (existing) — accessed from Go via
pgx. No data migration; same schema, same vchord index. - Identity / auth / sync layer:
charmbracelet/charm-server patterns — SSH-key identity, JWT issuance, encrypted KV for user-level prefs and config. Adopted as ported library code; not run as a sidecar. - Agent runtime:
charmbracelet/fantasy— multi-provider LLM access (Anthropic, OpenAI, Google, Bedrock, OpenRouter, etc. viacatwalk). Used for embeddings/summarisation today; for full central persistent agents tomorrow. - HTTP API: Go
net/http+ chi or echo router, serving the exact current OpenAPI contract. - MCP server: same wire protocol as today's Python implementation. Clients (sf, Hermes, OpenClaw, Claude Code, Cursor) keep working unchanged.
- CLI scaffolding:
charmbracelet/fang. - Observability:
promwish-style Prometheus metrics, scraped from a shared metrics endpoint. - Admin UI (Phase 3):
pony+ultravioletfor the view layer (reversed from earlier deferral; now adopted as a deliberate foundation bet — admin UI tolerates churn better than user-facing surfaces). Served over SSH viawish.
Alternatives Considered
Stack
- Stay Python + FastAPI + Postgres. Status quo. Works today.
- Rejected: misses the foundation bet for central persistent agents (sf SPEC §17). Building those on Python + raw OpenAI/Anthropic SDK calls means retrofitting fantasy-style agent semantics later — real refactor cost. The trigger to migrate isn't pain in the current server; it's foundation laying for what comes next.
- Rust + axum + Postgres. Uniformly fast, but Charm's agentic ecosystem (fantasy, catwalk, wish, charm-server, the entire Bubble Tea family) is Go-native. Rust on the server side would mean reimplementing those abstractions or shelling out. Rejected — wrong ecosystem.
- TypeScript + Node + Postgres. Keeps language alignment with sf core. But sf is moving toward parallel-build (ADR-016): TS in sf core, Go in new services. The Node ecosystem doesn't have an equivalent to fantasy + charm-server + Wish. Rejected.
Storage backbone
- Replace Postgres + vchord with
charm-server's native KV.charm-serveris a personal/team encrypted KV; it's not a vector DB or BM25 index. We'd lose retrieval sophistication. Rejected. - Replace Postgres with
sqlite-vec. Embeddable single-binary deployment is appealing, but BM25 quality ontsvectoris hard to match without a full re-tune, and we'd be redoing data migration on top. Rejected for v1; revisit in a v2 retrieval ADR if the Go server needs to ship without Postgres. - Keep Postgres + vchord, connect via Go
pgx. ← chosen. Battle-tested retrieval, zero data migration, focus the migration on language/runtime/agent-platform changes only.
Agent runtime
- Direct SDK calls (
anthropic-sdk-go,openai-go,go-genai). Simplest for today's narrow LLM use (embeddings + summarisation). But future central persistent agents need agent-loop semantics (multi-turn, tool calls); building those on raw SDKs reinvents fantasy's abstractions. Rejected — foundation bet. - Build our own agent runtime in Go. Pure NIH. Rejected.
charmbracelet/fantasy. ← chosen. 730 stars, actively developed, clean API, multi-provider viacatwalk.
Consequences
Positive
- Foundation is right for central persistent agents (sf SPEC §17). Adding new agents means defining their tools and system prompt, not rebuilding the runtime.
- Single static Go binary is operationally simpler than Python uv/venv + Alembic + worker on each deployment host.
- Charm ecosystem alignment with sf-worker (ADR-013), flight recorder (ADR-015), Charm TUI client (ADR-017). One language for the new-services tier.
- Wire contract preserved — clients are zero-touch.
Negative
- Migration is a real undertaking — ~12 weeks total, with the recall endpoint as the critical parity gate. See
MIGRATION.md. - Polyglot deployment grows — Python (during transition) + Go (new) + TS (sf core) + Rust (sf native). Bounded; once Python retires, three languages with clear boundaries.
fantasyandponyare pre-1.0 — API churn is real.
Risks and mitigations
- Risk: recall quality regression between Python and Go.
- Mitigation: held-out evaluation set; ±2% recall@k threshold enforced in CI before flipping traffic.
- Risk:
pgx+ vchord custom-type decoder edge cases.- Mitigation: prove out in Phase 1 against a small endpoint; engage vchord author if blocked.
- Risk:
fantasyAPI churn during the migration.- Mitigation: pin a version; one planned upgrade midway through the migration.
- Risk: central agents prove unworkable as a model and we've over-built the foundation.
- Mitigation: the foundation cost is incremental (fantasy ≈ raw SDK + a thin abstraction). Worst case we use fantasy for embeddings only and never grow it. No wasted bet.
Out of Scope
- Cross-tenant Singularity Memory — single trust domain per deployment.
- Retrieval-pipeline redesign — BM25 + vector + RRF + reranker semantics are preserved exactly.
- DB migration — Postgres + vchord stay.
- Public-internet endpoint — tailnet only per ADR-013.
Sequencing
| Phase | What | Cost |
|---|---|---|
| 0 | Prep: commit OpenAPI spec, build test suite, set up CI (per existing TODO.md) |
1–2 weeks |
| 1 | Greenfield Go scaffold parallel to Python; first endpoint (GET /v1/banks) |
2–3 weeks |
| 2 | Endpoint parity (recall is the critical gate) | 4–8 weeks |
| 3 | Worker + admin UI (pony + ultraviolet on wish) |
2–3 weeks |
| 4 | Central persistent-agent host (depends on sf SPEC §17 scoping) | variable |
| 5 | Python deprecation | 1 week |
Total: ~12 weeks for Phases 0–3 + Phase 5; Phase 4 lands when sf-side agent layer is scoped.
References
MIGRATION.md(singularity-memory repo) — implementation arm.SPEC.md§16 — Knowledge Layer.SPEC.md§17–18 — Persistent Agents and Inter-Agent Messaging.ADR-012— Multi-instance federation (this is one of its surfaces).ADR-013— Network and remote-execution (deployment substrate).ADR-016— Charm AI stack adoption (frames the polyglot decision).charmbracelet/charm— KV with sync (auth/identity patterns ported here).charmbracelet/fantasy— agent runtime.charmbracelet/catwalk— provider/model registry.