111 lines
7.6 KiB
Markdown
111 lines
7.6 KiB
Markdown
|
|
# ADR-014: Singularity Knowledge + Agent Platform stack
|
|||
|
|
|
|||
|
|
**Date**: 2026-04-29
|
|||
|
|
**Status**: proposed (deferred — capture for staged execution)
|
|||
|
|
|
|||
|
|
## Context
|
|||
|
|
|
|||
|
|
`SPEC.md` §16 defines a cross-instance knowledge layer (Singularity Memory). `SPEC.md` §17–18 defines persistent agents and inter-agent messaging (status NEW). sf instances today carry their own local memory store (`memory-store.ts`); persistent agents are not implemented at all.
|
|||
|
|
|
|||
|
|
Two trajectories converge:
|
|||
|
|
|
|||
|
|
- **Knowledge federates** — anti-patterns, learnings, contracts should be reachable across sf instances and across other agent products on the tailnet (Hermes, OpenClaw, Claude Code, Cursor).
|
|||
|
|
- **Persistent agents centralise** — long-lived cross-project agents (code-reviewer with cross-project memory, memory-curator, security-auditor, build-watch) are too heavy and too cross-cutting to live per-project.
|
|||
|
|
|
|||
|
|
These two needs collapse into one service: the **Singularity Knowledge + Agent Platform** — a single Go server hosting the federated memory store *and* the central persistent-agent runtime.
|
|||
|
|
|
|||
|
|
This ADR fixes the stack.
|
|||
|
|
|
|||
|
|
The implementation arm of this ADR lives in [`singularity-memory/MIGRATION.md`](https://github.com/singularity-ng/singularity-memory/blob/main/MIGRATION.md).
|
|||
|
|
|
|||
|
|
## Decision
|
|||
|
|
|
|||
|
|
- **Language: Go.**
|
|||
|
|
- **Storage backbone: Postgres + vchord** (existing) — accessed from Go via `pgx`. No data migration; same schema, same vchord index.
|
|||
|
|
- **Identity / auth / sync layer: `charmbracelet/charm`-server patterns** — SSH-key identity, JWT issuance, encrypted KV for user-level prefs and config. Adopted as ported library code; not run as a sidecar.
|
|||
|
|
- **Agent runtime: `charmbracelet/fantasy`** — multi-provider LLM access (Anthropic, OpenAI, Google, Bedrock, OpenRouter, etc. via `catwalk`). Used for embeddings/summarisation today; for full central persistent agents tomorrow.
|
|||
|
|
- **HTTP API: Go `net/http` + chi or echo router**, serving the *exact* current OpenAPI contract.
|
|||
|
|
- **MCP server: same wire protocol** as today's Python implementation. Clients (sf, Hermes, OpenClaw, Claude Code, Cursor) keep working unchanged.
|
|||
|
|
- **CLI scaffolding: `charmbracelet/fang`.**
|
|||
|
|
- **Observability: `promwish`-style Prometheus metrics**, scraped from a shared metrics endpoint.
|
|||
|
|
- **Admin UI (Phase 3): `pony` + `ultraviolet`** for the view layer (reversed from earlier deferral; now adopted as a deliberate foundation bet — admin UI tolerates churn better than user-facing surfaces). Served over SSH via `wish`.
|
|||
|
|
|
|||
|
|
## Alternatives Considered
|
|||
|
|
|
|||
|
|
### Stack
|
|||
|
|
|
|||
|
|
- **Stay Python + FastAPI + Postgres.** Status quo. Works today.
|
|||
|
|
- *Rejected:* misses the foundation bet for central persistent agents (sf SPEC §17). Building those on Python + raw OpenAI/Anthropic SDK calls means retrofitting fantasy-style agent semantics later — real refactor cost. The trigger to migrate isn't pain in the current server; it's foundation laying for what comes next.
|
|||
|
|
- **Rust + axum + Postgres.** Uniformly fast, but Charm's agentic ecosystem (fantasy, catwalk, wish, charm-server, the entire Bubble Tea family) is Go-native. Rust on the server side would mean reimplementing those abstractions or shelling out. Rejected — wrong ecosystem.
|
|||
|
|
- **TypeScript + Node + Postgres.** Keeps language alignment with sf core. But sf is moving toward parallel-build (ADR-016): TS in sf core, Go in new services. The Node ecosystem doesn't have an equivalent to fantasy + charm-server + Wish. Rejected.
|
|||
|
|
|
|||
|
|
### Storage backbone
|
|||
|
|
|
|||
|
|
- **Replace Postgres + vchord with `charm-server`'s native KV.** `charm-server` is a personal/team encrypted KV; it's not a vector DB or BM25 index. We'd lose retrieval sophistication. Rejected.
|
|||
|
|
- **Replace Postgres with `sqlite-vec`.** Embeddable single-binary deployment is appealing, but BM25 quality on `tsvector` is hard to match without a full re-tune, and we'd be redoing data migration on top. Rejected for v1; revisit in a v2 retrieval ADR if the Go server needs to ship without Postgres.
|
|||
|
|
- **Keep Postgres + vchord, connect via Go `pgx`.** ← chosen. Battle-tested retrieval, zero data migration, focus the migration on language/runtime/agent-platform changes only.
|
|||
|
|
|
|||
|
|
### Agent runtime
|
|||
|
|
|
|||
|
|
- **Direct SDK calls (`anthropic-sdk-go`, `openai-go`, `go-genai`).** Simplest for today's narrow LLM use (embeddings + summarisation). But future central persistent agents need agent-loop semantics (multi-turn, tool calls); building those on raw SDKs reinvents fantasy's abstractions. Rejected — foundation bet.
|
|||
|
|
- **Build our own agent runtime in Go.** Pure NIH. Rejected.
|
|||
|
|
- **`charmbracelet/fantasy`.** ← chosen. 730 stars, actively developed, clean API, multi-provider via `catwalk`.
|
|||
|
|
|
|||
|
|
## Consequences
|
|||
|
|
|
|||
|
|
**Positive**
|
|||
|
|
|
|||
|
|
- **Foundation is right** for central persistent agents (sf SPEC §17). Adding new agents means defining their tools and system prompt, not rebuilding the runtime.
|
|||
|
|
- **Single static Go binary** is operationally simpler than Python uv/venv + Alembic + worker on each deployment host.
|
|||
|
|
- **Charm ecosystem alignment** with sf-worker (ADR-013), flight recorder (ADR-015), Charm TUI client (ADR-017). One language for the new-services tier.
|
|||
|
|
- **Wire contract preserved** — clients are zero-touch.
|
|||
|
|
|
|||
|
|
**Negative**
|
|||
|
|
|
|||
|
|
- **Migration is a real undertaking** — ~12 weeks total, with the recall endpoint as the critical parity gate. See `MIGRATION.md`.
|
|||
|
|
- **Polyglot deployment grows** — Python (during transition) + Go (new) + TS (sf core) + Rust (sf native). Bounded; once Python retires, three languages with clear boundaries.
|
|||
|
|
- **`fantasy` and `pony` are pre-1.0** — API churn is real.
|
|||
|
|
|
|||
|
|
**Risks and mitigations**
|
|||
|
|
|
|||
|
|
- *Risk:* recall quality regression between Python and Go.
|
|||
|
|
- *Mitigation:* held-out evaluation set; ±2% recall@k threshold enforced in CI before flipping traffic.
|
|||
|
|
- *Risk:* `pgx` + vchord custom-type decoder edge cases.
|
|||
|
|
- *Mitigation:* prove out in Phase 1 against a small endpoint; engage vchord author if blocked.
|
|||
|
|
- *Risk:* `fantasy` API churn during the migration.
|
|||
|
|
- *Mitigation:* pin a version; one planned upgrade midway through the migration.
|
|||
|
|
- *Risk:* central agents prove unworkable as a model and we've over-built the foundation.
|
|||
|
|
- *Mitigation:* the foundation cost is incremental (fantasy ≈ raw SDK + a thin abstraction). Worst case we use fantasy for embeddings only and never grow it. No wasted bet.
|
|||
|
|
|
|||
|
|
## Out of Scope
|
|||
|
|
|
|||
|
|
- **Cross-tenant Singularity Memory** — single trust domain per deployment.
|
|||
|
|
- **Retrieval-pipeline redesign** — BM25 + vector + RRF + reranker semantics are preserved exactly.
|
|||
|
|
- **DB migration** — Postgres + vchord stay.
|
|||
|
|
- **Public-internet endpoint** — tailnet only per ADR-013.
|
|||
|
|
|
|||
|
|
## Sequencing
|
|||
|
|
|
|||
|
|
| Phase | What | Cost |
|
|||
|
|
|---|---|---|
|
|||
|
|
| 0 | Prep: commit OpenAPI spec, build test suite, set up CI (per existing `TODO.md`) | 1–2 weeks |
|
|||
|
|
| 1 | Greenfield Go scaffold parallel to Python; first endpoint (`GET /v1/banks`) | 2–3 weeks |
|
|||
|
|
| 2 | Endpoint parity (recall is the critical gate) | 4–8 weeks |
|
|||
|
|
| 3 | Worker + admin UI (`pony` + `ultraviolet` on `wish`) | 2–3 weeks |
|
|||
|
|
| 4 | Central persistent-agent host (depends on sf SPEC §17 scoping) | variable |
|
|||
|
|
| 5 | Python deprecation | 1 week |
|
|||
|
|
|
|||
|
|
Total: ~12 weeks for Phases 0–3 + Phase 5; Phase 4 lands when sf-side agent layer is scoped.
|
|||
|
|
|
|||
|
|
## References
|
|||
|
|
|
|||
|
|
- `MIGRATION.md` (singularity-memory repo) — implementation arm.
|
|||
|
|
- `SPEC.md` §16 — Knowledge Layer.
|
|||
|
|
- `SPEC.md` §17–18 — Persistent Agents and Inter-Agent Messaging.
|
|||
|
|
- `ADR-012` — Multi-instance federation (this is one of its surfaces).
|
|||
|
|
- `ADR-013` — Network and remote-execution (deployment substrate).
|
|||
|
|
- `ADR-016` — Charm AI stack adoption (frames the polyglot decision).
|
|||
|
|
- `charmbracelet/charm` — KV with sync (auth/identity patterns ported here).
|
|||
|
|
- `charmbracelet/fantasy` — agent runtime.
|
|||
|
|
- `charmbracelet/catwalk` — provider/model registry.
|