singularity-forge/docs/dev/ADR-014-singularity-knowledge-and-agent-platform.md
2026-04-29 17:44:30 +02:00

110 lines
7.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# ADR-014: Singularity Knowledge + Agent Platform stack
**Date**: 2026-04-29
**Status**: proposed (deferred — capture for staged execution)
## Context
`SPEC.md` §16 defines a cross-instance knowledge layer (Singularity Memory). `SPEC.md` §1718 defines persistent agents and inter-agent messaging (status NEW). sf instances today carry their own local memory store (`memory-store.ts`); persistent agents are not implemented at all.
Two trajectories converge:
- **Knowledge federates** — anti-patterns, learnings, contracts should be reachable across sf instances and across other agent products on the tailnet (Hermes, OpenClaw, Claude Code, Cursor).
- **Persistent agents centralise** — long-lived cross-project agents (code-reviewer with cross-project memory, memory-curator, security-auditor, build-watch) are too heavy and too cross-cutting to live per-project.
These two needs collapse into one service: the **Singularity Knowledge + Agent Platform** — a single Go server hosting the federated memory store *and* the central persistent-agent runtime.
This ADR fixes the stack.
The implementation arm of this ADR lives in [`singularity-memory/MIGRATION.md`](https://github.com/singularity-ng/singularity-memory/blob/main/MIGRATION.md).
## Decision
- **Language: Go.**
- **Storage backbone: Postgres + vchord** (existing) — accessed from Go via `pgx`. No data migration; same schema, same vchord index.
- **Identity / auth / sync layer: `charmbracelet/charm`-server patterns** — SSH-key identity, JWT issuance, encrypted KV for user-level prefs and config. Adopted as ported library code; not run as a sidecar.
- **Agent runtime: `charmbracelet/fantasy`** — multi-provider LLM access (Anthropic, OpenAI, Google, Bedrock, OpenRouter, etc. via `catwalk`). Used for embeddings/summarisation today; for full central persistent agents tomorrow.
- **HTTP API: Go `net/http` + chi or echo router**, serving the *exact* current OpenAPI contract.
- **MCP server: same wire protocol** as today's Python implementation. Clients (sf, Hermes, OpenClaw, Claude Code, Cursor) keep working unchanged.
- **CLI scaffolding: `charmbracelet/fang`.**
- **Observability: `promwish`-style Prometheus metrics**, scraped from a shared metrics endpoint.
- **Admin UI (Phase 3): `pony` + `ultraviolet`** for the view layer (reversed from earlier deferral; now adopted as a deliberate foundation bet — admin UI tolerates churn better than user-facing surfaces). Served over SSH via `wish`.
## Alternatives Considered
### Stack
- **Stay Python + FastAPI + Postgres.** Status quo. Works today.
- *Rejected:* misses the foundation bet for central persistent agents (sf SPEC §17). Building those on Python + raw OpenAI/Anthropic SDK calls means retrofitting fantasy-style agent semantics later — real refactor cost. The trigger to migrate isn't pain in the current server; it's foundation laying for what comes next.
- **Rust + axum + Postgres.** Uniformly fast, but Charm's agentic ecosystem (fantasy, catwalk, wish, charm-server, the entire Bubble Tea family) is Go-native. Rust on the server side would mean reimplementing those abstractions or shelling out. Rejected — wrong ecosystem.
- **TypeScript + Node + Postgres.** Keeps language alignment with sf core. But sf is moving toward parallel-build (ADR-016): TS in sf core, Go in new services. The Node ecosystem doesn't have an equivalent to fantasy + charm-server + Wish. Rejected.
### Storage backbone
- **Replace Postgres + vchord with `charm-server`'s native KV.** `charm-server` is a personal/team encrypted KV; it's not a vector DB or BM25 index. We'd lose retrieval sophistication. Rejected.
- **Replace Postgres with `sqlite-vec`.** Embeddable single-binary deployment is appealing, but BM25 quality on `tsvector` is hard to match without a full re-tune, and we'd be redoing data migration on top. Rejected for v1; revisit in a v2 retrieval ADR if the Go server needs to ship without Postgres.
- **Keep Postgres + vchord, connect via Go `pgx`.** ← chosen. Battle-tested retrieval, zero data migration, focus the migration on language/runtime/agent-platform changes only.
### Agent runtime
- **Direct SDK calls (`anthropic-sdk-go`, `openai-go`, `go-genai`).** Simplest for today's narrow LLM use (embeddings + summarisation). But future central persistent agents need agent-loop semantics (multi-turn, tool calls); building those on raw SDKs reinvents fantasy's abstractions. Rejected — foundation bet.
- **Build our own agent runtime in Go.** Pure NIH. Rejected.
- **`charmbracelet/fantasy`.** ← chosen. 730 stars, actively developed, clean API, multi-provider via `catwalk`.
## Consequences
**Positive**
- **Foundation is right** for central persistent agents (sf SPEC §17). Adding new agents means defining their tools and system prompt, not rebuilding the runtime.
- **Single static Go binary** is operationally simpler than Python uv/venv + Alembic + worker on each deployment host.
- **Charm ecosystem alignment** with sf-worker (ADR-013), flight recorder (ADR-015), Charm TUI client (ADR-017). One language for the new-services tier.
- **Wire contract preserved** — clients are zero-touch.
**Negative**
- **Migration is a real undertaking** — ~12 weeks total, with the recall endpoint as the critical parity gate. See `MIGRATION.md`.
- **Polyglot deployment grows** — Python (during transition) + Go (new) + TS (sf core) + Rust (sf native). Bounded; once Python retires, three languages with clear boundaries.
- **`fantasy` and `pony` are pre-1.0** — API churn is real.
**Risks and mitigations**
- *Risk:* recall quality regression between Python and Go.
- *Mitigation:* held-out evaluation set; ±2% recall@k threshold enforced in CI before flipping traffic.
- *Risk:* `pgx` + vchord custom-type decoder edge cases.
- *Mitigation:* prove out in Phase 1 against a small endpoint; engage vchord author if blocked.
- *Risk:* `fantasy` API churn during the migration.
- *Mitigation:* pin a version; one planned upgrade midway through the migration.
- *Risk:* central agents prove unworkable as a model and we've over-built the foundation.
- *Mitigation:* the foundation cost is incremental (fantasy ≈ raw SDK + a thin abstraction). Worst case we use fantasy for embeddings only and never grow it. No wasted bet.
## Out of Scope
- **Cross-tenant Singularity Memory** — single trust domain per deployment.
- **Retrieval-pipeline redesign** — BM25 + vector + RRF + reranker semantics are preserved exactly.
- **DB migration** — Postgres + vchord stay.
- **Public-internet endpoint** — tailnet only per ADR-013.
## Sequencing
| Phase | What | Cost |
|---|---|---|
| 0 | Prep: commit OpenAPI spec, build test suite, set up CI (per existing `TODO.md`) | 12 weeks |
| 1 | Greenfield Go scaffold parallel to Python; first endpoint (`GET /v1/banks`) | 23 weeks |
| 2 | Endpoint parity (recall is the critical gate) | 48 weeks |
| 3 | Worker + admin UI (`pony` + `ultraviolet` on `wish`) | 23 weeks |
| 4 | Central persistent-agent host (depends on sf SPEC §17 scoping) | variable |
| 5 | Python deprecation | 1 week |
Total: ~12 weeks for Phases 03 + Phase 5; Phase 4 lands when sf-side agent layer is scoped.
## References
- `MIGRATION.md` (singularity-memory repo) — implementation arm.
- `SPEC.md` §16 — Knowledge Layer.
- `SPEC.md` §1718 — Persistent Agents and Inter-Agent Messaging.
- `ADR-012` — Multi-instance federation (this is one of its surfaces).
- `ADR-013` — Network and remote-execution (deployment substrate).
- `ADR-016` — Charm AI stack adoption (frames the polyglot decision).
- `charmbracelet/charm` — KV with sync (auth/identity patterns ported here).
- `charmbracelet/fantasy` — agent runtime.
- `charmbracelet/catwalk` — provider/model registry.