singularity-forge/docs/dev/ADR-016-charm-ai-stack-adoption.md

8.5 KiB
Raw Permalink Blame History

ADR-016: Charm AI stack adoption strategy

Date: 2026-04-29 Status: accepted (strategic frame; concrete decisions in ADR-013/014/015/017)

Context

Older SPEC notes retargeted sf v3 from Go-on-Crush to TypeScript-on-pi-mono. That decision still stands for sf core. But over the past year the Charm ecosystem has matured to a point that adjacent services would be better served by it:

  • fantasy (730 stars, pushed today) — multi-provider AI agent SDK in Go. Equivalent of pi-ai. Wasn't this complete at retarget time.
  • catwalk (688 stars) — provider/model registry used by crush.
  • crush (23,641 stars) — Charm's agentic coding CLI.
  • charm (2,491 stars) — encrypted KV with sync, self-hostable as charm-server. Foundation for cross-instance state.
  • wish (5,158 stars) + wishlist + promwish — full SSH-served service stack with built-in metrics.
  • bubbletea (41,946 stars) + bubbles + lipgloss + glamour + huh + harmonica — production TUI stack.
  • x/vt, x/ansi, x/cellbuf, x/mosaic, x/vcr, x/xpty, x/conpty, x/editor, x/sshkey, x/term — bleeding-edge primitives, all actively maintained.
  • pony + ultraviolet — next-gen declarative TUI markup. Pre-1.0 / experimental.
  • anthropic-sdk-go, openai-go, go-genai — Charm-maintained Go LLM SDKs.

The question the historical retarget didn't have to answer: now that this much is here, do we migrate?

Decision

Option A — Parallel build, no core migration.

  • sf core (TypeScript on pi-mono): unchanged. The historical retarget rationale stands. Pi-mono SDK alignment, MCP client boundary, ~200+ TS files, real production users — none of it justifies a 36 month rewrite.
  • New services: Go on Charm, comprehensively. sf-worker (ADR-013), Singularity Knowledge + Agent Platform (ADR-014), flight recorder (ADR-015), Charm TUI client (ADR-017) — all in Go using the Charm ecosystem.
  • Native engine (Rust): permanent. ~11k LOC in rust-engine/ (git, text, forge_parser, grep, highlight, ast, diff, etc.) is best-of-breed and not re-implementable in Go without losing performance. Bindings (napi-rs from TS today; cgo from Go for new services if needed) flex per consumer.
  • Pony adoption: now, not deferred. Reversed from initial conservative stance. Adopting pony from day one in Phase-3 admin surfaces (Singularity Memory admin UI, future audit dashboards) — admin tolerates churn better than user-facing surfaces, and the foundation bet pays back if pony stabilises.
  • Other charmbracelet/x/* packages: adopted comprehensively. When a new Go service needs a primitive (image rendering, session recording, pty, editor, input handling), use the x/* package. Don't reinvent.
  • Re-evaluation trigger: 12 months from first Go service in production. If >50% of new sf code lands in Go services, the question of consolidating sf core becomes worth re-asking. Until then, polyglot is the right cost shape.

Alternatives Considered

  • Option B — Soft migration, gradual rewrite. Use Charm for new code AND opportunistically rewrite TS modules in Go when they need substantial work anyway. Eventually sf core drifts to Go.
    • Rejected: rolling polyglot across the same logical layer is harder to reason about than per-service polyglot at clean boundaries. Some PRs would bridge languages mid-feature; CI complexity grows.
  • Option C — Big-bang migration. Re-fork from crush, port sf's auto-loop, gates, planner, harness, skills doctrine into Go.
    • Rejected: 36 months of no feature shipping; production users disrupted; loss of pi-mono SDK upstream alignment. The retarget rationale isn't fully invalidated — the only argument it relied on that's weakened is "70% of Crush is duplicated in pi-mono", and even that's still true and the cost of rewriting outweighs the duplication tax.
  • Status quo — keep everything in TS, including new services.
    • Rejected: Node ecosystem doesn't have equivalents for wish, fantasy, charm-server, the comprehensive TUI/SSH/AI stack Charm provides. Building these in TS would be reinventing maturer Go libraries. New services in Charm are just easier.

Architectural Picture

[Charm TUI client]    Go      ← ADR-017: pony + ultraviolet + bubbles + lipgloss +
                                glamour + huh + harmonica + x/mosaic
[Singularity Knowledge + Agent Platform]   Go   ← ADR-014: charm-server + fantasy +
                                                  Postgres+vchord + pony admin
[sf-worker SSH host]  Go      ← ADR-013: wish + xpty/conpty + promwish
[Flight recorder]     Go      ← ADR-015: x/vcr
        │
        │   RPC / MCP / SSH / HTTP
        ▼
[sf daemon + core]    TS      ← unchanged, pi-mono SDK aligned
        │
        │   napi-rs
        ▼
[native engine]       Rust    ← permanent, ~11k LOC

Three languages, three clean boundaries. Each layer using the stack that fits.

Consequences

Positive

  • No 36 month feature freeze — sf core ships normally during the new-service build-out.
  • Right tool for each layer — Go's ecosystem advantages (Wish, fantasy, charm-server) accrue without disrupting what already works in TS.
  • Strategic optionality — pony and ultraviolet bets are localised to admin surfaces; if they fail, only those views need swapping.
  • Comprehensive adoption beats piecemeal — using Charm's stack across multiple new services means we develop deep ecosystem familiarity, can share patterns across services, and contribute back upstream where useful.

Negative

  • Polyglot deployment — TS + Go + Rust + (transitional Python during Singularity Memory migration). Three or four runtimes on a single host. Operationally manageable; not free.
  • Pi-mono SDK alignment is one-way — Charm's stack improvements don't flow to sf core. We get pi-mono updates upstream; we don't get fantasy updates upstream-of-sf.
  • Cross-language refactors are harder — when an interface between TS and Go needs to change, both sides need a coordinated PR. Mitigated by stable RPC/MCP/SSH-stdio contracts.

Risks and mitigations

  • Risk: fantasy or pony API churn breaks builds repeatedly.
    • Mitigation: pin versions; planned upgrade windows; pony swappable via clean view-layer separation.
  • Risk: Charm pivots away from one of these libraries.
    • Mitigation: Charm's stack is large and self-reinforcing; abandonment of a single piece (e.g., pony, which is experimental) is recoverable. Foundation libs (bubbletea, wish, lipgloss, glamour) are mature with strong commit cadence and unlikely to be abandoned.
  • Risk: Re-evaluation in 12 months says "actually we should consolidate to Go" and we've now got 12 months of TS-only work to throw out.
    • Mitigation: sf core code from now until then stays useful even if a future migration happens — it documents requirements and behaviour. Worst case it becomes the spec the Go rewrite implements.

Out of Scope (explicit non-decisions to keep them from re-emerging)

  • Migrating pi-mono SDK to Go. No.
  • Replacing pi-tui in sf core with a Charm TUI in-process. No — Charm TUI is a separate client (ADR-017), pi-tui stays in core until that client reaches parity, then deprecates.
  • Adopting Crush as the agent loop. No — pi-coding-agent stays.
  • Migrating native Rust to Go. No — Rust is best-of-breed for what it does.
  • Self-hosting a Charm Cloud account / charm-server as a separate sidecar. No — port charm-server patterns (auth/identity) as library code into our Go services.

Sequencing

When Action
Now This ADR captures the strategic frame. Concrete service builds tracked in 013/014/015/017.
12 months from first Go service in production Re-evaluate. Audit polyglot deployment costs vs. consolidation benefit. If >50% of new sf code is Go AND ops cost of polyglot is non-trivial AND TS sf core has shrunk substantially (post-pi-tui-deprecation), open a successor ADR proposing Option C (big-bang).

References

  • Older SPEC notes §1 — original retarget rationale (TS-on-pi-mono over Go-on-Crush).
  • ADR-013 — Network + remote execution (concrete: sf-worker).
  • ADR-014 — Singularity Knowledge + Agent Platform (concrete: SM rewrite).
  • ADR-015 — Flight recorder (concrete: x/vcr-based).
  • ADR-017 — Charm TUI client (concrete: pi-tui replacement).
  • BUILD_PLAN.md — tier-based execution tracking.
  • Charm org: https://github.com/orgs/charmbracelet — full ecosystem inventory.