feat: SF strengthening + ADR-020 wire architecture (Phases 1-2)

Phase 1 — close SF-side polish gaps:

- codebase-generator: distinguish uv/poetry/pdm in Python stack-signals;
  surface configured tooling (ruff/mypy/pyright) when config files exist
- doctor-environment: new checkPythonEnvironment — detects uv/poetry/pdm
  via lockfile, verifies binary on PATH, warns with install hint when missing
- doctor-environment: new checkSiftAvailable — recommends sift install for
  repos > 5000 source files when not on PATH
- tech-debt-tracker: documented future memory-as-sub-extension extraction
  (defer until real backend-swap requirement)

Phase 2 — internal wire architecture:

- ADR-020: singularity-grpc as shared schema repo; gRPC + typed clients
  for first-party services; MCP façade only at external-tool boundary
- ADR-019: trimmed MCP scope section to a 3-line summary linking to ADR-020
  to avoid the wire-format table living in two places
- design-docs/index.md: ADR-020 added to ADR table

These changes make SF stronger for autonomous work on Python repos
(particularly ace-coder) and capture the internal wire architecture
decision as a durable ADR before any singularity-grpc code lands.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Mikael Hugo 2026-05-02 00:03:34 +02:00
parent 3d8e8c5d57
commit 064dff2f0f
6 changed files with 325 additions and 23 deletions

View file

@ -24,6 +24,7 @@ in `docs/dev/`. Lighter design docs (problem framing, event model decisions) liv
| [ADR-017](../dev/ADR-017-charm-tui-client.md) | Charm TUI Client | Proposed |
| [ADR-018](../dev/ADR-018-repo-native-harness-evolution.md) | Repo-Native Harness Evolution | Proposed — staged impl |
| [ADR-019](../dev/ADR-019-workspace-vm-convergence.md) | Workspace VM Convergence — SF↔ACE incremental convergence via microVM execution layer | Proposed |
| [ADR-020](../dev/ADR-020-internal-wire-architecture.md) | Internal Wire Architecture — `singularity-grpc` shared schema repo, gRPC for first-party services, MCP at external-tool boundary only | Proposed |
## Design Docs (this directory)

View file

@ -175,27 +175,9 @@ workspace VM primitive is stable.
## MCP scope
MCP is **not** the production wire for this system. The rule:
| Caller | Callee | Wire | Why |
|--------|--------|------|-----|
| ACE host → ACE tools | in-process Python imports | function call | type-safe, zero overhead |
| ACE host → singularity-memory | typed Python client (gen from Go API) | HTTP/gRPC | typed, fast, refactorable |
| SF → singularity-memory | typed TS client (gen from Go API) | HTTP/gRPC | same, in TS |
| SF → ACE worker | existing JSON-RPC stdio (`rpc-client`) | stdio JSON-RPC | already in production, language-agnostic |
| ACE worker VM → host | direct gRPC over tailnet | gRPC | typed, low-latency |
| Claude Code / Cursor → singularity-memory | MCP façade | MCP | external tool, no shared types |
| Claude Code → ACE | MCP façade (temporary) | MCP | external coder helping build, until self-hosting |
MCP exists only at the **boundary to external LLM-driven coding tools** that don't
share our type system. It is a scaffold for the period when external coders
(Claude Code, Cursor, third-party agents) help build the system. As the system
becomes self-hosting, the MCP surface shrinks to whatever third parties still
need to integrate against.
Internally everything is real agentic tools — Python functions, generated typed
clients, direct calls. No JSON-RPC framing where the caller and callee share a
build system.
Internal services use typed direct clients (gRPC for first-party). MCP is reserved
for external coding tools (Claude Code, Cursor) that don't share our build system.
See [ADR-020](./ADR-020-internal-wire-architecture.md) for the full wire-format table and rationale.
---

View file

@ -0,0 +1,166 @@
# ADR-020: Internal Wire Architecture — singularity-grpc and MCP Scope
**Status:** Proposed
**Date:** 2026-05-02
**Deciders:** Mikael Hugo
**Context repos:** `singularity-grpc` (new), `singularity-forge` (SF), `ace-coder` (ACE), `singularity-memory`
---
## Context
The first-party services that make up the singularity stack — SF
(`singularity-forge`), ACE (`ace-coder`), `singularity-memory`, and the
future `singularity-*` services — need a wire format for talking to each
other. Until now this has been an open question, with MCP (Model Context
Protocol) sometimes treated as the default because it was already in use
for external coding tools.
**MCP causes misunderstandings.** Concretely:
- **String-typed tools.** Every cross-service call is funnelled through
a string tool name and a JSON blob. Refactoring a callsite means
grepping for strings; type checkers cannot follow the call graph.
- **JSON-RPC framing tax.** Every request pays the cost of JSON
marshalling, schema-less envelope handling, and per-message protocol
overhead, even for hot internal paths.
- **Namespace pollution in agent context.** Internal service tools
exposed via MCP show up alongside user-facing tools in the agent's
tool list, crowding the LLM's context with infrastructure plumbing
that should never have been a "tool" in the first place.
- **Lost typing across language boundaries.** SF (TypeScript), ACE
(Python), and `singularity-memory` (Go) all share a build system and
can generate native clients from a single schema source. MCP throws
that away on every call.
Internal services share our build system. External LLM-driven coding
tools (Claude Code, Cursor, third-party agents) do not. Conflating those
two audiences into one wire format is the root cause of the confusion.
---
## Decision
1. **Create `singularity-grpc`** as the shared schema-and-runtime repo
for first-party services. It owns common proto types, the codegen
toolchain, and per-language runtime helpers (auth, tracing, retry).
2. **First-party services talk to each other via gRPC** with typed
clients generated from each service's protos. SF, ACE, and
`singularity-memory` all consume `singularity-grpc` for shared types
and runtime.
3. **MCP is reserved for the external-tool boundary only.** It exists
as a façade for LLM-driven coding tools that don't share our build
system. It is a temporary scaffold for the period when external
coders help build the platform; the surface shrinks as the system
becomes self-hosting.
### `singularity-grpc` repo structure
```
singularity-grpc/
├── proto/
│ └── common/ # shared types: notification, error, identity,
│ # workspace base, pagination, timestamps
├── tools/
│ └── codegen/ # proto → Python / TypeScript / Go / Rust
├── runtime/
│ ├── python/ # auth interceptors, tracing, retry helpers
│ ├── typescript/ # same, in TS
│ └── go/ # same, in Go
└── docs/
├── conventions.md # naming, versioning, error model
├── auth.md # how services authenticate to each other
└── why-not-mcp.md # the rationale captured here, expanded
```
### Per-service ownership
Each service owns its own protos and server implementation. SF owns
nothing service-specific because SF *consumes* clients published by
other services — it does not expose a gRPC API of its own.
| Repo | Owns | Imports from `singularity-grpc` |
|------|------|---------------------------------|
| `ace-coder` | `api/proto/workspace.proto`, `api/proto/htdag.proto`, server impl, generated clients, the SF extension that registers the ACE engine | common types, runtime helpers, codegen |
| `singularity-memory` | `api/proto/memory.proto`, server impl, generated clients in 3 languages (Python, TS, Go) | common types, runtime helpers, codegen |
| `singularity-forge` | nothing service-specific — SF is a *consumer* of generated clients published by `ace-coder` and `singularity-memory` | nothing direct |
### The wire-format table (canonical)
This is the canonical reference for which wire goes where.
| Caller | Callee | Wire | Why |
|--------|--------|------|-----|
| ACE host → ACE tools | in-process Python imports | function call | type-safe, zero overhead |
| ACE host → singularity-memory | typed Python client (gen from Go API) | HTTP/gRPC | typed, fast, refactorable |
| SF → singularity-memory | typed TS client (gen from Go API) | HTTP/gRPC | same, in TS |
| SF → ACE worker | existing JSON-RPC stdio (`rpc-client`) | stdio JSON-RPC | already in production, language-agnostic |
| ACE worker VM → host | direct gRPC over tailnet | gRPC | typed, low-latency |
| Claude Code / Cursor → singularity-memory | MCP façade | MCP | external tool, no shared types |
| Claude Code → ACE | MCP façade (temporary) | MCP | external coder helping build, until self-hosting |
---
## Consequences
- **Removes string-typed tool naming from cross-service calls.**
Refactoring a service API now changes generated client code, which
the type checker enforces at every callsite.
- **Removes JSON-RPC framing tax for typed services.** Hot paths
between SF/ACE/memory pay only the gRPC binary protocol cost.
- **Existing `packages/rpc-client/` stdio JSON-RPC stays.** It is the
Codex-compatible RPC mode for driving SF as a child process. That is
a different concern from service-to-service typed wires and is not
affected by this ADR.
- **MCP is scoped narrowly.** Only external LLM-driven tools that don't
share our build system get an MCP façade. As the platform self-hosts,
this surface shrinks.
- **`singularity-grpc` becomes a versioning chokepoint** for shared
types. This is intentional — it forces a single source of truth for
the cross-service vocabulary (notifications, errors, identity).
---
## Naming
The repo is named **`singularity-grpc`** as a forcing function. The
name commits the project to gRPC as the implementation, which prevents
the kind of "let's just use HTTP/JSON for now" drift that produced the
MCP-everywhere situation in the first place.
Alternative names considered:
- **`singularity-wire`** — transport-agnostic. Rejected because the
ambiguity is exactly what we want to avoid; the name has to commit.
- **`singularity-rpc`** — more general. Rejected for the same reason;
too easy to backslide into JSON-RPC.
If gRPC ever needs to be replaced (e.g. with Cap'n Proto or a future
standard), the rename will be a deliberate, visible decision rather
than a silent transport swap inside a generic-named repo.
---
## Alternatives considered
- **Stay on MCP for everything.** Rejected. The string-typing,
framing tax, and namespace pollution problems compound as more
services are added. Refactor pain grows non-linearly.
- **HTTP/JSON without typed clients.** Rejected. Same string-typing
problem at smaller scale; no schema enforcement across services;
every service reinvents auth, retry, and error handling.
- **Cap'n Proto / FlatBuffers.** Rejected for now. gRPC has better
tooling, broader language support, and a mature ecosystem. The
schema layer (proto3) is portable enough that we can swap the
runtime later if a specific workload demands it.
---
## References
- [ADR-019](./ADR-019-workspace-vm-convergence.md) — Workspace VM
Convergence (this ADR replaces ADR-019's "MCP scope" section)
- [ADR-014](./ADR-014-singularity-knowledge-and-agent-platform.md) —
singularity-memory Go migration
- [ADR-013](./ADR-013-network-and-remote-execution.md) — tailnet +
remote execution substrate

View file

@ -67,3 +67,17 @@
**Verification:** After any planning milestone, `.sf/sf.db` contains a `repo_profiles` row for the current session.
**Tracked in:** [active/index.md — ADR-018 Phase 1](./active/index.md)
---
## Memory subsystem extraction into swappable sub-extension
**Location:** `src/resources/extensions/sf/memory-store.ts`, `src/resources/extensions/sf/memory-extractor.ts`, `src/resources/extensions/sf/memory-relations.ts`, `src/resources/extensions/sf/memory-sleeper.ts`
**Impact:** SF's memory subsystem (separate from singularity-memory, which serves hermes/ACE) is coupled directly into SF core. Plugging in a different backend — e.g. a vchord-in-docker container for local vector search — is harder than it needs to be.
**Proposed fix:** Extract the four files above into a sub-extension at `src/resources/extensions/sf-memory/` behind a small backend interface, so SF can swap implementations without touching core.
**When to do it:** Defer until a real backend-swap requirement appears (e.g. SF actually needs vector search). Not blocking.
**Estimated effort:** Small — these files have clean module boundaries already.

View file

@ -398,8 +398,40 @@ function inferProjectKnowledge(files: string[]): ProjectKnowledge {
pushUnique(skillNeeds, "Rust implementation and ownership review");
}
if (hasFile(files, "pyproject.toml") || hasFile(files, "requirements.txt")) {
pushUnique(stackSignals, "Python project manifest present");
pushUnique(verificationCommands, "pytest or the project quality command");
// Distinguish package manager so the agent gets accurate context for
// what `pytest` and friends should be prefixed with (uv run / poetry run).
const pyManager = hasFile(files, "uv.lock")
? "uv-managed"
: hasFile(files, "poetry.lock")
? "poetry-managed"
: hasFile(files, "pdm.lock")
? "pdm-managed"
: hasFile(files, "pyproject.toml")
? "pip/pyproject-managed"
: "pip/requirements-managed";
pushUnique(stackSignals, `Python project (${pyManager})`);
// Surface configured Python tools so the agent knows what verification
// stack actually exists. Config-file presence is the cheap signal;
// for [tool.X] sections in pyproject.toml see detection.pyprojectHasTool.
const pyTools: string[] = [];
if (hasFile(files, "ruff.toml") || hasFile(files, ".ruff.toml")) {
pyTools.push("ruff");
}
if (hasFile(files, "mypy.ini") || hasFile(files, ".mypy.ini")) {
pyTools.push("mypy");
}
if (hasFile(files, "pyrightconfig.json")) {
pyTools.push("pyright");
}
if (pyTools.length > 0) {
pushUnique(stackSignals, `Python tooling configured: ${pyTools.join(", ")}`);
}
pushUnique(
verificationCommands,
"pytest or the project quality command (lint + type + test stack from .sf/PREFERENCES.md)",
);
pushUnique(skillNeeds, "Python packaging, typing, and tests");
}
if (

View file

@ -81,6 +81,107 @@ function commandExists(name: string, cwd: string): boolean {
// ── Individual Checks ──────────────────────────────────────────────────────
/**
* Check that the Python package manager declared by lockfile is installed.
*
* Detects uv / poetry / pdm by lockfile presence and verifies the binary is
* on PATH. Surfaces missing-tool early so SF doesn't hand a Python milestone
* to an agent that will hit "uv: command not found" mid-task.
*
* Returns null when the project has no Python signals (not a Python repo).
*/
function checkPythonEnvironment(
basePath: string,
): EnvironmentCheckResult | null {
const hasPyproject = existsSync(join(basePath, "pyproject.toml"));
const hasRequirements = existsSync(join(basePath, "requirements.txt"));
if (!hasPyproject && !hasRequirements) return null;
const hasUvLock = existsSync(join(basePath, "uv.lock"));
const hasPoetryLock = existsSync(join(basePath, "poetry.lock"));
const hasPdmLock = existsSync(join(basePath, "pdm.lock"));
let manager: string | null = null;
let installHint = "";
if (hasUvLock) {
manager = "uv";
installHint = "Install: curl -LsSf https://astral.sh/uv/install.sh | sh";
} else if (hasPoetryLock) {
manager = "poetry";
installHint = "Install: curl -sSL https://install.python-poetry.org | python3 -";
} else if (hasPdmLock) {
manager = "pdm";
installHint = "Install: curl -sSL https://pdm-project.org/install-pdm.py | python3 -";
}
if (!manager) {
return {
name: "python_env",
status: "ok",
message: "Python project (no lockfile detected)",
};
}
const version = tryExec(`${manager} --version`, basePath);
if (!version) {
return {
name: "python_env",
status: "warning",
message: `${manager} not found in PATH (project uses ${manager}.lock)`,
detail: installHint,
};
}
return {
name: "python_env",
status: "ok",
message: `Python project (${manager}: ${version})`,
};
}
/**
* Recommend installing sift on large repos where code intelligence quality
* matters most. Non-fatal sift is optional but significantly improves
* codebase_search and the code-intelligence context block.
*
* Returns null when the repo is small (< 5000 source files) or sift is
* already on PATH.
*/
function checkSiftAvailable(
basePath: string,
): EnvironmentCheckResult | null {
let fileCount = 0;
try {
// Lazy import — scanProjectFiles walks the filesystem, only do this
// when called by the doctor pipeline.
// eslint-disable-next-line @typescript-eslint/no-require-imports
const { scanProjectFiles } = require("./detection.js") as {
scanProjectFiles(p: string): string[];
};
fileCount = scanProjectFiles(basePath).length;
} catch {
return null;
}
const SIFT_RECOMMENDED_THRESHOLD = 5000;
if (fileCount < SIFT_RECOMMENDED_THRESHOLD) return null;
if (commandExists("sift", basePath)) {
return {
name: "sift_available",
status: "ok",
message: `sift on PATH (recommended for ${fileCount}-file repo)`,
};
}
return {
name: "sift_available",
status: "warning",
message: `sift not installed (recommended for repos > ${SIFT_RECOMMENDED_THRESHOLD} files; this repo has ${fileCount})`,
detail: "Install: cargo install --git https://github.com/rupurt/sift",
};
}
/**
* Check that Node.js version meets the project's engines requirement.
*/
@ -576,6 +677,12 @@ export function runEnvironmentChecks(
const nodeCheck = checkNodeVersion(basePath);
if (nodeCheck) results.push(nodeCheck);
const pythonCheck = checkPythonEnvironment(basePath);
if (pythonCheck) results.push(pythonCheck);
const siftCheck = checkSiftAvailable(basePath);
if (siftCheck) results.push(siftCheck);
const depsCheck = checkDependenciesInstalled(basePath);
if (depsCheck) results.push(depsCheck);