39 KiB
SF + ACE Full-Stack Reference Survey — 2026-05-07
This record compares local coding-agent, orchestration, retrieval, model, and
platform-engineering references under /home/mhugo/code/ plus selected indexed
public references against the intended SF+ACE full-stack flow. It is planning
evidence, not an instruction to copy another product's architecture.
Product Boundary
Forge remains the local product/runtime surface, ACE remains the higher-level workflow/control-plane layer, and UOK remains the internal execution safety kernel. External systems are reference implementations used to sharpen the unified SF+ACE flow, not destination architectures.
Hard boundary: Forge must stay an MCP client only. Do not add, restore, or plan an SF MCP server. External control belongs in daemon, RPC, and headless interfaces.
Local Checkouts Inspected
Primary references:
singularity-forgecodexclaude-codeace-coder
Additional coder references:
aiderAgentlessRA.Aidplandexgoosegemini-cliqwen-codeopencodecrushamazon-q-developer-cliopen-codexletta-codeneovate-codesymphonysingularity/machine(codemachine)
Spec-system references:
spec-kitOpenSpecspec-kittycc-sdd
Indexed-only references to include in future passes:
kimi-cli/ Kimi Code- upstream CodeMachine CLI (
moazbuilds/CodeMachine-CLI)
Claude Code references in this survey are limited to public documentation and observed product behavior. Private or unverified mirrors are out of scope.
SF + ACE Full-Stack Reference Map
The long-term target is a unified SF+ACE autonomous software flow, not a collection of unrelated coding assistants. Compare each repo at the layer where it is strongest.
| Repo / Tool | Full-Stack Layer | Pattern To Study | Evidence Mode | Safe sift Scope |
|---|---|---|---|---|
singularity-forge |
Local product/runtime | UOK, DB-first state, CLI/TUI/headless, extension tools, MCP-client-only guardrails | local source + sift |
docs/, src/resources/extensions/sf/, packages/*/src/, tests |
ace-coder |
Workflow/control plane | HTDAG/YAML workflow DAGs, reviewers, quality gates, deployment governance, multi-repo memory | local source + sift only |
AGENTS.md, CLAUDE.md, docs/, .agents/skills/, python/ai_dev/ first-party modules |
symphony |
Work orchestration | Linear polling, isolated per-issue workspaces, WORKFLOW.md, Codex app-server, retries, PR review/landing |
local source + Context7 /openai/symphony |
README.md, SPEC.md, elixir/WORKFLOW.md, elixir/AGENTS.md, .codex/skills/ |
codemachine |
Multi-agent workflow engine | Engine matrix, SmartRouter, spec-to-code workflow templates, feature flags, tool health | local fork/source + web upstream | README.md, docs/architecture/, templates/workflows/, prompts/agents/, prompts/moderator/ |
| Amplication | Platform/golden paths | Live templates, service catalog, plugin codegen, generated service lifecycle, compliance/drift | web/GitHub; clone before local planning | docs/, packages/*/src/, plugin/codegen packages if cloned |
spec-kit |
Spec-driven artifacts | Constitution, scenarios, FR/SC IDs, spec -> plan -> tasks -> analyze -> implement, generated checklists | local source + Context7 /github/spec-kit |
templates/commands/, templates/*-template.md, scripts/, src/specify_cli/ |
OpenSpec |
Change/spec artifact graph | Current specs vs proposed changes, artifact dependency chain, workspace links, JSON/no-interactive command behavior | local source | docs/, openspec/specs/, openspec/changes/, CLI source |
spec-kitty |
Spec runtime governance | machine/project boundary, manifest-last generated artifacts, package-bundled template source, command-owned action context | local source | architecture/adrs/, src/, .kittify/, kitty-specs/, docs/ |
cc-sdd |
Kiro-style phase-gated SDD | steering vs specs, discovery routing, approvals, boundary-first tasks, per-task implement/review/debug roles | local source | AGENTS.md, CLAUDE.md, docs/guides/, tools/cc-sdd/src/ |
plandex |
Large-task implementation | Cumulative diff sandbox, plan versioning, context loading, apply/debug loop | local source + Context7 | README.md, app/cli/lib/, app/server/db/, first-party docs |
aider |
Edit loop/context map | Repo-map ranking, edit formats, lint/test repair, benchmark metadata | local source + Context7 | aider/, benchmark/, tests/, docs; avoid generated website data unless needed |
Agentless |
Bug repair/evals | Localization -> repair -> patch validation, reproduction tests, reranking | local source | agentless/fl/, agentless/repair/, agentless/test/, benchmark docs |
| SWE-agent/OpenHands | Bug repair/runtime research | issue-to-patch loops, sandbox/runtime harnesses, SWE-bench evaluation | Context7/web or local clone if added | source/docs/evals only when cloned |
codex |
Execution substrate | Sandbox profiles, approval policy, app-server protocol, typed events, AGENTS scope | local source + Context7 /openai/codex |
docs/, codex-rs/protocol/src/, codex-rs/exec/src/, codex-rs/linux-sandbox/; avoid vendor/ |
| Claude Code | UX reference | Permissions, commands, plugins, MCP client UX, subagent UX | public docs + observed behavior | public docs, command UX, transcript/output behavior |
qwen-code |
Terminal workflow | trusted folders, subagent fork design, terminal-capture tests, provider config | local source + Context7 | docs/, packages/*/src/, integration-tests/terminal-capture/ |
| Kimi Code | Model-specific coding agent | long-context coding, Kimi CLI/IDE flow, model-plan comparison | Context7 /moonshotai/kimi-cli |
docs/source if cloned |
| CodeGeeX2 | Model capability | multilingual code model, HumanEval-X/DS1000, local deployment/quantization | web/GitHub | benchmark/evaluation/docs if cloned |
gemini-cli |
Provider CLI/testing | release channels, generated schemas/docs, eval promotion, perf/memory tests | local source + Context7 if needed | docs/, evals/, perf-tests/, memory-tests/, packages/*/src/ |
opencode |
Mode/schema boundary | plan/build modes, client/server, project-local commands/tools, canonical schema | local source + Context7 | README.md, .opencode/, specs/, packages/opencode/specs/, packages/opencode/src/ |
crush |
Local runtime/TUI | SQLite/sqlc, hooks, permissions, LSP, MCP client status, Bubble Tea UI | local source | internal/db/, internal/hooks/, internal/permission/, internal/agent/tools/, internal/ui/ |
goose |
Desktop/CLI/API agent | diagnostics, API embedding, provider/extension breadth, MCP client lifecycle | local source | crates/, documentation/, ui/desktop/; do not copy server posture |
letta-code |
Long-lived memory | persistent agent memory, approval recovery, skills, channel/remote UX | local source | src/agent/, src/permissions/, src/cli/, src/tests/ |
OpenAgents |
Full-stack multi-agent platform | backend/frontend/agent split, one-agent-one-folder, plugin/data/web agents, adapters | web/GitHub; clone before local planning | backend/, frontend/, real_agents/ if cloned |
| Claude Context / Context+ | Code context retrieval | vector-backed semantic code search, MCP-client integration, context cost reduction | Context7/web | code search/indexing packages if cloned |
amazon-q-developer-cli |
Rust auth/security | auth, security, workspace patterns, Rust CLI lessons | local source; lower priority | crates/chat-cli/, crates/agent/, docs |
Comparison Matrix
| Reference | Strongest Fit For Forge | Borrow | Avoid |
|---|---|---|---|
plandex |
Large task planning and review workflow | Cumulative diff sandbox, plan versioning, explicit chat-to-plan-to-apply flow, context indexing for big repos | Server/product coupling and any cloud-hosting assumptions |
codex |
Execution hardening and protocol boundaries | Typed non-interactive event stream, sandbox permission profiles, app-server protocol shape, Rust crate boundaries, config schema rigor, plugin/skill manager discipline | Treating MCP server code as a Forge product direction |
| Claude Code | Interactive ergonomics | Permission UX, command discoverability, plugin surfaces, subagent UX, memory/context commands, MCP client config flows | Copying private implementation details or making it an upstream dependency |
ace-coder |
Owned multi-repo governance | Reviewer roles, hard quality gates, skill/subagent routing policy, explicit MCP-client contract style | Collapsing ACE and Forge into one product surface |
aider |
Tight edit loop, repo maps, and benchmark culture | Token-budgeted repo-map ranking, reproducible benchmark reports with model, edit format, commit hash, dirty state, pass rates, malformed output counts | Early auto-commit posture before validation and commit gates |
Agentless |
Bug-fix eval pipeline | Localization, candidate repair, regression-test selection, reproduction-test generation, validation-based patch reranking | SWE-bench-specific harness assumptions |
RA.Aid |
Stage boundaries and trajectory records | Explicit research/planning/implementation phases, research-only mode, durable session/tool trajectory records | Broad autonomous shell posture and external Aider outsourcing |
goose |
Desktop/CLI/API distribution and diagnostics | Provider breadth, diagnostics/reporting, API embedding, extension packaging, MCP client lifecycle patterns | Built-in/re-exported MCP servers or broad general-agent scope |
gemini-cli |
Release/test/docs automation | Release channels, generated settings schema/docs, behavioral eval incubation, sandbox integration tests, perf/memory baselines, GitHub Action workflows | Provider-specific product assumptions or unstable evals as hard CI gates |
qwen-code |
Claude-like terminal workflow and machine I/O | Skills/subagents, forked subagent design, trust-gated workspace config, bidirectional stream-json, JSON fd/file side channels, terminal-capture regressions, flexible provider config | OAuth/provider policy coupling, ungated project-local config, or mixing channel names with surface/protocol names |
opencode |
Mode split and schema boundary | Read-only plan mode vs full-access build mode, client/server framing, LSP opt-in, project-local commands/tools, schema-first domain boundary |
Bun-specific implementation style for Forge |
crush |
SQLite state, hooks, permissions, TUI | TUI as client over backend/session services, SQLite migrations/query discipline, hook engine, permission layering, session DB, tool markdown descriptions, LSP, pub/sub, MCP client status UX | Replacing Forge's TypeScript extension architecture with Go or hiding machine protocol behind env vars |
letta-code |
Long-lived memory-agent UX | Memory lifecycle, skill learning, approval recovery tests, channel/remote control ideas, MCP OAuth/connect UX | Treating memory as unstructured product magic instead of DB-backed state |
neovate-code |
Design-doc and terminal UX iteration | Small design records, queued-message designs, JSONL session replay, high-risk command classification, command/terminal UX records | Quiet/headless auto-approval, global-only session memory, provider-specific branding, or immature UX churn |
amazon-q-developer-cli |
Declarative agent and UI event reference | Agent manifest schema, hooks/resources/tools, AG-UI-like lifecycle/text/tool/state event taxonomy, auth/security/workspace patterns | Product direction, recursive delegate subprocesses, legacy raw passthrough as protocol, trust-all delegate defaults |
open-codex |
Older/forked approval-mode comparison | Approval-mode vocabulary and provider abstraction history | Fork-specific Chat Completions direction as a primary architecture |
symphony |
Work orchestration above individual agents | Issue-tracker polling, per-issue isolated workspaces, repo-owned WORKFLOW.md, Codex app-server lifecycle, retries, operator state, CI/PR review and landing loops |
High-trust unattended defaults without Forge's UOK gates and DB-first runtime evidence |
codemachine |
Multi-agent spec-to-code orchestration | Engine matrix, SmartRouter routing, heterogeneous agents, spec-to-code templates, feature flags, tool health, local workflow examples, upstream repeatable long-running workflow model | Optional MCP-server/tooling posture and Bun-specific implementation assumptions |
| Kimi Code | Long-context model-specific coding agent | Kimi CLI/IDE workflow, long-context coding, subagent-oriented terminal automation, model-plan comparison | Treating provider-specific subscription/API behavior as a Forge architecture |
spec-kit |
Spec-driven development workflow | Constitution, prioritized user scenarios, acceptance criteria, functional requirements, measurable success criteria, spec -> plan -> tasks -> implement -> analyze loop | Replacing Forge PDD/UOK with a generic spec template instead of mapping useful pieces into PDD fields |
OpenSpec |
Brownfield change planning | Clear split between current behavior specs and proposed changes, dependency-aware artifact continuation, workspace link/local mapping | Treating docs/specs as Forge's operational source of truth instead of generated/reviewed exports from .sf/SQLite |
spec-kitty |
Runtime and generated-artifact governance | Global runtime vs project overlay, package-bundled template source, manifest-last promotion, command-owned context resolver | Per-project full runtime copies, hidden alternate template sources, or prompt-level context discovery |
cc-sdd |
Agentic SDD operating loop | Steering/spec split, discovery router, approval gates, boundary annotations, per-task implementer/reviewer/debugger, implementation-note propagation | Markdown-only operational state for SF, or requiring heavy gates for trivial direct fixes |
Forge Already Has
- DB-backed workflow state and project-local planning artifacts.
- Headless/RPC surfaces for automation.
- UOK safety and recovery concepts.
- Extension loading and bundled tool surfaces.
- Purpose-first TDD and PDD field contracts.
- Provider abstraction through
pi-ai.
Those are the center of gravity. Borrowed patterns should strengthen these surfaces instead of adding parallel state systems.
Gaps Worth Pulling Into The Roadmap
-
Execution and permission hardening
- Use Codex and Crush as the references.
- Target Forge surfaces:
exec-sandbox, production mutation approval, command permissions, headless/RPC mutation gates, DB-recorded tool-call evidence, and permission profiles that specify filesystem, network,.git, metadata, writable-root, and denied-path behavior.
-
Plan/build mode separation
- Use OpenCode, Plandex, and Qwen Code as the references.
- Target Forge surfaces: explicit read-only planning mode, full-access build mode, and clearer mode transitions in auto/headless.
-
Typed headless event stream
- Use Codex, Gemini CLI, Qwen Code, Amazon Q, and OpenCode as the references.
- Target Forge surfaces: stable machine-readable events such as
thread.started,turn.started,turn.completed,turn.failed,item.started,item.updated, anditem.completed, with typed payloads for commands, patches, MCP calls, web/context lookups, todos, and UOK evidence. - Qwen-style bidirectional machine contracts are especially relevant: stream-json input, stream-json output, JSON fd/file side channels, and long-lived session control.
-
Reviewable cumulative diffs
- Use Plandex and Aider as the references.
- Target Forge surfaces: cumulative patch review, apply/reject/revise workflow, conflict analysis before apply/rewind, and commit metadata tied to model, prompt, dirty state, and evidence.
-
Eval and bug-fix pipeline
- Use Aider and Agentless as the references.
- Target Forge surfaces: reproducible eval reports, localization -> repair -> validation cases, candidate patch sampling, reproduction-test generation, and validation-based failure reranking.
-
Memory lifecycle and recovery
- Use Letta Code and ACE as references, while keeping Forge DB-first.
- Target Forge surfaces: durable memory extraction, turn recovery policy, approval recovery, stale-state reconciliation, typed memory records, and per-tool trajectory records for auto-mode postmortems.
-
Terminal UX and command discoverability
- Use Claude Code, Crush, OpenCode, and Neovate as references.
- Target Forge surfaces: command catalog, permission prompts, status line, queued-message behavior, and compact TUI/headless diagnostics.
-
Config and schema generation
- Use Gemini CLI, Codex, Qwen Code, and Crush as references.
- Target Forge surfaces: typed settings, generated docs, environment schema, DB migrations, and strict versioned JSON projections when JSON is only a compatibility/export format.
-
MCP client lifecycle
- Use Crush, Amazon Q, Claude Code, Letta Code, and Neovate as references.
- Target Forge surfaces: explicit client states (
disabled,starting,connected,error), reconnect behavior, scoped project/global/managed config, atomic config writes, tool namespacing such asmcp__server__tool, schema cleanup, resource list/read commands, OAuth connect UX, status counts, and evidence logging. - Stop rule: do not implement any SF MCP server, MCP worker backend, or bundled/re-exported MCP server.
-
Work orchestration above single agent sessions
- Use OpenAI Symphony and CodeMachine as references.
- Target Forge surfaces: durable queue/roadmap dispatch, isolated working directories, issue/task lifecycle state, retry/backoff, per-run observability, proof-of-work handoff, and CI/PR review/landing loops.
- Stop rule: orchestration must feed UOK and DB-backed state instead of bypassing Forge's safety gates.
-
Spec-driven artifact pipeline
- Use Spec Kit, OpenSpec, spec-kitty, cc-sdd, and CodeMachine as references.
- Target Forge surfaces: convert intent into PDD fields, prioritized slices, acceptance criteria, functional requirements, measurable success criteria, task generation, and consistency analysis before implementation.
-
Generated human exports and drift checks
- Use spec-kitty and Spec Kit as references, but keep Forge database-first.
- Target Forge surfaces: generated
docs/specs/exports,checkcommands that fail on stale projections, manifest-backed generated artifacts where promotion needs auditability, and command-owned context resolution rather than prompt heuristics. - Stop rule: generated docs may be reviewed and tracked by Git, but SF-owned
operational history and future-use knowledge stay in
.sf/SQLite.
-
Batch input and project-state relocation
- Use Aider and RA.Aid as references.
- Target Forge surfaces: prompt-file batch input, dry-run previews, explicit
--yes/confirmation gates, and an overrideable project-state directory for CI/sandboxes or migrated workspaces. - Stop rule: history/prompt buffers are convenience, not cross-repo memory or operational authority.
-
Decomposed autonomy and high-risk command classification
- Use Plandex, Goose, Gemini CLI, Qwen Code, and Neovate Code as references.
- Target Forge surfaces: separate choices for context loading, apply, execution, commit behavior, run control, and permission profile; keep static high-risk command classification as a first-pass guard; fail closed when a non-interactive run reaches approval-required tool use without an explicit permission profile that allows it.
- Stop rule: no broad always-yes mode and no destructive Git cleanup as a default recovery path.
-
Declarative agent/run manifests
- Use Amazon Q and Goose recipes as references.
- Target Forge surfaces: reviewed agent/run manifests with prompt context, file resources, hooks, visible/allowed tools, model policy, extension requirements, parameters, and expected response/evidence contracts.
- Stop rule: manifests feed UOK and
.sf/sf.db; they do not bypass SF permission or evidence gates.
Priority Order
P0:
- Keep Forge MCP-client-only; reject any MCP-server plan.
- Harden command/tool execution policy and mutation gates.
- Add typed headless event DTOs for auto/headless consumers.
- Make DB-backed state the structured source of truth for planner/runtime records, with JSON/Markdown only as projections, imports, exports, or promoted human docs.
- Add trust gating for project-local config, hooks, tools,
.env, and automatic memory loading before expanding those surfaces.
P1:
- Add explicit plan/build mode semantics.
- Add cumulative diff review and evidence metadata.
- Expand UOK evals with Agentless-style localization/repair/validation cases.
- Add MCP client state/status/config hardening without adding any MCP server.
- Add durable orchestration contracts for issue/task queues, isolated workspaces, retry policy, proof-of-work, and review/landing loops.
P2:
- Improve terminal command discovery and permission UX.
- Generate settings/environment docs from typed schemas.
- Compare memory lifecycle/recovery against Letta and ACE.
- Map Spec Kit scenario/requirement/success-criteria templates into Forge PDD fields without replacing PDD.
Evidence Pointers
The follow-up subagent pass inspected these concrete local paths:
aider/aider/repomap.py,aider/aider/coders/base_coder.py,aider/aider/linter.py,aider/aider/args.py,aider/aider/io.py, andaider/benchmark/README.md.Agentless/agentless/fl/localize.py,Agentless/agentless/repair/rerank.py,Agentless/agentless/repair/repair.py,Agentless/agentless/test/generate_reproduction_tests.py, andAgentless/agentless/test/run_tests.py.RA.Aid/ra_aid/agents/,RA.Aid/ra_aid/tools/programmer.py,RA.Aid/ra_aid/database/models.py,RA.Aid/ra_aid/config.py,RA.Aid/ra_aid/database/connection.py, andRA.Aid/ra_aid/tools/shell.py.plandex/app/cli/lib/apply.go,plandex/app/cli/lib/rewind.go,plandex/app/cli/lib/git.go,plandex/app/cli/lib/repl.go,plandex/app/cli/cmd/plan_exec_helpers.go,plandex/app/cli/cmd/plan_start_helpers.go,plandex/app/server/db/diff_helpers.go, andplandex/app/server/db/plan_config_helpers.go.codex/codex-rs/exec/src/exec_events.rs,codex/codex-rs/linux-sandbox/README.md,codex/codex-rs/linux-sandbox/src/linux_run_main.rs.gemini-cli/evals/README.md,gemini-cli/perf-tests/README.md,gemini-cli/memory-tests/,gemini-cli/packages/cli/src/config/config.ts,gemini-cli/packages/cli/src/nonInteractiveCli.ts,gemini-cli/packages/core/src/output/types.ts,gemini-cli/packages/core/src/policy/types.ts, andgemini-cli/packages/core/src/config/storage.ts.qwen-code/docs/users/configuration/trusted-folders.md,qwen-code/docs/design/fork-subagent/fork-subagent-design.md,qwen-code/integration-tests/terminal-capture/,qwen-code/packages/cli/src/config/config.ts,qwen-code/packages/cli/src/nonInteractiveCli.ts,qwen-code/packages/cli/src/nonInteractive/session.ts,qwen-code/packages/channels/base/README.md, andqwen-code/packages/core/src/permissions/types.ts.opencode/.opencode/,opencode/specs/v2/session.md,opencode/packages/opencode/specs/effect/schema.md,opencode/packages/opencode/src/session/schema.ts.crush/internal/db/,crush/internal/hooks/,crush/internal/permission/,crush/internal/agent/tools/mcp/.- Claude Code public documentation and observed command/transcript behavior.
letta-code/src/cli/components/McpConnectFlow.tsx,letta-code/src/cli/helpers/mcpOauth.ts,letta-code/src/agent/approval-recovery.ts.neovate-code/src/mcp.ts,neovate-code/src/commands/mcp.ts,neovate-code/src/slash-commands/builtin/mcp.tsx,neovate-code/src/session.ts,neovate-code/src/config.ts,neovate-code/src/tools/bash.ts, andneovate-code/src/ui/ApprovalModal.tsx.amazon-q-developer-cli/crates/agent/src/agent/mcp/,amazon-q-developer-cli/crates/chat-cli/src/cli/mcp.rs,amazon-q-developer-cli/schemas/agent-v1.json,amazon-q-developer-cli/crates/chat-cli-ui/src/protocol.rs, andamazon-q-developer-cli/crates/chat-cli/src/cli/chat/tools/delegate.rs.goose/crates/goose/src/config/goose_mode.rs,goose/crates/goose/src/config/permission.rs,goose/crates/goose-cli/src/session/mod.rs,goose/crates/goose-cli/src/cli.rs,goose/crates/goose/src/session/session_manager.rs, andgoose/crates/goose/src/recipe/mod.rs.crush/internal/cmd/root.go,crush/internal/cmd/run.go,crush/internal/proto/proto.go,crush/internal/backend/permission.go,crush/internal/db/migrations/20250424200609_initial.sql,crush/internal/cmd/session.go, andcrush/internal/config/config.go.ace-coder/docs/MCP_SERVER.md,ace-coder/docs/plans/2026-04-05-mcp-daemon-refactor.md,ace-coder/python/ai_dev/mcp/.symphony/README.md,symphony/SPEC.md,symphony/elixir/WORKFLOW.md,symphony/elixir/AGENTS.md, and.codex/skills/land/SKILL.md.singularity/machine/README.md,package.json,templates/workflows/,docs/architecture/engine-matrix.md, anddocs/OPENAI_SPECS_DOWNLOAD.md.spec-kit/README.md,templates/commands/specify.md,templates/commands/plan.md,templates/commands/tasks.md, andscripts/bash/common.sh.OpenSpec/docs/concepts.md,OpenSpec/docs/commands.md,OpenSpec/openspec/specs/, andOpenSpec/openspec/changes/.spec-kitty/architecture/adrs/2026-04-08-1-global-kittify-machine-level-runtime.md,spec-kitty/architecture/adrs/2026-04-08-2-package-bundled-templates-sole-source.md, andspec-kitty/architecture/adrs/2026-03-09-1-prompts-do-not-discover-context-commands-do.md.cc-sdd/AGENTS.md,cc-sdd/README.md,cc-sdd/docs/guides/spec-driven.md, andcc-sdd/docs/guides/skill-reference.md.
Context7 Cross-Check
Context7 was used after the local-source pass as a secondary check for indexed public references. Local source remains the evidence of record because it is the snapshot available on this machine.
/openai/codexconfirmed the relevant Codex public patterns: interactive and non-interactive CLI modes,app-server,AGENTS.mdproject guidance, approval policy, and sandbox modes (read-only,workspace-write,danger-full-access) with writable roots and network controls./plandex-ai/plandexconfirmed the relevant Plandex public patterns: semi/full run-control levels, smart context loading, cumulative diff review sandbox, review/apply/debug workflow, and large multi-file task focus./ai-christianson/ra.aidconfirmed the relevant RA.Aid public patterns: research-only and research-and-plan-only modes, the research -> planning -> implementation workflow, logging/cost visibility, and the risky--cowboy-modeshell approval bypass that Forge should not copy.- Context7 also resolves these remaining comparison targets for later deeper
checks:
- Aider:
/websites/aider_chatand/aider-ai/aider. - Qwen Code:
/qwenlm/qwen-code,/websites/qwenlm_github_io_qwen-code-docs, and/websites/qwenlm_github_io_qwen-code-docs_en. - OpenCode:
/anomalyco/opencode. - OpenAI Symphony:
/openai/symphony. - Kimi Code:
/moonshotai/kimi-cli,/websites/moonshotai_github_io_kimi-cli_en, and/websites/kimi_code.
- Aider:
- Spec Kit:
/github/spec-kitand/websites/github_github_io_spec-kit.- Upstream CodeMachine CLI did not resolve by name in Context7 during this
pass, but GitHub confirms
https://github.com/moazbuilds/CodeMachine-CLIas the public upstream-style repo for CodeMachine CLI. The local checkout inspected ishttps://github.com/singularity-ng/machine.git, so treat it as local fork/mirror evidence rather than exact upstream state.
- Upstream CodeMachine CLI did not resolve by name in Context7 during this
pass, but GitHub confirms
Local Sift Cross-Check
ACE is private/local and should not be treated as Context7-indexed. Use sift
for ACE and Forge when checking private or machine-local architecture.
For dependency hygiene, do not run broad sift search over repo roots that may
contain vendored dependencies, package caches, build output, or generated blobs.
This sift install does not expose an exclude flag, so scope searches to
first-party paths such as docs/, src/, packages/*/src/, specs/,
AGENTS.md, CLAUDE.md, and known design files. Avoid node_modules/,
vendor/, dist/, build/, target/, .venv/, caches, fixture dumps, and
generated lock/schema/output directories unless the dependency surface itself is
the subject of the question.
The targeted sift pass found:
- Codex
codex-rs/protocol/src/config_types.rsandprotocol.rs: confirms first-party typed approval policy and sandbox mode surfaces without searchingcodex-rs/vendor/. - OpenCode
packages/opencode/specs/effect/schema.md: confirms the schema-first rule to prefer one canonical schema definition and derive compatibility schemas instead of maintaining parallel sources of truth. - Aider first-party docs/tests: confirms local repo-map/edit-format/lint/test and commit behavior surfaces.
- Plandex
README.md, changelog, and first-party app model files: confirms the cumulative diff sandbox, controlled command execution, rollback/debug loop, and planning phases. - Qwen Code
docs/: confirms terminal-capture integration tests, trusted folders documentation, and provider configuration docs. - RA.Aid first-party docs/source: confirms shell command approval bypass via
--cowboy-mode, research/planning agents, and session/logging surfaces. - Symphony first-party spec/workflow files: confirm issue-tracker polling,
per-issue workspace isolation, repo-owned
WORKFLOW.md, Codex app-server lifecycle, max turns/concurrency, retry/backoff, state snapshots, token/rate observability, PR feedback sweeps, and land-loop skills. - CodeMachine first-party docs/templates: confirm local multi-agent
orchestration, heterogeneous engine routing, spec-to-code workflow templates,
feature-flag governance, health/status commands, and optional MCP tooling.
GitHub upstream
moazbuilds/CodeMachine-CLIconfirms the public product framing: repeatable long-running workflows, multi-agent orchestration, parallel execution, context engineering, and headless scripting of coding engines such as Claude Code, Codex, Cursor, and others. - ACE
AGENTS.md: confirms the repo-local Claude MCP client contract, hard stops, skills, reviewer workflow, quality gate, and the warning that ACE's autonomous system uses its own code/YAML workflow DAGs rather thanAGENTS.md. - ACE
docs/designs/SPEC_TO_BUILD_PIPELINE_DESIGN.md: confirms the spec -> feature graph -> ADR detection -> implementation planning/review pipeline. - Forge
docs/dev/ADR-008-sf-tools-over-mcp-for-provider-parity.md: confirms the durable SF boundary: if external control is needed, use daemon/RPC/headless contracts; MCP remains SF-as-client only. - Forge
src/resources/extensions/sf/tests/no-sf-mcp-server.test.mjs: confirms there is executable guard coverage preventing recreation of SF MCP server paths.
Local Surface/State Cross-Check
The detailed coder-agent pass supports Forge's five-axis operating model.
- Codex has the cleanest split: TUI flags, non-interactive
exec, app-server protocol, sandbox mode, approval policy, persistent state, and input history are separate code paths and types. Forge should copy the typed separation, not the exact crate structure. - OpenCode treats the TUI as one client of a server, exposes HTTP/OpenAPI, and bridges ACP as a protocol adapter over its server. Forge should adapt the client/server clarity and generated SDK idea without making HTTP the only mental model for machine access.
- Claude Code is strong on structured headless output, rich JSONL transcripts, entrypoint metadata, permission modes, and sandbox schemas. Forge should adapt the stream discipline and transcript fields while avoiding hidden feature-flag sprawl.
- old
open-codexis the cautionary example: autonomy, approval policy, output, and sandbox behavior collapse into a small set of flags. Forge should keep run control, output format, protocol, and permission profile independent.
The second coder-agent pass adds state/history guardrails:
- Aider's prompt-file, dry-run, and explicit yes primitives are useful batch affordances, but its flat history files should remain convenience only.
- RA.Aid's overrideable project state directory is useful for CI/sandboxes, but broad shell-approval bypasses are not.
- Plandex's decomposed context/apply/execution/commit modes map well to Forge's need to keep run control separate from permission profile and Git policy.
- Agentless's stage JSONL artifacts are good eval/evidence inputs; import them into DB-backed contracts instead of making JSONL the live state model.
- Neovate's JSONL session replay and high-risk bash classifier are useful; its quiet-mode auto-approval is not.
The terminal-agent pass adds concrete machine/API patterns:
- Gemini and Qwen both expose
text,json, andstream-jsonoutput formats. Qwen goes further with stream-json input plus JSON fd/file side channels. Forge should adopt the bidirectional contract shape for parent processes. - Goose cleanly names run-control and permission behavior, and its non-interactive path refuses approval-required tool calls unless the run was explicitly allowed to proceed. Forge should copy that fail-closed posture.
- Crush shows the right architecture shape for TUI-over-backend/session-store while keeping session/message/file history in SQLite. Forge already wants this DB-first boundary; the lesson is to avoid making the machine protocol hidden or text-only.
- Amazon Q has the richest declarative agent schema and useful lifecycle/text/ tool/state event taxonomy. Forge should adapt manifests and event taxonomy, not recursive delegate subprocesses or raw passthrough protocol events.
- GitHub Copilot CLI's autopilot documentation is a useful naming cross-check:
autopilot is the continuation behavior,
--allow-all/--yoloare permission expansion, and--no-ask-useris question suppression. Forge should keep the same separation but use SF's own terms: run control ismanual | assisted | autonomous, permission profile isrestricted | normal | trusted | unrestricted, and headless/machine output is a surface/format concern.
Local Spec-System Cross-Check
The spec-system pass reinforces the current Forge direction: specs and docs are valuable contracts for humans, but command/runtime state must own execution.
- Spec Kit creates
specs/<feature>/spec.md, stores the active feature pointer in.specify/feature.json, runs plan/task setup scripts that return JSON, and generates quality checklists before planning. The useful pattern is explicit generated artifacts plus machine-readable path discovery, not a second SF planning database. Itsanalyzecommand is a useful read-only consistency shape for comparing spec, plan, tasks, and constitution; in Forge that should compare.sf/DB state, generated docs, evidence, and code diffs. - OpenSpec separates
openspec/specs/as current behavior fromopenspec/changes/as proposed modifications, then uses commands likepropose,continue,ff,verify,sync, andarchiveto move artifacts through a schema-defined dependency graph. Its best pattern is deterministic ready/blocked artifact queries, apply requirements, and delta validation. Forge should keep the change/spec distinction as a human review model while storing operational order/gates in.sf/sf.db. - spec-kitty is strongest on runtime boundaries: one global machine runtime, thin project overlay, package-bundled templates as the sole end-user source, manifest-last generated artifact promotion, and command-owned action context. Its newer runtime pattern is also important: append-only status events, materialized status, expected-artifact manifests with blocking semantics, step contracts, explicit write scopes, review evidence, and commit hooks that keep planning artifacts out of lane branches. Forge should copy the boundary discipline and avoid prompt-level discovery for any context a command can resolve.
- cc-sdd is strongest on phase gates and role separation: steering vs feature specs, discovery routing, requirements/design/tasks approvals, boundary-first task contracts, and per-task implementer/reviewer/debugger contexts. Forge should adapt the contract discipline into PDD fields and UOK gates without making markdown checkboxes the operational state store. The most reusable pieces are boundary/dependency annotations, observable completion, approval booleans as explicit state, implementation-note propagation, and a manifest planner for generated agent/startup artifacts with dry-run/conflict policy.
- Forge
docs/records/2026-05-07-cli-agent-code-survey.md: now records the MCP-client-only product boundary and roadmap pull-through.
Implementation Follow-Up
The first DB-backed retrieval slice landed with schema v41:
retrieval_evidencerecords backend, source kind, query, strategy, scope, project root, git head/branch, worktree dirty flag, freshness, status, hit count, elapsed time, cache path, error, result metadata, and timestamp.sift_searchandcodebase_searchwrite retrieval evidence for successful and failed searches.- Native Context7
resolve_libraryandget_library_docswrite docs retrieval evidence withfreshness=external-index. search-the-webwrites web retrieval evidence withfreshness=external-livefor success, cache hits, missing-provider errors, duplicate-loop stops, budget exhaustion, aborts, and provider failures.sf_retrieval_evidenceexposes the rows through the SF read-only DB tool surface so agents do not query.sf/sf.dbdirectly.- Sift telemetry now uses the no-op debug logger; telemetry failures no longer turn successful searches into failed tool calls.
Next slices should wrap search_and_read and fetch_page results in the same
evidence contract before using them for planning.
The first execution-policy vocabulary slice also landed:
execution-policy.jsdefines namedplan,build,trusted, andunrestrictedprofiles with filesystem, network, git, and mutation posture.- The
planprofile reuses the existing queue-mode write gate, so read-only commands and.sf/planning artifacts are allowed while source mutations are denied. - The
buildprofile records destructive bash risk labels from the existing destructive-command classifier without changing runtime enforcement yet. - Auto-mode now writes
execution-policy-decisionjournal events for tool calls, recording the profile, allow/deny result, risk, destructive labels, tool name, call id, and policy-relevant command/path only.
Next slices should project these profile decisions into UOK evidence and the machine-surface JSON/JSONL projections before broad enforcement.
Resulting Direction
Forge should absorb proven patterns into UOK and the existing DB-first runtime: structured state, explicit modes, stronger permissions, reproducible evidence, and better review UX. The goal is not feature parity with every coder. The goal is a purpose-to-software compiler whose run control and permission profile are inspectable, recoverable, and safe enough to run repeatedly.