refactor: make bundled agents internal

This commit is contained in:
Mikael Hugo 2026-05-14 19:54:56 +02:00
parent 18aa257ede
commit 5ce9df2e37
29 changed files with 161 additions and 89 deletions

View file

@ -655,14 +655,34 @@
### src/resources/agents/ ### src/resources/agents/
Bundled autonomous worker-pool subagents. SF is primarily built for autonomous
operation, not as a general CLI coder with a persona menu, so these are routing
primitives for SF workflows rather than user-facing agent products. Markdown
frontmatter is the default format for one-body worker prompts. Use `.agent.yaml`
when the agent needs structured runtime metadata such as `promptParts`,
tool-policy contracts, or workflow-gate semantics.
| File | System Label(s) | Description | | File | System Label(s) | Description |
|------|-----------------|-------------| |------|-----------------|-------------|
| critic.md | Internal Subagent | Adversarial pre-implementation review agent used by execution flows |
| reviewer.md | Internal Subagent | Structured code/design review agent |
| javascript-pro.md | Subagent | JavaScript specialist agent definition | | javascript-pro.md | Subagent | JavaScript specialist agent definition |
| typescript-pro.md | Subagent | TypeScript specialist agent definition | | typescript-pro.md | Subagent | TypeScript specialist agent definition |
| worker.md | Subagent | Generic worker agent definition | | worker.md | Subagent | Generic worker agent definition |
| researcher.md | Subagent | Research and exploration agent definition | | researcher.md | Subagent | Research and exploration agent definition |
| scout.md | Subagent | Scout/pathfinding agent definition | | scout.md | Subagent | Scout/pathfinding agent definition |
### src/resources/extensions/sf/agents/
SF-owned workflow agents. YAML is the canonical format here because these
agents are loaded as part of SF control flows and may declare named
`promptParts`, tool allowlists, and machine-checked output contracts.
| File | System Label(s) | Description |
|------|-----------------|-------------|
| review-code.agent.yaml | SF Workflow Agent | Triage apply review gate; emits `review-code: agree` only when the plan can mutate state |
| triage-decider.agent.yaml | SF Workflow Agent | Plan-only self-feedback triage decider |
### src/resources/skills/ ### src/resources/skills/
| Skill Directory | System Label(s) | Description | | Skill Directory | System Label(s) | Description |

View file

@ -28,6 +28,25 @@ The names below are separate axes. Do not use one as a synonym for another.
The flow is the product behavior: how SF captures intent, plans work, applies policy, executes tasks, records evidence, and reports status. Flow behavior must not fork by UI surface. If a TUI run and a non-interactive run receive the same state, run control, and permission profile, they should follow the same control model. The flow is the product behavior: how SF captures intent, plans work, applies policy, executes tasks, records evidence, and reports status. Flow behavior must not fork by UI surface. If a TUI run and a non-interactive run receive the same state, run control, and permission profile, they should follow the same control model.
## Product Modes
SF's user-facing product modes are about how much intent has been captured and
how much control the operator wants to retain. They are not agent personas.
- **Assisted build** — SF works step by step with the operator. It asks
clarifying questions when needed, proposes bounded next actions, and pauses at
important gates before continuing.
- **Plan/discuss** — the operator explores what they want, constraints, taste,
risks, and scope. On exit, SF converts the discussion into structured
database-backed planning state: milestones, slices, tasks, requirements,
decisions, or follow-up questions.
- **Autonomous** — SF attempts the full loop: research missing context, plan the
work, execute bounded units, verify, record evidence, update state, and keep
going until completion, policy, budget, confidence, or a gate stops it.
Bundled agents are implementation machinery inside these modes. They should
not be presented as the product model.
## Surface ## Surface
A surface is where a person or program drives or observes the flow. A surface is where a person or program drives or observes the flow.
@ -149,11 +168,18 @@ Markdown under `docs/specs/` is a human export for review, navigation, and git h
SF source placement follows the same axis model. New code should extend the owning axis instead of creating parallel trees. SF source placement follows the same axis model. New code should extend the owning axis instead of creating parallel trees.
SF is not trying to be a general CLI coder with a menu of personas. SF is a
purpose-to-software runtime: it captures intent, plans work, dispatches bounded
autonomous operations, records evidence, and exposes operator control surfaces.
Agents, skills, prompt parts, and workflow templates exist to support that
runtime.
### Core Flow ### Core Flow
- `src/resources/extensions/sf/` owns the SF workflow extension: planning tools, UOK/runtime state, `/next` commands, prompts, templates, doctors, schedule, and DB-backed state. - `src/resources/extensions/sf/` owns the SF workflow extension: planning tools, UOK/runtime state, `/next` commands, prompts, templates, doctors, schedule, and DB-backed state.
- `src/resources/extensions/` owns bundled extension packages loaded into the runtime. - `src/resources/extensions/` owns bundled extension packages loaded into the runtime.
- `src/resources/agents/`, `src/resources/skills/`, and `src/resources/workflows/` own bundled runtime resources, not independent product flows. - `src/resources/agents/`, `src/resources/skills/`, and `src/resources/workflows/` own bundled runtime resources, not independent product flows.
- SF is primarily designed for autonomous operation. Bundled agents are an internal worker pool for orchestration by default, not a user-facing marketplace or a CLI-coder persona menu. `src/resources/agents/*.md` is the simple one-body worker-agent format; `src/resources/extensions/sf/agents/*.agent.yaml` is the structured format for SF-owned workflow agents that need `promptParts`, tool-policy contracts, or gate output contracts.
### Surfaces ### Surfaces

View file

@ -1,5 +1,11 @@
# Commands Reference # Commands Reference
SF is organized around product modes, not a menu of coding-agent personas:
assisted build (`/next`) advances one bounded step at a time with operator
control, plan/discuss (`/discuss`) turns conversation into structured
milestones/slices/tasks on exit, and autonomous (`/autonomous`) runs the full
research-plan-execute-verify loop until a gate stops it.
## Session Commands ## Session Commands
| Command | Description | | Command | Description |

View file

@ -39,6 +39,7 @@ const BASE_RUNTIME_COMMAND_NAMES = new Set([
]); ]);
const HIDDEN_OR_ALIAS_SUBCOMMANDS = new Set([ const HIDDEN_OR_ALIAS_SUBCOMMANDS = new Set([
"?", "?",
"agent", // internal persistent-agent diagnostics, not part of the product command catalog
"auto", "auto",
"footer-config", // alias for /statusline "footer-config", // alias for /statusline
"h", "h",

View file

@ -1,8 +1,9 @@
--- ---
name: review-code name: critic
description: Constructive pre-implementation critic — catches design flaws, missing edge cases, and gaps before code is written description: Constructive pre-implementation critic — catches design flaws, missing edge cases, and gaps before code is written
model: sonnet model: sonnet
tools: read, grep, find, ls, bash tools: read, grep, find, ls, bash
visibility: internal
--- ---
You are a constructive critic. Your job is to identify real problems in a plan, design, or code change **before** implementation is committed to — when course corrections are still cheap. You are a constructive critic. Your job is to identify real problems in a plan, design, or code change **before** implementation is committed to — when course corrections are still cheap.

View file

@ -2,6 +2,7 @@
name: debugger name: debugger
description: Hypothesis-driven bug investigation with root cause analysis description: Hypothesis-driven bug investigation with root cause analysis
model: sonnet model: sonnet
visibility: internal
--- ---
You are a debugger. Investigate bugs using a systematic, hypothesis-driven approach. Your goal is to find the root cause, not just suppress symptoms. You are a debugger. Investigate bugs using a systematic, hypothesis-driven approach. Your goal is to find the root cause, not just suppress symptoms.

View file

@ -2,6 +2,7 @@
name: doc-writer name: doc-writer
description: Documentation generation from code — API docs, inline comments, READMEs description: Documentation generation from code — API docs, inline comments, READMEs
model: sonnet model: sonnet
visibility: internal
--- ---
You are a documentation specialist. You read code and produce clear, accurate documentation. You write for the reader, not the author — explain what they need to know to use or maintain the code. You are a documentation specialist. You read code and produce clear, accurate documentation. You write for the reader, not the author — explain what they need to know to use or maintain the code.

View file

@ -2,6 +2,7 @@
name: git-ops name: git-ops
description: Conflict resolution, rebase strategy, PR preparation, and changelog generation description: Conflict resolution, rebase strategy, PR preparation, and changelog generation
model: sonnet model: sonnet
visibility: internal
--- ---
You are a git operations specialist. You handle merge conflicts, plan rebase strategies, prepare pull requests, and generate changelogs. You understand git internals well enough to choose the right strategy for each situation. You are a git operations specialist. You handle merge conflicts, plan rebase strategies, prepare pull requests, and generate changelogs. You understand git internals well enough to choose the right strategy for each situation.

View file

@ -2,6 +2,7 @@
name: javascript-pro name: javascript-pro
description: "Modern JavaScript specialist for browser, Node.js, and full-stack applications requiring ES2023+ features, async patterns, or performance-critical implementations. Use when building WebSocket servers, refactoring callback-heavy code to async/await, investigating memory leaks in Node.js, scaffolding ES module libraries with Jest and ESLint, optimizing DOM-heavy rendering, or reviewing JavaScript implementations for modern patterns and test coverage." description: "Modern JavaScript specialist for browser, Node.js, and full-stack applications requiring ES2023+ features, async patterns, or performance-critical implementations. Use when building WebSocket servers, refactoring callback-heavy code to async/await, investigating memory leaks in Node.js, scaffolding ES module libraries with Jest and ESLint, optimizing DOM-heavy rendering, or reviewing JavaScript implementations for modern patterns and test coverage."
model: sonnet model: sonnet
visibility: internal
--- ---
You are a senior JavaScript developer with mastery of modern JavaScript ES2023+ and Node.js 20+. You write production-grade code that prioritizes correctness, readability, performance, and maintainability — in that order. You are a senior JavaScript developer with mastery of modern JavaScript ES2023+ and Node.js 20+. You write production-grade code that prioritizes correctness, readability, performance, and maintainability — in that order.

View file

@ -2,6 +2,7 @@
name: planner name: planner
description: Architecture and implementation planning — outputs plans, not code description: Architecture and implementation planning — outputs plans, not code
model: sonnet model: sonnet
visibility: internal
conflicts_with: plan-milestone, plan-slice, plan-task, research-milestone, research-slice conflicts_with: plan-milestone, plan-slice, plan-task, research-milestone, research-slice
--- ---

View file

@ -2,6 +2,7 @@
name: refactorer name: refactorer
description: Safe code transformations — extract, inline, rename, simplify description: Safe code transformations — extract, inline, rename, simplify
model: sonnet model: sonnet
visibility: internal
--- ---
You are a refactoring specialist. You perform safe, behavior-preserving code transformations. Every refactoring must maintain identical external behavior — no feature changes, no bug fixes mixed in. You are a refactoring specialist. You perform safe, behavior-preserving code transformations. Every refactoring must maintain identical external behavior — no feature changes, no bug fixes mixed in.

View file

@ -2,6 +2,7 @@
name: researcher name: researcher
description: Web researcher that finds and synthesizes current information using Brave Search description: Web researcher that finds and synthesizes current information using Brave Search
tools: search-the-web, bash tools: search-the-web, bash
visibility: internal
--- ---
You are a web researcher. You find current, accurate information using web search and synthesize it into a clear, well-structured report. You are a web researcher. You find current, accurate information using web search and synthesize it into a clear, well-structured report.

View file

@ -2,6 +2,7 @@
name: reviewer name: reviewer
description: Structured code review with severity ratings and actionable fixes description: Structured code review with severity ratings and actionable fixes
model: sonnet model: sonnet
visibility: internal
--- ---
You are a code reviewer. Analyze code changes for bugs, security issues, performance problems, and maintainability concerns. Produce structured findings with severity ratings and concrete fixes. You are a code reviewer. Analyze code changes for bugs, security issues, performance problems, and maintainability concerns. Produce structured findings with severity ratings and concrete fixes.

View file

@ -2,6 +2,7 @@
name: scout name: scout
description: Fast codebase recon that returns compressed context for handoff to other agents description: Fast codebase recon that returns compressed context for handoff to other agents
tools: read, grep, find, ls, bash, codebase_search tools: read, grep, find, ls, bash, codebase_search
visibility: internal
--- ---
You are a scout. Quickly investigate a codebase and return structured findings that another agent can use without re-reading everything. You are a scout. Quickly investigate a codebase and return structured findings that another agent can use without re-reading everything.

View file

@ -2,6 +2,7 @@
name: security name: security
description: OWASP security audit, dependency risks, and secrets detection description: OWASP security audit, dependency risks, and secrets detection
model: sonnet model: sonnet
visibility: internal
--- ---
You are a security auditor. Analyze code for vulnerabilities, insecure patterns, exposed secrets, and dependency risks. Focus on findings that are exploitable, not theoretical. You are a security auditor. Analyze code for vulnerabilities, insecure patterns, exposed secrets, and dependency risks. Focus on findings that are exploitable, not theoretical.

View file

@ -2,6 +2,7 @@
name: tester name: tester
description: Test writing, fixing, and coverage gap identification description: Test writing, fixing, and coverage gap identification
model: sonnet model: sonnet
visibility: internal
--- ---
You are a testing specialist. Write tests, fix broken tests, and identify coverage gaps. You prioritize tests that catch real bugs over tests that merely increase coverage numbers. You are a testing specialist. Write tests, fix broken tests, and identify coverage gaps. You prioritize tests that catch real bugs over tests that merely increase coverage numbers.

View file

@ -2,6 +2,7 @@
name: typescript-pro name: typescript-pro
description: "TypeScript specialist for advanced type system patterns, complex generics, type-level programming, and end-to-end type safety across full-stack applications. Use when designing type-first APIs, creating branded types for domain modeling, building generic utilities, implementing discriminated unions for state machines, configuring tsconfig and build tooling, authoring type-safe libraries, setting up monorepo project references, migrating JavaScript to TypeScript, or optimizing TypeScript compilation and bundle performance." description: "TypeScript specialist for advanced type system patterns, complex generics, type-level programming, and end-to-end type safety across full-stack applications. Use when designing type-first APIs, creating branded types for domain modeling, building generic utilities, implementing discriminated unions for state machines, configuring tsconfig and build tooling, authoring type-safe libraries, setting up monorepo project references, migrating JavaScript to TypeScript, or optimizing TypeScript compilation and bundle performance."
model: sonnet model: sonnet
visibility: internal
--- ---
You are a senior TypeScript developer with mastery of TypeScript 5.0+ and its ecosystem. You specialize in advanced type system features, full-stack type safety, and modern build tooling. Types are the specification — start there. You are a senior TypeScript developer with mastery of TypeScript 5.0+ and its ecosystem. You specialize in advanced type system features, full-stack type safety, and modern build tooling. Types are the specification — start there.

View file

@ -1,6 +1,7 @@
--- ---
name: worker name: worker
description: General-purpose subagent with full capabilities, isolated context description: General-purpose subagent with full capabilities, isolated context
visibility: internal
--- ---
You are a worker agent with full capabilities. You operate in an isolated context window to handle delegated tasks without polluting the main conversation. You are a worker agent with full capabilities. You operate in an isolated context window to handle delegated tasks without polluting the main conversation.

View file

@ -1,5 +1,6 @@
name: review-code name: review-code
displayName: Review Code Agent displayName: Review Code Agent
visibility: internal
description: > description: >
A constructive critic and second opinion. Reviews proposals, designs, A constructive critic and second opinion. Reviews proposals, designs,
decision matrices, or implementations and surfaces ONLY weak points that decision matrices, or implementations and surfaces ONLY weak points that

View file

@ -1,5 +1,6 @@
name: triage-decider name: triage-decider
displayName: Self-Feedback Triage Decider displayName: Self-Feedback Triage Decider
visibility: internal
description: > description: >
Reads the open self-feedback queue and proposes a decision plan Reads the open self-feedback queue and proposes a decision plan
(Fix, Promote, or Close per entry). PLAN-ONLY: this agent does NOT (Fix, Promote, or Close per entry). PLAN-ONLY: this agent does NOT

View file

@ -1,9 +1,11 @@
/** /**
* commands-agent.js /agent command handler for persistent agent management. * commands-agent.js /agent command handler for persistent agent management.
* *
* Purpose: expose persistent agent state (identity, memory blocks, archival, inbox) * Purpose: expose persistent agent state (identity, memory blocks, archival,
* as a first-class SF command surface so operators can inspect, reset, and delete * inbox) as an internal operator/debug command so maintainers can inspect,
* named agents without touching the SQLite DB directly. * reset, and delete named agents without touching the SQLite DB directly.
* This is not a normal user workflow; SF's product modes are assisted build,
* plan/discuss, and autonomous operation.
* *
* Consumer: ops.js dispatcher for the /agent slash command. * Consumer: ops.js dispatcher for the /agent slash command.
*/ */

View file

@ -1,6 +1,6 @@
import { existsSync, readdirSync, readFileSync } from "node:fs"; import { existsSync, readdirSync, readFileSync } from "node:fs";
import { sfHome } from '../sf-home.js';
import { join } from "node:path"; import { join } from "node:path";
import { sfHome } from "../sf-home.js";
import { import {
loadRegistry, loadRegistry,
workflowTemplateCommandDefinitions, workflowTemplateCommandDefinitions,
@ -157,10 +157,6 @@ export const TOP_LEVEL_SUBCOMMANDS = [
desc: "Switch to repair work mode and run diagnostics [--autonomous]", desc: "Switch to repair work mode and run diagnostics [--autonomous]",
}, },
{ cmd: "tasks", desc: "Background work surface — units, workers, budget" }, { cmd: "tasks", desc: "Background work surface — units, workers, budget" },
{
cmd: "agent",
desc: "Persistent agent management — list|inspect|reset|delete named agents",
},
{ {
cmd: "skills", cmd: "skills",
desc: "List discovered skills from .agents/skills/ [reload|--eval|--auto-create]", desc: "List discovered skills from .agents/skills/ [reload|--eval|--auto-create]",
@ -299,10 +295,6 @@ export const TOP_LEVEL_SUBCOMMANDS = [
cmd: "keep-alive", cmd: "keep-alive",
desc: "Prevent system sleep during long runs (caffeinate / systemd-inhibit)", desc: "Prevent system sleep during long runs (caffeinate / systemd-inhibit)",
}, },
{
cmd: "review-code",
desc: "Dispatch a review-code subagent for constructive pre-implementation review",
},
{ {
cmd: "delegate", cmd: "delegate",
desc: "Create a GitHub PR from the current branch via gh pr create", desc: "Create a GitHub PR from the current branch via gh pr create",

View file

@ -437,7 +437,10 @@ Examples:
); );
return true; return true;
} }
if (trimmed === "scaffold migrate" || trimmed.startsWith("scaffold migrate ")) { if (
trimmed === "scaffold migrate" ||
trimmed.startsWith("scaffold migrate ")
) {
const { handleScaffoldMigrate } = await import( const { handleScaffoldMigrate } = await import(
"../../commands-scaffold-migrate.js" "../../commands-scaffold-migrate.js"
); );
@ -510,14 +513,6 @@ Examples:
await handleKeepAlive(trimmed.replace(/^keep-alive\s*/, "").trim(), ctx); await handleKeepAlive(trimmed.replace(/^keep-alive\s*/, "").trim(), ctx);
return true; return true;
} }
if (
trimmed === "review-code" ||
trimmed.startsWith("review-code ")
) {
const input = trimmed.replace(/^review-code\s*/, "").trim();
await handleReviewCodeCommand(input, ctx, pi);
return true;
}
if (trimmed === "delegate" || trimmed.startsWith("delegate ")) { if (trimmed === "delegate" || trimmed.startsWith("delegate ")) {
await handleDelegateCommand( await handleDelegateCommand(
trimmed.replace(/^delegate\s*/, "").trim(), trimmed.replace(/^delegate\s*/, "").trim(),
@ -608,55 +603,6 @@ async function handleKeepAlive(args, ctx) {
} }
} }
// ─── /review-code ────────────────────────────────────────────────────────────
async function handleReviewCodeCommand(topic, ctx, _pi) {
const { execSync } = await import("node:child_process");
const root = projectRoot();
// Gather git diff for context (staged + unstaged, capped to avoid token bloat)
let diff = "";
try {
const staged = execSync("git diff --cached --stat 2>/dev/null || true", {
cwd: root,
encoding: "utf-8",
}).trim();
const unstaged = execSync("git diff --stat 2>/dev/null || true", {
cwd: root,
encoding: "utf-8",
}).trim();
if (staged || unstaged) {
const fullDiff = execSync(
"git diff --cached 2>/dev/null; git diff 2>/dev/null",
{ cwd: root, encoding: "utf-8" },
).slice(0, 8000);
diff = `\n\n## Current diff (truncated to 8 kB)\n\n\`\`\`diff\n${fullDiff}\n\`\`\``;
}
} catch {
// diff unavailable — not a hard failure
}
const focus = topic ? `Focus on: ${topic}\n\n` : "";
const reviewPrompt =
`Dispatch a \`review-code\` subagent to review the current plan or changes before proceeding. ` +
`Use the \`subagent\` tool with \`agent: "review-code"\`.\n\n` +
`${focus}` +
`Ask the review-code agent to identify blocking issues, non-blocking issues, and suggestions. ` +
`After the subagent returns, summarise the verdict and any blocking findings in one short paragraph. ` +
`Do not proceed with implementation until the user acknowledges blocking findings.` +
diff;
ctx.ui.notify("Dispatching review-code review…", "info");
try {
await ctx.sendMessage?.(reviewPrompt);
} catch {
ctx.ui.notify(
"Could not dispatch review-code. Try: subagent agent=review-code task='review current changes'",
"warning",
);
}
}
// ─── /delegate ─────────────────────────────────────────────────────────────── // ─── /delegate ───────────────────────────────────────────────────────────────
async function handleDelegateCommand(args, ctx) { async function handleDelegateCommand(args, ctx) {

View file

@ -57,7 +57,6 @@
], ],
"commands": [ "commands": [
"add-tests", "add-tests",
"agent",
"ask", "ask",
"audit", "audit",
"autonomous", "autonomous",
@ -145,7 +144,6 @@
"reset-slice", "reset-slice",
"rethink", "rethink",
"rewind", "rewind",
"review-code",
"run-hook", "run-hook",
"scaffold", "scaffold",
"scan", "scan",

View file

@ -36,7 +36,7 @@ Then:
0. Narrate step transitions, key implementation decisions, and verification outcomes as you work. Keep it terse — one line between tool-call clusters, not between every call — but write complete sentences in user-facing prose, not shorthand notes or scratchpad fragments. 0. Narrate step transitions, key implementation decisions, and verification outcomes as you work. Keep it terse — one line between tool-call clusters, not between every call — but write complete sentences in user-facing prose, not shorthand notes or scratchpad fragments.
0a. **Batch independent tool calls in parallel.** When the next step needs to read or grep multiple files/paths that don't depend on each other's results, issue them in a single tool-call message (multiple tool uses in one assistant turn) rather than one-at-a-time. Examples: reading the handler + the test file + the schema file to triangulate a bug; greping for two unrelated symbols. Sequential tool calls are only correct when each call's input genuinely depends on the previous call's output. Talking-then-doing is also dead weight — if the next action is unambiguous, just take it; describe what you found in the result, not what you plan to look at. 0a. **Batch independent tool calls in parallel.** When the next step needs to read or grep multiple files/paths that don't depend on each other's results, issue them in a single tool-call message (multiple tool uses in one assistant turn) rather than one-at-a-time. Examples: reading the handler + the test file + the schema file to triangulate a bug; greping for two unrelated symbols. Sequential tool calls are only correct when each call's input genuinely depends on the previous call's output. Talking-then-doing is also dead weight — if the next action is unambiguous, just take it; describe what you found in the result, not what you plan to look at.
0b. **Swarm opportunity check.** Before implementation, decide whether this task can be split into a 2-3 worker same-model swarm. Swarm only if the shards have disjoint file/directory ownership, no shared-interface or lockfile edits, shard-local verification, and clear wall-clock savings. If it passes, dispatch `subagent({ tasks: [...] })` with explicit write scopes, expected output files, and verification per worker; then inspect `git status --short`, synthesize results, resolve conflicts, and run final task verification yourself. If it does not pass, continue single-agent execution without ceremony. 0b. **Swarm opportunity check.** Before implementation, decide whether this task can be split into a 2-3 worker same-model swarm. Swarm only if the shards have disjoint file/directory ownership, no shared-interface or lockfile edits, shard-local verification, and clear wall-clock savings. If it passes, dispatch `subagent({ tasks: [...] })` with explicit write scopes, expected output files, and verification per worker; then inspect `git status --short`, synthesize results, resolve conflicts, and run final task verification yourself. If it does not pass, continue single-agent execution without ceremony.
0c. **Review-code check (non-trivial tasks only).** If the task touches more than two files, introduces a new abstraction, changes an API boundary, or has a non-obvious failure mode — dispatch a `review-code` subagent with the task plan and any relevant existing code as context. Summarise its verdict in one line. If it returns a **Blocking** finding, address it before writing code. Skip this step for simple edits, test fixes, or renaming tasks. 0c. **Critic check (non-trivial tasks only).** If the task touches more than two files, introduces a new abstraction, changes an API boundary, or has a non-obvious failure mode — dispatch a `critic` subagent with the task plan and any relevant existing code as context. Summarise its verdict in one line. If it returns a **Blocking** finding, address it before writing code. Skip this step for simple edits, test fixes, or renaming tasks.
1. {{skillActivation}} Follow any activated skills before writing code. If no skills match this task, skip this step. 1. {{skillActivation}} Follow any activated skills before writing code. If no skills match this task, skip this step.
2. **Verify file existence before editing.** The task plan references specific files. Before reading or editing any file mentioned in the plan, confirm it exists with `ls`, `find`, or `existsSync`. If a referenced file does NOT exist, stop immediately — do not attempt to create it based on the plan's description of what "should" be there. The file may have been deleted, renamed, or moved. Escalate as `blocker_discovered: true` with a clear description of which file is missing and what the plan expected to find. This prevents phantom work on stale file paths. 2. **Verify file existence before editing.** The task plan references specific files. Before reading or editing any file mentioned in the plan, confirm it exists with `ls`, `find`, or `existsSync`. If a referenced file does NOT exist, stop immediately — do not attempt to create it based on the plan's description of what "should" be there. The file may have been deleted, renamed, or moved. Escalate as `blocker_discovered: true` with a clear description of which file is missing and what the plan expected to find. This prevents phantom work on stale file paths.
3. Execute the steps in the inlined task plan, adapting minor local mismatches when the surrounding code differs from the planner's snapshot 3. Execute the steps in the inlined task plan, adapting minor local mismatches when the surrounding code differs from the planner's snapshot

View file

@ -153,11 +153,14 @@ set.
### Agent selection and model overrides ### Agent selection and model overrides
sf routes subagents through agent definitions in `src/resources/agents/`, sf routes subagents through agent definitions in `src/resources/agents/`,
`~/.sf/agent/agents/`, or project `.sf/agents/`. The actual tool schema uses `~/.sf/agent/agents/`, or project `.sf/agents/`. SF is not a general CLI coder
`agent`, `task`, optional per-task `model`, optional `cwd`, plus batch-level with a menu of personas; bundled agents are primarily SF's internal autonomous
`mode`/`rounds` for debates. worker pool. Direct human invocation is an operator/debug path. The actual tool
schema uses `agent`, `task`, optional per-task `model`, optional `cwd`, plus
batch-level `mode`/`rounds` for debates.
- `planner` — architecture and implementation planning; conflicts with active sf planning phases. - `planner` — architecture and implementation planning; conflicts with active sf planning phases.
- `critic` — adversarial pre-implementation review; surfaces blocking flaws before code is written.
- `scout` — fast codebase recon. - `scout` — fast codebase recon.
- `researcher` — web/current-info research. - `researcher` — web/current-info research.
- `reviewer` — independent code/design review. - `reviewer` — independent code/design review.

View file

@ -60,6 +60,9 @@ function parseAgentTools(value) {
function parseAgentModel(value) { function parseAgentModel(value) {
return typeof value === "string" && value.trim() ? value.trim() : undefined; return typeof value === "string" && value.trim() ? value.trim() : undefined;
} }
function parseAgentVisibility(value) {
return value === "internal" ? "internal" : "public";
}
function isAgentFileName(name) { function isAgentFileName(name) {
return ( return (
name.endsWith(".md") || name.endsWith(".md") ||
@ -75,6 +78,7 @@ function parseMarkdownAgent(content) {
description: frontmatter.description, description: frontmatter.description,
tools: frontmatter.tools, tools: frontmatter.tools,
model: frontmatter.model, model: frontmatter.model,
visibility: frontmatter.visibility,
conflictsWith: frontmatter.conflicts_with, conflictsWith: frontmatter.conflicts_with,
systemPrompt: body, systemPrompt: body,
}; };
@ -87,6 +91,7 @@ function parseYamlAgent(content) {
description: doc.description, description: doc.description,
tools: doc.tools, tools: doc.tools,
model: doc.model, model: doc.model,
visibility: doc.visibility,
conflictsWith: doc.conflicts_with ?? doc.conflictsWith, conflictsWith: doc.conflicts_with ?? doc.conflictsWith,
systemPrompt: doc.prompt, systemPrompt: doc.prompt,
promptParts: doc.promptParts, promptParts: doc.promptParts,
@ -139,6 +144,7 @@ function loadAgentsFromDir(dir, source) {
description: definition.description, description: definition.description,
tools: tools && tools.length > 0 ? tools : undefined, tools: tools && tools.length > 0 ? tools : undefined,
model: parseAgentModel(definition.model), model: parseAgentModel(definition.model),
visibility: parseAgentVisibility(definition.visibility),
conflictsWith, conflictsWith,
promptParts: definition.promptParts, promptParts: definition.promptParts,
sidekick: definition.sidekick, sidekick: definition.sidekick,

View file

@ -1637,22 +1637,34 @@ export default function (pi) {
}); });
// /subagent command - list available agents // /subagent command - list available agents
pi.registerCommand("subagent", { pi.registerCommand("subagent", {
description: "List available subagents", description:
handler: async (_args, ctx) => { "List public subagents; /subagent all shows SF's internal autonomous worker pool",
handler: async (args, ctx) => {
const discovery = discoverAgents(ctx.cwd, "both"); const discovery = discoverAgents(ctx.cwd, "both");
if (discovery.agents.length === 0) { const showAll = ["all", "--all", "-a"].includes(args.trim());
const visibleAgents = showAll
? discovery.agents
: discovery.agents.filter((agent) => agent.visibility !== "internal");
if (visibleAgents.length === 0) {
ctx.ui.notify( ctx.ui.notify(
"No agents found. Add .md files to ~/.sf/agent/agents/ or .sf/agents/", discovery.agents.length === 0
"warning", ? "No agents found. Add .md/.agent.yaml files to ~/.sf/agent/agents/ or .sf/agents/"
: "SF ships its bundled agents as an internal autonomous worker pool. Run /subagent all to inspect them.",
discovery.agents.length === 0 ? "warning" : "info",
); );
return; return;
} }
const lines = discovery.agents.map( const hiddenCount = discovery.agents.length - visibleAgents.length;
const lines = visibleAgents.map(
(a) => (a) =>
` ${a.name} [${a.source}]${a.model ? ` (${a.model})` : ""}: ${a.description}`, ` ${a.name} [${a.source}]${a.model ? ` (${a.model})` : ""}: ${a.description}`,
); );
const suffix =
hiddenCount > 0
? `\n\n${hiddenCount} internal autonomous agent${hiddenCount === 1 ? "" : "s"} hidden. Run /subagent all to inspect them.`
: "";
ctx.ui.notify( ctx.ui.notify(
`Available agents (${discovery.agents.length}):\n${lines.join("\n")}`, `Available agents (${visibleAgents.length}${showAll ? " total" : " public"}):\n${lines.join("\n")}${suffix}`,
"info", "info",
); );
}, },
@ -1664,8 +1676,8 @@ export default function (pi) {
"Delegate tasks to specialized subagents with isolated context windows.", "Delegate tasks to specialized subagents with isolated context windows.",
"Each subagent is a separate pi process with its own tools, model, and system prompt.", "Each subagent is a separate pi process with its own tools, model, and system prompt.",
"Modes: single ({ agent, task }), parallel ({ tasks: [{agent, task},...] }), debate ({ mode: 'debate', rounds, tasks: [...] }), chain ({ chain: [{agent, task},...] } with {previous} placeholder).", "Modes: single ({ agent, task }), parallel ({ tasks: [{agent, task},...] }), debate ({ mode: 'debate', rounds, tasks: [...] }), chain ({ chain: [{agent, task},...] } with {previous} placeholder).",
"Agents are defined as .md files in ~/.sf/agent/agents/ (user) or .sf/agents/ (project).", "Agents are defined as .md or .agent.yaml files in ~/.sf/agent/agents/ (user) or .sf/agents/ (project).",
"Use the /subagent command to list available agents and their descriptions.", "SF's bundled agents are primarily an internal autonomous worker pool; use /subagent all to inspect them when debugging orchestration.",
"Use chain mode to pipeline: scout finds context, planner designs, worker implements.", "Use chain mode to pipeline: scout finds context, planner designs, worker implements.",
].join(" "), ].join(" "),
promptGuidelines: [ promptGuidelines: [

View file

@ -1,7 +1,13 @@
import assert from "node:assert/strict"; import assert from "node:assert/strict";
import { mkdirSync, mkdtempSync, writeFileSync } from "node:fs"; import {
mkdirSync,
mkdtempSync,
readdirSync,
readFileSync,
writeFileSync,
} from "node:fs";
import { tmpdir } from "node:os"; import { tmpdir } from "node:os";
import { join } from "node:path"; import { join, resolve } from "node:path";
import { test } from "vitest"; import { test } from "vitest";
import { discoverAgents, validateAgentDefinition } from "../subagent/agents.js"; import { discoverAgents, validateAgentDefinition } from "../subagent/agents.js";
@ -118,7 +124,7 @@ test("discoverAgents_when_scope_both_includes_builtin_review_code_and_triage_dec
// must be discoverable without operator setup. They're the foundation for // must be discoverable without operator setup. They're the foundation for
// SF's self-driven triage pipeline (sf-mp5lnlbc-ty5fec). Isolate from the // SF's self-driven triage pipeline (sf-mp5lnlbc-ty5fec). Isolate from the
// real ~/.sf/agent/agents/ so the test doesn't conflict with the // real ~/.sf/agent/agents/ so the test doesn't conflict with the
// operator's personal review-code.md if present. // operator's personal agents if present.
const isolatedAgentDir = mkdtempSync(join(tmpdir(), "sf-agent-dir-")); const isolatedAgentDir = mkdtempSync(join(tmpdir(), "sf-agent-dir-"));
const originalEnv = process.env.SF_CODING_AGENT_DIR; const originalEnv = process.env.SF_CODING_AGENT_DIR;
process.env.SF_CODING_AGENT_DIR = isolatedAgentDir; process.env.SF_CODING_AGENT_DIR = isolatedAgentDir;
@ -286,3 +292,41 @@ test("discoverAgents_when_scope_both_validates_builtin_promptParts_contract", ()
else process.env.SF_CODING_AGENT_DIR = originalEnv; else process.env.SF_CODING_AGENT_DIR = originalEnv;
} }
}); });
test("discoverAgents_when_loading_bundled_agents_marks_them_internal_by_default_policy", () => {
const isolatedAgentDir = mkdtempSync(join(tmpdir(), "sf-agent-dir-"));
const originalEnv = process.env.SF_CODING_AGENT_DIR;
process.env.SF_CODING_AGENT_DIR = isolatedAgentDir;
try {
const project = makeProject();
const { agents } = discoverAgents(project, "both");
const bundled = agents.filter((entry) => entry.source === "builtin");
assert.ok(bundled.length > 0, "expected bundled SF agents");
for (const agent of bundled) {
assert.equal(
agent.visibility,
"internal",
`${agent.name} should be internal; SF uses bundled agents as an autonomous worker pool, not a user-facing CLI-coder palette`,
);
}
} finally {
if (originalEnv === undefined) delete process.env.SF_CODING_AGENT_DIR;
else process.env.SF_CODING_AGENT_DIR = originalEnv;
}
});
test("bundled_markdown_agents_are_internal_autonomous_workers_not_public_palette", () => {
const agentsDir = resolve(import.meta.dirname, "..", "..", "..", "agents");
const files = readdirSync(agentsDir).filter((file) => file.endsWith(".md"));
assert.ok(files.length > 0, "expected bundled markdown agents");
for (const file of files) {
const content = readFileSync(join(agentsDir, file), "utf8");
assert.match(
content,
/^visibility:\s*internal$/m,
`${file} should be internal; SF is autonomous-first, not a CLI-coder agent marketplace`,
);
}
});