Autonomous mode is SF's product-development execution engine for the purpose-to-software compiler. It advances only from structured state: bounded intent, PDD fields, research assumptions, tests or executable evidence, implementation, verification, and recorded outcomes. Run `/sf autonomous`, walk away, come back to built software with clean git history.
Autonomous mode is a **state machine driven by structured state on disk**. It reads `.sf` state, determines the next unit of work, creates a fresh agent session, injects a focused prompt with all relevant context pre-inlined, and lets the LLM execute. When the LLM finishes, autonomous mode reads disk state again and dispatches the next unit. Markdown files are projections for humans when structured state exists.
Plan (with integrated research) → Execute (per task) → Complete → Reassess Roadmap → Next Slice
↓ (all slices done)
Validate Milestone → Complete Milestone
```
- **Plan** — scouts the codebase, researches relevant docs, and decomposes the slice into tasks with must-haves
- **Execute** — runs each task in a fresh context window
- **Complete** — writes summary, UAT script, marks roadmap, commits
- **Reassess** — checks if the roadmap still makes sense
- **Validate Milestone** — reconciliation gate after all slices complete; compares roadmap success criteria against actual results, catches gaps before sealing the milestone
## Key Properties
### Fresh Session Per Unit
Every task, research phase, and planning step gets a clean context window. No accumulated garbage. No degraded quality from context bloat. The dispatch prompt includes everything needed — task plans, prior summaries, dependency context, decisions register — so the LLM starts oriented instead of spending tool calls reading files.
### Context Pre-Loading
The dispatch prompt is carefully constructed with:
| Inlined Artifact | Purpose |
|------------------|---------|
| Task plan | What to build |
| Slice plan | Where this task fits |
| Prior task summaries | What's already done |
| Dependency summaries | Cross-slice context |
| Roadmap excerpt | Overall direction |
| Decisions register | Architectural context |
The amount of context inlined is controlled by your [token profile](./token-optimization.md). Budget mode inlines minimal context; quality mode inlines everything.
- **`worktree`** (default): Each milestone runs in its own git worktree at `.sf/worktrees/<MID>/` on a `milestone/<MID>` branch. All slice work commits sequentially — no branch switching, no merge conflicts mid-milestone. When the milestone completes, it's squash-merged to main as one clean commit.
- **`branch`**: Work happens in the project root on a `milestone/<MID>` branch. Useful for submodule-heavy repos where worktrees don't work well.
- **`none`**: Work happens directly on your current branch. No worktree, no milestone branch. Ideal for hot-reload workflows where file isolation breaks dev tooling.
See [Git Strategy](./git-strategy.md) for details.
### Parallel Execution
When your project has independent milestones, you can run them simultaneously. Each milestone gets its own worker process and worktree. See [Parallel Orchestration](./parallel-orchestration.md) for setup and usage.
A lock file tracks the current unit. If the session dies, the next `/sf autonomous` reads the surviving session file, synthesizes a recovery briefing from every tool call that made it to disk, and resumes with full context.
SF maintains a `KNOWLEDGE.md` file — an append-only register of project-specific rules, patterns, and lessons learned. The agent reads it at the start of every unit and appends to it when discovering recurring issues, non-obvious patterns, or rules that future sessions should follow. This gives autonomous mode cross-session memory that survives context window boundaries.
When context usage reaches 70%, SF sends a wrap-up signal to the agent, nudging it to finish durable output (commit, write summaries) before the context window fills. This prevents sessions from hitting the hard context limit mid-task with no artifacts written.
Commits are generated from task summaries — not generic "complete task" messages. Each commit message reflects what was actually built, giving clean `git log` output that reads like a changelog.
SF uses a sliding-window analysis to detect stuck loops. Instead of a simple "same unit dispatched twice" counter, the detector examines recent dispatch history for repeated patterns — catching cycles like A→B→A→B as well as single-unit repeats. On detection, SF retries once with a deep diagnostic prompt. If it fails again, autonomous mode stops with the exact file it expected, so you can intervene.
The sliding-window approach reduces false positives on legitimate retries (e.g., verification failures that self-correct) while catching genuine stuck loops faster.
Every unit's token usage and cost is captured, broken down by phase, slice, and model. The dashboard shows running totals and projections. Budget ceilings can pause autonomous mode before overspending.
After each slice completes, the roadmap is reassessed. If the work revealed new information that changes the plan, slices are reordered, added, or removed before continuing. This can be skipped with the `balanced` or `budget` token profiles.
### Verification Enforcement (v2.26)
Configure shell commands that run automatically after every task execution:
```yaml
verification_commands:
- npm run lint
- npm run test
verification_auto_fix: true # auto-retry on failure (default)
verification_max_retries: 2 # max retry attempts (default: 2)
```
Failures trigger auto-fix retries — the agent sees the verification output and attempts to fix the issues before advancing. This ensures code quality gates are enforced mechanically, not by LLM compliance.
### Slice Discussion Gate (v2.26)
For projects where you want human review before each slice begins:
Autonomous mode pauses before each slice, presenting the slice context for discussion. After you confirm, execution continues. Useful for high-stakes projects where you want to review the plan before the agent builds.
After a milestone completes, SF auto-generates a self-contained HTML report in `.sf/reports/`. Reports include project summary, progress tree, slice dependency graph (SVG DAG), cost/token metrics with bar charts, execution timeline, changelog, and knowledge base. No external dependencies — all CSS and JS are inlined.
v2.28 hardens autonomous mode reliability with multiple safeguards: atomic file writes prevent corruption on crash, OAuth fetch timeouts (30s) prevent indefinite hangs, RPC subprocess exit is detected and reported, and blob garbage collection prevents unbounded disk growth. Combined with the existing crash recovery and headless autonomous-restart, autonomous mode is designed for true "fire and forget" overnight execution.
This linear flow is easier to debug, uses less memory (no recursive call stack), and provides cleaner error recovery since each phase has well-defined entry and exit conditions.
Issues are classified by severity: `error` (blocks autonomous mode), `warning` (non-blocking), and `info` (advisory). Autonomous mode checks health at dispatch time and can pause on critical issues.
Configured skills are automatically resolved and injected into dispatch prompts. The agent receives an "Available Skills" block listing skills that match the current context, based on:
-`always_use_skills` — always included
-`prefer_skills` — included with preference indicator
-`skill_rules` — conditional activation based on `when` clauses
See [Configuration](./configuration.md) for skill routing preferences.
When enabled, autonomous mode automatically selects cheaper models for simple units (slice completion, UAT) and reserves expensive models for complex work (replanning, architectural tasks). See [Dynamic Model Routing](./dynamic-model-routing.md).
When `reactive_execution: true` is set in preferences, SF derives a dependency graph from IO annotations in task plans. Tasks that don't conflict (no shared file reads/writes) are dispatched in parallel via subagents, while dependent tasks wait for their predecessors to complete.
The graph derivation is pure and deterministic — it resolves a ready-set of tasks, detects conflicts, and guards against deadlocks. Verification results carry forward across parallel batches, so tasks that pass verification don't need to be re-verified when subsequent tasks in the same slice complete.
The implementation lives in `reactive-graph.ts` (graph derivation, ready-set resolution, conflict/deadlock detection) with integration into `auto-dispatch.ts` and `auto-prompts.ts`.