chore: init sf

This commit is contained in:
ace-pm 2026-04-21 01:38:02 +02:00
parent e63184f91d
commit 485e8f608e
No known key found for this signature in database
17 changed files with 25 additions and 1263 deletions

1
.gitignore vendored
View file

@ -90,3 +90,4 @@ bun.lock
.direnv/
.envrc
.serena/
repowise.db

View file

@ -1,215 +0,0 @@
---
name: sf-orchestrator
description: >
Build software products autonomously via SF headless mode. Handles the full
lifecycle: write a spec, launch a build, poll for completion, handle blockers,
track costs, and verify the result. Use when asked to "build something",
"create a project", "run sf", "check build status", or any task that
requires autonomous software development via subprocess.
metadata:
openclaw:
requires:
bins: [sf]
install:
kind: node
package: sf-run
bins: [sf]
---
<objective>
You are an autonomous agent that builds software by orchestrating SF as a subprocess.
SF is a headless CLI that plans, codes, tests, and ships software from a spec.
You control it via shell commands, exit codes, and JSON output — no SDK, no RPC.
</objective>
<mental_model>
SF headless is a subprocess you launch and monitor. Think of it like a junior developer
you hand a spec to:
1. You write the spec (what to build)
2. You launch the build (`sf headless ... new-milestone --context spec.md --auto`)
3. You wait for it to finish (exit code tells you the outcome)
4. You check the result (query state, inspect files, verify deliverables)
5. If blocked, you intervene (steer, supply answers, or escalate)
The subprocess handles all planning, coding, testing, and git commits internally.
You never write application code yourself — SF does that.
</mental_model>
<critical_rules>
- **Flags before command.** `sf headless [--flags] [command] [args]`. Flags after the command are ignored.
- **Redirect stderr.** JSON output goes to stdout. Progress goes to stderr. Always `2>/dev/null` when parsing JSON.
- **Check exit codes.** 0=success, 1=error, 10=blocked (needs you), 11=cancelled.
- **Use `query` to poll.** Instant (~50ms), no LLM cost. Use it between steps, not `auto` for status.
- **Budget awareness.** Track `cost.total` from query results. Set limits before launching long runs.
- **One project directory per build.** Each SF project needs its own directory with a `.sf/` folder.
</critical_rules>
<routing>
Route based on what you need to do:
**Build something from scratch:**
Read `workflows/build-from-spec.md` — write spec, init directory, launch, monitor, verify.
**Check on a running or completed build:**
Read `workflows/monitor-and-poll.md` — query state, interpret phases, handle blockers.
**Execute with fine-grained control:**
Read `workflows/step-by-step.md` — run one unit at a time with decision points.
**Understand the JSON output:**
Read `references/json-result.md` — field reference for HeadlessJsonResult.
**Pre-supply answers or secrets:**
Read `references/answer-injection.md` — answer file schema and injection mechanism.
**Look up a specific command:**
Read `references/commands.md` — full command reference with flags and examples.
</routing>
<quick_reference>
**Launch a full build (spec to working code):**
```bash
mkdir -p /tmp/my-project && cd /tmp/my-project && git init
cat > spec.md << 'EOF'
# Your Product Spec Here
Build a ...
EOF
sf headless --output-format json --context spec.md new-milestone --auto 2>/dev/null
```
**Check project state (instant, free):**
```bash
cd /path/to/project
sf headless query | jq '{phase: .state.phase, progress: .state.progress, cost: .cost.total}'
```
**Resume work on an existing project:**
```bash
cd /path/to/project
sf headless --output-format json auto 2>/dev/null
```
**Run one step at a time:**
```bash
RESULT=$(sf headless --output-format json next 2>/dev/null)
echo "$RESULT" | jq '{status: .status, phase: .phase, cost: .cost.total}'
```
</quick_reference>
<exit_codes>
| Code | Meaning | Your action |
|------|---------|-------------|
| `0` | Success | Check deliverables, verify output, report completion |
| `1` | Error or timeout | Inspect stderr, check `.sf/STATE.md`, retry or escalate |
| `10` | Blocked | Query state for blocker details, steer around it or escalate to human |
| `11` | Cancelled | Process was interrupted — resume with `--resume <sessionId>` or restart |
</exit_codes>
<project_structure>
SF creates and manages all state in `.sf/`:
```
.sf/
PROJECT.md # What this project is
REQUIREMENTS.md # Capability contract
DECISIONS.md # Architectural decisions (append-only)
KNOWLEDGE.md # Persistent project knowledge (patterns, rules, lessons)
STATE.md # Current phase and next action
milestones/
M001-xxxxx/
M001-xxxxx-CONTEXT.md # Scope, constraints, assumptions
M001-xxxxx-ROADMAP.md # Slices with checkboxes
M001-xxxxx-SUMMARY.md # Completion summary
slices/S01/
S01-PLAN.md # Tasks
S01-SUMMARY.md # Slice summary
tasks/
T01-PLAN.md # Individual task spec
T01-SUMMARY.md # Task completion summary
```
State is derived from files on disk — checkboxes in ROADMAP.md and PLAN.md are the source of truth for completion. You never need to edit these files. SF manages them. But you can read them to understand progress.
</project_structure>
<flags>
| Flag | Description |
|------|-------------|
| `--output-format <fmt>` | `text` (default), `json` (structured result at exit), `stream-json` (JSONL events) |
| `--json` | Alias for `--output-format stream-json` — JSONL event stream to stdout |
| `--bare` | Skip CLAUDE.md, AGENTS.md, user settings, user skills. Use for CI/ecosystem runs. |
| `--resume <id>` | Resume a prior headless session by its session ID |
| `--timeout N` | Overall timeout in ms (default: 300000, use 0 to disable) |
| `--model ID` | Override LLM model |
| `--supervised` | Forward interactive UI requests to orchestrator via stdout/stdin |
| `--response-timeout N` | Timeout (ms) for orchestrator response in supervised mode (default: 30000) |
| `--answers <path>` | Pre-supply answers and secrets from JSON file |
| `--events <types>` | Filter JSONL to specific event types (comma-separated, implies `--json`) |
| `--verbose` | Show tool calls in progress output |
| `--context <path>` | Spec file path for `new-milestone` (use `-` for stdin) |
| `--context-text <text>` | Inline spec text for `new-milestone` |
| `--auto` | Chain into auto-mode after `new-milestone` |
</flags>
<answer_injection>
Pre-supply answers and secrets for fully autonomous runs:
```bash
sf headless --answers answers.json --output-format json auto 2>/dev/null
```
```json
{
"questions": { "question_id": "selected_option" },
"secrets": { "API_KEY": "sk-..." },
"defaults": { "strategy": "first_option" }
}
```
- **questions** — question ID to answer (string for single-select, string[] for multi-select)
- **secrets** — env var to value, injected into child process environment
- **defaults.strategy**`"first_option"` (default) or `"cancel"` for unmatched questions
See `references/answer-injection.md` for the full mechanism.
</answer_injection>
<event_streaming>
For real-time monitoring, use JSONL event streaming:
```bash
sf headless --json auto 2>/dev/null | while read -r line; do
TYPE=$(echo "$line" | jq -r '.type')
case "$TYPE" in
tool_execution_start) echo "Tool: $(echo "$line" | jq -r '.toolName')" ;;
extension_ui_request) echo "SF: $(echo "$line" | jq -r '.message // .title // empty')" ;;
agent_end) echo "Session ended" ;;
esac
done
```
Filter to specific events: `--events agent_end,execution_complete,extension_ui_request`
Available types: `agent_start`, `agent_end`, `tool_execution_start`, `tool_execution_end`,
`tool_execution_update`, `extension_ui_request`, `message_start`, `message_end`,
`message_update`, `turn_start`, `turn_end`, `cost_update`, `execution_complete`.
</event_streaming>
<all_commands>
| Command | Purpose |
|---------|---------|
| `auto` | Run all queued units until milestone complete or blocked (default) |
| `next` | Run exactly one unit, then exit |
| `query` | Instant JSON snapshot — state, next dispatch, costs (no LLM, ~50ms) |
| `new-milestone` | Create milestone from spec file |
| `dispatch <phase>` | Force specific phase (research, plan, execute, complete, reassess, uat, replan) |
| `stop` / `pause` | Control auto-mode |
| `steer <desc>` | Hard-steer plan mid-execution |
| `skip` / `undo` | Unit control |
| `queue` | Queue/reorder milestones |
| `history` | View execution history |
| `doctor` | Health check + auto-fix |
| `knowledge <rule>` | Add persistent project knowledge |
See `references/commands.md` for the complete reference.
</all_commands>

View file

@ -1,119 +0,0 @@
# Answer Injection
Pre-supply answers and secrets to eliminate interactive prompts during headless execution.
## Usage
```bash
sf headless --answers answers.json auto
sf headless --answers answers.json new-milestone --context spec.md --auto
```
The `--answers` flag takes a path to a JSON file containing pre-supplied answers and secrets.
## Answer File Schema
```json
{
"questions": {
"question_id": "selected_option_label",
"multi_select_question": ["option_a", "option_b"]
},
"secrets": {
"API_KEY": "sk-...",
"DATABASE_URL": "postgres://..."
},
"defaults": {
"strategy": "first_option"
}
}
```
### Fields
| Field | Type | Description |
|-------|------|-------------|
| `questions` | `Record<string, string \| string[]>` | Map question ID → answer. String for single-select, string array for multi-select. |
| `secrets` | `Record<string, string>` | Map env var name → value. Injected into child process environment variables. |
| `defaults.strategy` | `"first_option" \| "cancel"` | Fallback for unmatched questions. Default: `"first_option"`. |
## How Secrets Work
Secrets are injected as environment variables into the SF child process:
1. The orchestrator passes the answer file via `--answers`
2. SF reads the file and sets secret values as env vars in the child process
3. When `secure_env_collect` runs inside the agent, it finds the keys already in `process.env`
4. The tool skips the interactive prompt and reports the keys as "already configured"
Secrets are never logged or included in event streams.
## How Question Matching Works
Two-phase correlation:
1. **Observe** — SF monitors `tool_execution_start` events for `ask_user_questions` to extract question metadata (ID, options, allowMultiple)
2. **Match** — Subsequent `extension_ui_request` events are correlated to the metadata and responded to with the pre-supplied answer
Handles out-of-order events (extension_ui_request can arrive before tool_execution_start) via a deferred processing queue with 500ms timeout.
## Coexistence with `--supervised`
Both `--answers` and `--supervised` can be active simultaneously. Priority order:
1. Answer injector tries first
2. If no answer found, supervised mode forwards to the orchestrator
3. If no orchestrator response within `--response-timeout`, the auto-responder kicks in
## Without Answer Injection
Headless mode has built-in auto-responders for all prompt types:
| Prompt Type | Default Behavior |
|-------------|-----------------|
| Select | Picks first option |
| Confirm | Auto-confirms |
| Input | Empty string |
| Editor | Returns prefill or empty |
Answer injection overrides these defaults with specific answers when precision matters.
## Diagnostics
The injector tracks statistics printed in the session summary:
| Stat | Description |
|------|-------------|
| `questionsAnswered` | Questions resolved from the answer file |
| `questionsDefaulted` | Questions handled by the default strategy |
| `secretsProvided` | Number of secrets injected |
Unused question IDs and secret keys are warned about at exit.
## Example: Orchestrator with Answers
```bash
# Create answer file
cat > answers.json << 'EOF'
{
"questions": {
"test_framework": "vitest",
"package_manager": "pnpm"
},
"secrets": {
"OPENAI_API_KEY": "sk-...",
"DATABASE_URL": "postgres://localhost:5432/mydb"
},
"defaults": {
"strategy": "first_option"
}
}
EOF
# Run with pre-supplied answers
sf headless --answers answers.json --output-format json auto 2>/dev/null
# Parse result
RESULT=$(sf headless --answers answers.json --output-format json next 2>/dev/null)
echo "$RESULT" | jq '{status: .status, cost: .cost.total}'
```

View file

@ -1,210 +0,0 @@
# SF Commands Reference
All commands run as subprocesses via `sf headless [flags] [command] [args...]`.
## Global Flags
These flags apply to any `sf headless` invocation:
| Flag | Description |
|------|-------------|
| `--output-format <fmt>` | `text` (default), `json` (structured result), `stream-json` (JSONL) |
| `--json` | Alias for `--output-format stream-json` |
| `--bare` | Minimal context: skip CLAUDE.md, AGENTS.md, user settings, user skills |
| `--resume <id>` | Resume a prior headless session by ID |
| `--timeout N` | Overall timeout in ms (default: 300000) |
| `--model ID` | Override LLM model |
| `--supervised` | Forward interactive UI requests to orchestrator via stdout/stdin |
| `--response-timeout N` | Timeout for orchestrator response in supervised mode (default: 30000ms) |
| `--answers <path>` | Pre-supply answers and secrets from JSON file |
| `--events <types>` | Filter JSONL output to specific event types (comma-separated, implies `--json`) |
| `--verbose` | Show tool calls in progress output |
## Exit Codes
| Code | Meaning | When |
|------|---------|------|
| `0` | Success | Unit/milestone completed normally |
| `1` | Error or timeout | Runtime error, LLM failure, or `--timeout` exceeded |
| `10` | Blocked | Execution hit a blocker requiring human intervention |
| `11` | Cancelled | User or orchestrator cancelled the operation |
## Workflow Commands
### `auto` (default)
Autonomous mode — loop through all pending units until milestone complete or blocked.
```bash
sf headless --output-format json auto
```
### `next`
Step mode — execute exactly one unit (task/slice/milestone step), then exit. Recommended for orchestrators that need decision points between steps.
```bash
sf headless --output-format json next
```
### `new-milestone`
Create a milestone from a specification document.
```bash
sf headless new-milestone --context spec.md
sf headless new-milestone --context spec.md --auto
sf headless new-milestone --context-text "Build a REST API" --auto
cat spec.md | sf headless new-milestone --context - --auto
```
Extra flags:
- `--context <path>` — path to spec/PRD file (use `-` for stdin)
- `--context-text <text>` — inline specification text
- `--auto` — start auto-mode after milestone creation
### `dispatch <phase>`
Force-route to a specific phase, bypassing normal state-machine routing.
```bash
sf headless dispatch research
sf headless dispatch plan
sf headless dispatch execute
sf headless dispatch complete
sf headless dispatch reassess
sf headless dispatch uat
sf headless dispatch replan
```
### `discuss`
Start guided milestone/slice discussion.
```bash
sf headless discuss
```
### `stop`
Stop auto-mode gracefully.
```bash
sf headless stop
```
### `pause`
Pause auto-mode (preserves state, resumable).
```bash
sf headless pause
```
## State Inspection
### `query`
**Instant JSON snapshot** — state, next dispatch, parallel costs. No LLM, ~50ms. The recommended way for orchestrators to inspect state.
```bash
sf headless query
sf headless query | jq '.state.phase'
sf headless query | jq '.next'
sf headless query | jq '.cost.total'
```
### `status`
Progress dashboard (TUI overlay — useful interactively, not for parsing).
```bash
sf headless status
```
### `history`
Execution history. Supports `--cost`, `--phase`, `--model`, and `limit` arguments.
```bash
sf headless history
```
## Unit Control
### `skip`
Prevent a unit from auto-mode dispatch.
```bash
sf headless skip
```
### `undo`
Revert last completed unit. Use `--force` to bypass confirmation.
```bash
sf headless undo
sf headless undo --force
```
### `steer <description>`
Hard-steer plan documents during execution. Useful for mid-course corrections.
```bash
sf headless steer "Skip the blocked dependency, use mock instead"
```
### `queue`
Queue and reorder future milestones.
```bash
sf headless queue
```
## Configuration & Health
### `doctor`
Runtime health checks with auto-fix.
```bash
sf headless doctor
```
### `prefs`
Manage preferences (global/project/status/wizard/setup).
```bash
sf headless prefs
```
### `knowledge <rule|pattern|lesson>`
Add persistent project knowledge.
```bash
sf headless knowledge "Always use UTC timestamps in API responses"
```
## Phases
SF workflows progress through these phases:
```
pre-planning → needs-discussion → discussing → researching → planning →
executing → verifying → summarizing → advancing → validating-milestone →
completing-milestone → complete
```
Special phases: `paused`, `blocked`, `replanning-slice`
## Hierarchy
- **Milestone**: Shippable version (410 slices, 14 weeks)
- **Slice**: One demoable vertical capability (17 tasks, 13 days)
- **Task**: One context-window-sized unit of work (one session)

View file

@ -1,162 +0,0 @@
# HeadlessJsonResult Reference
When using `--output-format json`, SF collects events silently and emits a single `HeadlessJsonResult` JSON object to stdout at process exit. This is the structured result for orchestrator decision-making.
## Obtaining the Result
```bash
# Capture the JSON result
RESULT=$(sf headless --output-format json next 2>/dev/null)
EXIT=$?
# Parse fields with jq
echo "$RESULT" | jq '.status'
echo "$RESULT" | jq '.cost.total'
echo "$RESULT" | jq '.nextAction'
```
**Important:** Progress text goes to stderr. The JSON result goes to stdout. Redirect stderr to `/dev/null` when parsing stdout.
## Field Reference
### Top-Level Fields
| Field | Type | Description |
|-------|------|-------------|
| `status` | `"success" \| "error" \| "blocked" \| "cancelled" \| "timeout"` | Final session status. Maps directly to exit codes. |
| `exitCode` | `number` | Process exit code: `0` (success), `1` (error/timeout), `10` (blocked), `11` (cancelled). |
| `sessionId` | `string \| undefined` | Session identifier. Pass to `--resume <id>` to continue this session. |
| `duration` | `number` | Session wall-clock duration in milliseconds. |
| `cost` | `CostObject` | Token usage and cost breakdown. See below. |
| `toolCalls` | `number` | Total number of tool calls made during the session. |
| `events` | `number` | Total number of events processed during the session. |
| `milestone` | `string \| undefined` | Active milestone ID (e.g. `"M001"`). |
| `phase` | `string \| undefined` | Current SF phase at session end (e.g. `"executing"`, `"blocked"`, `"complete"`). |
| `nextAction` | `string \| undefined` | Recommended next action from the state machine (e.g. `"dispatch"`, `"complete"`). |
| `artifacts` | `string[] \| undefined` | Paths to artifacts created or modified during the session. |
| `commits` | `string[] \| undefined` | Git commit SHAs created during the session. |
### Status → Exit Code Mapping
| Status | Exit Code | Constant | Meaning |
|--------|-----------|----------|---------|
| `success` | `0` | `EXIT_SUCCESS` | Unit or milestone completed successfully |
| `error` | `1` | `EXIT_ERROR` | Runtime error or LLM failure |
| `timeout` | `1` | `EXIT_ERROR` | `--timeout` deadline exceeded |
| `blocked` | `10` | `EXIT_BLOCKED` | Execution blocked — needs human intervention |
| `cancelled` | `11` | `EXIT_CANCELLED` | Cancelled by user or orchestrator |
### Cost Object
| Field | Type | Description |
|-------|------|-------------|
| `cost.total` | `number` | Total cost in USD for the session. |
| `cost.input_tokens` | `number` | Number of input tokens consumed. |
| `cost.output_tokens` | `number` | Number of output tokens generated. |
| `cost.cache_read_tokens` | `number` | Number of tokens served from prompt cache. |
| `cost.cache_write_tokens` | `number` | Number of tokens written to prompt cache. |
## Parsing Patterns
### Decision-Making After Each Step
```bash
RESULT=$(sf headless --output-format json next 2>/dev/null)
EXIT=$?
case $EXIT in
0)
PHASE=$(echo "$RESULT" | jq -r '.phase')
NEXT=$(echo "$RESULT" | jq -r '.nextAction')
echo "Success — phase: $PHASE, next: $NEXT"
;;
1)
STATUS=$(echo "$RESULT" | jq -r '.status')
echo "Failed — status: $STATUS"
;;
10)
echo "Blocked — needs intervention"
sf headless query | jq '.state'
;;
11)
echo "Cancelled"
;;
esac
```
### Cost Tracking
```bash
RESULT=$(sf headless --output-format json next 2>/dev/null)
COST=$(echo "$RESULT" | jq -r '.cost.total')
INPUT=$(echo "$RESULT" | jq -r '.cost.input_tokens')
OUTPUT=$(echo "$RESULT" | jq -r '.cost.output_tokens')
echo "Cost: \$$COST (${INPUT} in / ${OUTPUT} out)"
```
### Session Resumption
```bash
# First run — capture session ID
RESULT=$(sf headless --output-format json next 2>/dev/null)
SESSION_ID=$(echo "$RESULT" | jq -r '.sessionId')
# Resume the same session later
sf headless --resume "$SESSION_ID" --output-format json next 2>/dev/null
```
### Artifact Collection
```bash
RESULT=$(sf headless --output-format json auto 2>/dev/null)
# List files created/modified
echo "$RESULT" | jq -r '.artifacts[]?'
# List commits made
echo "$RESULT" | jq -r '.commits[]?'
```
## Example Result
```json
{
"status": "success",
"exitCode": 0,
"sessionId": "abc123def456",
"duration": 45200,
"cost": {
"total": 0.42,
"input_tokens": 15000,
"output_tokens": 3500,
"cache_read_tokens": 8000,
"cache_write_tokens": 2000
},
"toolCalls": 12,
"events": 87,
"milestone": "M001",
"phase": "executing",
"nextAction": "dispatch",
"artifacts": [
".sf/milestones/M001/slices/S01/tasks/T01-SUMMARY.md"
],
"commits": [
"a1b2c3d"
]
}
```
## Combined with `query` for Full Picture
The `HeadlessJsonResult` captures what happened during a session. Use `query` for the current project state:
```bash
# What happened in this step?
RESULT=$(sf headless --output-format json next 2>/dev/null)
echo "$RESULT" | jq '{status, cost: .cost.total, phase}'
# What's the overall project state now?
sf headless query | jq '{phase: .state.phase, progress: .state.progress, totalCost: .cost.total}'
```

View file

@ -1,20 +0,0 @@
# [Product Name]
## What
[One paragraph: what this product does. Be concrete — "A CLI tool that converts CSV files to JSON" not "A data transformation solution".]
## Requirements
- [User can DO something specific and observable]
- [User can DO another specific thing]
- [System DOES something automatically]
- [Error case: system handles X gracefully]
## Technical Constraints
- Language: [Node.js / Python / Go / Rust / etc.]
- Framework: [Express / FastAPI / none / etc.]
- External dependencies: [list APIs, databases, services]
- Environment: [Node >= 22 / Python 3.12+ / etc.]
## Out of Scope
- [Explicit exclusion 1 — prevents scope creep]
- [Explicit exclusion 2]

View file

@ -1,184 +0,0 @@
# Build From Spec
End-to-end workflow: take a product idea or specification, produce working software.
## Prerequisites
- `sf` CLI installed (`npm install -g sf-run`)
- A directory for the project (can be empty)
- Git initialized in the directory
## Process
### Step 1: Prepare the project directory
```bash
PROJECT_DIR="/tmp/my-project-name"
mkdir -p "$PROJECT_DIR"
cd "$PROJECT_DIR"
git init 2>/dev/null # SF needs a git repo
```
### Step 2: Write the spec file
Write a spec file that describes what to build. More detail = better results.
```bash
cat > spec.md << 'SPEC'
# Product Name
## What
[Concrete description of what to build]
## Requirements
- [Specific, testable requirement 1]
- [Specific, testable requirement 2]
- [Specific, testable requirement 3]
## Technical Constraints
- [Language, framework, or platform requirements]
- [External services or APIs involved]
- [Performance or security requirements]
## Out of Scope
- [Things explicitly NOT included]
SPEC
```
**Spec quality matters.** Vague specs produce vague results. Include:
- What the user can DO when it's done (not what code to write)
- Technical constraints (language, framework, Node version)
- What's out of scope (prevents scope creep)
### Step 3: Launch the build
**Fire-and-forget (simplest — SF does everything):**
```bash
cd "$PROJECT_DIR"
RESULT=$(sf headless --output-format json --timeout 0 --context spec.md new-milestone --auto 2>/dev/null)
EXIT=$?
```
`--timeout 0` disables the timeout for long builds. `--auto` chains milestone creation into execution.
**With budget limit:**
```bash
# Use step-by-step mode with budget checks instead of auto
# See workflows/step-by-step.md
```
**For CI or ecosystem runs (no user config):**
```bash
RESULT=$(sf headless --bare --output-format json --timeout 0 --context spec.md new-milestone --auto 2>/dev/null)
EXIT=$?
```
### Step 4: Handle the result
```bash
case $EXIT in
0)
# Success — verify deliverables
STATUS=$(echo "$RESULT" | jq -r '.status')
COST=$(echo "$RESULT" | jq -r '.cost.total')
COMMITS=$(echo "$RESULT" | jq -r '.commits | length')
echo "Build complete: $STATUS, cost: \$$COST, commits: $COMMITS"
# Inspect what was built
sf headless query | jq '.state.progress'
# Check the actual files
ls -la "$PROJECT_DIR"
;;
1)
# Error — inspect and decide
echo "Build failed"
echo "$RESULT" | jq '{status: .status, phase: .phase}'
# Check state for details
sf headless query | jq '.state'
;;
10)
# Blocked — needs intervention
echo "Build blocked — needs human input"
sf headless query | jq '{phase: .state.phase, blockers: .state.blockers}'
# Options: steer, supply answers, or escalate
# See workflows/monitor-and-poll.md for blocker handling
;;
11)
echo "Build was cancelled"
;;
esac
```
### Step 5: Verify deliverables
After a successful build, verify the output:
```bash
cd "$PROJECT_DIR"
# Check project state
sf headless query | jq '{
phase: .state.phase,
progress: .state.progress,
cost: .cost.total
}'
# Check git log for what was built
git log --oneline
# Run the project's own tests if they exist
[ -f package.json ] && npm test 2>/dev/null
[ -f Makefile ] && make test 2>/dev/null
```
## Complete Example
```bash
# 1. Setup
mkdir -p /tmp/todo-api && cd /tmp/todo-api && git init
# 2. Write spec
cat > spec.md << 'SPEC'
# Todo API
Build a REST API for managing todo items using Node.js and Express.
## Requirements
- GET /todos — list all todos
- POST /todos — create a todo (title, completed)
- PUT /todos/:id — update a todo
- DELETE /todos/:id — delete a todo
- Todos stored in-memory (no database)
- Input validation with descriptive error messages
- Health check endpoint at GET /health
## Technical Constraints
- Node.js with ESM modules
- Express framework
- No external database — in-memory array
- Port configurable via PORT env var (default 3000)
## Out of Scope
- Authentication
- Persistent storage
- Frontend
SPEC
# 3. Launch
RESULT=$(sf headless --output-format json --timeout 0 --context spec.md new-milestone --auto 2>/dev/null)
EXIT=$?
# 4. Report
if [ $EXIT -eq 0 ]; then
COST=$(echo "$RESULT" | jq -r '.cost.total')
echo "Build complete (\$$COST)"
echo "Files created:"
find . -not -path './.sf/*' -not -path './.git/*' -type f
else
echo "Build failed (exit $EXIT)"
echo "$RESULT" | jq .
fi
```

View file

@ -1,187 +0,0 @@
# Monitor and Poll
Check status of a SF project, handle blockers, track costs, and decide next actions.
## Checking Project State
The `query` command is your primary monitoring tool. It's instant (~50ms), costs nothing (no LLM), and returns the full project snapshot.
```bash
cd /path/to/project
sf headless query
```
### Key fields to inspect
```bash
# Overall status
sf headless query | jq '{
phase: .state.phase,
milestone: .state.activeMilestone.id,
slice: .state.activeSlice.id,
task: .state.activeTask.id,
progress: .state.progress,
cost: .cost.total
}'
# What should happen next
sf headless query | jq '.next'
# Returns: { "action": "dispatch", "unitType": "execute-task", "unitId": "M001/S01/T01" }
# Is it done?
sf headless query | jq '.state.phase'
# "complete" = done, "blocked" = needs you, anything else = in progress
```
### Phase meanings
| Phase | Meaning | Your action |
|-------|---------|-------------|
| `pre-planning` | Milestone exists, no slices planned yet | Run `auto` or `next` |
| `needs-discussion` | Ambiguities need resolution | Supply answers or run with defaults |
| `discussing` | Discussion in progress | Wait |
| `researching` | Codebase/library research | Wait |
| `planning` | Creating task plans | Wait |
| `executing` | Writing code | Wait |
| `verifying` | Checking must-haves | Wait |
| `summarizing` | Recording what happened | Wait |
| `advancing` | Moving to next task/slice | Wait |
| `evaluating-gates` | Quality checks before execution | Wait or run `next` |
| `validating-milestone` | Final milestone checks | Wait |
| `completing-milestone` | Archiving and cleanup | Wait |
| `complete` | Done | Verify deliverables |
| `blocked` | Needs human input | Handle blocker (see below) |
| `paused` | Explicitly paused | Resume with `auto` |
## Handling Blockers
When exit code is `10` or phase is `blocked`:
```bash
# 1. Understand the blocker
sf headless query | jq '{phase: .state.phase, blockers: .state.blockers, nextAction: .state.nextAction}'
# 2. Option A: Steer around it
sf headless steer "Skip the database dependency, use in-memory storage instead"
# 3. Option B: Supply pre-built answers
cat > fix.json << 'EOF'
{
"questions": { "blocked_question_id": "workaround_option" },
"defaults": { "strategy": "first_option" }
}
EOF
sf headless --answers fix.json auto
# 4. Option C: Force a specific phase
sf headless dispatch replan
# 5. Option D: Escalate to user
echo "SF build blocked. Phase: $(sf headless query | jq -r '.state.phase')"
echo "Manual intervention required."
```
## Cost Tracking
```bash
# Current cumulative cost
sf headless query | jq '.cost.total'
# Per-worker breakdown
sf headless query | jq '.cost.workers'
# After a step (from HeadlessJsonResult)
RESULT=$(sf headless --output-format json next 2>/dev/null)
echo "$RESULT" | jq '.cost'
```
### Budget enforcement pattern
```bash
MAX_BUDGET=15.00
check_budget() {
TOTAL=$(sf headless query | jq -r '.cost.total')
OVER=$(echo "$TOTAL > $MAX_BUDGET" | bc -l)
if [ "$OVER" = "1" ]; then
echo "Budget exceeded: \$$TOTAL > \$$MAX_BUDGET"
sf headless stop
return 1
fi
return 0
}
```
## Poll-and-React Loop
For agents that need to periodically check on a build:
```bash
cd /path/to/project
poll_project() {
STATE=$(sf headless query 2>/dev/null)
if [ -z "$STATE" ]; then
echo "NO_PROJECT"
return
fi
PHASE=$(echo "$STATE" | jq -r '.state.phase')
COST=$(echo "$STATE" | jq -r '.cost.total')
PROGRESS=$(echo "$STATE" | jq -r '"\(.state.progress.milestones.done)/\(.state.progress.milestones.total) milestones, \(.state.progress.tasks.done)/\(.state.progress.tasks.total) tasks"')
case "$PHASE" in
complete)
echo "COMPLETE cost=\$$COST progress=$PROGRESS"
;;
blocked)
BLOCKER=$(echo "$STATE" | jq -r '.state.nextAction // "unknown"')
echo "BLOCKED reason=$BLOCKER cost=\$$COST"
;;
*)
NEXT=$(echo "$STATE" | jq -r '.next.action // "none"')
echo "IN_PROGRESS phase=$PHASE next=$NEXT cost=\$$COST progress=$PROGRESS"
;;
esac
}
```
## Resuming Work
If a build was interrupted or you need to continue:
```bash
cd /path/to/project
# Check current state
sf headless query | jq '.state.phase'
# Resume from where it left off
sf headless --output-format json auto 2>/dev/null
# Or resume a specific session
sf headless --resume "$SESSION_ID" --output-format json auto 2>/dev/null
```
## Reading Build Artifacts
After completion, inspect what SF produced:
```bash
cd /path/to/project
# Project summary
cat .sf/PROJECT.md
# What was decided
cat .sf/DECISIONS.md
# Requirements and their validation status
cat .sf/REQUIREMENTS.md
# Milestone summary
cat .sf/milestones/M001-*/M001-*-SUMMARY.md 2>/dev/null
# Git history (SF commits per-slice)
git log --oneline
```

View file

@ -1,156 +0,0 @@
# Step-by-Step Execution
Run SF one unit at a time with decision points between steps. Use this when you need
control over execution — budget enforcement, progress reporting, conditional logic,
or the ability to steer mid-build.
## When to use this vs `auto`
| Approach | Use when |
|----------|----------|
| `auto` | You trust the build, just want the result |
| `next` loop | You need budget checks, progress updates, or intervention points |
## Core Loop
```bash
cd /path/to/project
MAX_BUDGET=20.00
TOTAL_COST=0
while true; do
# Run one unit
RESULT=$(sf headless --output-format json next 2>/dev/null)
EXIT=$?
# Parse result
STATUS=$(echo "$RESULT" | jq -r '.status')
STEP_COST=$(echo "$RESULT" | jq -r '.cost.total')
PHASE=$(echo "$RESULT" | jq -r '.phase // empty')
SESSION_ID=$(echo "$RESULT" | jq -r '.sessionId // empty')
# Handle exit codes
case $EXIT in
0) ;; # success — continue
1)
echo "Step failed: $STATUS"
break
;;
10)
echo "Blocked — needs intervention"
sf headless query | jq '.state'
break
;;
11)
echo "Cancelled"
break
;;
esac
# Check if milestone complete
CURRENT_PHASE=$(sf headless query | jq -r '.state.phase')
if [ "$CURRENT_PHASE" = "complete" ]; then
TOTAL_COST=$(sf headless query | jq -r '.cost.total')
echo "Milestone complete. Total cost: \$$TOTAL_COST"
break
fi
# Budget check
TOTAL_COST=$(sf headless query | jq -r '.cost.total')
OVER=$(echo "$TOTAL_COST > $MAX_BUDGET" | bc -l)
if [ "$OVER" = "1" ]; then
echo "Budget limit (\$$MAX_BUDGET) exceeded at \$$TOTAL_COST"
sf headless stop
break
fi
# Progress report
PROGRESS=$(sf headless query | jq -r '"\(.state.progress.tasks.done)/\(.state.progress.tasks.total) tasks"')
echo "Step done ($STATUS). Phase: $CURRENT_PHASE, Progress: $PROGRESS, Cost: \$$TOTAL_COST"
done
```
## Step-by-Step with Spec Creation
Complete flow from idea to working code with full control:
```bash
# 1. Setup
PROJECT_DIR="/tmp/my-project"
mkdir -p "$PROJECT_DIR" && cd "$PROJECT_DIR" && git init 2>/dev/null
# 2. Write spec
cat > spec.md << 'SPEC'
[Your spec here]
SPEC
# 3. Create the milestone (planning only, no execution)
RESULT=$(sf headless --output-format json --context spec.md new-milestone 2>/dev/null)
EXIT=$?
if [ $EXIT -ne 0 ]; then
echo "Milestone creation failed"
echo "$RESULT" | jq .
exit 1
fi
echo "Milestone created. Starting execution..."
# 4. Execute step-by-step
STEP=0
while true; do
STEP=$((STEP + 1))
RESULT=$(sf headless --output-format json next 2>/dev/null)
EXIT=$?
[ $EXIT -ne 0 ] && break
PHASE=$(sf headless query | jq -r '.state.phase')
COST=$(sf headless query | jq -r '.cost.total')
echo "Step $STEP complete. Phase: $PHASE, Cost: \$$COST"
[ "$PHASE" = "complete" ] && break
done
echo "Build finished in $STEP steps"
```
## Intervention Patterns
### Steer mid-execution
If you detect the build going in the wrong direction:
```bash
# Check what's happening
sf headless query | jq '{phase: .state.phase, task: .state.activeTask}'
# Redirect
sf headless steer "Use SQLite instead of PostgreSQL for storage"
# Continue
sf headless --output-format json next 2>/dev/null
```
### Skip a stuck unit
```bash
sf headless skip
sf headless --output-format json next 2>/dev/null
```
### Undo last completed unit
```bash
sf headless undo --force
sf headless --output-format json next 2>/dev/null
```
### Force a specific phase
```bash
sf headless dispatch replan # Re-plan the current slice
sf headless dispatch execute # Skip to execution
sf headless dispatch uat # Jump to user acceptance testing
```

View file

@ -1,7 +1,7 @@
{
"name": "sf-run",
"name": "singularity-forge",
"version": "2.75.0",
"description": "sf-run — Singularity Forge runtime core",
"description": "Singularity Forge runtime core",
"license": "MIT",
"repository": {
"type": "git",

View file

@ -10,7 +10,7 @@ import type {
StreamFunction,
} from "../types.js";
import { AssistantMessageEventStream } from "../utils/event-stream.js";
import { adjustMaxTokensForThinking, buildBaseOptions } from "./simple-options.js";
import { adjustMaxTokensForThinking, buildBaseOptions, isAutoReasoning, resolveReasoningLevel } from "./simple-options.js";
import {
type AnthropicOptions,
mapThinkingLevelToEffort,
@ -105,8 +105,17 @@ export const streamSimpleAnthropicVertex: StreamFunction<"anthropic-vertex", Sim
return streamAnthropicVertex(model, context, { ...base, thinkingEnabled: false } satisfies AnthropicOptions);
}
if (isAutoReasoning(options.reasoning) && (supportsAdaptiveThinking(model.id) || model.capabilities?.thinkingNoBudget)) {
return streamAnthropicVertex(model, context, {
...base,
thinkingEnabled: true,
} satisfies AnthropicOptions);
}
const effectiveReasoning = resolveReasoningLevel(model, options.reasoning)!;
if (supportsAdaptiveThinking(model.id)) {
const effort = mapThinkingLevelToEffort(options.reasoning, model.id);
const effort = mapThinkingLevelToEffort(effectiveReasoning, model.id);
return streamAnthropicVertex(model, context, {
...base,
thinkingEnabled: true,
@ -117,7 +126,7 @@ export const streamSimpleAnthropicVertex: StreamFunction<"anthropic-vertex", Sim
const adjusted = adjustMaxTokensForThinking(
base.maxTokens || 0,
model.maxTokens,
options.reasoning,
effectiveReasoning,
options.thinkingBudgets,
);

Binary file not shown.

View file

@ -798,7 +798,7 @@ if (!process.stdin.isTTY || !process.stdout.isTTY) {
// Welcome screen — shown on every fresh interactive session before TUI takes over.
// Skip when the first-run banner was already printed in loader.ts (prevents double banner).
if (!process.env.SF_FIRST_RUN_BANNER && !process.env.SF_FIRST_RUN_BANNER) {
if (!process.env.SF_FIRST_RUN_BANNER) {
const { printWelcomeScreen } = await import('./welcome-screen.js')
let remoteChannel: string | undefined
try {

View file

@ -13,6 +13,7 @@ import type {
AssistantMessageEventStream,
Context,
Model,
RequestedThinkingLevel,
SimpleStreamOptions,
ThinkingLevel,
ToolCall,
@ -730,9 +731,12 @@ export function buildSdkOptions(
modelId: string,
prompt: string,
overrides?: { permissionMode?: "bypassPermissions" | "acceptEdits" | "default" | "plan" },
extraOptions: Record<string, unknown> & { reasoning?: ThinkingLevel } = {},
extraOptions: Record<string, unknown> & { reasoning?: RequestedThinkingLevel } = {},
): Record<string, unknown> {
const { reasoning, ...sdkExtraOptions } = extraOptions;
const { reasoning: requestedReasoning, ...sdkExtraOptions } = extraOptions;
// "auto" → let Claude's adaptive thinking pick effort itself (no explicit level)
const reasoning: ThinkingLevel | undefined = requestedReasoning === "auto" ? undefined : requestedReasoning;
const autoReasoning = requestedReasoning === "auto";
const mcpServers = buildWorkflowMcpServers();
const permissionMode = overrides?.permissionMode ?? "bypassPermissions";
const disallowedTools = ["AskUserQuestion"];
@ -760,8 +764,9 @@ export function buildSdkOptions(
// Bug B: SDK requires thinking:{type:"adaptive"} alongside effort for adaptive thinking to activate.
// Bug C: SDK requires thinking:{type:"disabled"} to actually stop adaptive thinking when reasoning is off;
// omitting the field leaves the SDK in its adaptive default (or persisted session state).
// "auto": request adaptive thinking with no explicit effort (SDK picks).
const thinkingConfig = supportsAdaptive
? effort
? effort || autoReasoning
? { thinking: { type: "adaptive" } }
: { thinking: { type: "disabled" } }
: undefined;

View file

@ -68,7 +68,7 @@ async function recordUnitOutcome(unit: UnitMetrics): Promise<void> {
unitType: unit.type,
unitId: unit.id,
succeeded: true, // metrics.json entry implies completion
retries: 0, // TODO: extract from session entries if possible
// retries omitted — UnitMetrics does not track retry count; defaults to 0 in outcome-recorder
escalated: !!unit.modelDowngraded,
verification_passed: null,
blocker_discovered: false,