624 lines
19 KiB
Markdown
624 lines
19 KiB
Markdown
|
|
# SF Agent Mode System
|
||
|
|
|
||
|
|
> **Status:** Draft specification. Promoted from `copilot-thoughts.md` research notes.
|
||
|
|
> **Scope:** TUI mode surface, command structure, orthogonal state axes, skills, background work, runtime target.
|
||
|
|
> **Decision authority:** Product + architecture review required before implementation.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 1. Problem Statement
|
||
|
|
|
||
|
|
SF's current command surface (`/sf autonomous`, `/sf next`, `/sf pause`, `/sf stop`) treats mode switching as separate commands rather than persistent states. There is no visible indicator of the current mode, and the `/sf` prefix positions SF as a plugin rather than the system itself.
|
||
|
|
|
||
|
|
Competitors (Copilot CLI, Factory Droid, Amp) have cleaner mode surfaces with visible state and orthogonal controls. SF has deeper autonomous machinery but weaker presentation.
|
||
|
|
|
||
|
|
**Goal:** Make SF's mode system as obvious as Vim's insert/normal mode indicator, with the control depth of Factory Droid's autonomy levels and the skill system of Amp.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 2. Orthogonal State Axes
|
||
|
|
|
||
|
|
SF state is five independent axes, not one overloaded "mode."
|
||
|
|
|
||
|
|
```text
|
||
|
|
workMode: chat | plan | build | review | repair | research
|
||
|
|
runControl: manual | assisted | autonomous
|
||
|
|
permissionProfile: restricted | normal | trusted | unrestricted
|
||
|
|
modelMode: fast | smart | deep
|
||
|
|
surface: tui | web | headless | rpc
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2.1 Axis Definitions
|
||
|
|
|
||
|
|
| Axis | Question It Answers | Values |
|
||
|
|
|------|---------------------|--------|
|
||
|
|
| `workMode` | What kind of work is SF doing? | `chat`, `plan`, `build`, `review`, `repair`, `research` |
|
||
|
|
| `runControl` | Who advances the loop? | `manual` (user), `assisted` (one unit then pause), `autonomous` (continuous) |
|
||
|
|
| `permissionProfile` | What may proceed without approval? | `restricted`, `normal`, `trusted`, `unrestricted` |
|
||
|
|
| `modelMode` | Speed/cost/reasoning posture? | `fast` (cheap), `smart` (balanced), `deep` (reasoning) |
|
||
|
|
| `surface` | How is the user connected? | `tui`, `web`, `headless`, `rpc` |
|
||
|
|
|
||
|
|
### 2.2 Example Combinations
|
||
|
|
|
||
|
|
```text
|
||
|
|
plan | manual | normal | deep → user plans with reasoning model
|
||
|
|
build | autonomous | trusted | smart → continuous implementation
|
||
|
|
repair | assisted | normal | smart → one repair unit at a time
|
||
|
|
research | autonomous | restricted | deep → continuous research, read-only
|
||
|
|
review | manual | restricted | deep → user reviews with reasoning model
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2.3 Rules
|
||
|
|
|
||
|
|
- `permissionProfile` never implies `runControl`. Autonomous run with `restricted` permissions is valid.
|
||
|
|
- `runControl` never implies `permissionProfile`. Manual run with `unrestricted` permissions is valid.
|
||
|
|
- Denylists and safety gates override `permissionProfile` regardless of value.
|
||
|
|
- Every risk decision logs all five axis values.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 3. Work Modes
|
||
|
|
|
||
|
|
### 3.1 `chat`
|
||
|
|
|
||
|
|
Default conversational mode. Questions, explanations, low-commitment exploration. No durable artifacts created without explicit user request.
|
||
|
|
|
||
|
|
### 3.2 `plan`
|
||
|
|
|
||
|
|
Research, clarify, write/update specs, derive tasks, produce explicit acceptance point before implementation. Primary user journey starts here.
|
||
|
|
|
||
|
|
**Plan → Build handoff:**
|
||
|
|
```text
|
||
|
|
plan | manual | normal | deep
|
||
|
|
accept plan
|
||
|
|
build | autonomous | selected-permission-profile | smart
|
||
|
|
```
|
||
|
|
|
||
|
|
Surfaces:
|
||
|
|
- TUI: plan acceptance prompt includes "run autonomously" button
|
||
|
|
- Web: plan acceptance button includes "run autonomously"
|
||
|
|
- Headless: `--autonomous` chains into direct `/autonomous`
|
||
|
|
- RPC: machine event records transition explicitly
|
||
|
|
|
||
|
|
### 3.3 `build`
|
||
|
|
|
||
|
|
Implement, test, lint, typecheck, verify, prepare commit-ready changes. The autonomous default.
|
||
|
|
|
||
|
|
### 3.4 `review`
|
||
|
|
|
||
|
|
Inspect diffs, tests, risks, regressions, security issues, missing evidence. Requires reasoning model (`deep`).
|
||
|
|
|
||
|
|
### 3.5 `repair`
|
||
|
|
|
||
|
|
Fix SF health, repo health, runtime drift, broken generated state, bad command surfaces, failing workflow infrastructure, stale locks, broken installed runtime copies.
|
||
|
|
|
||
|
|
**Doctor is the diagnostic engine, not the mode.** `/doctor` inspects. `/repair` switches work mode.
|
||
|
|
|
||
|
|
Commands:
|
||
|
|
```text
|
||
|
|
/doctor → inspect and report
|
||
|
|
/doctor fix → deterministic auto-fix
|
||
|
|
/doctor heal → LLM-assisted deep healing
|
||
|
|
/repair → switch workMode to repair
|
||
|
|
/repair --autonomous → repair until clean, blocked, or limit-hit
|
||
|
|
```
|
||
|
|
|
||
|
|
**Auto-transition to repair:**
|
||
|
|
```text
|
||
|
|
build | autonomous | trusted | smart
|
||
|
|
→ repair | autonomous | normal | smart
|
||
|
|
```
|
||
|
|
|
||
|
|
Allowed when:
|
||
|
|
- pre-dispatch health gate fails
|
||
|
|
- installed runtime drift detected
|
||
|
|
- SF cannot dispatch safely
|
||
|
|
- repo workflow state corrupted
|
||
|
|
|
||
|
|
Policy: configurable per project. Options: `auto`, `ask`, `log-only`.
|
||
|
|
|
||
|
|
### 3.6 `research`
|
||
|
|
|
||
|
|
Longer-form codebase, competitor, design, API, or dependency research. Uses web search, local code exploration, cross-repo research, helper agents.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 4. Run Control
|
||
|
|
|
||
|
|
| Value | Behavior |
|
||
|
|
|-------|----------|
|
||
|
|
| `manual` | User drives every step. Tool calls require approval. |
|
||
|
|
| `assisted` | SF executes one unit, then pauses for user review. |
|
||
|
|
| `autonomous` | SF continues until done, blocked, interrupted, budget-hit, or limit-hit. |
|
||
|
|
|
||
|
|
### 4.1 Commands
|
||
|
|
|
||
|
|
```text
|
||
|
|
/control manual
|
||
|
|
/control assisted
|
||
|
|
/control autonomous
|
||
|
|
/autonomous → alias for /control autonomous
|
||
|
|
/next → alias for /control assisted (one unit)
|
||
|
|
/pause → pause autonomous, preserve state
|
||
|
|
/stop → stop autonomous, clear state
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4.2 Transition Scopes
|
||
|
|
|
||
|
|
| Scope | Behavior |
|
||
|
|
|-------|----------|
|
||
|
|
| `now` | Apply immediately if no tool active. Abort current tool if policy allows. |
|
||
|
|
| `after-current-tool` | Finish active tool, then switch. |
|
||
|
|
| `after-current-unit` | Finish current SF unit, then switch. |
|
||
|
|
| `next-milestone` | Switch after current milestone completes. |
|
||
|
|
|
||
|
|
Autonomous changes affect future decisions, never mutate active tool calls mid-execution.
|
||
|
|
|
||
|
|
### 4.3 Transition Logging
|
||
|
|
|
||
|
|
Every transition persists:
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"timestamp": "2026-05-08T10:00:00Z",
|
||
|
|
"from": {"workMode": "build", "runControl": "autonomous"},
|
||
|
|
"to": {"workMode": "repair", "runControl": "autonomous"},
|
||
|
|
"reason": "pre-dispatch health gate failed",
|
||
|
|
"scope": "after-current-unit",
|
||
|
|
"sessionId": "..."
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 5. Permission Profiles
|
||
|
|
|
||
|
|
| Profile | Description |
|
||
|
|
|---------|-------------|
|
||
|
|
| `restricted` | Read-only and explicitly allowlisted actions. |
|
||
|
|
| `normal` | Safe edits, non-destructive local commands. |
|
||
|
|
| `trusted` | Build/test/install/local commits and bounded repo automation. |
|
||
|
|
| `unrestricted` | High-risk orchestration only in intentionally trusted environments. |
|
||
|
|
|
||
|
|
### 5.1 Enforcement
|
||
|
|
|
||
|
|
Permission profile is enforced at three layers:
|
||
|
|
|
||
|
|
1. **Tool registry:** Each tool declares required profile. Tools below profile are hidden from model.
|
||
|
|
2. **Execution gate:** Each tool call checks profile at invocation. Violation = error.
|
||
|
|
3. **Safety harness:** Destructive operations (delete, push to production, etc.) require explicit confirmation regardless of profile.
|
||
|
|
|
||
|
|
### 5.2 Commands
|
||
|
|
|
||
|
|
```text
|
||
|
|
/trust restricted
|
||
|
|
/trust normal
|
||
|
|
/trust trusted
|
||
|
|
/trust unrestricted
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 6. Model Modes
|
||
|
|
|
||
|
|
| Mode | Use Case | Routing Hint |
|
||
|
|
|------|----------|--------------|
|
||
|
|
| `fast` | Small bounded tasks | Cheapest available model |
|
||
|
|
| `smart` | Default balanced work | Default routing table |
|
||
|
|
| `deep` | Planning, debugging, research, review | Reasoning model (o1, Claude Opus, etc.) |
|
||
|
|
|
||
|
|
`modelMode` guides routing. It does not replace explicit `/model` selection.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 7. Mode Switching UX
|
||
|
|
|
||
|
|
### 7.1 Direct Commands
|
||
|
|
|
||
|
|
```text
|
||
|
|
/mode chat
|
||
|
|
/mode plan
|
||
|
|
/mode build
|
||
|
|
/mode review
|
||
|
|
/mode repair
|
||
|
|
/mode research
|
||
|
|
/control manual
|
||
|
|
/control assisted
|
||
|
|
/control autonomous
|
||
|
|
/trust restricted
|
||
|
|
/trust normal
|
||
|
|
/trust trusted
|
||
|
|
/trust unrestricted
|
||
|
|
/model-mode fast
|
||
|
|
/model-mode smart
|
||
|
|
/model-mode deep
|
||
|
|
```
|
||
|
|
|
||
|
|
### 7.2 Combined Forms
|
||
|
|
|
||
|
|
```text
|
||
|
|
/mode repair --autonomous --trust normal
|
||
|
|
/mode build --autonomous --trust trusted
|
||
|
|
/mode research --autonomous --trust restricted --model-mode deep
|
||
|
|
```
|
||
|
|
|
||
|
|
### 7.3 Autonomous Steering
|
||
|
|
|
||
|
|
```text
|
||
|
|
/steer mode repair
|
||
|
|
/steer mode review after-current-unit
|
||
|
|
/steer trust restricted now
|
||
|
|
/steer model-mode deep for-next-unit
|
||
|
|
```
|
||
|
|
|
||
|
|
### 7.4 Keyboard Shortcuts
|
||
|
|
|
||
|
|
| Shortcut | Action |
|
||
|
|
|----------|--------|
|
||
|
|
| `Ctrl+Shift+M` | Cycle workMode: chat → plan → build → review → repair → research → chat |
|
||
|
|
| `Ctrl+Shift+A` | Set runControl to autonomous |
|
||
|
|
| `Ctrl+Shift+S` | Set runControl to assisted (step) |
|
||
|
|
| `Ctrl+Shift+I` | Set runControl to manual (interactive) |
|
||
|
|
| `Ctrl+Shift+R` | Set workMode to repair |
|
||
|
|
| `Ctrl+Shift+P` | Cycle permissionProfile: restricted → normal → trusted → unrestricted → restricted |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 8. Status and Mode Badge
|
||
|
|
|
||
|
|
### 8.1 Full Status Line
|
||
|
|
|
||
|
|
```text
|
||
|
|
SF build | autonomous | trusted | smart
|
||
|
|
```
|
||
|
|
|
||
|
|
### 8.2 Compact Badge Form
|
||
|
|
|
||
|
|
For narrow terminals (< 80 cols):
|
||
|
|
|
||
|
|
```text
|
||
|
|
[B][A][T][S]
|
||
|
|
```
|
||
|
|
|
||
|
|
### 8.3 Critical State Labels
|
||
|
|
|
||
|
|
When workMode is `repair` or `review`, show full labels regardless of width:
|
||
|
|
|
||
|
|
```text
|
||
|
|
repair | autonomous | normal | smart
|
||
|
|
review | assisted | normal | deep
|
||
|
|
```
|
||
|
|
|
||
|
|
### 8.4 Badge Placement
|
||
|
|
|
||
|
|
| Surface | Placement |
|
||
|
|
|---------|-----------|
|
||
|
|
| TUI header | Left side, after "SF" logo |
|
||
|
|
| TUI status bar | Bottom line when header hidden |
|
||
|
|
| tmux/terminal title | `SF[build|A|trusted|smart] project-name` |
|
||
|
|
| Web | Top bar, color-coded chip |
|
||
|
|
|
||
|
|
### 8.5 Badge During Auto Mode
|
||
|
|
|
||
|
|
Current code hides header/footer during auto mode (`if (isAutoActive()) return []`). This must change:
|
||
|
|
|
||
|
|
- Show **minimal header** during auto: badge + project name only
|
||
|
|
- Or show badge in **dedicated status bar** separate from header/footer
|
||
|
|
- Badge color pulses slowly during autonomous execution (subtle animation)
|
||
|
|
|
||
|
|
### 8.6 Badge Colors
|
||
|
|
|
||
|
|
| Axis | Value | Color |
|
||
|
|
|------|-------|-------|
|
||
|
|
| workMode | `chat` | dim |
|
||
|
|
| workMode | `plan` | accent |
|
||
|
|
| workMode | `build` | success |
|
||
|
|
| workMode | `review` | warning |
|
||
|
|
| workMode | `repair` | error |
|
||
|
|
| workMode | `research` | info |
|
||
|
|
| runControl | `manual` | dim |
|
||
|
|
| runControl | `assisted` | warning |
|
||
|
|
| runControl | `autonomous` | success (pulsing) |
|
||
|
|
| permissionProfile | `restricted` | success |
|
||
|
|
| permissionProfile | `normal` | dim |
|
||
|
|
| permissionProfile | `trusted` | warning |
|
||
|
|
| permissionProfile | `unrestricted` | error |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 9. Background Work Surface (`/tasks`)
|
||
|
|
|
||
|
|
Unified view of all background work. Replaces scattered `/status`, `/queue`, `/parallel status` for work inspection.
|
||
|
|
|
||
|
|
### 9.1 What `/tasks` Shows
|
||
|
|
|
||
|
|
- autonomous units (current + queued)
|
||
|
|
- parallel workers
|
||
|
|
- scheduled autonomous dispatches
|
||
|
|
- background shell sessions
|
||
|
|
- stuck or resumable sessions
|
||
|
|
- remote questions waiting for answers
|
||
|
|
- current cost/budget state
|
||
|
|
- last checkpoint and next action
|
||
|
|
|
||
|
|
### 9.2 Data Model
|
||
|
|
|
||
|
|
SQLite tables:
|
||
|
|
|
||
|
|
```sql
|
||
|
|
-- Durable task state
|
||
|
|
CREATE TABLE tasks (
|
||
|
|
id TEXT PRIMARY KEY,
|
||
|
|
work_mode TEXT NOT NULL,
|
||
|
|
run_control TEXT NOT NULL,
|
||
|
|
permission_profile TEXT NOT NULL,
|
||
|
|
model_mode TEXT NOT NULL,
|
||
|
|
status TEXT NOT NULL, -- pending | running | review | done | retrying | failed | cancelled
|
||
|
|
dependency_blockers TEXT, -- JSON array of task IDs
|
||
|
|
retry_count INTEGER DEFAULT 0,
|
||
|
|
max_retries INTEGER DEFAULT 3,
|
||
|
|
checkpoint_ref TEXT, -- git ref or patch file
|
||
|
|
cost_budget REAL,
|
||
|
|
cost_spent REAL DEFAULT 0,
|
||
|
|
created_at TEXT, -- Temporal.Instant ISO
|
||
|
|
started_at TEXT,
|
||
|
|
completed_at TEXT,
|
||
|
|
next_action_at TEXT, -- Temporal.ZonedDateTime for scheduled
|
||
|
|
intent_claim TEXT -- for parallel workers: "I will edit src/foo.ts lines 10-50"
|
||
|
|
);
|
||
|
|
|
||
|
|
-- Ephemeral running state
|
||
|
|
CREATE TABLE task_runtime (
|
||
|
|
task_id TEXT PRIMARY KEY REFERENCES tasks(id),
|
||
|
|
process_pid INTEGER,
|
||
|
|
worktree_path TEXT,
|
||
|
|
current_model TEXT,
|
||
|
|
context_usage_percent REAL,
|
||
|
|
last_heartbeat_at TEXT, -- Temporal.Instant
|
||
|
|
FOREIGN KEY (task_id) REFERENCES tasks(id)
|
||
|
|
);
|
||
|
|
|
||
|
|
-- Transition log
|
||
|
|
CREATE TABLE task_transitions (
|
||
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||
|
|
task_id TEXT NOT NULL,
|
||
|
|
from_status TEXT NOT NULL,
|
||
|
|
to_status TEXT NOT NULL,
|
||
|
|
reason TEXT,
|
||
|
|
scope TEXT,
|
||
|
|
timestamp TEXT NOT NULL -- Temporal.Instant
|
||
|
|
);
|
||
|
|
```
|
||
|
|
|
||
|
|
### 9.3 Complementary Commands
|
||
|
|
|
||
|
|
`/tasks` does not replace:
|
||
|
|
- `/status` → project health dashboard
|
||
|
|
- `/queue` → milestone/slice dispatch order
|
||
|
|
- `/parallel status` → parallel orchestrator detail
|
||
|
|
- `/session-report` → cost/token summary
|
||
|
|
- `/logs` → activity logs
|
||
|
|
- `/forensics` → execution forensics
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 10. Skills System
|
||
|
|
|
||
|
|
### 10.1 Directory Structure
|
||
|
|
|
||
|
|
```text
|
||
|
|
.agents/skills/<skill-name>/
|
||
|
|
SKILL.md -- skill definition with YAML frontmatter
|
||
|
|
scripts/ -- supporting scripts
|
||
|
|
schemas/ -- JSON schemas for inputs/outputs
|
||
|
|
checklists/ -- verification checklists
|
||
|
|
mcp.json -- MCP server config if applicable
|
||
|
|
```
|
||
|
|
|
||
|
|
### 10.2 Skill Frontmatter
|
||
|
|
|
||
|
|
```yaml
|
||
|
|
---
|
||
|
|
name: forge-command-surface
|
||
|
|
description: Use when changing SF slash commands, browser command parity, or headless command dispatch.
|
||
|
|
user-invocable: true
|
||
|
|
model-invocable: true
|
||
|
|
side-effects: code-edits
|
||
|
|
permission-profile: normal
|
||
|
|
---
|
||
|
|
```
|
||
|
|
|
||
|
|
Fields:
|
||
|
|
- `name`: unique identifier
|
||
|
|
- `description`: when to use this skill
|
||
|
|
- `user-invocable`: can user explicitly invoke?
|
||
|
|
- `model-invocable`: can model auto-invoke when relevant?
|
||
|
|
- `side-effects`: `none`, `code-edits`, `production-mutation`, etc.
|
||
|
|
- `permission-profile`: minimum profile required
|
||
|
|
|
||
|
|
### 10.3 Skill Categories
|
||
|
|
|
||
|
|
| Type | Example | `model-invocable` |
|
||
|
|
|------|---------|-------------------|
|
||
|
|
| Background knowledge | `forge-autonomous-runtime` | true |
|
||
|
|
| User tool | `production-deploy` | false |
|
||
|
|
| Shared capability | `forge-command-surface` | true |
|
||
|
|
|
||
|
|
Dangerous skills (`production-mutation`) are never model-invoked by default.
|
||
|
|
|
||
|
|
### 10.4 Auto-Creation Flow
|
||
|
|
|
||
|
|
1. Detect repeated repo-specific evidence (same files, commands, failure modes, rules)
|
||
|
|
2. Propose skill in manual/restricted contexts
|
||
|
|
3. Generate/update automatically only when policy allows
|
||
|
|
4. Record source evidence in `.sf` state
|
||
|
|
5. Keep narrow and testable
|
||
|
|
6. Commit with repo when accepted
|
||
|
|
|
||
|
|
### 10.5 Skill Eval Cases
|
||
|
|
|
||
|
|
Every auto-created skill needs eval cases:
|
||
|
|
|
||
|
|
```text
|
||
|
|
.agents/skills/<skill-name>/evals/
|
||
|
|
case-1/
|
||
|
|
task.md -- user-like prompt
|
||
|
|
grader.js -- deterministic checker
|
||
|
|
hidden/ -- reference answers (not visible to agent)
|
||
|
|
work/ -- agent workspace
|
||
|
|
```
|
||
|
|
|
||
|
|
Graders inspect: files, artifacts, `answer.json`, `trace.jsonl`, result state.
|
||
|
|
Failed trials preserve workspace for debugging.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 11. Migration from `/sf` Commands
|
||
|
|
|
||
|
|
### 11.1 Command Mapping
|
||
|
|
|
||
|
|
| Old | New | Status |
|
||
|
|
|-----|-----|--------|
|
||
|
|
| `/sf` | `/next` | migrate |
|
||
|
|
| `/sf autonomous` | `/autonomous` | migrate |
|
||
|
|
| `/sf next` | `/next` | migrate |
|
||
|
|
| `/sf stop` | `/stop` | migrate |
|
||
|
|
| `/sf pause` | `/pause` | migrate |
|
||
|
|
| `/sf status` | `/status` | migrate |
|
||
|
|
| `/sf doctor` | `/doctor` | migrate |
|
||
|
|
| `/sf rate` | `/rate` | migrate |
|
||
|
|
| `/sf session-report` | `/session-report` | migrate |
|
||
|
|
| `/sf parallel` | `/parallel` | migrate |
|
||
|
|
| `/sf remote` | `/remote` | migrate |
|
||
|
|
| `/sf tasks` | `/tasks` | new |
|
||
|
|
|
||
|
|
### 11.2 Migration Timeline
|
||
|
|
|
||
|
|
| Phase | Action |
|
||
|
|
|-------|--------|
|
||
|
|
| Phase 1 (now) | Accept both `/sf X` and `/X`. Log deprecation warning for `/sf`. |
|
||
|
|
| Phase 2 (2 releases) | `/sf X` shows warning: "Use /X instead. /sf will be removed." |
|
||
|
|
| Phase 3 (4 releases) | `/sf X` errors: "Unknown command. Did you mean /X?" |
|
||
|
|
| Phase 4 (6 releases) | Remove `/sf` handler entirely. |
|
||
|
|
|
||
|
|
### 11.3 Shell Surface
|
||
|
|
|
||
|
|
Machine surface remains prefixed:
|
||
|
|
|
||
|
|
```text
|
||
|
|
sf headless autonomous
|
||
|
|
sf headless --autonomous ...
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 12. Runtime Target: Node 26
|
||
|
|
|
||
|
|
### 12.1 Policy
|
||
|
|
|
||
|
|
```text
|
||
|
|
current compatibility floor: Node 26.1+
|
||
|
|
internal target runtime: Node 26.1
|
||
|
|
canonical baseline: Node 26.1
|
||
|
|
Node 25: skip except quick probes
|
||
|
|
```
|
||
|
|
|
||
|
|
### 12.2 Why Node 26
|
||
|
|
|
||
|
|
- `Temporal` enabled by default
|
||
|
|
- V8 14.6 baseline
|
||
|
|
- Undici 8 HTTP/fetch baseline
|
||
|
|
- Removes legacy APIs, hardens against old assumptions
|
||
|
|
|
||
|
|
### 12.3 Temporal Adoption
|
||
|
|
|
||
|
|
Store semantic type, not just formatted string:
|
||
|
|
|
||
|
|
| Concept | Temporal Type | Use Case |
|
||
|
|
|---------|---------------|----------|
|
||
|
|
| Exact instant | `Temporal.Instant` | Journal events, checkpoints, lock leases |
|
||
|
|
| Local time | `Temporal.ZonedDateTime` | Reminders, schedules, audits |
|
||
|
|
| Calendar date | `Temporal.PlainDate` | Daily reports, milestone reviews |
|
||
|
|
| Wall-clock time | `Temporal.PlainTime` | Recurring policies |
|
||
|
|
| Time amount | `Temporal.Duration` | Budgets, leases, cooldowns, retry delays |
|
||
|
|
|
||
|
|
### 12.4 Adoption Priority
|
||
|
|
|
||
|
|
1. `sf schedule` — highest user-visible impact
|
||
|
|
2. Lock/lease — highest operational correctness
|
||
|
|
3. Journals/traces — highest debugging impact
|
||
|
|
4. Session reports — nice to have
|
||
|
|
5. Background tasks — future work
|
||
|
|
|
||
|
|
### 12.5 Gate
|
||
|
|
|
||
|
|
```text
|
||
|
|
node@26 --version
|
||
|
|
npm run lint
|
||
|
|
npm run typecheck:extensions
|
||
|
|
npm run build
|
||
|
|
npm test
|
||
|
|
sf --version
|
||
|
|
sf --help
|
||
|
|
sf --print "ping"
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 13. Implementation Pull-Through
|
||
|
|
|
||
|
|
### 13.1 Already Directionally Right
|
||
|
|
|
||
|
|
- UOK lifecycle records carry `runControl`
|
||
|
|
- UOK lifecycle records carry `permissionProfile`
|
||
|
|
- Schedule command state uses `autonomous_dispatch`
|
||
|
|
- DB-backed state, recovery, verification, scheduling, captures, forensics
|
||
|
|
- Skills and project-specific skill paths exist
|
||
|
|
- Parallel orchestration and remote-question infrastructure
|
||
|
|
|
||
|
|
### 13.2 Still Needed
|
||
|
|
|
||
|
|
| Priority | Item | Effort |
|
||
|
|
|----------|------|--------|
|
||
|
|
| P0 | Remove `/sf` internal dispatch, docs, tests, help text | Medium |
|
||
|
|
| P0 | Make `workMode` durable state (SQLite + `.sf/`) | Medium |
|
||
|
|
| P0 | Add direct `/mode`, `/control`, `/trust`, `/model-mode` commands | Medium |
|
||
|
|
| P0 | Add visible mode badge to TUI header/status bar | Small |
|
||
|
|
| P1 | Make `--autonomous` chain into direct `/autonomous` | Small |
|
||
|
|
| P1 | Expose autonomous continuation limits in settings and status | Small |
|
||
|
|
| P1 | Add `/tasks` with durable + ephemeral state | Large |
|
||
|
|
| P1 | Make `repair` first-class workflow over `doctor` | Medium |
|
||
|
|
| P2 | Policy-aware project skill suggestion/generation | Large |
|
||
|
|
| P2 | Skill eval cases for generated skills | Large |
|
||
|
|
| P2 | Schema-backed task frontmatter (risk, mutation, verification) | Medium |
|
||
|
|
| P2 | Intent/claim records for parallel workers | Medium |
|
||
|
|
| P2 | Audit subagent provider/model/permission inheritance | Medium |
|
||
|
|
| P2 | Audit remote steering as full-session surface | Medium |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 14. Open Questions
|
||
|
|
|
||
|
|
1. Should `plan` mode show badge `[P]` or `plan` text in full?
|
||
|
|
2. Should paused autonomous show previous badge dimmed, or `[P]` for paused?
|
||
|
|
3. Should mode be per-session or per-project? (Current: per-session)
|
||
|
|
4. Should badge appear in tmux/terminal window titles?
|
||
|
|
5. Should mode transitions have sound/notification?
|
||
|
|
6. Should `repair` auto-transition be `ask` by default for new projects?
|
||
|
|
7. Should skill eval cases run in CI or only on-demand?
|
||
|
|
8. Should `/tasks` be a TUI overlay or a separate scrollable panel?
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 15. References
|
||
|
|
|
||
|
|
- GitHub Docs, "Allowing GitHub Copilot CLI to work autonomously" — <https://docs.github.com/en/copilot/concepts/agents/copilot-cli/autopilot>
|
||
|
|
- Factory Droid, "Autonomy Level" — <https://docs.factory.ai/cli/user-guides/auto-run>
|
||
|
|
- Amp manual — <https://ampcode.com/manual>
|
||
|
|
- Smelt (mode cycling) — <https://github.com/leonardcser/smelt>
|
||
|
|
- ORCH (task state machine) — <https://github.com/oxgeneral/ORCH>
|
||
|
|
- AgentPlane (schema-first tasks) — <https://github.com/basilisk-labs/agentplane>
|
||
|
|
- Relay (channels and tickets) — <https://github.com/jcast90/relay>
|
||
|
|
- Sage (runtime-neutral orchestration) — <https://github.com/youwangd/SageCLI>
|
||
|
|
- Wit (symbol-level locks) — <https://github.com/amaar-mc/wit>
|