662 lines
21 KiB
Markdown
662 lines
21 KiB
Markdown
# SF Agent Mode System
|
|
|
|
> **Status:** Draft specification. Promoted from `copilot-thoughts.md` research notes.
|
|
> **Scope:** TUI mode surface, command structure, orthogonal state axes, skills, background work, runtime target.
|
|
> **Decision authority:** Product + architecture review required before implementation.
|
|
|
|
---
|
|
|
|
## 1. Problem Statement
|
|
|
|
SF's old command surface treated mode switching as separate commands rather than persistent states. There was no visible indicator of the current mode, and the `/sf` prefix positioned SF as a plugin rather than the system itself.
|
|
|
|
Competitors (Copilot CLI, Factory Droid, Amp) have cleaner mode surfaces with visible state and orthogonal controls. SF has deeper autonomous machinery but weaker presentation.
|
|
|
|
**Goal:** Make SF's mode system as obvious as Vim's insert/normal mode indicator, with the control depth of Factory Droid's autonomy levels and the skill system of Amp.
|
|
|
|
---
|
|
|
|
## 2. Orthogonal State Axes
|
|
|
|
SF state is five independent axes, not one overloaded "mode."
|
|
|
|
```text
|
|
workMode: chat | plan | build | review | repair | research
|
|
runControl: manual | assisted | autonomous
|
|
permissionProfile: restricted | normal | trusted | unrestricted
|
|
modelMode: fast | smart | deep
|
|
surface: tui | web | headless | rpc
|
|
```
|
|
|
|
### 2.1 Axis Definitions
|
|
|
|
| Axis | Question It Answers | Values |
|
|
|------|---------------------|--------|
|
|
| `workMode` | What kind of work is SF doing? | `chat`, `plan`, `build`, `review`, `repair`, `research` |
|
|
| `runControl` | Who advances the loop? | `manual` (user), `assisted` (one unit then pause), `autonomous` (continuous) |
|
|
| `permissionProfile` | What may proceed without approval? | `restricted`, `normal`, `trusted`, `unrestricted` |
|
|
| `modelMode` | Speed/cost/reasoning posture? | `fast` (cheap), `smart` (balanced), `deep` (reasoning) |
|
|
| `surface` | How is the user connected? | `tui`, `web`, `headless`, `rpc` |
|
|
|
|
### 2.2 Example Combinations
|
|
|
|
```text
|
|
plan | manual | normal | deep → user plans with reasoning model
|
|
build | autonomous | trusted | smart → continuous implementation
|
|
repair | assisted | normal | smart → one repair unit at a time
|
|
research | autonomous | restricted | deep → continuous research, read-only
|
|
review | manual | restricted | deep → user reviews with reasoning model
|
|
```
|
|
|
|
### 2.3 Rules
|
|
|
|
- `permissionProfile` never implies `runControl`. Autonomous run with `restricted` permissions is valid.
|
|
- `runControl` never implies `permissionProfile`. Manual run with `unrestricted` permissions is valid.
|
|
- Denylists and safety gates override `permissionProfile` regardless of value.
|
|
- Every risk decision logs all five axis values.
|
|
- `sandboxProfile` may become a sixth axis later. It is separate from `permissionProfile`: sandboxing controls process/filesystem/network containment, while permission profile controls what SF may approve.
|
|
|
|
---
|
|
|
|
## 3. Work Modes
|
|
|
|
### 3.1 `chat`
|
|
|
|
Default conversational mode. Questions, explanations, low-commitment exploration. No durable artifacts created without explicit user request.
|
|
|
|
### 3.2 `plan`
|
|
|
|
Research, clarify, write/update specs, derive tasks, produce explicit acceptance point before implementation. Primary user journey starts here.
|
|
|
|
**Plan → Build handoff:**
|
|
```text
|
|
plan | manual | normal | deep
|
|
accept plan
|
|
build | autonomous | selected-permission-profile | smart
|
|
```
|
|
|
|
Surfaces:
|
|
- TUI: plan acceptance prompt includes "run autonomously" button
|
|
- Web: plan acceptance button includes "run autonomously"
|
|
- Headless: `--autonomous` chains into direct `/autonomous`
|
|
- RPC: machine event records transition explicitly
|
|
|
|
### 3.3 `build`
|
|
|
|
Implement, test, lint, typecheck, verify, prepare commit-ready changes. The autonomous default.
|
|
|
|
### 3.4 `review`
|
|
|
|
Inspect diffs, tests, risks, regressions, security issues, missing evidence. Requires reasoning model (`deep`).
|
|
|
|
### 3.5 `repair`
|
|
|
|
Fix SF health, repo health, runtime drift, broken generated state, bad command surfaces, failing workflow infrastructure, stale locks, broken installed runtime copies, failed gates, generated/runtime drift, and broken state.
|
|
|
|
**Doctor is the diagnostic engine, not the mode.** `/doctor` inspects. `/repair` switches work mode.
|
|
`repair` is a `workMode`, not a separate subsystem.
|
|
|
|
Commands:
|
|
```text
|
|
/doctor → inspect and report
|
|
/doctor fix → deterministic auto-fix
|
|
/doctor heal → LLM-assisted deep healing
|
|
/repair → switch workMode to repair
|
|
/repair --autonomous → repair until clean, blocked, or limit-hit
|
|
```
|
|
|
|
**Auto-transition to repair:**
|
|
```text
|
|
build | autonomous | trusted | smart
|
|
→ repair | autonomous | normal | smart
|
|
```
|
|
|
|
Allowed when:
|
|
- pre-dispatch health gate fails
|
|
- installed runtime drift detected
|
|
- SF cannot dispatch safely
|
|
- repo workflow state corrupted
|
|
|
|
Policy: configurable per project. Options: `auto`, `ask`, `log-only`.
|
|
|
|
### 3.6 `research`
|
|
|
|
Longer-form codebase, competitor, design, API, or dependency research. Uses web search, local code exploration, cross-repo research, helper agents.
|
|
|
|
---
|
|
|
|
## 4. Run Control
|
|
|
|
| Value | Behavior |
|
|
|-------|----------|
|
|
| `manual` | User drives every step. Tool calls require approval. |
|
|
| `assisted` | SF executes one unit, then pauses for user review. |
|
|
| `autonomous` | SF continues until done, blocked, interrupted, budget-hit, or limit-hit. |
|
|
|
|
### 4.1 Commands
|
|
|
|
```text
|
|
/control manual
|
|
/control assisted
|
|
/control autonomous
|
|
/autonomous → alias for /control autonomous
|
|
/next → alias for /control assisted (one unit)
|
|
/pause → pause autonomous, preserve state
|
|
/stop → stop autonomous, clear state
|
|
```
|
|
|
|
### 4.2 Transition Scopes
|
|
|
|
| Scope | Behavior |
|
|
|-------|----------|
|
|
| `now` | Apply immediately if no tool active. Abort current tool if policy allows. |
|
|
| `after-current-tool` | Finish active tool, then switch. |
|
|
| `after-current-unit` | Finish current SF unit, then switch. |
|
|
| `next-milestone` | Switch after current milestone completes. |
|
|
|
|
Autonomous changes affect future decisions, never mutate active tool calls mid-execution.
|
|
|
|
### 4.3 Transition Logging
|
|
|
|
Every transition persists:
|
|
|
|
```json
|
|
{
|
|
"timestamp": "2026-05-08T10:00:00Z",
|
|
"from": {"workMode": "build", "runControl": "autonomous"},
|
|
"to": {"workMode": "repair", "runControl": "autonomous"},
|
|
"reason": "pre-dispatch health gate failed",
|
|
"scope": "after-current-unit",
|
|
"sessionId": "..."
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Permission Profiles
|
|
|
|
| Profile | Description |
|
|
|---------|-------------|
|
|
| `restricted` | Read-only and explicitly allowlisted actions. |
|
|
| `normal` | Safe edits, non-destructive local commands. |
|
|
| `trusted` | Build/test/install/local commits and bounded repo automation. |
|
|
| `unrestricted` | High-risk orchestration only in intentionally trusted environments. |
|
|
|
|
### 5.1 Enforcement
|
|
|
|
Permission profile is enforced at three layers:
|
|
|
|
1. **Tool registry:** Each tool declares required profile. Tools below profile are hidden from model.
|
|
2. **Execution gate:** Each tool call checks profile at invocation. Violation = error.
|
|
3. **Safety harness:** Destructive operations (delete, push to production, etc.) require explicit confirmation regardless of profile.
|
|
|
|
### 5.2 Commands
|
|
|
|
```text
|
|
/permission-profile restricted
|
|
/permission-profile normal
|
|
/permission-profile trusted
|
|
/permission-profile unrestricted
|
|
```
|
|
|
|
---
|
|
|
|
## 6. Model Modes
|
|
|
|
| Mode | Use Case | Routing Hint |
|
|
|------|----------|--------------|
|
|
| `fast` | Small bounded tasks | Cheapest available model |
|
|
| `smart` | Default balanced work | Default routing table |
|
|
| `deep` | Planning, debugging, research, review | Reasoning model (o1, Claude Opus, etc.) |
|
|
|
|
`modelMode` guides routing. It does not replace explicit `/model` selection.
|
|
|
|
---
|
|
|
|
## 7. Mode Switching UX
|
|
|
|
### 7.1 Direct Commands
|
|
|
|
```text
|
|
/mode chat
|
|
/mode plan
|
|
/mode build
|
|
/mode review
|
|
/mode repair
|
|
/mode research
|
|
/control manual
|
|
/control assisted
|
|
/control autonomous
|
|
/permission-profile restricted
|
|
/permission-profile normal
|
|
/permission-profile trusted
|
|
/permission-profile unrestricted
|
|
/model-mode fast
|
|
/model-mode smart
|
|
/model-mode deep
|
|
```
|
|
|
|
### 7.2 Combined Forms
|
|
|
|
```text
|
|
/mode repair --autonomous --permission-profile normal
|
|
/mode build --autonomous --permission-profile trusted
|
|
/mode research --autonomous --permission-profile restricted --model-mode deep
|
|
```
|
|
|
|
### 7.3 Autonomous Steering
|
|
|
|
```text
|
|
/steer mode repair
|
|
/steer mode review after-current-unit
|
|
/steer permission-profile restricted now
|
|
/steer model-mode deep for-next-unit
|
|
```
|
|
|
|
### 7.4 Keyboard Shortcuts
|
|
|
|
| Shortcut | Action |
|
|
|----------|--------|
|
|
| `Ctrl+Shift+M` | Cycle workMode: chat → plan → build → review → repair → research → chat |
|
|
| `Ctrl+Shift+A` | Set runControl to autonomous |
|
|
| `Ctrl+Shift+S` | Set runControl to assisted (step) |
|
|
| `Ctrl+Shift+I` | Set runControl to manual (interactive) |
|
|
| `Ctrl+Shift+R` | Set workMode to repair |
|
|
| `Ctrl+Shift+P` | Cycle permissionProfile: restricted → normal → trusted → unrestricted → restricted |
|
|
|
|
---
|
|
|
|
## 8. Status and Mode Badge
|
|
|
|
### 8.1 Full Status Line
|
|
|
|
```text
|
|
SF build | autonomous | trusted | smart
|
|
```
|
|
|
|
### 8.2 Compact Badge Form
|
|
|
|
For narrow terminals (< 80 cols):
|
|
|
|
```text
|
|
[B][A][T][S]
|
|
```
|
|
|
|
### 8.3 Critical State Labels
|
|
|
|
When workMode is `repair` or `review`, show full labels regardless of width:
|
|
|
|
```text
|
|
repair | autonomous | normal | smart
|
|
review | assisted | normal | deep
|
|
```
|
|
|
|
### 8.4 Badge Placement
|
|
|
|
| Surface | Placement |
|
|
|---------|-----------|
|
|
| TUI header | Left side, after "SF" logo |
|
|
| TUI status bar | Bottom line when header hidden |
|
|
| tmux/terminal title | `SF[build|A|trusted|smart] project-name` |
|
|
| Web | Top bar, color-coded chip |
|
|
|
|
### 8.5 Badge During Auto Mode
|
|
|
|
Current code hides header/footer during auto mode (`if (isAutoActive()) return []`). This must change:
|
|
|
|
- Show **minimal header** during auto: badge + project name only
|
|
- Or show badge in **dedicated status bar** separate from header/footer
|
|
- Badge color pulses slowly during autonomous execution (subtle animation)
|
|
|
|
### 8.6 Badge Colors
|
|
|
|
| Axis | Value | Color |
|
|
|------|-------|-------|
|
|
| workMode | `chat` | dim |
|
|
| workMode | `plan` | accent |
|
|
| workMode | `build` | success |
|
|
| workMode | `review` | warning |
|
|
| workMode | `repair` | error |
|
|
| workMode | `research` | info |
|
|
| runControl | `manual` | dim |
|
|
| runControl | `assisted` | warning |
|
|
| runControl | `autonomous` | success (pulsing) |
|
|
| permissionProfile | `restricted` | success |
|
|
| permissionProfile | `normal` | dim |
|
|
| permissionProfile | `trusted` | warning |
|
|
| permissionProfile | `unrestricted` | error |
|
|
|
|
---
|
|
|
|
## 9. Background Work Surface (`/tasks`)
|
|
|
|
Unified view of all background work. Replaces scattered `/status`, `/queue`, `/parallel status` for work inspection.
|
|
|
|
### 9.1 What `/tasks` Shows
|
|
|
|
- autonomous task lifecycle rows
|
|
- parallel workers
|
|
- scheduled autonomous dispatches and queued scheduler rows
|
|
- background shell sessions
|
|
- stuck or resumable sessions
|
|
- remote questions waiting for answers
|
|
- current cost/budget state
|
|
- last checkpoint and next action
|
|
|
|
### 9.2 Data Model
|
|
|
|
SQLite tables:
|
|
|
|
```sql
|
|
-- Durable task state
|
|
CREATE TABLE tasks (
|
|
id TEXT PRIMARY KEY,
|
|
work_mode TEXT NOT NULL,
|
|
run_control TEXT NOT NULL,
|
|
permission_profile TEXT NOT NULL,
|
|
model_mode TEXT NOT NULL,
|
|
status TEXT NOT NULL, -- todo | running | verifying | reviewing | done | blocked | paused | failed | cancelled | retrying
|
|
dependency_blockers TEXT, -- JSON array of task IDs
|
|
retry_count INTEGER DEFAULT 0,
|
|
max_retries INTEGER DEFAULT 3,
|
|
checkpoint_ref TEXT, -- git ref or patch file
|
|
cost_budget REAL,
|
|
cost_spent REAL DEFAULT 0,
|
|
created_at TEXT, -- Temporal.Instant ISO
|
|
started_at TEXT,
|
|
completed_at TEXT,
|
|
next_action_at TEXT, -- Temporal.ZonedDateTime for scheduled
|
|
intent_claim TEXT -- for parallel workers: "I will edit src/foo.ts lines 10-50"
|
|
);
|
|
|
|
-- Scheduler state is separate from task lifecycle state
|
|
CREATE TABLE task_scheduler (
|
|
task_id TEXT PRIMARY KEY REFERENCES tasks(id),
|
|
status TEXT NOT NULL, -- queued | due | claimed | dispatched | consumed | expired
|
|
due_at TEXT,
|
|
claimed_by TEXT,
|
|
dispatched_at TEXT,
|
|
consumed_at TEXT,
|
|
expires_at TEXT
|
|
);
|
|
|
|
-- Ephemeral running state
|
|
CREATE TABLE task_runtime (
|
|
task_id TEXT PRIMARY KEY REFERENCES tasks(id),
|
|
process_pid INTEGER,
|
|
worktree_path TEXT,
|
|
current_model TEXT,
|
|
context_usage_percent REAL,
|
|
last_heartbeat_at TEXT, -- Temporal.Instant
|
|
FOREIGN KEY (task_id) REFERENCES tasks(id)
|
|
);
|
|
|
|
-- Transition log
|
|
CREATE TABLE task_transitions (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
task_id TEXT NOT NULL,
|
|
from_status TEXT NOT NULL,
|
|
to_status TEXT NOT NULL,
|
|
reason TEXT,
|
|
scope TEXT,
|
|
timestamp TEXT NOT NULL -- Temporal.Instant
|
|
);
|
|
```
|
|
|
|
Parallel workers must stay worktree-isolated and report heartbeat/status into
|
|
`.sf` state. Scheduler rows may use `queued`; task lifecycle rows use `todo`.
|
|
|
|
### 9.3 Complementary Commands
|
|
|
|
`/tasks` does not replace:
|
|
- `/status` → project health dashboard
|
|
- `/queue` → milestone/slice dispatch order
|
|
- `/parallel status` → parallel orchestrator detail
|
|
- `/session-report` → cost/token summary
|
|
- `/logs` → activity logs
|
|
- `/forensics` → execution forensics
|
|
|
|
---
|
|
|
|
## 10. Skills System
|
|
|
|
### 10.1 Directory Structure
|
|
|
|
```text
|
|
.agents/skills/<skill-name>/
|
|
SKILL.md -- skill definition with YAML frontmatter
|
|
scripts/ -- supporting scripts
|
|
schemas/ -- JSON schemas for inputs/outputs
|
|
checklists/ -- verification checklists
|
|
mcp.json -- MCP server config if applicable
|
|
```
|
|
|
|
### 10.2 Skill Frontmatter
|
|
|
|
```yaml
|
|
---
|
|
name: forge-command-surface
|
|
description: Use when changing SF slash commands, browser command parity, or headless command dispatch.
|
|
user-invocable: true
|
|
model-invocable: true
|
|
side-effects: code-edits
|
|
permission-profile: normal
|
|
---
|
|
```
|
|
|
|
Fields:
|
|
- `name`: unique identifier
|
|
- `description`: when to use this skill
|
|
- `user-invocable`: can user explicitly invoke?
|
|
- `model-invocable`: can model auto-invoke when relevant?
|
|
- `side-effects`: `none`, `code-edits`, `production-mutation`, etc.
|
|
- `permission-profile`: minimum profile required
|
|
|
|
### 10.3 Skill Categories
|
|
|
|
| Type | Example | `model-invocable` |
|
|
|------|---------|-------------------|
|
|
| Background knowledge | `forge-autonomous-runtime` | true |
|
|
| User tool | `production-deploy` | false |
|
|
| Shared capability | `forge-command-surface` | true |
|
|
|
|
Dangerous skills (`production-mutation`) are never model-invoked by default.
|
|
|
|
### 10.4 Auto-Creation Flow
|
|
|
|
1. Detect repeated repo-specific evidence (same files, commands, failure modes, rules)
|
|
2. Propose skill in manual/restricted contexts
|
|
3. Generate/update automatically only when policy allows
|
|
4. Record repeated source evidence in `.sf` state
|
|
5. Keep narrow, linted, and evaled like code
|
|
6. Commit with repo when accepted
|
|
|
|
### 10.5 Skill Eval Cases
|
|
|
|
Every auto-created skill needs eval cases:
|
|
|
|
```text
|
|
.agents/skills/<skill-name>/evals/
|
|
case-1/
|
|
task.md -- user-like prompt
|
|
grader.js -- deterministic checker
|
|
hidden/ -- reference answers (not visible to agent)
|
|
work/ -- agent workspace
|
|
```
|
|
|
|
Graders inspect: files, artifacts, `answer.json`, `trace.jsonl`, result state.
|
|
Failed trials preserve workspace for debugging.
|
|
|
|
---
|
|
|
|
## 11. Command Surfaces
|
|
|
|
### 11.1 Human Slash Commands
|
|
|
|
SF registers direct command roots only:
|
|
|
|
```text
|
|
/status
|
|
/autonomous
|
|
/doctor
|
|
/rate
|
|
/session-report
|
|
/parallel
|
|
/remote
|
|
/tasks
|
|
```
|
|
|
|
`/sf` is not a command root. TUI and browser command parity tests reject it so
|
|
compatibility shims do not grow back.
|
|
|
|
`/remote` is a full-session steering surface. Remote answers may change
|
|
`workMode`, `runControl`, `permissionProfile`, and `modelMode`; they are not
|
|
limited to question delivery.
|
|
|
|
### 11.2 Shell Surface
|
|
|
|
Machine surface remains prefixed:
|
|
|
|
```text
|
|
sf headless autonomous
|
|
sf headless --autonomous ...
|
|
```
|
|
|
|
The shell prefix is the executable name, not an interactive slash-command
|
|
namespace.
|
|
|
|
---
|
|
|
|
## 12. Runtime Target: Node 26
|
|
|
|
### 12.1 Policy
|
|
|
|
```text
|
|
current compatibility floor: Node 26.1+
|
|
internal target runtime: Node 26.1
|
|
canonical baseline: Node 26.1
|
|
Node 25: skip except quick probes
|
|
```
|
|
|
|
### 12.2 Why Node 26
|
|
|
|
- `Temporal` enabled by default
|
|
- V8 14.6 baseline
|
|
- Undici 8 HTTP/fetch baseline
|
|
- Removes legacy APIs, hardens against old assumptions
|
|
|
|
### 12.3 Temporal Adoption
|
|
|
|
Store semantic type, not just formatted string:
|
|
|
|
| Concept | Temporal Type | Use Case |
|
|
|---------|---------------|----------|
|
|
| Exact instant | `Temporal.Instant` | Journal events, checkpoints, lock leases |
|
|
| Local time | `Temporal.ZonedDateTime` | Reminders, schedules, audits |
|
|
| Calendar date | `Temporal.PlainDate` | Daily reports, milestone reviews |
|
|
| Wall-clock time | `Temporal.PlainTime` | Recurring policies |
|
|
| Time amount | `Temporal.Duration` | Budgets, leases, cooldowns, retry delays |
|
|
|
|
### 12.4 Adoption Priority
|
|
|
|
1. `sf schedule` — highest user-visible impact
|
|
2. Lock/lease — highest operational correctness
|
|
3. Journals/traces — highest debugging impact
|
|
4. Session reports — nice to have
|
|
5. Background tasks — future work
|
|
|
|
### 12.5 Gate
|
|
|
|
```text
|
|
node@26 --version
|
|
npm run lint
|
|
npm run typecheck:extensions
|
|
npm run build
|
|
npm test
|
|
sf --version
|
|
sf --help
|
|
sf --print "ping"
|
|
```
|
|
|
|
---
|
|
|
|
## 13. Implementation Pull-Through
|
|
|
|
### 13.1 Already Directionally Right
|
|
|
|
- UOK lifecycle records carry `runControl`
|
|
- UOK lifecycle records carry `permissionProfile`
|
|
- Schedule command state uses `autonomous_dispatch`
|
|
- DB-backed state, recovery, verification, scheduling, captures, forensics
|
|
- Skills and project-specific skill paths exist
|
|
- Parallel orchestration and remote-question infrastructure
|
|
|
|
### 13.2 Still Needed
|
|
|
|
| Priority | Item | Effort |
|
|
|----------|------|--------|
|
|
| P2 | Decide whether `sandboxProfile` becomes a sixth persisted axis | Medium |
|
|
| P2 | Remove `/sf` from docs/web/tests (Phase 2 deprecation) | Small |
|
|
|
|
### 13.4 Recently Completed (This Session)
|
|
|
|
| Priority | Item | Status |
|
|
|----------|------|--------|
|
|
| P1 | Centralized metrics system (`metrics-central.js`) | ✓ |
|
|
| P1 | Cost command (`/cost`) with DB + ledger queries | ✓ |
|
|
| P1 | Explicit stage commands (`/research`, `/plan`, `/implement`) | ✓ |
|
|
| P2 | Reasoning assist foundation (`reasoning-assist.js`) | ✓ |
|
|
| P2 | Self-feedback → workMode auto-transition | ✓ |
|
|
| P2 | UOK events carry workMode + modelMode | ✓ |
|
|
| P2 | `/sf` prefix deprecation warning (Phase 1) | ✓ |
|
|
|
|
### 13.3 Completed
|
|
|
|
| Priority | Item | Status |
|
|
|----------|------|--------|
|
|
| P0 | Make mode state durable in SQLite | ✓ |
|
|
| P0 | Add direct `/mode`, `/control`, `/permission-profile`, `/model-mode` commands | ✓ |
|
|
| P0 | Add visible mode badge to TUI header/status bar | ✓ |
|
|
| P1 | Make `--autonomous` chain into direct `/autonomous` | ✓ |
|
|
| P1 | Expose autonomous continuation limits in settings and status | ✓ |
|
|
| P1 | Add `/tasks` backed by DB execution graph state | ✓ |
|
|
| P1 | Make `repair` first-class workflow over `doctor` | ✓ |
|
|
| P1 | Enhanced `/steer` with mode/permission-profile/model-mode transitions | ✓ |
|
|
| P1 | TUI keyboard shortcuts for mode cycling (Ctrl+Shift+M/R/A/S/P) | ✓ |
|
|
| P1 | Minimal auto-mode header/footer (badge visible during autonomy) | ✓ |
|
|
| P1 | Remove `/sf` namespace registration and parity-test against fallback | ✓ |
|
|
| P1 | Parallel worker intent/claim registry backed by UOK SQLite coordination | ✓ |
|
|
| P1 | Skill eval harness foundation | ✓ |
|
|
| P1 | Terminal title mode indicator | ✓ |
|
|
| P2 | Policy-aware project skill suggestion/generation with DB cooldown | ✓ |
|
|
| P2 | Schema-backed task frontmatter (risk, mutation, verification, approval) | ✓ |
|
|
| P2 | Subagent provider/model/permission inheritance audit and guard | ✓ |
|
|
| P2 | Remote steering as full-session surface from remote answers | ✓ |
|
|
|
|
---
|
|
|
|
## 14. Open Questions
|
|
|
|
1. Should `plan` mode show badge `[P]` or `plan` text in full?
|
|
2. Should paused autonomous show previous badge dimmed, or `[P]` for paused?
|
|
3. Should mode be per-session or per-project? (Current: per-session)
|
|
4. Should badge appear in tmux/terminal window titles?
|
|
5. Should mode transitions have sound/notification?
|
|
6. Should `repair` auto-transition be `ask` by default for new projects?
|
|
7. Should skill eval cases run in CI or only on-demand?
|
|
8. Should `/tasks` be a TUI overlay or a separate scrollable panel?
|
|
9. Should reasoning assist call a fast model automatically, or only prepare prompts for now?
|
|
|
|
---
|
|
|
|
## 15. References
|
|
|
|
- GitHub Docs, "Allowing GitHub Copilot CLI to work autonomously" — <https://docs.github.com/en/copilot/concepts/agents/copilot-cli/autopilot>
|
|
- Factory Droid, "Autonomy Level" — <https://docs.factory.ai/cli/user-guides/auto-run>
|
|
- Amp manual — <https://ampcode.com/manual>
|
|
- Smelt (mode cycling) — <https://github.com/leonardcser/smelt>
|
|
- ORCH (task state machine) — <https://github.com/oxgeneral/ORCH>
|
|
- AgentPlane (schema-first tasks) — <https://github.com/basilisk-labs/agentplane>
|
|
- Relay (channels and tickets) — <https://github.com/jcast90/relay>
|
|
- Sage (runtime-neutral orchestration) — <https://github.com/youwangd/SageCLI>
|
|
- Wit (symbol-level locks) — <https://github.com/amaar-mc/wit>
|