singularity-forge/docs/specs/sf-operating-model.md
2026-05-14 19:54:56 +02:00

265 lines
15 KiB
Markdown

# SF Operating Model
## Working Model Inputs
- Generator: `sf-spec-projections@1`
- Database: `.sf/sf.db` (present)
- Guidance: `.sf/PRINCIPLES.md` (present)
- Guidance: `.sf/TASTE.md` (present)
- Guidance: `.sf/ANTI-GOALS.md` (present)
- Optional knowledge: `.sf/KNOWLEDGE.md` (missing)
- Optional preferences: `.sf/PREFERENCES.md` (present)
- Source schema version: 45
- DB planning rows: milestones=1, slices=0, tasks=0
- DB spec rows: milestone_specs=1, slice_specs=0, task_specs=0
- Source roots analyzed as implementation evidence: `src/resources/extensions/sf/`, `src/headless*.ts`, `src/cli.ts`, `src/help-text.ts`, `web/`, `vscode-extension/`, `packages/`
This file is a human export for review, navigation, and git history. Generated docs are allowed to change because Git keeps the human-facing history. If SF needs operational history or future-use knowledge, store it in `.sf`/DB-backed state instead of relying on this export.
SF has one workflow engine and one database-first working model. TUI, CLI, web, editor integrations, and non-interactive automation must all drive the same flow:
```text
intent -> structured state/evidence -> UOK/policy gates -> execution -> journal/evidence -> projected status
```
The names below are separate axes. Do not use one as a synonym for another.
## Flow
The flow is the product behavior: how SF captures intent, plans work, applies policy, executes tasks, records evidence, and reports status. Flow behavior must not fork by UI surface. If a TUI run and a non-interactive run receive the same state, run control, and permission profile, they should follow the same control model.
## Product Modes
SF's user-facing product modes are about how much intent has been captured and
how much control the operator wants to retain. They are not agent personas.
- **Assisted build** — SF works step by step with the operator. It asks
clarifying questions when needed, proposes bounded next actions, and pauses at
important gates before continuing.
- **Plan/discuss** — the operator explores what they want, constraints, taste,
risks, and scope. On exit, SF converts the discussion into structured
database-backed planning state: milestones, slices, tasks, requirements,
decisions, or follow-up questions.
- **Autonomous** — SF attempts the full loop: research missing context, plan the
work, execute bounded units, verify, record evidence, update state, and keep
going until completion, policy, budget, confidence, or a gate stops it.
Bundled agents are implementation machinery inside these modes. They should
not be presented as the product model.
## Surface
A surface is where a person or program drives or observes the flow.
- **TUI surface** — interactive terminal UI.
- **CLI surface** — explicit commands and single-shot prompt entrypoints.
- **Web surface** — browser UI around the same state and flow.
- **Editor surface** — IDE/editor integration, usually through an adapter.
- **Machine surface** — non-interactive runner for CI, scripts, schedulers, and parent processes.
`sf headless` is the current command name for the machine surface. It means "without the TUI"; it does not mean "JSON", "autonomous", or a separate flow.
## Protocol
A protocol is how a surface or adapter talks to SF or to another agent process.
- **RPC** — SF's current child-process control channel.
- **stdio JSON-RPC** — process transport used by existing RPC-style adapters.
- **ACP** — editor/client protocol adapter for agent/editor interoperability.
- **HTTP/RPC** — possible daemon/web integration protocol.
- **Wire** — a low-level internal message layer, if introduced, below surfaces and adapters.
Protocols are adapters around the flow. They should translate messages, not invent policy or planning semantics.
## Output Format
An output format is only the encoding of a response or event stream.
- `text` — human-readable progress and results.
- `json` — one machine-readable result object.
- `stream-json` / JSONL — event stream for parent processes and monitors.
JSON is not a surface, run control, or permission profile. It is one output format that the machine surface can emit.
## Run Control
Run control describes how far SF continues through the flow before stopping for the operator.
- **manual** — user approves each consequential step.
- **assisted** — SF proposes and executes bounded steps, with human approval for important or uncertain actions.
- **autonomous** — SF continues through the flow until policy, evidence, budget, or completion stops it.
`auto` is not a run-control mode. Use **autonomous** for continuous run control; use **assisted** for bounded human-guided progression.
> Competitor note: Copilot CLI calls continuous run control autopilot.
> SF does not use that product name. The SF term is autonomous mode,
> and it stays separate from permission profiles, surfaces, protocols,
> and output formats.
UOK kernel records carry `runControl` as a first-class lifecycle field. Workflow phases such as planning, building, verification, and finalization are separate execution stages, not run-control modes.
## Permission Profile
A permission profile describes what SF is allowed to touch when a run-control mode asks it to act.
- **restricted** — read-mostly or planning-artifact-only permissions.
- **normal** — default workspace permissions for ordinary project work.
- **trusted** — wider local permissions for a trusted operator/workspace.
- **unrestricted** — explicit danger profile; use for logs, policy records, or deliberate bypass flows, not as friendly default product language.
Run control and permission profile are independent. For example, `autonomous + restricted` can keep going with narrow permissions, while `manual + trusted` still asks before each consequential step but can perform broader approved actions.
UOK kernel records and execution-policy decisions carry `permissionProfile` as the trust posture. Permission expansion never implies autonomous continuation.
## Work Mode
A work mode describes the kind of work SF is doing. `repair` is one work mode, not a separate subsystem. It owns self-healing, stale locks, installed-runtime drift, broken state, failed gates, generated/runtime drift, and other cases where SF must repair its own ability to continue safely.
`doctor` remains a diagnostic engine. It can inspect and report problems, but switching into repair work is a `workMode` transition.
## Task And Scheduler Status
Durable task lifecycle state uses the ORCH-style status machine:
```text
todo -> running -> verifying -> reviewing -> done | blocked | paused | failed | cancelled | retrying
```
Use `todo`, not `queued`, for work that exists but has not started. `queued` belongs to scheduler state only:
```text
queued -> due -> claimed -> dispatched -> consumed | expired
```
Parallel workers must stay worktree-isolated and report heartbeat/status into `.sf` state. Their lifecycle rows use `task_status`; timed dispatch and reminder rows use `task_scheduler.status`.
## Remote Steering
Remote is a full-session steering surface. It may change `workMode`, `runControl`, `permissionProfile`, and `modelMode`; it is not only a question delivery channel.
## Future Sandbox Profile
`sandboxProfile` may become a sixth independent axis later. Keep it separate from `permissionProfile`: sandbox profile controls containment, while permission profile controls what SF may approve.
## Naming Rules
- Say **flow** for the shared planning/execution engine.
- Say **surface** for TUI, CLI, web, editor, or machine entrypoints.
- Say **protocol** for ACP, RPC, stdio JSON-RPC, HTTP, or wire messages.
- Say **output format** for `text`, `json`, and `stream-json`.
- Say **run control** for `manual`, `assisted`, and `autonomous`.
- Say **permission profile** for `restricted`, `normal`, `trusted`, and `unrestricted`.
- Say **task status** for `todo`, `running`, `verifying`, `reviewing`, `done`, `blocked`, `paused`, `failed`, `cancelled`, and `retrying`.
- Say **scheduler status** for `queued`, `due`, `claimed`, `dispatched`, `consumed`, and `expired`.
- Use **headless** only for the current `sf headless` command and implementation path. Product docs should explain it as the machine surface.
## Working State Contract
SF working state is database-first. An initialized SF repo has `.sf/sf.db`, and runtime tools use it as the canonical structured store for planning hierarchy, ordering, gates, ledgers, schedules, and validation-sensitive state.
Markdown under `.sf/` has two roles:
- working guidance and knowledge that the runtime loads, such as `PRINCIPLES.md`, `TASTE.md`, `ANTI-GOALS.md`, `KNOWLEDGE.md`, and `PREFERENCES.md`;
- human-readable projections from DB-owned records, such as rendered decisions, requirements, roadmap, plan, summary, and state files.
Markdown under `docs/specs/` is a human export for review, navigation, and git history. Generated docs can change; Git records that human-facing history. If SF needs its own operational history, it should store that in `.sf`/DB-backed state. Plans should record any surface, protocol, output-format, run-control, or permission-profile impact explicitly when a milestone changes integration behavior.
## Source Placement
SF source placement follows the same axis model. New code should extend the owning axis instead of creating parallel trees.
SF is not trying to be a general CLI coder with a menu of personas. SF is a
purpose-to-software runtime: it captures intent, plans work, dispatches bounded
autonomous operations, records evidence, and exposes operator control surfaces.
Agents, skills, prompt parts, and workflow templates exist to support that
runtime.
### Core Flow
- `src/resources/extensions/sf/` owns the SF workflow extension: planning tools, UOK/runtime state, `/next` commands, prompts, templates, doctors, schedule, and DB-backed state.
- `src/resources/extensions/` owns bundled extension packages loaded into the runtime.
- `src/resources/agents/`, `src/resources/skills/`, and `src/resources/workflows/` own bundled runtime resources, not independent product flows.
- SF is primarily designed for autonomous operation. Bundled agents are an internal worker pool for orchestration by default, not a user-facing marketplace or a CLI-coder persona menu. `src/resources/agents/*.md` is the simple one-body worker-agent format; `src/resources/extensions/sf/agents/*.agent.yaml` is the structured format for SF-owned workflow agents that need `promptParts`, tool-policy contracts, or gate output contracts.
### Surfaces
- `src/cli.ts` and `src/help-text.ts` own CLI/session entrypoint behavior and command help.
- `src/headless*.ts` owns the existing `sf headless` machine-surface command path. Keep the command name; describe it as the machine surface in product language.
- `web/` owns the browser surface.
- `vscode-extension/` owns the editor surface.
- `packages/pi-tui/` owns reusable TUI primitives and terminal UI components.
### Protocols And Adapters
- `packages/rpc-client/` owns reusable RPC client protocol code.
- RPC child-process orchestration stays in the machine-surface path unless promoted into a reusable protocol package.
- ACP, stdio JSON-RPC, HTTP, and future wire layers are protocol/adapters. They should translate messages to the same SF flow, not fork planning semantics.
### Workspace Packages
- `packages/pi-agent-core/` owns reusable agent-core primitives.
- `packages/pi-ai/` owns provider/model integration.
- `packages/pi-coding-agent/` owns reusable coding-agent substrate inherited from Pi.
- `packages/daemon/` owns daemonized background service code.
- `packages/native/` and `rust-engine/` own native/Rust performance paths.
### State And Projections
- `.sf/sf.db` is the canonical structured runtime state store for initialized SF repos. Treat a missing or unreadable DB as bootstrap/recovery, not a normal alternate source of truth.
- `.sf/DECISIONS.md`, `.sf/REQUIREMENTS.md`, milestone roadmaps, and similar files are rendered working projections when database-backed tools own the data. They are useful to humans and agents but must not compete with DB rows.
- `.sf/PRINCIPLES.md`, `.sf/TASTE.md`, `.sf/ANTI-GOALS.md`, `.sf/KNOWLEDGE.md`, and `.sf/PREFERENCES.md` are repo-local working guidance files when present.
- Generated `.sf/` runtime files are evidence, projections, or import/recovery artifacts.
- Durable human-facing exports belong in `docs/specs/`, `docs/adr/`, or `docs/plans/`. They are reviewable projections and git-history artifacts, not a second planning database.
### Placement Rules
- Do not create a second implementation because a feature appears in another surface. Add an adapter to the same flow.
- Do not name output encodings as surfaces. JSON belongs to output formats.
- Do not name permission expansion as run control. `autonomous` means the loop continues; `trusted` or `unrestricted` means the permission profile widened.
- Do not route human questions because of `headless`. Questions come from run-control and permission-policy gates; the surface only determines delivery.
## Planning Schema
SF uses a three-table planning hierarchy stored in `.sf/sf.db`. Each level owns a different granularity of work and carries distinct lifecycle state.
### Table: `milestones`
Primary key: `id` (e.g. `M001`, `M002`).
Stores the top-level milestone record with its current status, title, and parent goal reference. The `milestone_specs` table holds the spec-authored metadata (description, success criteria, constraints).
### Table: `slices`
Primary key: `(milestone_id, id)` (e.g. `M001 / S01`).
Stores one vertical slice of a milestone — a self-contained deliverable. Slices carry their own status, sequence number, and optional blocking relationships. The `slice_specs` table holds the spec-authored metadata (acceptance criteria, deliverables, risk level).
### Table: `tasks`
Primary key: `(milestone_id, slice_id, id)` (e.g. `M001 / S01 / T01`).
Stores one atomic unit of implementation work. Tasks carry:
- Planning fields: `title`, `description`, `estimate`, `files` (JSON array), `verify`, `inputs`, `expected_output`, `full_plan_md`
- Frontmatter fields: `risk`, `mutation_scope`, `verification_type`, `plan_approval`, `task_status`, `estimated_effort`, `dependencies` (JSON array), `blocks_parallel`, `requires_user_input`, `auto_retry`, `max_retries`, `frontmatter_version`
- Lifecycle fields: `status` (ORCH-style), `sequence`, `run_index`, `retries`, `evidence`, `error`
The `task_specs` table holds spec-authored content and carries its own `spec_version` field. The `frontmatter_version` field on the `tasks` table records which frontmatter schema version was used when the row was last written, enabling forward-compatible migrations as the frontmatter schema evolves.
### Relationship
```
milestones 1──* slices 1──* tasks
│ │
slice_specs task_specs
```
Spec tables (`milestone_specs`, `slice_specs`, `task_specs`) are append-append: a new spec row is inserted when the spec changes, keeping history. The planning tables (`milestones`, `slices`, `tasks`) are live mutable state, updated as work progresses.
### ID Conventions
- Milestone IDs: `M001`, `M002`, …
- Slice IDs: `S01`, `S02`, … (scoped within a milestone)
- Task IDs: `T01`, `T02`, … (scoped within a slice)
Never use raw integer IDs or UUIDs for planning hierarchy references. The `M/S/T` prefixed IDs appear in prompts, logs, and evidence, and must be human-readable.