singularity-forge/docs/PRODUCT_SENSE.md

# Product Sense

## The Core Thesis

Autonomous execution is the end gate. SF exists to take a multi-phase software project — a milestone with slices and tasks — and run it to completion without human intervention, producing a clean git history, passing tests, and a deployable artifact.

Every design decision should be evaluated against this question: **does it make autonomous execution more reliable, more observable, or more recoverable?**

## User Goals

- Hand off a milestone and have it complete without babysitting
- Know the agent won't make irreversible mistakes (write gates, protected files, budget ceilings)
- Resume after a crash without losing work (state-on-disk, crash recovery)
- See what the agent did and why (trace files, decision register, records keeper)
- Steer mid-run without breaking the loop (message queue, steering gate)

## Non-Goals

- Being a chat interface — use the Pi interactive mode for exploratory conversation
- Replacing CI — SF triggers verification but does not replace your existing CI pipeline
- Working without context — SF needs a spec, a roadmap, and a task plan; it does not invent work from nothing

## What Good Product Judgment Looks Like

**Fresh context per unit, not accumulated context.** Each task gets a new session with exactly the context it needs pre-injected (task plan, slice plan, prior summaries, relevant skills). This prevents quality degradation from context accumulation — one of the primary failure modes of naive LLM agents on long projects.

**State machine, not LLM guessing.** The loop is deterministic: read STATE.md → validate → dispatch → post-unit → verify → advance. The LLM executes work inside a unit; it does not decide what the next unit is. Separating orchestration from execution keeps the system predictable.

**Spec-first.** No behavior change without a failing test first. No completion without a real consumer. This is the iron law — not a suggestion. An agent that completes tasks without specs is just making things up.

**Crash recovery must be invisible.** A crashed session should resume within seconds with no visible data loss. If recovery requires human intervention, it is a product failure.

**User stays in the loop via gates, not via interrupts.** Discussion gates, write gates, budget ceilings, and approval prompts are the designed points of human interaction. The agent should not need to ask for help in the middle of a task.

## Tradeoffs

| Choice | What we gave up | Why |
|--------|----------------|-----|
| Fresh session per unit | Conversational continuity across units | Quality and predictability over convenience |
| State on disk (not in memory) | Speed of in-memory state | Crash recovery and multi-process visibility |
| Write gate during queue | Faster iteration in planning | Safety: prevents accidental file mutations during discussion |
| Protected files (ADRs, SPEC.md) | Agent autonomy over architecture docs | Human oversight over durable decisions |
| Serial execution default | Throughput | Correctness before parallelism; parallel locking is deferred debt |
fix: stabilize sf auto and subagent routing 2026-04-30 21:55:17 +02:00			`# Product Sense`

feat: implement ADR-001 gitignore split and fill placeholder docs Gitignore (core change): - Remove stale blanket .sf/ entries from .gitignore (migrated to .git/info/exclude on 2026-04-29, never cleaned up) - gitignore.ts: split SF_RUNTIME_EXCLUSION_PATTERNS into two modes — SF_SYMLINK_EXCLUSION_PATTERNS (blanket .sf for symlink repos where git cannot traverse the symlink) and SF_RUNTIME_EXCLUSION_PATTERNS (granular runtime-only patterns for directory repos, enabling .sf/milestones/ and other durable planning artifacts to be tracked) - ensureGitInfoExclude() now detects symlink vs directory and writes the correct patterns, handling transitions between modes cleanly - ADR-001 status: Proposed → Accepted Docs: - Fill 11 placeholder scaffold docs with real SF-specific content: PLANS, DESIGN, PRODUCT_SENSE, QUALITY_SCORE, RELIABILITY, SECURITY, design-docs/index.md, exec-plans/active, exec-plans/completed, exec-plans/tech-debt-tracker, records/index - Add records note: docs/records/2026-05-01-repo-vcs-and-notifications.md - ADR-008 status: Accepted → Proposed (deferred — not applicable to current usage model where Claude Code assists externally, not as a Pi provider inside SF's dispatch loop) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> 2026-05-01 22:32:28 +02:00			`## The Core Thesis`

			`Autonomous execution is the end gate. SF exists to take a multi-phase software project — a milestone with slices and tasks — and run it to completion without human intervention, producing a clean git history, passing tests, and a deployable artifact.`

			`Every design decision should be evaluated against this question: does it make autonomous execution more reliable, more observable, or more recoverable?`

			`## User Goals`

			`- Hand off a milestone and have it complete without babysitting`
			`- Know the agent won't make irreversible mistakes (write gates, protected files, budget ceilings)`
			`- Resume after a crash without losing work (state-on-disk, crash recovery)`
			`- See what the agent did and why (trace files, decision register, records keeper)`
			`- Steer mid-run without breaking the loop (message queue, steering gate)`

			`## Non-Goals`

			`- Being a chat interface — use the Pi interactive mode for exploratory conversation`
			`- Replacing CI — SF triggers verification but does not replace your existing CI pipeline`
			`- Working without context — SF needs a spec, a roadmap, and a task plan; it does not invent work from nothing`

			`## What Good Product Judgment Looks Like`

			`Fresh context per unit, not accumulated context. Each task gets a new session with exactly the context it needs pre-injected (task plan, slice plan, prior summaries, relevant skills). This prevents quality degradation from context accumulation — one of the primary failure modes of naive LLM agents on long projects.`

			`State machine, not LLM guessing. The loop is deterministic: read STATE.md → validate → dispatch → post-unit → verify → advance. The LLM executes work inside a unit; it does not decide what the next unit is. Separating orchestration from execution keeps the system predictable.`

			`Spec-first. No behavior change without a failing test first. No completion without a real consumer. This is the iron law — not a suggestion. An agent that completes tasks without specs is just making things up.`

			`Crash recovery must be invisible. A crashed session should resume within seconds with no visible data loss. If recovery requires human intervention, it is a product failure.`

			`User stays in the loop via gates, not via interrupts. Discussion gates, write gates, budget ceilings, and approval prompts are the designed points of human interaction. The agent should not need to ask for help in the middle of a task.`

			`## Tradeoffs`

			`\| Choice \| What we gave up \| Why \|`
			`\|--------\|----------------\|-----\|`
			`\| Fresh session per unit \| Conversational continuity across units \| Quality and predictability over convenience \|`
			`\| State on disk (not in memory) \| Speed of in-memory state \| Crash recovery and multi-process visibility \|`
			`\| Write gate during queue \| Faster iteration in planning \| Safety: prevents accidental file mutations during discussion \|`
			`\| Protected files (ADRs, SPEC.md) \| Agent autonomy over architecture docs \| Human oversight over durable decisions \|`
			`\| Serial execution default \| Throughput \| Correctness before parallelism; parallel locking is deferred debt \|`