Mikael Hugo 16ff608d80 feat: implement ADR-001 gitignore split and fill placeholder docs

Gitignore (core change):
- Remove stale blanket .sf/ entries from .gitignore (migrated to
  .git/info/exclude on 2026-04-29, never cleaned up)
- gitignore.ts: split SF_RUNTIME_EXCLUSION_PATTERNS into two modes —
  SF_SYMLINK_EXCLUSION_PATTERNS (blanket .sf for symlink repos where
  git cannot traverse the symlink) and SF_RUNTIME_EXCLUSION_PATTERNS
  (granular runtime-only patterns for directory repos, enabling
  .sf/milestones/ and other durable planning artifacts to be tracked)
- ensureGitInfoExclude() now detects symlink vs directory and writes
  the correct patterns, handling transitions between modes cleanly
- ADR-001 status: Proposed → Accepted

Docs:
- Fill 11 placeholder scaffold docs with real SF-specific content:
  PLANS, DESIGN, PRODUCT_SENSE, QUALITY_SCORE, RELIABILITY, SECURITY,
  design-docs/index.md, exec-plans/active, exec-plans/completed,
  exec-plans/tech-debt-tracker, records/index
- Add records note: docs/records/2026-05-01-repo-vcs-and-notifications.md
- ADR-008 status: Accepted → Proposed (deferred — not applicable to
  current usage model where Claude Code assists externally, not as a
  Pi provider inside SF's dispatch loop)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-01 22:32:28 +02:00

3 KiB

Raw Blame History

Product Sense

The Core Thesis

Autonomous execution is the end gate. SF exists to take a multi-phase software project — a milestone with slices and tasks — and run it to completion without human intervention, producing a clean git history, passing tests, and a deployable artifact.

Every design decision should be evaluated against this question: does it make autonomous execution more reliable, more observable, or more recoverable?

User Goals

Hand off a milestone and have it complete without babysitting
Know the agent won't make irreversible mistakes (write gates, protected files, budget ceilings)
Resume after a crash without losing work (state-on-disk, crash recovery)
See what the agent did and why (trace files, decision register, records keeper)
Steer mid-run without breaking the loop (message queue, steering gate)

Non-Goals

Being a chat interface — use the Pi interactive mode for exploratory conversation
Replacing CI — SF triggers verification but does not replace your existing CI pipeline
Working without context — SF needs a spec, a roadmap, and a task plan; it does not invent work from nothing

What Good Product Judgment Looks Like

Fresh context per unit, not accumulated context. Each task gets a new session with exactly the context it needs pre-injected (task plan, slice plan, prior summaries, relevant skills). This prevents quality degradation from context accumulation — one of the primary failure modes of naive LLM agents on long projects.

State machine, not LLM guessing. The loop is deterministic: read STATE.md → validate → dispatch → post-unit → verify → advance. The LLM executes work inside a unit; it does not decide what the next unit is. Separating orchestration from execution keeps the system predictable.

Spec-first. No behavior change without a failing test first. No completion without a real consumer. This is the iron law — not a suggestion. An agent that completes tasks without specs is just making things up.

Crash recovery must be invisible. A crashed session should resume within seconds with no visible data loss. If recovery requires human intervention, it is a product failure.

User stays in the loop via gates, not via interrupts. Discussion gates, write gates, budget ceilings, and approval prompts are the designed points of human interaction. The agent should not need to ask for help in the middle of a task.

Tradeoffs

Choice	What we gave up	Why
Fresh session per unit	Conversational continuity across units	Quality and predictability over convenience
State on disk (not in memory)	Speed of in-memory state	Crash recovery and multi-process visibility
Write gate during queue	Faster iteration in planning	Safety: prevents accidental file mutations during discussion
Protected files (ADRs, SPEC.md)	Agent autonomy over architecture docs	Human oversight over durable decisions
Serial execution default	Throughput	Correctness before parallelism; parallel locking is deferred debt

3 KiB Raw Blame History