Tacit knowledge files captured in tracked .sf/ artifacts (per ADR-001):
- PRINCIPLES.md: durable design philosophy, with PDD as the canonical
change method (purpose / consumer / contract / failure boundary /
evidence / non-goals / invariants — all 7 fields required)
- TASTE.md: what good code looks like in SF — verbose names, domain >
layer, behavior-is-the-spec, minimum change, idempotent dispatch,
fail-non-fatal, structured blocker format, PDD discipline
- ANTI-GOALS.md: 25 rule-coded anti-patterns (SF001-SF025) covering bare
errors, type lies, magic strings, partial migrations, Ralph-loop retry,
central federation, MCP between first-party services, implementation-
mirror tests, coding-before-PDD-fields, happy-path-only, etc.
Translated from ACE-coder's STYLEGUIDE.md as the model. Anchored on
purpose-driven-development as the canonical change method. These three
files plus KNOWLEDGE.md plus DECISIONS.md are the tacit-knowledge layer
auto-injected into every agent context (via system-context.ts mtime cache).
Closes the "smart human gap" identified in this session: the difference
between SF behaving like a competent engineer in this codebase vs. a
generic LLM is the accumulated tacit knowledge available to the agent.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds explicit Tier 1 / Tier 2 / Tier 3 escalation guidance to every system
prompt. Tier 1 = code lookup (sift, source, .sf/DECISIONS.md). Tier 2 =
external lookup (WebSearch, WebFetch, Context7, MCP servers). Tier 3 = ask
user (in auto/step) or exit-with-structured-blocker (in autonomous).
- bootstrap/system-context.ts: buildEscalationPolicyBlock injected at top
of SF system-context section, mode-aware via isCanAskUser()
- bootstrap/ask-gate.ts: gateAskUserQuestions() runtime safety net,
blocks ask_user_questions in autonomous mode at the tool layer with a
structured rejection that escalates back to Tier 1/2
- tests: 18 escalation-policy + 16 ask-gate, all pass
Implements the user's "solve it like a smart human, not Ralph Wiggum"
philosophy: in autonomous mode the agent must do the research a competent
human would do, and only stop with a blocker when even a human couldn't
proceed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- project-research-policy.ts: replace throw stubs with real imports from
schemas/parsers.ts — parseProject and parseRequirements now live
- deep-project-setup-policy.ts: remove redundant inline stubs now that
schemas/validate.ts is ported
- tests/runtime-root-redirect.test.ts: new test for root redirect
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- sf-home.ts: new — resolves ~/.sf/ path and SF home dir helpers (port of gsd-home.ts)
- memory-embeddings.ts: new — embedding helpers for memory similarity search
- component-types.ts: new — Component, ComponentManifest, ComponentHook type defs
- workflow-install.ts: new — workflow installation from local/remote sources
- auto-post-unit.ts: clearEvidenceFromDisk after successful verification
- routing-history.ts: add cost-per-token tracking to routing decisions
- workflow-{manifest,templates}.ts: hardening sweep
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Performance fix from audit:
- bootstrap/system-context.ts: cachedReadFile() with mtime-keyed in-process
cache for KNOWLEDGE.md (global + project) and ARCHITECTURE.md. Eliminates
3-4 sync readFileSync calls per agent turn on the common case where these
files haven't changed. Live edits still picked up via mtime invalidation.
Docstring sweep on the notification + detection cluster:
- headless-events.ts: 17 JSDoc blocks (exit codes + every classification fn)
- notification-store.ts, notification-overlay.ts, notification-widget.ts,
notifications.ts: ~17 blocks
- detection.ts, codebase-generator.ts: ~5 blocks
Typecheck clean. 3/3 perf tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Last batch from the parallel swarm session: docstring tweaks,
verification-gate doc additions, workflow-reconcile and worktree-command
follow-ups, doctor-environment cleanup. Typecheck clean.
Most of the session work landed in earlier commits (8be8f4774, 3045538cb,
038938f2a, ed85252fc, 4f4b584e5, etc.); this commit is the residual
working-tree state after all swarms reported.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Touches auto.ts, auto/loop.ts, preferences.ts, safety/git-checkpoint.ts,
token-counter.ts, tools/complete-slice.ts, verification-gate.ts,
workflow-logger.ts, workflow-migration.ts, plus new
tests/record-promoter.test.ts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- src/headless-events.ts: add case "reload" → EXIT_RELOAD (12).
EXIT_RELOAD sentinel was defined but unused — "reload" status fell
through to EXIT_ERROR (1).
- src/resources/extensions/sf/notification-store.ts:109: use <= for
dedup window so a second identical notification at exactly
DEDUP_WINDOW_MS still gets suppressed (was off-by-one at boundary).
- src/resources/extensions/sf/definition-loader.ts: pending docstring
tweaks from autonomous sweep.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- new worktree-root.ts / worktree-session-state.ts: track and restore
original project root after /worktree merge or /worktree return
- new tools/skip-slice.ts: cascade skip to tasks in the slice so milestone
completion isn't blocked by pending tasks (#4375)
- auto/run-unit.ts: anchor cwd to basePath before newSession() captures it
(GAP-10) — prevents tool runtime / system prompt from rooting on drifted
cwd from async_bash, background jobs, or prior unit cleanup
- safety/git-checkpoint.ts: harden HEAD-rev-parse against execFileSync
errors, surface stderr properly
- broad JSDoc / docstring pass across the rest of the SF extension surface
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`/sf autonomous full` (or `--full`) plumbs through to AutoSession.fullAutonomy,
to be consumed at milestone-complete to skip the human-review pause and
auto-merge + chain to the next milestone. Git revert is the safety net
(see ADR-019/021 conversation on autonomy and reversibility).
Plumbing path:
- commands/handlers/auto.ts: parses `full` / `--full` modifier, threads
fullAutonomy through launchAuto options
- commands/catalog.ts: completion entries for `full` and `--full`
- auto.ts: startAuto and startAutoDetached accept fullAutonomy in options;
startAuto pins it on the session up-front so resume paths preserve it
- auto/session.ts: AutoSession.fullAutonomy field with full docstring
Behavior change is staged: the milestone-complete consumer that auto-merges
and chains is intentionally not in this commit (parallel session is active
in auto-post-unit.ts and auto/loop.ts; will land in a follow-up).
Also adds JSDoc to the functions on the touched path:
- handleAutoCommand (full command-family doc)
- launchAuto (headless vs detached routing)
- startAutoDetached (fire-and-forget rationale, why it diverges from startAuto)
- AutoSession.fullAutonomy (full inline doc)
Typecheck clean.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
loop.ts:
- saveStuckState on main dev path (was only on custom-engine path — P1 fix)
- Add pid to stuck-state JSON to prevent test pollution across process runs
- Use atomicWriteSync in saveCustomVerifyRetryCounts for crash-safety
- Add enforceMinRequestInterval + call before both runUnitPhaseViaContract sites
- Update s.lastRequestTimestamp from requestDispatchedAt on each unit
session.ts:
- Add lastRequestTimestamp and lastUnitAgentEndMessages fields
phases.ts:
- Add consecutiveSessionTimeouts + exponential-backoff auto-resume (up to 3x)
for session-creation timeouts before pausing for manual review
- Add loadEvidenceFromDisk after resetEvidence to rehydrate evidence on restart
- Add USER_DRIVEN_DEEP_UNITS + isAwaitingUserInput guard to skip artifact
verification when a deep-planning unit is paused awaiting user input
- Store s.lastUnitAgentEndMessages after each unit run
- Add requestDispatchedAt to runUnitPhase return type
evidence-collector.ts: add loadEvidenceFromDisk export
auto-post-unit.ts: add USER_DRIVEN_DEEP_UNITS set + re-export isAwaitingUserInput
user-input-boundary.ts: port from gsd2 (isAwaitingUserInput + approval helpers)
run-unit.ts: capture requestDispatchedAt at API dispatch time
kernel.ts: remove redundant !legacyFallback guard (enabled already encodes it)
tests/uok-kernel-path.test.ts: add SF_UOK_AUDIT_ENVELOPE env var assertions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The test was checking for a literal single-line ternary in auto-post-unit.ts,
but the formatter naturally renders the same ternary multi-line. The semantic
content is identical; the test was failing on whitespace alone.
Normalize runs of whitespace before substring-matching so the assertion
survives prettier/biome formatting changes.
After this fix: 39/39 uok tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- loop.ts: add DispatchContract type, AutoLoopOptions, resolveDispatchNodeKind,
runUnitPhaseViaContract — kernel path routes unit execution through
ExecutionGraphScheduler; legacy path passes through directly
- loop.ts: export runUokKernelLoop (contract=uok-scheduler) and
runLegacyAutoLoop (contract=legacy-direct)
- auto-loop.ts: re-export both new loop functions
- auto.ts: use runUokKernelLoop/runLegacyAutoLoop at both call sites
- phases.ts: use uokFlags.planningFlow for plan gate (was bypassing
legacyFallback via raw pref read)
- auto-dispatch.ts: use hasFinalizedMilestoneContext for execution-entry
context check (picks up SF_PROJECT_ROOT artifact fallback)
- tests: port uok-writer, uok-parity-report, uok-loop-adapter-writer,
uok-kernel-path test files from gsd2 — all 8 tests pass
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Follow-up to commit 39e2dc70c. Two small improvements that surfaced when
the parallel Phase D subagent finished and inspected the worktree:
- commands-scaffold-sync.ts:
- Tighten ScaffoldKeeperFn to match Phase D's actual dispatcher signature
(basePath, ctx) => Promise<number>. Define a local minimal
ScaffoldKeeperCtxShape for the lazy loader so we don't form a hard
import dependency on scaffold-keeper.ts.
- Remove duplicated "Upgradable" line from the report table — keep only
"Pending" since ADR-021 §10 names that the user-facing label.
- tests/scaffold-keeper.test.ts: better-typed notify stub; covers Phase E
arg-parser helpers (parseScaffoldSyncArgs, matchesOnly, applyOnlyFilter).
Typecheck clean. 49/49 scaffold tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase D: scaffold-keeper background agent
- scaffold-keeper.ts: dispatchScaffoldKeeperIfNeeded fires async after milestone
completion and on stopAuto cleanup. Detects editing-drift items, writes
<file>.proposed artifacts (template-only stub for now; later wires the
records-keeper skill subagent for code-as-fact merging), emits a structured
approval_request notification with stable dedupe_key so repeated runs don't
spam the user.
- Wired into auto-post-unit.ts and auto.ts:stopAuto via fire-and-forget so
the auto loop is never blocked by scaffold work.
- Failure modes non-fatal: try/catch around the dispatch, errors logged via
logWarning("scaffold").
Phase E: /sf scaffold sync command (escape hatch)
- commands-scaffold-sync.ts: parseScaffoldSyncArgs + handleScaffoldSync.
- Flags:
--dry-run report what would change, no writes
--include-editing run scaffold-keeper synchronously for editing-drift items
--only=<glob> scope to a path glob (suffix/prefix match)
- Wired into the SF command system via commands-bootstrap.ts, commands/catalog.ts,
and commands/handlers/ops.ts following the existing /sf <verb> pattern.
- Reuses ensureAgenticDocsScaffold from Phase C — doesn't reimplement sync logic.
Doctor finding (checkScaffoldFreshness) refined to reference the new command.
Tests: 8 new cases in scaffold-keeper.test.ts. All 49 scaffold tests green.
Together with Phases A-C, this completes ADR-021. Documents are now versioned,
upgrades are automatic for the safe cases, and editing-drift surfaces through
.proposed artifacts and structured notifications. The scaffold-keeper agent
body is currently a template-only stub; replacing it with a real records-keeper
subagent dispatch is a follow-up that the architecture now enables.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase C (automatic silent sync) had no dedicated tests when committed.
Added 8 cases covering:
- ensureAgenticDocsScaffold on empty dir creates files with markers
- old-version pending marker silently re-renders to current
- editing-drift file left untouched
- legacy unmarked file matched against archive promoted to pending
- migrateLegacyScaffold idempotency
Total scaffold test count: 41 (was 33).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The user-visible "automatic" upgrade behavior. After this lands, projects
pointed at SF silently catch up to the current scaffold without any user
action — for the simple cases.
Drift-aware ensureAgenticDocsScaffold:
- Step 1: migrateLegacyScaffold runs first to promote unmarked-but-recognised
files via SCAFFOLD_VERSION_ARCHIVE hash matching
- Step 2: per-template walk:
- Missing → create + stamp + manifest entry (existing behavior)
- Present, marker, state=pending, version drifted, hash matches stamp
→ silent re-render with current template + restamp (NEW)
- Editing/completed/customized → leave alone (Phase D handles editing-drift)
- Silent contract: no stdout/stderr, only logWarning("scaffold") for I/O
failures. All failure modes non-fatal.
SCAFFOLD_VERSION_ARCHIVE bootstrap:
- Lazily seeded with current SF version's body hashes from SCAFFOLD_FILES
- Future SF releases append entries when templates change so legacy projects
can match against any prior version
checkScaffoldFreshness doctor finding (ADR-021 §8):
- Surfaces missing/upgradable/editing-drift counts as "scaffold_drift" warning
- Auto-fix runs ensureAgenticDocsScaffold to handle missing+pending
- Non-fatal warning, never blocks dispatch
- Editing-drift left for Phase D (scaffold-keeper background agent)
Tests pass: 33/33 across scaffold-versioning + scaffold-drift suites.
Typecheck clean.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Subagent split scaffold tests into scaffold-versioning.test.ts (Phase A)
and scaffold-drift.test.ts (Phase B). Fixed an ESM-incompatible
require("node:fs") in one drift test that was breaking with
--experimental-strip-types. All 33 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wire plan-gate in runDispatch() and verification gate in runFinalize()
- Add planningFlow gate persistence in guided-flow.ts
- Add execution-graph gate event in auto-dispatch.ts
- Flip all UOK feature flags from opt-in (=== true) to on-by-default (?? true)
- Port dispatch-envelope.ts, parity-report.ts, writer.ts from gsd2
- Add DispatchReasonCode, UokDispatchEnvelope, WriterToken, WriteRecord,
WriteSequence, DispatchExplanation to contracts.ts
- Add "refine" to UokNodeKind
- Extend auto-worktree.ts with workspace.after_create hook support
- Add workspace.after_create to preferences-types and preferences-validation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 4-D fixes from the Phase 3 validation report. ace-coder is a
uv-managed Python repo with Rust crates in subdirectories; SF was
mis-detecting it in ways that would have failed every autonomous
verification.
1. detectPackageManager: return undefined when no root package.json
(previously hallucinated "npm" as default, leaking into reports)
2. detectVerificationCommands: only synthesize npm runner when
package.json actually present at root
3. ROOT_ONLY_PROJECT_FILES: expanded with Cargo.toml, go.mod,
pyproject.toml, setup.py, pom.xml, pubspec.yaml, Package.swift,
mix.exs — these are root-only signals; nested instances are
handled explicitly by emitter logic
4. Cargo block: distinguishes workspace-root vs single-crate-root vs
nested-only-crates layouts; emits per-crate bash loop for the last
case (mirrors the Go multi-module branch pattern)
5. pyprojectHasTool: matches both [tool.X] and [tool.X.subkey] so
ace-coder's [tool.ruff.lint] / [tool.ruff.format] are detected
6. Makefile branch: skip `make test` when (a) test command already
emitted by another block, or (b) the test target depends on
_verify_nix or similar nix-shell gates (ace-coder's case)
After these fixes, detectProjectSignals on ace-coder yields the
expected output: no spurious "npm", per-crate cargo loops, ruff/pyright
detected, no nix-gated `make test`.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 1 — close SF-side polish gaps:
- codebase-generator: distinguish uv/poetry/pdm in Python stack-signals;
surface configured tooling (ruff/mypy/pyright) when config files exist
- doctor-environment: new checkPythonEnvironment — detects uv/poetry/pdm
via lockfile, verifies binary on PATH, warns with install hint when missing
- doctor-environment: new checkSiftAvailable — recommends sift install for
repos > 5000 source files when not on PATH
- tech-debt-tracker: documented future memory-as-sub-extension extraction
(defer until real backend-swap requirement)
Phase 2 — internal wire architecture:
- ADR-020: singularity-grpc as shared schema repo; gRPC + typed clients
for first-party services; MCP façade only at external-tool boundary
- ADR-019: trimmed MCP scope section to a 3-line summary linking to ADR-020
to avoid the wire-format table living in two places
- design-docs/index.md: ADR-020 added to ADR table
These changes make SF stronger for autonomous work on Python repos
(particularly ace-coder) and capture the internal wire architecture
decision as a durable ADR before any singularity-grpc code lands.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds __pycache__, .pytest_cache, .mypy_cache, .ruff_cache, .tox, .eggs,
and htmlcov to RECURSIVE_SCAN_IGNORED_DIRS so SF doesn't walk into them
when scanning project files. These directories can contain thousands of
files in mature Python projects and were slowing down detection / scan
operations on Python codebases.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ADR-019 framing corrections:
- SF is single-machine, single-user, single-repo by design — character, not
limitation. Stays a standalone app permanently; does not get absorbed into ACE.
- Phase 6 reframed: "pattern transfer" not "orchestration convergence." ACE
ports patterns from SF, both apps remain independent.
- Phase 2 reframed: SF stays local. Federation is an ACE concern; SF doesn't
wire memory-store remote-mode against singularity-memory.
Detection strengthened for Python (priority for ace-coder work):
- Detect uv / poetry / pdm and prefix verification commands accordingly
- Emit ruff check when configured (file or [tool.ruff] in pyproject.toml)
- Emit mypy / pyright when configured — skip when no config to avoid false fails
- pyprojectHasTool helper for [tool.<name>] section detection
Detection strengthened for Rust:
- cargo fmt --check (fastest, catches style first)
- cargo check (type-only, faster than test)
- cargo clippy -- -D warnings (warnings as errors)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When metadata is present, skip the text fallback entirely — the emitter
declared the event kind explicitly and the regex should not override it.
Add regression test file covering all acceptance criteria: metadata-first
classification, legacy fallback, dedupe_key dedup, and the key invariant
that automated notices cannot produce terminal/blocked signals.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace brittle string-matching in headless-events.ts with structured
source/kind/blocking/dedupe_key metadata on notify() events. String
matching is preserved as a fallback for the ~940 untagged call sites.
- Add NotificationMetadata type to headless-types.ts (canonical definition)
- Extend rpc-types.ts notify event with optional metadata field
- Extend ExtensionUIContext.notify() signature with optional 3rd arg
- Pass metadata through RPC notify implementation in rpc-mode.ts
- Update headless-events.ts: isTerminalNotification, isBlockedNotification,
isMilestoneReadyNotification, isPauseNotification all check metadata first
- Update notification-store.ts: store metadata on NotificationEntry; use
metadata.dedupe_key as dedup key when provided (falls back to message hash)
- Update notify-interceptor.ts to thread metadata through to store + original
- Tag critical emit sites with structured metadata:
stopAuto → { kind: "terminal" } (+ blocking: true when reason includes "block")
pauseAuto → { kind: "terminal", blocking: true }
guided-flow milestone ready → { kind: "approval_request", blocking: true }
- Update notification-overlay.ts to prefer metadata.source for [label] display
- Add 17-test regression suite (notification-event-model.test.ts)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add harness/ directory to SF repo (specs/, evals/, graders/ with AGENTS.md)
and seed harness/specs/bootstrap.md (agent-legibility verification)
- Extend agentic-docs-scaffold.ts: new repos get harness/ + ADR-TEMPLATE.md
and just adr / just spec / just harness-spec recipes via justfile
- Sync SF_RUNTIME_PATTERNS (gitignore.ts canonical) → git-service.ts and
worktree-manager.ts: add audit/, exec/, model-benchmarks/, reports/,
notifications.jsonl, routing-history.json, self-feedback.jsonl, repo-meta.json,
and milestone continue-marker patterns
- Inject ARCHITECTURE.md into system prompt via loadArchitectureBlock() in
system-context.ts (capped at 8 000 chars, after KNOWLEDGE block)
- Write real ARCHITECTURE.md for this repo (system map, .sf/ layout, key flows)
- Add ADR-TEMPLATE.md to docs/design-docs/
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>