Commit graph

98 commits

Author SHA1 Message Date
Mikael Hugo
1115437cec feat(swarm): event streaming + outcome derivation for runUnitViaSwarm
- Forward onEvent through swarm-dispatch → agent-runner → runSubagent
- Collect toolcall_end events in runUnitViaSwarm to build real tool-use blocks
- Detect checkpoint tool outcome for accurate unit completion signal
- Add headless.ts graceful shutdown (async signal handler, 2.5s timeout)
- RPC client stop() now awaits flush and propagates stop to child sessions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 04:54:58 +02:00
Mikael Hugo
12f5eb2279 feat(triage): wire --apply CLI + canonical resolve_issue evidence kinds
Three coupled changes that together complete the operator-facing
--apply surface for sf headless triage:

1. headless.ts: parse --apply from commandArgs and forward to
   handleTriage. The triage option flow now distinguishes inspect
   (--list, --json), one-shot (--run), and orchestrated apply
   (--apply) cleanly.

2. help-text.ts: triage subcommand line + examples block now document
   the --apply mode (triage-decider → rubber-duck pipeline).

3. bootstrap/db-tools.js: resolve_issue tool now accepts the full
   canonical evidence-kind set instead of hardcoding "agent-fix":
   - agent-fix (default; commit-based fix evidence)
   - human-clear (stale, superseded, false positive, intentional close)
   - promoted-to-requirement (with required requirement_id)
   The tool surfaces a clear error when promoted-to-requirement is
   used without requirement_id. The promptGuidelines updated to walk
   callers through choosing the right kind.

   self-feedback-db.test.mjs extended with coverage for all three
   evidence kinds + the missing-requirement_id rejection path.

Together these make sf headless triage --apply genuinely useful: the
agent can produce a plan with any outcome, rubber-duck reviews it,
and the runner applies via resolve_issue with the right evidence
kind per decision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 17:23:10 +02:00
Mikael Hugo
34521814cc feat(headless): sf headless triage --run — dispatch via @singularity-forge/ai
Adds runTriage to self-feedback-drain.js, mirroring runReflection in
reflection.js: provider-agnostic dispatch via @singularity-forge/ai's
completeSimple, dependency-injectable for tests, 8-minute timeout race,
clean-finish detection on the canonical "Self-feedback triage complete"
terminator.

`sf headless triage --run [--model provider/modelId]` now dispatches the
canonical triage prompt and writes the model's decision text to
.sf/triage/decisions/<ts>.md. Operators apply the decisions (resolve_issue
calls, code edits) — a tool-enabled variant that lets the model close
entries directly is follow-up work.

Default model: google-gemini-cli/gemini-3-pro-preview (matches
DEFAULT_REFLECTION_MODEL).

Continues the bounded chip away at sf-mp4rxkwb-l4baga: triage now has
both an operator-pipe path (default) and a one-shot dispatch path (--run).
The full unit-type registration that wires this into the autonomous
dispatcher's idle path is the remaining slice of that entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 07:29:29 +02:00
Mikael Hugo
8fde12301f feat(headless): sf headless triage — operator-driven self-feedback drain
Adds a deterministic, turn-independent path to drain the self-feedback
queue. Modes:
  - default: emits the canonical buildInlineFixPrompt() output for
    piping into any model (sf headless triage | sf headless -p -)
  - --list:  human-readable digest sorted by impact↓ effort↑ ts↑
  - --json:  structured candidate list for tooling
  - --max N: cap candidates

Why this matters (partial step toward sf-mp4rxkwb-l4baga): the existing
session_start drain queues triage as `triggerTurn:true,
deliverAs:"followUp"`. When autonomous mode bails at milestone
validation before any turn runs, the followUp gets dropped and the
queue stays unprocessed. This command sidesteps that by rendering the
prompt synchronously to stdout — operators can pipe it into any model
without depending on autonomous-loop turn semantics. The full
unit-type registration that fixes the underlying dispatcher gap is
larger work tracked in the parent entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 07:04:01 +02:00
Mikael Hugo
62b19d7ba4 feat(reflection): wire LLM dispatch (sf headless reflect --run)
Phase 1B of the reflection layer: complete the operator-driven loop by
adding actual LLM dispatch. Phase 1A (commit e161a59e2) shipped the
corpus assembler + prompt template + the prompt-emit operator surface.
This commit wires the dispatch end so `sf headless reflect --run`
produces a real report on disk without manual model piping.

Why shell-out to the gemini CLI and not SF's provider abstraction:
reflection is a single-prompt one-shot inference. Going through SF's
full agent dispatch would require a session, model registry, tool
registration, recovery shell — overkill for "render this prompt,
capture text." The gemini CLI handles auth (~/.gemini/oauth_creds.json),
Code Assist project discovery, and protocol drift on its behalf.
Subprocess cost is paid once per reflection (rare).

Implementation:

- reflection.js: runGeminiReflection(prompt, options) spawns
  `gemini --yolo --model <model> -p "<directive>"` and pipes the giant
  rendered template via stdin (gemini -p reads stdin and appends).
  Returns { ok, content, cleanFinish, exitCode, error, stderr }; never
  throws. Defaults to gemini-3-pro-preview (0% used on AI Ultra,
  strongest agentic model with quota). 8-minute timeout.

  cleanFinish detected by REFLECTION_COMPLETE terminator (emitted by
  the prompt template's output contract) — operator gets a warning when
  the report is truncated.

- headless-reflect.ts: --run flag triggers dispatch + report write
  via writeReflectionReport. --model overrides the default. Errors
  surface as JSON or text per --json. Successful runs emit the report
  path on stdout; failures emit error + truncated stderr.

- help-text.ts: documents --run and --model flags.

- Tests (4 new, 13 total): use a fake `gemini` binary on PATH to
  exercise the spawn path without real OAuth/network — covers
  ok+cleanFinish, non-zero exit, hang/timeout, missing-terminator.

All 1538 SF extension tests pass; typecheck clean.

Phase 2 follow-up (still gated on sf-mp4rxkwb-l4baga
triage-not-a-first-class-unit-type landing): reflection-pass becomes a
real autonomous-loop unit type, milestone-close auto-triggers it, the
report's `Recommended new self-feedback entries` section gets parsed
and the entries auto-filed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 04:33:16 +02:00
Mikael Hugo
e161a59e2f feat(reflection): add Phase 1A reflection layer (corpus + prompt + sf headless reflect)
Addresses self-feedback entry sf-mp4uzvcd-pazg6v
(architecture-defect:no-reflection-layer-over-self-feedback-corpus): SF
detected symptoms and triaged individual entries but had no layer that
reasoned about the corpus to recognize recurring structural patterns.
The same architectural pressure expressed itself across multiple entries
with different exact-kind strings; nothing escalated the pattern to a
class. The cognitive work fell on the operator.

This commit ships Phase 1A — the data-assembly + prompt half of the
reflection layer + an operator-driven entry point. Phase 1B (LLM dispatch
via the autonomous loop as a real unit type) lands once
sf-mp4rxkwb-l4baga (triage-not-a-first-class-unit-type) is in.

Files:
- src/resources/extensions/sf/reflection.js (new)
  - assembleReflectionCorpus(basePath): bundles open + recent-resolved
    self-feedback (full json), last 50 commits via git log, milestone +
    slice + task state, all milestone validation verdicts, and prior
    reflection report into one struct. Returns null on prerequisite
    failure (DB closed) so callers downgrade gracefully.
  - renderReflectionCorpusBrief(corpus): renders the corpus into a
    markdown brief the LLM consumes in one turn.
  - writeReflectionReport(basePath, content): persists to
    .sf/reflection/<timestamp>-report.md so next pass detects "what
    changed since last reflection."

- src/resources/extensions/sf/prompts/reflection-pass.md (new)
  - {{include:working-directory}} prefix.
  - Reasoning order: cluster by structural shape (not exact kind),
    identify recurring patterns, identify commit/ledger gaps, identify
    stale validation drift, identify the deepest architectural concern,
    compare against prior report.
  - Output contract: structured markdown report with named sections,
    terminator REFLECTION_COMPLETE for clean-finish detection.
  - Constraints: don't fix anything (reflection layer not executor),
    don't resolve entries without commit-SHA evidence, don't invent IDs.

- src/headless-reflect.ts (new) — sf headless reflect [--json]
  - Pre-opens the project DB via auto-start.openProjectDbIfPresent
    (one-shot bypass path doesn't run the full SF agent bootstrap).
  - Default: emits the rendered prompt brief (template + corpus) for
    operators to pipe into any model. Lets the corpus-assembly layer
    ship and validate before the LLM-dispatch layer is wired.
  - --json: emits raw corpus snapshot for tooling.

- src/headless.ts: registers the new "reflect" command after the
  existing usage block.
- src/help-text.ts: documents it in the headless command list.

- src/resources/extensions/sf/tests/reflection.test.mjs (new, 9 tests):
  null-when-DB-closed; collects open + recent-resolved; excludes >30d
  resolutions; captures milestone/slice/task tree; captures validation
  verdicts; commits returned as array (best-effort tmpdir is ok); brief
  renders all major sections; entry IDs/severity/kind appear in brief;
  writeReflectionReport round-trips through assembleReflectionCorpus's
  previousReport read.

Live smoke verified: sf headless reflect against the real .sf/sf.db
returns 15 open + 23 recent-resolved entries, 50 commits, 2 milestones,
1 validation file (correctly surfacing M001's stale needs-attention
verdict against actual 5/5 slices done — exactly the case that
motivated this layer).

Total: +848 LOC, full SF extension suite (1534 tests) passes,
typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 04:27:29 +02:00
Mikael Hugo
383e495085 feat(headless,gemini-cli): add sf headless usage + unify gemini quota path
Adds a machine-readable headless surface for live LLM-provider usage and
unifies the gemini-cli quota fetch through one helper, removing the
duplication that existed between usage-bar.js and the new package.

1. snapshotGeminiCliAccount in @singularity-forge/google-gemini-cli-provider

   - Single source of truth for { projectId, userTierId, userTierName,
     paidTier, models[] } via setupUser + retrieveUserQuota.
   - Dedups buckets per modelId, keeping the worst (lowest remainingFraction)
     so consumers always see the most-restrictive window. Code Assist
     sometimes returns multiple buckets per model; the pessimistic choice
     is what every consumer needs.
   - discoverGeminiCliModels(cwd?) wraps it for catalog-cache callers that
     only need the IDs.

2. sf headless usage subcommand

   - New src/headless-usage.ts handler. text (default) and --json output.
     Uses the package's snapshot directly — no RPC child, no jiti
     gymnastics — matching the shape of headless-uok-status / headless-doctor.
   - Wired into src/headless.ts after the doctor block.
   - Help text adds the command line.

3. usage-bar.js refactored to delegate

   - fetchGeminiUsage no longer imports gemini-cli-core directly. It calls
     snapshotGeminiCliAccount and reshapes the result into the existing
     { provider, displayName, windows[] } UI contract.
   - Eliminates the duplicate setupUser + retrieveUserQuota code path.
   - The fast existsSync(~/.gemini/oauth_creds.json) pre-flight stays
     so unauth'd users get a friendly message without paying for OAuth
     bootstrap.

4. Model registry refactor (separate track committed alongside)

   - src/resources/extensions/sf/model-registry.ts (new) consolidates
     canonical model identity, capability tier, and generation tags into
     one source of truth that auto-model-selection, benchmark-selector,
     and model-router now consume instead of maintaining parallel maps.

All 1487 tests pass (151 files); typecheck clean for both the package
and the SF extensions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 03:42:53 +02:00
Mikael Hugo
d22df007a7 fix(headless): correct log message to show actual command format
The log message said '/sf ${command}' but the actual command sent is
'/${command}' (without the sf namespace). Fix to match actual dispatch.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-12 17:04:11 +02:00
Mikael Hugo
0426aafad2 fix(headless): drop /sf prefix so typed commands route through extension dispatch
headless.ts was sending `/sf {subcommand} {args}` to the RPC session, but
commands are registered without the sf namespace (e.g. 'todo', 'autonomous').
_tryExecuteExtensionCommand parsed commandName='sf', found no match, and the
LLM handled the request instead of the typed backend.

Fix: send `/{subcommand} {args}` directly — matches what registerSFCommands
registers and what the TUI already uses.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-12 15:55:46 +02:00
Mikael Hugo
797db16ae8 feat(sf): S03/T04 — add UOK gate health to sf headless status uok
Adds a new `sf headless status uok` subcommand that queries
gate-run stats and circuit-breaker state from sf.db and formats
them as a markdown table or JSON (--json flag).

- src/headless-uok-status.ts: handler that loads sf-db-gates
  directly (avoids the unimported getDistinctGateIds in gate-runner)
- src/headless.ts: bypass RPC, route 'status uok' to handler
- src/help-text.ts: document the new subcommand
- tests/headless-uok-status.test.mjs: 19 node:test coverage

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 18:31:03 +02:00
Mikael Hugo
02a4339a51 refactor: rename pi-* packages to forge-native names (Phase 1)
Rename all four packages/pi-* directories to forge-native names,
stripping the 'pi' identity and establishing forge's own:

- packages/pi-coding-agent → packages/coding-agent
- packages/pi-ai → packages/ai
- packages/pi-agent-core → packages/agent-core
- packages/pi-tui → packages/tui

Package names updated:
- @singularity-forge/pi-coding-agent → @singularity-forge/coding-agent
- @singularity-forge/pi-ai → @singularity-forge/ai
- @singularity-forge/pi-agent-core → @singularity-forge/agent-core
- @singularity-forge/pi-tui → @singularity-forge/tui

All import references, bare string references, path references,
internal variable names (_bundledPi*), and dist files updated.
@mariozechner/pi-* third-party compat aliases preserved.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 11:28:01 +02:00
Mikael Hugo
b93409cfa4 feat(headless): add -y / --yolo CLI flag to sf headless
- HeadlessOptions.yolo added
- parseHeadlessArgs handles --yolo and -y (short form)
- SF_YOLO=1 is injected into the RPC child env when flag is set
- AutoSession._loadPersistedModeState() checks SF_YOLO=1 and
  auto-activates YOLO mode (build+autonomous+deep+unrestricted)
  on session startup

Usage:
  sf headless -y autonomous       # YOLO + autonomous mode
  sf headless --yolo next         # YOLO + run next unit

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-09 19:05:32 +02:00
Mikael Hugo
995a57335b fix(surfaces): stamp correct surface in AutoSession + /mode yolo headless command
Surface stamp:
- AutoSession._loadPersistedModeState() now calls detectSurface() to stamp
  the correct surface (headless/web/tui) from env vars on every startup.
  Persisted surface value was the previous launch's surface — wrong when
  switching between TUI and headless on the same project.
  SF_HEADLESS=1 → 'headless', SF_WEB_BRIDGE_TUI=1 → 'web', else 'tui'.

/mode yolo:
- handleModeCommand now recognises 'yolo' as a toggleable special case.
  Headless callers can now run: sf headless --command '/mode yolo'
  Same behaviour as Ctrl+Y: full-autonomy slam + settingsManager bypass.
  /mode catalog description updated to list 'yolo' as an option.

Documentation:
- headless.ts /query and /doctor short-circuits annotated as intentional
  architecture trade-offs with a note to keep them in sync with the extension.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-09 17:03:33 +02:00
Mikael Hugo
b5893d1c28 Make SF direct command surface baseline 2026-05-08 01:34:07 +02:00
Mikael Hugo
6fc054e7c3 sf snapshot: uncommitted changes after 49m inactivity 2026-05-08 01:07:24 +02:00
Mikael Hugo
89677b7e9b sf snapshot: uncommitted changes after 110m inactivity 2026-05-08 00:17:47 +02:00
Mikael Hugo
deeb4dbd4e sf snapshot: uncommitted changes after 61m inactivity 2026-05-07 16:39:39 +02:00
Mikael Hugo
343ee5c89e sf snapshot: uncommitted changes after 158m inactivity 2026-05-07 10:01:56 +02:00
Mikael Hugo
5157223e4c fix: record requested headless command 2026-05-07 00:40:05 +02:00
Mikael Hugo
2d465b11fd test: add comprehensive Phase 1 coverage for dispatch loop (48 tests)
- Add metrics.test.ts: 21 tests for unit outcome recording, model performance tracking, fire-and-forget safety, persistence, error handling
- Add triage-self-feedback.test.ts: 27 tests for report classification, confidence thresholds, auto-fix, deduplication, severity categorization, async safety

Purpose: Increase coverage of critical autonomous dispatch paths from 40% to 60%+.
Covers fire-and-forget patterns (metrics recording and auto-fix application must not
block dispatch), concurrent recording safety, graceful degradation on error.

Tests validate:
  ✓ Unit outcome recording without blocking
  ✓ Per-task-type model performance tracking
  ✓ Fire-and-forget error handling (metrics/fixes don't break dispatch)
  ✓ Concurrent metric recording race conditions
  ✓ Persistence atomicity
  ✓ Report classification by type/severity
  ✓ Confidence thresholds (0.85-0.95 per type)
  ✓ Auto-fix deduplication and prioritization
  ✓ Async triage without blocking dispatch

Phase 1 complete: 48 tests, all passing.
Phase 2: Recovery path hardening (recovery/forensics)
Phase 3: Property-based FSM testing (fast-check)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 00:38:19 +02:00
Mikael Hugo
553ba23b89 integrate: hook quick wins into UOK dispatch loop
Integration of 3 quick wins into existing UOK infrastructure:

1. Model Learning (Quick Win #2) → metrics.js
   - Record outcomes to model-learner for per-task-type performance tracking
   - Hook: recordUnitOutcome() now calls ModelLearner.recordOutcome()
   - Fire-and-forget: never blocks outcome recording on learning failure
   - Enables adaptive model routing decisions in downstream gates

2. Self-Report Fixing (Quick Win #1) → triage-self-feedback.js
   - Auto-fix high-confidence reports (>0.85) in applyTriageReport()
   - Hook: After triage and requirement promotion, apply auto-fixes
   - Fire-and-forget: never blocks report application on fix failure
   - Returns reportsAutoFixed count for triage metrics

3. Knowledge Injection (Quick Win #3) → already integrated in auto-prompts.js
   - Already active in execute-task prompt template
   - Semantic matching with graceful degradation

All integration points:
- Fire-and-forget: learning/fixing failures never block dispatch
- UOK-native: use existing outcome recording, db, gates
- Backward compatible: applyTriageReport now async, but callers handle it
- No new dependencies: all modules already in codebase

Testing: 2934 tests pass (no regressions from integration)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-06 22:34:41 +02:00
Mikael Hugo
42c651d106 fix: show verbose prompt traces 2026-05-06 06:45:15 +02:00
Mikael Hugo
a95e2947df fix: reconcile sift warmup observability 2026-05-06 06:22:09 +02:00
Mikael Hugo
76b218762b fix: harden sf autonomous runtime 2026-05-06 06:02:46 +02:00
Mikael Hugo
a1fd6cfc05 fix: separate headless transport from autonomous mode 2026-05-06 02:24:15 +02:00
Mikael Hugo
46db1e95ef refactor: remove legacy autonomous aliases 2026-05-05 18:47:50 +02:00
Mikael Hugo
ab6cad4c84 fix: clean provider surfaces and core build 2026-05-05 16:31:53 +02:00
Mikael Hugo
4c98cb8c33 fix: make autonomous mode canonical 2026-05-05 15:42:10 +02:00
Mikael Hugo
2d9c2018af chore: clean repo quality gates 2026-05-05 14:55:11 +02:00
Mikael Hugo
f11c877224 style: format repository with biome 2026-05-05 14:31:16 +02:00
Mikael Hugo
ed4a4bc93a chore: commit current worktree state 2026-05-04 19:28:39 +02:00
Mikael Hugo
44204e0424 chore(sf): add optional token telemetry 2026-05-02 11:50:34 +02:00
Mikael Hugo
26be0b4153 fix(sf): stabilize headless auto flow 2026-05-02 11:34:41 +02:00
Mikael Hugo
12538bbfa3 sf snapshot: pre-dispatch, uncommitted changes after 32m inactivity 2026-05-02 11:25:51 +02:00
Mikael Hugo
1990d2a2ee feat: Renamed textBuffer to assistantTextBuffer in headless.ts and vali…
- src/headless.ts
- .sf/REQUIREMENTS.md

SF-Task: S01/T04
2026-05-02 08:48:44 +02:00
Mikael Hugo
3a3ea29c51 chore(sf): test backfill, parse helpers, parallel session pickups 2026-05-02 02:26:01 +02:00
Mikael Hugo
dda9793cd6 feat(sf): port sf-home, memory-embeddings, component-types, workflow-install + sweep
- sf-home.ts: new — resolves ~/.sf/ path and SF home dir helpers (port of gsd-home.ts)
- memory-embeddings.ts: new — embedding helpers for memory similarity search
- component-types.ts: new — Component, ComponentManifest, ComponentHook type defs
- workflow-install.ts: new — workflow installation from local/remote sources
- auto-post-unit.ts: clearEvidenceFromDisk after successful verification
- routing-history.ts: add cost-per-token tracking to routing decisions
- workflow-{manifest,templates}.ts: hardening sweep

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 02:22:13 +02:00
Mikael Hugo
9e8361da23 chore(sf): minor self-feedback + workflow-template tweaks
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 02:21:13 +02:00
Mikael Hugo
df8fca8cc7 feat(sf): workflow-plugins port, sf-db expansions, worktree-manager hardening
- workflow-plugins.ts: new — unified plugin discovery, 4 execution modes
  (oneshot, yaml-step, markdown-phase, auto-milestone), hot-reload support
- sf-db.ts: add milestone ghosting/reservation, hook_runs table, memory
  embedding schema, subscription token usage tracking
- worktree-manager.ts: active-worktree tracking, health check cascade,
  dangling-ref pruning, sync-on-switch
- atomic-write.ts: add writeJsonAtomic convenience wrapper
- workflow-logger.ts: add "plugins" LogComponent variant
- workflow-templates.ts: template hot-reload + validation sweep
- scaffold-versioning.ts: versioned drift detection improvements
- preferences-migrations.ts: v3→v4 subscription cost fields migration
- self-feedback.ts: feedback loop dedup window
- headless.ts: EXIT_RELOAD + notification dedup boundary (final)
- tests/auto-vs-autonomous.test.ts: expand coverage for both code paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 02:20:14 +02:00
Mikael Hugo
abb3d76ffa chore(sf): minor sweep — gate-registry dedup, token-counter, worktree-health
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 02:18:03 +02:00
Mikael Hugo
86026c9e4f feat(sf): final UOK parity pass + secondary agent sweep
Evidence-collector (matches gsd2 exactly):
- recordToolCall now takes toolCallId as first arg (parallel-call fix)
- recordToolResult matches by toolCallId, not last-unresolved heuristic
- saveEvidenceToDisk now atomic tmp-rename JSON (not appendFileSync JSONL)
- clearEvidenceFromDisk added; resetEvidence takes no args
- stricter isEvidenceArray validator

auto/loop.ts:
- PID guard in loadStuckState prevents cross-test state pollution
- pid field added to saveStuckState payload
- saveCustomVerifyRetryCounts uses atomicWriteSync (crash-safe)

auto/run-unit.ts:
- chdir failure marked isTransient:true (dir may exist on retry)

auto/session.ts:
- canAskUser field added with reset() support

auto/phases.ts:
- currentUnit = null in closeoutAndStop (no stale refs after stop)

bootstrap/provider-error-resume.ts:
- resetTransientRetryState injectable via ProviderErrorResumeDeps

Secondary sweep (worktree, workflow, token-counter, verification-gate,
activity-log, doctor-environment, json-persistence, scaffold-keeper tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 02:17:21 +02:00
Mikael Hugo
9db94ed77e chore(sf): residual session work — final consolidation
Last batch from the parallel swarm session: docstring tweaks,
verification-gate doc additions, workflow-reconcile and worktree-command
follow-ups, doctor-environment cleanup. Typecheck clean.

Most of the session work landed in earlier commits (8be8f4774, 3045538cb,
038938f2a, ed85252fc, 4f4b584e5, etc.); this commit is the residual
working-tree state after all swarms reported.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 02:17:03 +02:00
Mikael Hugo
f1cef7c476 feat(sf): multi-agent sweep — paths, verification, auto closeout, bootstrap, worktree
- paths.ts: add resolveSliceSummaryPath, resolveCheckpointPath, task-summary helpers
- bootstrap/system-context.ts: worktree active context + codebase-map inject
- auto.ts: plumb autonomousMode flag, startAuto options expansion
- auto/loop.ts: Math.max(0,...) clock-skew guard in enforceMinRequestInterval
- auto/session.ts: add lastUnitAgentEndMessages and PreExecFailure tracking
- auto-post-unit.ts: clearEvidenceFromDisk after verification, isDeterministicPolicyError
- auto-unit-closeout.ts: populate lastPreExecFailure on gate failures
- cache.ts: fix TTL helper arg counts
- codebase-generator.ts: add incremental refresh helpers
- commands/handlers/auto.ts: wire autonomousMode and plan-v2 flags
- context-budget.ts: remove stale context-budget trimming (was dead code)
- dispatch-guard.ts: trim unused guards
- doctor-{environment,runtime-checks}.ts: expand health checks
- execution-instruction-guard.ts: add approval-boundary guard
- gate-registry.ts: de-dup gate registration on reload
- gitignore.ts: add .sf/worktrees to default gitignore
- notification-store.ts: add dedup window + category grouping
- pre-execution-checks.ts: add provider-readiness pre-check
- preferences.ts: subscription cost helpers + allow_flat_rate_providers
- production-mutation-approval.ts: approval-required flag on mutation tools
- state.ts: remove redundant fallback (now handled in deriveState)
- token-counter.ts: subscription token usage tracking
- verification-gate.ts: gate retry on bounded failure class
- workflow-{projections,reconcile,template-compiler,templates}: hardening
- worktree-{command,manager}: path normalization + active-worktree tracking
- tests/verification-evidence.test.ts: new — evidence load/save/clear coverage
- tests/provider-errors.test.ts: add missing provider-delay tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 02:16:13 +02:00
Mikael Hugo
a611cd5792 feat: introduce repo-vcs skill and add JSDoc annotations across core modules
- Add repository-vcs-context.ts to detect and inject VCS context (Git/Jujutsu)
  into the agent system prompt; wire in repo-vcs bundled skill trigger
- Add src/resources/skills/repo-vcs/ skill for commit, push, and safe-push workflows
- Add JSDoc Purpose/Consumer annotations to app-paths, bundled-extension-paths,
  errors, extension-discovery, extension-registry, headless-types, headless, and traces
- Add justfile and just to flake.nix devShell
- Fill out new-user-onboarding.md spec (Draft) and core-beliefs.md (Status: Accepted)
- Add notification-event-model.md design doc and notification-source-hygiene.md spec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-01 21:36:32 +02:00
Mikael Hugo
12e7333f1c feat: stabilize autonomous workflow system 2026-05-01 20:18:50 +02:00
Mikael Hugo
2111da8e60 sf snapshot: pre-dispatch, uncommitted changes after 53m inactivity 2026-04-30 19:10:38 +02:00
Mikael Hugo
d8a9d63c87 feat: Replaced bare error writes in cli.ts, headless.ts, and startup-mo…
- src/cli.ts
- src/headless.ts
- src/startup-model-validation.ts

SF-Task: S04/T03
2026-04-30 15:43:29 +02:00
Mikael Hugo
8677e73046 sf snapshot: pre-dispatch, uncommitted changes after 97m inactivity 2026-04-30 15:11:45 +02:00
Mikael Hugo
62d430ab23 Add provider smoke benchmark and headless updates 2026-04-30 10:19:18 +02:00
Mikael Hugo
6ccce42c62 Add headless bootstrap and TODO triage tests 2026-04-30 09:21:24 +02:00