- Forward onEvent through swarm-dispatch → agent-runner → runSubagent
- Collect toolcall_end events in runUnitViaSwarm to build real tool-use blocks
- Detect checkpoint tool outcome for accurate unit completion signal
- Add headless.ts graceful shutdown (async signal handler, 2.5s timeout)
- RPC client stop() now awaits flush and propagates stop to child sessions
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three coupled changes that together complete the operator-facing
--apply surface for sf headless triage:
1. headless.ts: parse --apply from commandArgs and forward to
handleTriage. The triage option flow now distinguishes inspect
(--list, --json), one-shot (--run), and orchestrated apply
(--apply) cleanly.
2. help-text.ts: triage subcommand line + examples block now document
the --apply mode (triage-decider → rubber-duck pipeline).
3. bootstrap/db-tools.js: resolve_issue tool now accepts the full
canonical evidence-kind set instead of hardcoding "agent-fix":
- agent-fix (default; commit-based fix evidence)
- human-clear (stale, superseded, false positive, intentional close)
- promoted-to-requirement (with required requirement_id)
The tool surfaces a clear error when promoted-to-requirement is
used without requirement_id. The promptGuidelines updated to walk
callers through choosing the right kind.
self-feedback-db.test.mjs extended with coverage for all three
evidence kinds + the missing-requirement_id rejection path.
Together these make sf headless triage --apply genuinely useful: the
agent can produce a plan with any outcome, rubber-duck reviews it,
and the runner applies via resolve_issue with the right evidence
kind per decision.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds runTriage to self-feedback-drain.js, mirroring runReflection in
reflection.js: provider-agnostic dispatch via @singularity-forge/ai's
completeSimple, dependency-injectable for tests, 8-minute timeout race,
clean-finish detection on the canonical "Self-feedback triage complete"
terminator.
`sf headless triage --run [--model provider/modelId]` now dispatches the
canonical triage prompt and writes the model's decision text to
.sf/triage/decisions/<ts>.md. Operators apply the decisions (resolve_issue
calls, code edits) — a tool-enabled variant that lets the model close
entries directly is follow-up work.
Default model: google-gemini-cli/gemini-3-pro-preview (matches
DEFAULT_REFLECTION_MODEL).
Continues the bounded chip away at sf-mp4rxkwb-l4baga: triage now has
both an operator-pipe path (default) and a one-shot dispatch path (--run).
The full unit-type registration that wires this into the autonomous
dispatcher's idle path is the remaining slice of that entry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a deterministic, turn-independent path to drain the self-feedback
queue. Modes:
- default: emits the canonical buildInlineFixPrompt() output for
piping into any model (sf headless triage | sf headless -p -)
- --list: human-readable digest sorted by impact↓ effort↑ ts↑
- --json: structured candidate list for tooling
- --max N: cap candidates
Why this matters (partial step toward sf-mp4rxkwb-l4baga): the existing
session_start drain queues triage as `triggerTurn:true,
deliverAs:"followUp"`. When autonomous mode bails at milestone
validation before any turn runs, the followUp gets dropped and the
queue stays unprocessed. This command sidesteps that by rendering the
prompt synchronously to stdout — operators can pipe it into any model
without depending on autonomous-loop turn semantics. The full
unit-type registration that fixes the underlying dispatcher gap is
larger work tracked in the parent entry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1B of the reflection layer: complete the operator-driven loop by
adding actual LLM dispatch. Phase 1A (commit e161a59e2) shipped the
corpus assembler + prompt template + the prompt-emit operator surface.
This commit wires the dispatch end so `sf headless reflect --run`
produces a real report on disk without manual model piping.
Why shell-out to the gemini CLI and not SF's provider abstraction:
reflection is a single-prompt one-shot inference. Going through SF's
full agent dispatch would require a session, model registry, tool
registration, recovery shell — overkill for "render this prompt,
capture text." The gemini CLI handles auth (~/.gemini/oauth_creds.json),
Code Assist project discovery, and protocol drift on its behalf.
Subprocess cost is paid once per reflection (rare).
Implementation:
- reflection.js: runGeminiReflection(prompt, options) spawns
`gemini --yolo --model <model> -p "<directive>"` and pipes the giant
rendered template via stdin (gemini -p reads stdin and appends).
Returns { ok, content, cleanFinish, exitCode, error, stderr }; never
throws. Defaults to gemini-3-pro-preview (0% used on AI Ultra,
strongest agentic model with quota). 8-minute timeout.
cleanFinish detected by REFLECTION_COMPLETE terminator (emitted by
the prompt template's output contract) — operator gets a warning when
the report is truncated.
- headless-reflect.ts: --run flag triggers dispatch + report write
via writeReflectionReport. --model overrides the default. Errors
surface as JSON or text per --json. Successful runs emit the report
path on stdout; failures emit error + truncated stderr.
- help-text.ts: documents --run and --model flags.
- Tests (4 new, 13 total): use a fake `gemini` binary on PATH to
exercise the spawn path without real OAuth/network — covers
ok+cleanFinish, non-zero exit, hang/timeout, missing-terminator.
All 1538 SF extension tests pass; typecheck clean.
Phase 2 follow-up (still gated on sf-mp4rxkwb-l4baga
triage-not-a-first-class-unit-type landing): reflection-pass becomes a
real autonomous-loop unit type, milestone-close auto-triggers it, the
report's `Recommended new self-feedback entries` section gets parsed
and the entries auto-filed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses self-feedback entry sf-mp4uzvcd-pazg6v
(architecture-defect:no-reflection-layer-over-self-feedback-corpus): SF
detected symptoms and triaged individual entries but had no layer that
reasoned about the corpus to recognize recurring structural patterns.
The same architectural pressure expressed itself across multiple entries
with different exact-kind strings; nothing escalated the pattern to a
class. The cognitive work fell on the operator.
This commit ships Phase 1A — the data-assembly + prompt half of the
reflection layer + an operator-driven entry point. Phase 1B (LLM dispatch
via the autonomous loop as a real unit type) lands once
sf-mp4rxkwb-l4baga (triage-not-a-first-class-unit-type) is in.
Files:
- src/resources/extensions/sf/reflection.js (new)
- assembleReflectionCorpus(basePath): bundles open + recent-resolved
self-feedback (full json), last 50 commits via git log, milestone +
slice + task state, all milestone validation verdicts, and prior
reflection report into one struct. Returns null on prerequisite
failure (DB closed) so callers downgrade gracefully.
- renderReflectionCorpusBrief(corpus): renders the corpus into a
markdown brief the LLM consumes in one turn.
- writeReflectionReport(basePath, content): persists to
.sf/reflection/<timestamp>-report.md so next pass detects "what
changed since last reflection."
- src/resources/extensions/sf/prompts/reflection-pass.md (new)
- {{include:working-directory}} prefix.
- Reasoning order: cluster by structural shape (not exact kind),
identify recurring patterns, identify commit/ledger gaps, identify
stale validation drift, identify the deepest architectural concern,
compare against prior report.
- Output contract: structured markdown report with named sections,
terminator REFLECTION_COMPLETE for clean-finish detection.
- Constraints: don't fix anything (reflection layer not executor),
don't resolve entries without commit-SHA evidence, don't invent IDs.
- src/headless-reflect.ts (new) — sf headless reflect [--json]
- Pre-opens the project DB via auto-start.openProjectDbIfPresent
(one-shot bypass path doesn't run the full SF agent bootstrap).
- Default: emits the rendered prompt brief (template + corpus) for
operators to pipe into any model. Lets the corpus-assembly layer
ship and validate before the LLM-dispatch layer is wired.
- --json: emits raw corpus snapshot for tooling.
- src/headless.ts: registers the new "reflect" command after the
existing usage block.
- src/help-text.ts: documents it in the headless command list.
- src/resources/extensions/sf/tests/reflection.test.mjs (new, 9 tests):
null-when-DB-closed; collects open + recent-resolved; excludes >30d
resolutions; captures milestone/slice/task tree; captures validation
verdicts; commits returned as array (best-effort tmpdir is ok); brief
renders all major sections; entry IDs/severity/kind appear in brief;
writeReflectionReport round-trips through assembleReflectionCorpus's
previousReport read.
Live smoke verified: sf headless reflect against the real .sf/sf.db
returns 15 open + 23 recent-resolved entries, 50 commits, 2 milestones,
1 validation file (correctly surfacing M001's stale needs-attention
verdict against actual 5/5 slices done — exactly the case that
motivated this layer).
Total: +848 LOC, full SF extension suite (1534 tests) passes,
typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a machine-readable headless surface for live LLM-provider usage and
unifies the gemini-cli quota fetch through one helper, removing the
duplication that existed between usage-bar.js and the new package.
1. snapshotGeminiCliAccount in @singularity-forge/google-gemini-cli-provider
- Single source of truth for { projectId, userTierId, userTierName,
paidTier, models[] } via setupUser + retrieveUserQuota.
- Dedups buckets per modelId, keeping the worst (lowest remainingFraction)
so consumers always see the most-restrictive window. Code Assist
sometimes returns multiple buckets per model; the pessimistic choice
is what every consumer needs.
- discoverGeminiCliModels(cwd?) wraps it for catalog-cache callers that
only need the IDs.
2. sf headless usage subcommand
- New src/headless-usage.ts handler. text (default) and --json output.
Uses the package's snapshot directly — no RPC child, no jiti
gymnastics — matching the shape of headless-uok-status / headless-doctor.
- Wired into src/headless.ts after the doctor block.
- Help text adds the command line.
3. usage-bar.js refactored to delegate
- fetchGeminiUsage no longer imports gemini-cli-core directly. It calls
snapshotGeminiCliAccount and reshapes the result into the existing
{ provider, displayName, windows[] } UI contract.
- Eliminates the duplicate setupUser + retrieveUserQuota code path.
- The fast existsSync(~/.gemini/oauth_creds.json) pre-flight stays
so unauth'd users get a friendly message without paying for OAuth
bootstrap.
4. Model registry refactor (separate track committed alongside)
- src/resources/extensions/sf/model-registry.ts (new) consolidates
canonical model identity, capability tier, and generation tags into
one source of truth that auto-model-selection, benchmark-selector,
and model-router now consume instead of maintaining parallel maps.
All 1487 tests pass (151 files); typecheck clean for both the package
and the SF extensions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The log message said '/sf ${command}' but the actual command sent is
'/${command}' (without the sf namespace). Fix to match actual dispatch.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
headless.ts was sending `/sf {subcommand} {args}` to the RPC session, but
commands are registered without the sf namespace (e.g. 'todo', 'autonomous').
_tryExecuteExtensionCommand parsed commandName='sf', found no match, and the
LLM handled the request instead of the typed backend.
Fix: send `/{subcommand} {args}` directly — matches what registerSFCommands
registers and what the TUI already uses.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a new `sf headless status uok` subcommand that queries
gate-run stats and circuit-breaker state from sf.db and formats
them as a markdown table or JSON (--json flag).
- src/headless-uok-status.ts: handler that loads sf-db-gates
directly (avoids the unimported getDistinctGateIds in gate-runner)
- src/headless.ts: bypass RPC, route 'status uok' to handler
- src/help-text.ts: document the new subcommand
- tests/headless-uok-status.test.mjs: 19 node:test coverage
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- HeadlessOptions.yolo added
- parseHeadlessArgs handles --yolo and -y (short form)
- SF_YOLO=1 is injected into the RPC child env when flag is set
- AutoSession._loadPersistedModeState() checks SF_YOLO=1 and
auto-activates YOLO mode (build+autonomous+deep+unrestricted)
on session startup
Usage:
sf headless -y autonomous # YOLO + autonomous mode
sf headless --yolo next # YOLO + run next unit
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Surface stamp:
- AutoSession._loadPersistedModeState() now calls detectSurface() to stamp
the correct surface (headless/web/tui) from env vars on every startup.
Persisted surface value was the previous launch's surface — wrong when
switching between TUI and headless on the same project.
SF_HEADLESS=1 → 'headless', SF_WEB_BRIDGE_TUI=1 → 'web', else 'tui'.
/mode yolo:
- handleModeCommand now recognises 'yolo' as a toggleable special case.
Headless callers can now run: sf headless --command '/mode yolo'
Same behaviour as Ctrl+Y: full-autonomy slam + settingsManager bypass.
/mode catalog description updated to list 'yolo' as an option.
Documentation:
- headless.ts /query and /doctor short-circuits annotated as intentional
architecture trade-offs with a note to keep them in sync with the extension.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Integration of 3 quick wins into existing UOK infrastructure:
1. Model Learning (Quick Win #2) → metrics.js
- Record outcomes to model-learner for per-task-type performance tracking
- Hook: recordUnitOutcome() now calls ModelLearner.recordOutcome()
- Fire-and-forget: never blocks outcome recording on learning failure
- Enables adaptive model routing decisions in downstream gates
2. Self-Report Fixing (Quick Win #1) → triage-self-feedback.js
- Auto-fix high-confidence reports (>0.85) in applyTriageReport()
- Hook: After triage and requirement promotion, apply auto-fixes
- Fire-and-forget: never blocks report application on fix failure
- Returns reportsAutoFixed count for triage metrics
3. Knowledge Injection (Quick Win #3) → already integrated in auto-prompts.js
- Already active in execute-task prompt template
- Semantic matching with graceful degradation
All integration points:
- Fire-and-forget: learning/fixing failures never block dispatch
- UOK-native: use existing outcome recording, db, gates
- Backward compatible: applyTriageReport now async, but callers handle it
- No new dependencies: all modules already in codebase
Testing: 2934 tests pass (no regressions from integration)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- sf-home.ts: new — resolves ~/.sf/ path and SF home dir helpers (port of gsd-home.ts)
- memory-embeddings.ts: new — embedding helpers for memory similarity search
- component-types.ts: new — Component, ComponentManifest, ComponentHook type defs
- workflow-install.ts: new — workflow installation from local/remote sources
- auto-post-unit.ts: clearEvidenceFromDisk after successful verification
- routing-history.ts: add cost-per-token tracking to routing decisions
- workflow-{manifest,templates}.ts: hardening sweep
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Last batch from the parallel swarm session: docstring tweaks,
verification-gate doc additions, workflow-reconcile and worktree-command
follow-ups, doctor-environment cleanup. Typecheck clean.
Most of the session work landed in earlier commits (8be8f4774, 3045538cb,
038938f2a, ed85252fc, 4f4b584e5, etc.); this commit is the residual
working-tree state after all swarms reported.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add repository-vcs-context.ts to detect and inject VCS context (Git/Jujutsu)
into the agent system prompt; wire in repo-vcs bundled skill trigger
- Add src/resources/skills/repo-vcs/ skill for commit, push, and safe-push workflows
- Add JSDoc Purpose/Consumer annotations to app-paths, bundled-extension-paths,
errors, extension-discovery, extension-registry, headless-types, headless, and traces
- Add justfile and just to flake.nix devShell
- Fill out new-user-onboarding.md spec (Draft) and core-beliefs.md (Status: Accepted)
- Add notification-event-model.md design doc and notification-source-hygiene.md spec
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>