Pi-mono Tier 0 #4 — manual port (sf went off-task; ported directly).
undici's default 300s bodyTimeout aborts long local-LLM SSE streams
(e.g. vLLM buffering a large tool call) with UND_ERR_BODY_TIMEOUT.
retry.provider.timeoutMs cannot lift this cap — it controls the
provider SDK's AbortController, not undici's per-socket idle timer.
Pass {bodyTimeout: 0, headersTimeout: 0} to EnvHttpProxyAgent. Provider
SDKs continue to enforce their own deadlines.
Type-check passes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
pkg/dist/core/export-html/template.js is a tracked dist mirror that
needs the same HTML escape fix as packages/pi-coding-agent/src/core/
export-html/template.js (committed in 701ec8fb8).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without this, every fresh project inherits sf's user-level dogfooding
defaults (npm run typecheck:extensions, test:sf-light) — which run sf's
own dev scripts against unrelated repos and produce universal false
negatives. Hit in dr-repo (Go): T01-VERIFY.json showed all_fail because
those npm scripts don't exist there, even though T01's actual work passed
verification per its SUMMARY.
- ensurePreferences() now calls detectProjectSignals() and embeds the
auto-detected commands in the YAML frontmatter on first init. Detection
failure is non-fatal — falls back to the bare template.
- detectVerificationCommands() Go branch now handles multi-module repos
(no root go.mod, only nested ones — common pattern for repos like
dr-repo/{dr-agent,portal,gateway,installer,cmd/installer}). Generates
a per-module loop instead of running go vet/test from the repo root,
which would fail since each subdir is its own Go module.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pi-mono Tier 0 #2 — sf-driven port of PR #3650.
Some LLM providers reject API calls when `tools: []` is sent (an empty
array), but accept the call when the tools field is omitted entirely.
This guards each provider's request-body builder to omit `tools` when
the tool list is empty, instead of serialising the empty array.
Files (5 provider builders):
- packages/pi-ai/src/providers/openai-completions.ts
- packages/pi-ai/src/providers/openai-responses.ts
- packages/pi-ai/src/providers/openai-codex-responses.ts
- packages/pi-ai/src/providers/azure-openai-responses.ts
- packages/pi-ai/src/providers/anthropic-shared.ts (covers anthropic
and anthropic-vertex which both import buildParams from it)
Pattern: `if (context.tools)` → `if (context.tools && context.tools.length > 0)`.
Preserved: the `else if (hasToolHistory(context.messages))` branch in
openai-completions.ts that intentionally emits `tools: []` for
LiteLLM/Anthropic-proxy compatibility is unchanged.
Type-check passes.
Co-Authored-By: sf v2.75.1 (session 38ed0a48)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pi-mono Tier 0 #1 (security) — sf-driven port.
Two upstream security fixes (pi-mono PR #3819, #3883) that escape
user-controlled session content before embedding in HTML exports.
Crafted session content (image mime types, image data, model IDs,
tool names, entry IDs) could otherwise inject markup at the export
boundary.
What sf changed in
packages/pi-coding-agent/src/core/export-html/template.js:
- Image tags: escape `mimeType` and `data` attributes for both
tool-result and user-message image renders (PR #3819).
- Session metadata: escape `msg.toolName`, `msg.role`, `entry.modelId`,
`entry.thinkingLevel`, `entry.type`, `entry.id`, and
`globalStats.models` (PR #3883).
- DOM id construction: renamed `entryId` → `entryDomId` and escape
`entry.id` to prevent attribute-breakout from a crafted id.
The existing `escapeHtml()` helper was used at every site; no new
helper introduced. Type-check passes.
Co-Authored-By: sf v2.75.1 (session 150fe2c1)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pi-mono Tier 0 #5 — first sf-driven port. sf-from-source dispatched the
task in print mode and produced this fix autonomously.
Adds getModelMatchCandidates(modelId, modelName?) helper that normalizes
both inputs to lowercase and dash-separated form
(s.replace(/[\s_.:]+/g, "-")). Inference profile ARNs don't embed the
model name; the helper lets capability checks match against either the
inference profile ARN or the underlying model name.
Updated:
- supportsAdaptiveThinking — uses the helper; consolidates the
opus-4.6/opus-4-6 dot-vs-dash variants.
- mapThinkingLevelToEffort — same pattern.
- supportsPromptCaching — same pattern (also from pi-mono PR #3527).
- streamSimpleBedrock and buildAdditionalModelRequestFields — pass
model.name through to capability checks.
Type-check passes (cd packages/pi-ai && npx tsc --noEmit).
Co-Authored-By: sf v2.75.1 (session 911dd2de)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OpenRouter is already neutered via the provider_model_allow allowlist
(see d38e5ea09 fix(schema): auto-coerce string → [string] for sf_* list
fields + provider_model_allow tests). The 248 model entries in
models.generated.ts are inert — no dispatch path reaches them.
Removing the data entries would be aesthetic cleanup with zero
behavioral effect. Not worth a Tier-1 follow-up.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Milestone-end workflow that compares declared product intent (VISION.md,
RUNBOOKS.md, etc.) against actual code/test/deploy/docs evidence and
emits structured gaps with severity. Soft gates — adds follow-up slices
but doesn't hard-block merge.
Slim port (4 new files + 1 registration) — extracts only the audit
feature itself, not bunker's parallel rewrite of dispatch/prompts/
benchmark-selector that came with it in commit 2aa785475.
Created:
- prompts/product-audit.md — prompt verbatim, gsd_*→sf_* and .gsd→.sf
- tools/product-audit-tool.ts — slim file-write implementation,
atomicWriteAsync to .sf/active/{mid}/
PRODUCT-AUDIT.{json,md}; no DB deps
- bootstrap/product-audit-tool.ts — pi-coding-agent tool registration,
TypeBox schema for sf_product_audit
- workflow-templates/product-audit.md — workflow template
Modified:
- bootstrap/register-extension.ts — 2 lines: import + add to nonCriticalRegistrations
- workflow-templates/registry.json — registry entry
- package.json — version 2.75.0 → 2.75.1
Verdict logic (no-gaps | gaps-found | contract-underspecified) is the
load-bearing innovation: contract-underspecified forces the auditor to
flag unverifiable docs as a real gap rather than rubber-stamping
no-gaps when the product contract is silent.
Out of scope: phase enum changes, dispatch hookup. Wire-up to the phase
machine is a follow-up; the prompt + tool + template stand alone.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds direct xiaomi token-plan API access alongside the existing
OpenRouter-routed xiaomi entries. ADDITIVE only — OpenRouter cleanup is
a separate follow-up.
Three new region providers:
- xiaomi-token-plan-ams (Amsterdam, default for plain `xiaomi`)
- xiaomi-token-plan-sgp (Singapore)
- xiaomi-token-plan-cn (China)
All use Anthropic Messages API. Env-var resolution: XIAOMI_API_KEY →
XIAOMI_TOKEN_PLAN_API_KEY → MIMO_API_KEY (in that fallback order).
Three xiaomi MiMo models registered under each direct provider:
- mimo-v2-flash (256k ctx, 64k output, text-only, reasoning)
- mimo-v2-omni (256k ctx, 128k output, text+image, reasoning)
- mimo-v2-pro (1M ctx, 128k output, text-only, reasoning)
Same model literals × 4 provider keys, different baseUrls per region.
Test count assertion bumped 22 → 26 providers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three small UX fixes for headless / autopilot logs:
1. Add `zz-notifications` to TUI_FOOTER_STATUS_KEYS — these are sticky
notification dots from the interactive TUI footer; they have no
meaning in headless and were spamming the log.
2. Categorize notification messages by prefix so headless output is
scannable: [mcp] for MCP-client-ready, [search] for web search status,
[parallel] for slice-parallel/subagent dispatch. Falls through to
the existing important/non-important formatting for everything else.
3. Distinguish phase transitions from generic status updates: phase:/
milestone:/slice:/task: prefixed keys get [phase]; everything else
gets [status]. Previously both used [phase], which was misleading.
Patterns based on bunker commits 14ec4d97f / c15afb45f (which were the
research source) but written fresh against our existing
TUI_FOOTER_STATUS_KEYS structure rather than cherry-picked.
The assistant-text-preview commit (cf0274c63) is a separate, larger
refactor in headless.ts and is deferred to v3.1.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
We have .serena/ configured (cache, memories, project.local.yml) but no
prompt mentioned Serena anywhere. Agents weren't using it for symbol
lookup or cross-file architecture mapping; they fell straight to rg/find.
Added a one-sentence Serena hint to the code-exploration step in:
- research-slice.md
- research-milestone.md
- plan-slice.md
- plan-milestone.md
- guided-research-slice.md
Phrased generically ("If a repo-intelligence MCP (e.g. Serena) is
configured...") so it degrades cleanly when Serena isn't set up.
Pattern based on bunker commit 4ba746888 but written fresh against our
post-rename prompt structure rather than cherry-picked.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After attempting cluster B (4 surgical agent-session fixes), even the
first commit conflicted because of structural namespace divergence
(gsd_*→sf_* rename, @sf-run/*→@singularity-forge/* rename, prior
pi-mono direct cherry-picks). The conflicts are real semantic
divergence, not noise.
Conclusion: sf is a fork; we do not periodically sync from
gsd-build/gsd-2. Pretending we still track upstream means weeks of
merge work for diminishing return.
BUILD_PLAN.md adds an explicit "Upstream stance" section documenting
the fork posture and the rationale for the three irreversible naming
choices.
UPSTREAM_CHERRY_PICK_CANDIDATES.md is reframed as a reference list,
not an action plan. The clusters and SHAs remain useful as an
intelligence source — port specific fixes by hand when one bites us;
do not run automated cherry-picks against the list.
Pi-mono SDK syncs continue separately — that path doesn't have the
same divergence problem.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The origin↔upstream divergence is 4,589 commits. This file picks the
high-leverage subset (~70 commits across 16 topical clusters) worth
considering for cherry-pick. Recommended order at the bottom.
Each cluster lists candidate SHAs with one-line context and effort
estimates. Total estimated work if all clusters A-N are taken: ~10-15
hours plus conflict resolution. Cluster O (UnitContextManifest /
Composer rewrite, ~15 commits) is deferred — likely conflicts heavily
with our work and should be revisited during v3 schema reconciliation.
Cluster P (memories table cutover, 1 commit) is flagged as READ FIRST
because it's upstream's answer to what BUILD_PLAN calls Singularity
Memory integration; reading it may change the recommended integration
path.
This is a candidate list for human decision, not an action plan.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit captures uncommitted modifications that accumulated in the
working tree across multiple in-progress workstreams. It is a snapshot
to clear the deck before sf v3 work begins; individual workstreams
should land separately on top of this.
Notable additions:
- trace-collector.ts, traces.ts, src/tests/trace-export.test.ts —
trace export plumbing
- biome.json — Biome linter configuration
- .gitignore — exclude native/npm/**/*.node compiled binaries
The bulk of the diff is across src/resources/extensions/sf/ (301 files)
and src/resources/extensions/sf/tests/ (277 files), reflecting the
ongoing sf extension work. Specific feature commits should follow this
snapshot rather than being archaeology'd out of it.
The 76MB native/npm/linux-x64-gnu/forge_engine.node compiled binary
was left out of the commit — it's now gitignored and built locally.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Of the 56 NEW items in SPEC.md, not all are worth building for v3.
This plan groups them by tier:
- Tier 1 ESSENTIAL (~5 weeks): Vault resolver, sm integration decision,
schema reconciliation, config alignment.
- Tier 2 STRONG (~3-4 weeks): doc-sync, intent chapters, PhaseReview
3-pass, turn_status marker, last_error cap, cost_micro_usd.
- Tier 3 NICE (v3.1+): persistent agents, inter-agent messaging,
workflow content pinning, runs table, pending_retain.
- Tier 4 DEFER: SSH workers, HTTP API auth, trace_index, PhaseUAT —
build when a deployment demands it.
- Tier 5 DROP: items from late adversarial-review iterations that
don't earn their keep (workflow_pins separate table, snap_ columns,
agent_capabilities separate index).
Includes a recommended ~6-8 week v3.0 schedule and four decision
points that should be settled before starting work.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Imports SPEC.md (v1.0-draft) from singularity-ng/crush#docs/spec — the
forward-looking contract for sf v3. Annotated section-by-section and
item-by-item with implementation status against current sf:
- EXISTS — already implemented in sf, matches the spec
- PARTIAL — implemented but diverges from spec; needs alignment work
- NEW — not yet implemented
Conformance breakdown (123 items total):
- 37 EXISTS
- 30 PARTIAL
- 56 NEW
The NEW items concentrate in: persistent-agent inbox model (§17/§18),
Singularity Memory integration (§16/§24), SSH worker extension (§22),
several supervisor refinements (§9), and policy/operations details
(audit fields, trace metadata, version pinning) introduced during the
v0.x adversarial review iterations.
The PARTIAL items concentrate in: schema reconciliation (sf has 3
tables — milestones/slices/tasks — vs spec's single units table),
config schema alignment, runs-table unification with audit_events,
and several worker-attempt lifecycle details that exist in different
shapes today.
This is an informational import. Implementing v3 against this spec
is its own work; the next step is deciding which NEW items are
actually wanted vs deferred, and whether to migrate the 3-table
planning schema to the single-units shape or keep what sf has and
update the spec.
Spec source: https://github.com/singularity-ng/crush/blob/docs/spec/SPEC.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codex-rescue output (a299c461 / bnr88iy59) — the 'Git merge approved once'
followed seconds later by 'Git merge declined by user' bug we hit on
M002 complete-milestone. Same gate, same agent run, opposite verdicts.
Single source of truth for the merge-gate state in guardrails/index.ts.
Approval is now sticky — re-asks return the cached approval until consumed
or explicitly revoked, never auto-flip to decline. Timeout converts to
pause+log instead of decline. Adds tests/safe-git-merge-gate.test.ts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <noreply@openai.com>
Two codex-rescue tasks landed together:
1. Auto-coerce JSON-schema validator: when a tool field declares
{type:"array", items:{type:"string"}} and the model sends a single
string, wrap it in [string] before validation instead of hard-rejecting.
Fixes the recurring "keyDecisions: must be array" rejection on
sf_complete_task that wasted retries.
2. Provider_model_allow filter (proper implementation with helpers):
- resolveProviderModelAllowList / isProviderModelAllowed /
filterModelsByProviderModelAllow helpers in preferences-models
- Wired into model-registry and auto-model-selection
- New tests/provider-model-allow.test.ts
Tools coerced: sf_complete_task, sf_complete_milestone, sf_plan_milestone,
sf_plan_slice, sf_replan_slice, sf_reassess_roadmap (key list fields).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <noreply@openai.com>
Cherry-pick of gsd-build/gsd-2 65ca5aa2e — applies the security hardening
hunks that conflicted minimally:
- mcp-server/env-writer: validate writes against a strict allowlist
- web/api/files: enforce path containment via web/lib/secure-path
- vscode-extension: read binaryPath/autoStart only from trusted
global/default scopes (resolveTrustedSfStartupConfig), avoiding
workspace-controlled override (renamed Gsd → Sf for sf naming)
- New regression tests: mcp-client-security, vscode-startup-security,
web-files-symlink
Skipped hunks (drifted): mcp-server/server.ts, mcp-client/index.ts,
mcp-server/README.md.
Co-Authored-By: Jeremy <jeremy@fluxlabs.net>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cherry-pick of gsd-build/gsd-2 a09e01640 — withFileLockSync now actually
acquires a proper-lockfile (was previously a no-op when proper-lockfile
wasn't required) and throws on ELOCKED contention by default. Adds
onLocked: 'skip' option for best-effort callers that tolerate dropped
entries (audit, journal). Modernizes import style (createRequire/join
from imports rather than ad-hoc require). Path-renames preserved
(gsd-pi → sf-run).
Co-Authored-By: Jeremy <jeremy@fluxlabs.net>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cherry-pick of gsd-build/gsd-2 53babec29 — lock-wrapped append half.
Wraps appends to .sf/journal/, .sf/audit/events.jsonl, and the
workflow-logger error log in withFileLockSync (onLocked: skip),
preserving best-effort semantics while preventing torn writes
under contention.
Companion to the atomic-write half landed in 3df56cb94. Path-renames
(gsdRoot→sfRoot, gsd-db→sf-db) preserved during conflict resolution.
Co-Authored-By: Jeremy <jeremy@fluxlabs.net>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cherry-pick of gsd-build/gsd-2 9340f1e9b (#4423) — doctor self-heal
detection for symlinked staging directories that can cause silent
data loss. Skips native-git-bridge.ts and git-service test (drifted).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cherry-pick of gsd-build/gsd-2 a4f78731f — handles worktree context fallback
and sanitizes paths in paused session resumption. Skips uok-plan-v2-wiring
test hunk (drifted in sf).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cherry-pick of gsd-build/gsd-2 851507913 (#4056) — defensive parsing
so a corrupt or non-array tasks blob in a milestone row doesn't crash
sf-db reads. Test hunk skipped (sf-db.test.ts has drifted).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cherry-pick of gsd-build/gsd-2 53babec29 (Jeremy <jeremy@fluxlabs.net>)
— atomic-write half only. Eliminates torn-write risk on PROJECT.md
queue sync and reports.json/HTML index regeneration by switching
writeFileSync → atomicWriteSync (tmp+rename).
The companion lock-wrapped-append changes (journal.ts, uok/audit.ts,
workflow-logger.ts) are deferred — they need proper-lockfile +
withFileLockSync helper introduced first.
Co-Authored-By: Jeremy <jeremy@fluxlabs.net>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Generalize the code-intelligence hook to support multiple indexer
backends, with sift (rupurt/sift) as a new option next to the existing
project-rag MCP server. Backend is selected via CodebaseMapPreferences.
- code-intelligence.ts: new abstraction + sift backend (detect, resolve,
status, context-block contribution)
- preferences-types.ts: codebaseIndexer field (project-rag | sift | none)
- preferences-validation.ts: validate the new field
- bootstrap/system-context.ts, commands-codebase.ts: dispatch on backend
- tests/code-intelligence.test.ts: sift detection/resolution/status tests
(19 pass, 0 fail)
project-rag path unchanged and continues to work.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SubagentBackgroundJobManager tracks long-running subagent jobs with
status, abort support, and TTL-based eviction of completed results.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Human-oriented documentation of SF capabilities, with a script that
keeps it in sync with workflow-tools.ts and extension manifests.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extracting a class method as a bare reference loses its 'this' context,
causing 'Cannot read properties of undefined' when minimax (or any
provider) triggers the flat-rate auth-mode lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the dist-vs-source distinction that caused the memoriesSection
fix to not take effect, the c8 coverage runner process leak, and the
template variable maintenance contract.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
buildExecuteTaskPrompt was not passing memoriesSection to loadPrompt,
causing headless auto to fail with a template variable error. Also
updated plan-slice-prompt.test.ts to supply the four template variables
(memoriesSection, runtimeContext, phaseAnchorSection, gatesToClose) that
were missing from the test fixture.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The resolver guarded on context.parentURL.includes('/src/') to identify
in-repo source files, but @google/gemini-cli-core installs to
node_modules/@google/gemini-cli-core/dist/src/ which also contains '/src/'.
Relative imports from that dist package (e.g. './config/config.js') were
incorrectly rewritten to './config/config.ts', causing ERR_MODULE_NOT_FOUND
on every test that transitively imports the google-gemini provider.
Fix: add !context.parentURL.includes('/node_modules/') guard.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
blocked-models.ts (new):
Persistent per-project blocklist at .sf/runtime/blocked-models.json.
loadBlockedModels / isModelBlocked / blockModel (file-lock-safe write).
milestone-summary-classifier.ts (new):
classifyMilestoneSummaryContent → "success" | "failure" | "unknown".
isTerminalMilestoneSummaryContent: failure summaries are NOT terminal —
lets auto-mode re-enter a milestone after a failed recovery summary.
state.ts:
Phase 1 (completeMilestoneIds) and Phase 2 (registry) now check
isTerminalMilestoneSummaryContent before treating a SUMMARY as complete.
A failure SUMMARY no longer prematurely parks a milestone.
error-classifier.ts:
Add "unsupported-model" ErrorClass kind with regex detection
(model + not-supported/unavailable/no-access + account/plan/tier).
Checked before "permanent" so /account/i in PERMANENT_RE doesn't swallow it.
auto-model-selection.ts:
Wire isModelBlocked() gate in selectAndApplyModel candidate loop:
skips provider-rejected models and continues to fallbacks.
bootstrap/agent-end-recovery.ts:
Handle cls.kind === "unsupported-model": blockModel(), try fallback chain
skipping already-blocked models, pause if no usable fallback.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ports commit 7fb35ca58 from gsd2 (PR #4769) covering four issues:
#4761 — resolveCanonicalMilestoneRoot in worktree-manager.ts routes
validate-milestone through the live worktree path instead of stale
project-root state when a milestone worktree is active.
#4762 — auditOrphanedMilestoneBranches in auto-start.ts now surfaces
in-progress milestone branches with unmerged commits ahead of main
(previously only complete milestones were audited). Gated on
isClosedStatus so parked/other closed statuses are unaffected.
#4764 — worktree-telemetry.ts: typed emit helpers (emitWorktreeCreated,
emitWorktreeMerged, emitWorktreeOrphaned, emitAutoExit, emitWorktreeSync,
emitCanonicalRootRedirect, emitSliceMerged, emitMilestoneResquash) plus
summarizeWorktreeTelemetry aggregator and nearest-rank percentile().
Wired in: worktree-resolver.ts (create/merge events), auto-start.ts
(orphan telemetry), auto.ts stopAuto (auto-exit with normalized reason),
worktree-manager.ts (canonical-root-redirect). Surfaced in forensics.ts
via detectWorktreeOrphans and Worktree Telemetry sections.
#4765 — slice-cadence.ts: mergeSliceToMain squash-merges each slice's
commits onto main as soon as the slice passes validation (opt-in via
git.collapse_cadence: "slice"). resquashMilestoneOnMain collapses N
per-slice commits into one milestone commit at completion. Wired in
auto-post-unit.ts (slice merge after complete-slice with stopAuto on
conflict/error) and worktree-resolver.ts (resquash at mergeAndExit).
AutoSession.milestoneStartShas tracks the pre-first-slice SHA.
GitPreferences and preferences-validation.ts extended with
collapse_cadence and milestone_resquash fields.
Also ports /sf scan command: commands-scan.ts with parseScanArgs,
resolveScanDocuments, buildScanOutputPaths, and handleScan dispatching
a focused codebase assessment prompt to .sf/codebase/.
journal.ts: 9 new JournalEventType values for the telemetry events.
All changes are additive; default behavior (cadence="milestone") unchanged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
reassess-roadmap: flip default from true → false. Most reassess units
conclude "roadmap is fine" burning a session for no change; the
plan-slice prompt now carries a JIT preamble at zero cost. (#4778)
tool-execution: always prefer toolDefinition.label when non-empty,
even when label === name — allows tools to display their canonical
name explicitly. (#4758)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds support for project-local SF extension plugins dropped in
.sf/extensions/. Trust-gated (requires pi trust), symlink-escape safe.
- ecosystem/sf-extension-api.ts: SFExtensionAPI wrapper exposing
getPhase() and getActiveUnit() to third-party handlers; updateSnapshot
refreshes state before_agent_start so handlers see current phase/unit
- ecosystem/loader.ts: discovers .sf/extensions/*.js, loads them via
dynamic import, dispatches factory(api) for each
- register-extension.ts: initializes ecosystemHandlers array, wires loader
- register-hooks.ts: before_agent_start refreshes snapshot then dispatches
ecosystem handlers before returning SF system prompt
- types.ts: SFActiveUnit interface (milestoneId/sliceId/taskId + titles)
- workflow-logger.ts: "ecosystem" added to LogComponent union
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes a bug where per-unit tool narrowing poisoned the policy gate for
subsequent units, causing "Model policy denied dispatch before prompt send"
errors on complete-slice and discuss-milestone (100% Win repro).
Four-part port from gsd2@817031b2a:
- ModelPolicyDispatchBlockedError class with per-model deny reasons
- TOOL_BASELINE WeakMap + clearToolBaseline/restoreToolBaseline lifecycle
- auto-model-selection: use getRequiredWorkflowToolsForAutoUnit as requiredTools
- auto/loop: catch ModelPolicyDispatchBlockedError as non-retryable (pause)
- auto.ts: wire clearToolBaseline at startAuto (fresh only) and stopAuto
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>