singularity/singularity-forge

Author	SHA1	Message	Date
Mikael Hugo	dea4c2dbc1	docs: update Tier 0 with port status; flag SSE parser refactor as bigger work 5 of 9 Tier 0 items landed: - #1 HTML export escape (security) `701ec8fb8` + `92c6d933c` - #2 Empty tools array fix `58b1d7c60` - #4 undici 5min timeout `d0907b6d8` - #5 Bedrock inference profile `7c487bb60` Deferred: - #3 Anthropic SSE proxy event tolerance — fix applies to pi-mono's custom SSE parser, but we still use @anthropic-ai/sdk directly. To get protection we'd need to port the full "own Anthropic SSE parsing" refactor (3 commits, ~200 LOC). Added as a separate Tier 0 item. Remaining TODO from Tier 0: items #6-#9 (symlinked dedup, setWorkingVisible extension API, Cloudflare provider, Azure Cognitive Services). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 14:35:55 +02:00
Mikael Hugo	d0907b6d87	port(pi-mono): disable undici body/headers idle timeouts on global dispatcher (refs ea90a6783) Pi-mono Tier 0 #4 — manual port (sf went off-task; ported directly). undici's default 300s bodyTimeout aborts long local-LLM SSE streams (e.g. vLLM buffering a large tool call) with UND_ERR_BODY_TIMEOUT. retry.provider.timeoutMs cannot lift this cap — it controls the provider SDK's AbortController, not undici's per-socket idle timer. Pass {bodyTimeout: 0, headersTimeout: 0} to EnvHttpProxyAgent. Provider SDKs continue to enforce their own deadlines. Type-check passes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 14:35:08 +02:00
Mikael Hugo	92c6d933ce	chore(pkg/dist): sync template.js with source after HTML escape port (refs `701ec8fb8`) pkg/dist/core/export-html/template.js is a tracked dist mirror that needs the same HTML escape fix as packages/pi-coding-agent/src/core/ export-html/template.js (committed in `701ec8fb8`). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 14:28:33 +02:00
Mikael Hugo	6248e79a7a	feat(init): auto-seed PREFERENCES.md with detected verification_commands Without this, every fresh project inherits sf's user-level dogfooding defaults (npm run typecheck:extensions, test:sf-light) — which run sf's own dev scripts against unrelated repos and produce universal false negatives. Hit in dr-repo (Go): T01-VERIFY.json showed all_fail because those npm scripts don't exist there, even though T01's actual work passed verification per its SUMMARY. - ensurePreferences() now calls detectProjectSignals() and embeds the auto-detected commands in the YAML frontmatter on first init. Detection failure is non-fatal — falls back to the bare template. - detectVerificationCommands() Go branch now handles multi-module repos (no root go.mod, only nested ones — common pattern for repos like dr-repo/{dr-agent,portal,gateway,installer,cmd/installer}). Generates a per-module loop instead of running go vet/test from the repo root, which would fail since each subdir is its own Go module. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 14:26:49 +02:00
Mikael Hugo	58b1d7c601	port(pi-mono): omit tools field instead of sending empty array (refs 3e0ee69b5) Pi-mono Tier 0 #2 — sf-driven port of PR #3650. Some LLM providers reject API calls when `tools: []` is sent (an empty array), but accept the call when the tools field is omitted entirely. This guards each provider's request-body builder to omit `tools` when the tool list is empty, instead of serialising the empty array. Files (5 provider builders): - packages/pi-ai/src/providers/openai-completions.ts - packages/pi-ai/src/providers/openai-responses.ts - packages/pi-ai/src/providers/openai-codex-responses.ts - packages/pi-ai/src/providers/azure-openai-responses.ts - packages/pi-ai/src/providers/anthropic-shared.ts (covers anthropic and anthropic-vertex which both import buildParams from it) Pattern: `if (context.tools)` → `if (context.tools && context.tools.length > 0)`. Preserved: the `else if (hasToolHistory(context.messages))` branch in openai-completions.ts that intentionally emits `tools: []` for LiteLLM/Anthropic-proxy compatibility is unchanged. Type-check passes. Co-Authored-By: sf v2.75.1 (session 38ed0a48) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 14:22:31 +02:00
Mikael Hugo	701ec8fb88	port(pi-mono): escape session metadata + image data in HTML export (refs 7617c1ad9, 57787b655) Pi-mono Tier 0 #1 (security) — sf-driven port. Two upstream security fixes (pi-mono PR #3819, #3883) that escape user-controlled session content before embedding in HTML exports. Crafted session content (image mime types, image data, model IDs, tool names, entry IDs) could otherwise inject markup at the export boundary. What sf changed in packages/pi-coding-agent/src/core/export-html/template.js: - Image tags: escape `mimeType` and `data` attributes for both tool-result and user-message image renders (PR #3819). - Session metadata: escape `msg.toolName`, `msg.role`, `entry.modelId`, `entry.thinkingLevel`, `entry.type`, `entry.id`, and `globalStats.models` (PR #3883). - DOM id construction: renamed `entryId` → `entryDomId` and escape `entry.id` to prevent attribute-breakout from a crafted id. The existing `escapeHtml()` helper was used at every site; no new helper introduced. Type-check passes. Co-Authored-By: sf v2.75.1 (session 150fe2c1) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 14:20:23 +02:00
Mikael Hugo	7c487bb60e	port(pi-mono): normalize Bedrock model names for inference profiles (refs ed4bc7308) Pi-mono Tier 0 #5 — first sf-driven port. sf-from-source dispatched the task in print mode and produced this fix autonomously. Adds getModelMatchCandidates(modelId, modelName?) helper that normalizes both inputs to lowercase and dash-separated form (s.replace(/[\s_.:]+/g, "-")). Inference profile ARNs don't embed the model name; the helper lets capability checks match against either the inference profile ARN or the underlying model name. Updated: - supportsAdaptiveThinking — uses the helper; consolidates the opus-4.6/opus-4-6 dot-vs-dash variants. - mapThinkingLevelToEffort — same pattern. - supportsPromptCaching — same pattern (also from pi-mono PR #3527). - streamSimpleBedrock and buildAdditionalModelRequestFields — pass model.name through to capability checks. Type-check passes (cd packages/pi-ai && npx tsc --noEmit). Co-Authored-By: sf v2.75.1 (session 911dd2de) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 14:14:17 +02:00
Mikael Hugo	a3c487c918	docs: add Tier 0 (pi-mono ports) and Tier 0.5 (gsd-2 manual ports) — sf does these first Tier 0 (pi-mono — should land cleanly via cherry-pick, no namespace divergence): 9 items ranked security → bug-fixes → infra → features. Critical: 1. HTML export escape (security) 2. Empty tools array fix (provider compatibility) 3. Anthropic SSE proxy event tolerance 4. Long local-LLM SSE 5min timeout fix Infrastructure: 5. Bedrock inference profile normalization 6. Symlinked packages dedup 7. ctx.ui.setWorkingVisible() extension API Features: 8. Cloudflare Workers AI provider 9. Azure Cognitive Services endpoint Tier 0.5 (gsd-2 — must be MANUALLY ported; cherry-pick fails on namespace): Critical fixes (11): 1-6. bash race, security hardening, web_search injection narrowing, symlinked staging self-heal, KNOWLEDGE budget, mcp-server deadlock 7-10. agent_end transition fixes (4 commits) 11. claude-code-cli Always-Allow persistence Normal-value features (6): 12. /gsd eval-review slim port (prompt + tool + template) 13. Workflow state machine hardening (5 commits as unit) 14. Proactive rate limiting (min_request_interval_ms) 15. Per-call token telemetry (opt-in pi-coding-agent hooks) 16. Worktree TUI commands 17. Doctor check for orphan milestone directories Skipped from each upstream is documented. All in BUILD_PLAN.md so sf can work the list systematically. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 14:04:31 +02:00
Mikael Hugo	bb1c68b7ab	docs: drop OpenRouter-removal follow-up OpenRouter is already neutered via the provider_model_allow allowlist (see `d38e5ea09` fix(schema): auto-coerce string → [string] for sf_* list fields + provider_model_allow tests). The 248 model entries in models.generated.ts are inert — no dispatch path reaches them. Removing the data entries would be aesthetic cleanup with zero behavioral effect. Not worth a Tier-1 follow-up. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:58:33 +02:00
Mikael Hugo	310ce963ea	docs: add session follow-ups to BUILD_PLAN Six items surfaced during 2026-04-29 ports/refactors that didn't get tracked anywhere: - Tier 1: Remove OpenRouter (~248 model entries; user confirmed unused) - Tier 1: Minimax search tests (deferred from initial port) - Tier 2: Search provider registry refactor (rid of 9-file-per-provider) - Tier 2: Product-audit phase machine wire-up (slim port shipped tool; phase dispatch not yet wired) - Tier 2: Headless assistant-text preview (bunker pattern, deferred from headless UX commit) - Tier 3: Pi-mono SDK sync cadence Each entry has rationale + effort estimate. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:56:55 +02:00
Mikael Hugo	a8cf2cd941	feat(workflow): add product-audit (slim port) Milestone-end workflow that compares declared product intent (VISION.md, RUNBOOKS.md, etc.) against actual code/test/deploy/docs evidence and emits structured gaps with severity. Soft gates — adds follow-up slices but doesn't hard-block merge. Slim port (4 new files + 1 registration) — extracts only the audit feature itself, not bunker's parallel rewrite of dispatch/prompts/ benchmark-selector that came with it in commit 2aa785475. Created: - prompts/product-audit.md — prompt verbatim, gsd_→sf_ and .gsd→.sf - tools/product-audit-tool.ts — slim file-write implementation, atomicWriteAsync to .sf/active/{mid}/ PRODUCT-AUDIT.{json,md}; no DB deps - bootstrap/product-audit-tool.ts — pi-coding-agent tool registration, TypeBox schema for sf_product_audit - workflow-templates/product-audit.md — workflow template Modified: - bootstrap/register-extension.ts — 2 lines: import + add to nonCriticalRegistrations - workflow-templates/registry.json — registry entry - package.json — version 2.75.0 → 2.75.1 Verdict logic (no-gaps \| gaps-found \| contract-underspecified) is the load-bearing innovation: contract-underspecified forces the auditor to flag unverifiable docs as a real gap rather than rubber-stamping no-gaps when the product contract is silent. Out of scope: phase enum changes, dispatch hookup. Wire-up to the phase machine is a follow-up; the prompt + tool + template stand alone. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:55:23 +02:00
Mikael Hugo	2eebeccb93	feat(search): add MiniMax web search provider New search backend alongside tavily/brave/serper/exa/ollama. API key resolution: MINIMAX_CODE_PLAN_KEY → MINIMAX_CODING_API_KEY → MINIMAX_API_KEY (fallback order matches MiniMax's documented aliases). Wired through every existing seam: - type union: SearchProvider = 'tavily' \| 'minimax' \| 'brave' \| 'ollama' - VALID_PREFERENCES set + selection logic in provider.ts - native-search routing (Anthropic native web_search delegates correctly) - /search-provider CLI command (tab completion, select UI, parser) - tool-search.ts: search execution path - tool-llm-context.ts: prefetch / context-builder path - preferences-types + preferences-validation - configuration.md user docs - extension-manifest description Tests not added in this commit — the bunker reference tests don't match our preferences/provider export shape (we have serper/exa/combosearch that bunker doesn't). Tests for getMiniMaxSearchApiKey priority order, resolveSearchProvider returning "minimax", /search-provider minimax CLI behavior, no-key error messages, and executeMiniMaxSearch request shape are TODO. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:55:04 +02:00
Mikael Hugo	ae0bbe32fc	feat(providers): add xiaomi direct API (token-plan-{ams,sgp,cn}) — additive Adds direct xiaomi token-plan API access alongside the existing OpenRouter-routed xiaomi entries. ADDITIVE only — OpenRouter cleanup is a separate follow-up. Three new region providers: - xiaomi-token-plan-ams (Amsterdam, default for plain `xiaomi`) - xiaomi-token-plan-sgp (Singapore) - xiaomi-token-plan-cn (China) All use Anthropic Messages API. Env-var resolution: XIAOMI_API_KEY → XIAOMI_TOKEN_PLAN_API_KEY → MIMO_API_KEY (in that fallback order). Three xiaomi MiMo models registered under each direct provider: - mimo-v2-flash (256k ctx, 64k output, text-only, reasoning) - mimo-v2-omni (256k ctx, 128k output, text+image, reasoning) - mimo-v2-pro (1M ctx, 128k output, text-only, reasoning) Same model literals × 4 provider keys, different baseUrls per region. Test count assertion bumped 22 → 26 providers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:54:43 +02:00
Mikael Hugo	dff0df5fdc	fix(headless): suppress notification spam, categorize messages, distinguish phase vs status Three small UX fixes for headless / autopilot logs: 1. Add `zz-notifications` to TUI_FOOTER_STATUS_KEYS — these are sticky notification dots from the interactive TUI footer; they have no meaning in headless and were spamming the log. 2. Categorize notification messages by prefix so headless output is scannable: [mcp] for MCP-client-ready, [search] for web search status, [parallel] for slice-parallel/subagent dispatch. Falls through to the existing important/non-important formatting for everything else. 3. Distinguish phase transitions from generic status updates: phase:/ milestone:/slice:/task: prefixed keys get [phase]; everything else gets [status]. Previously both used [phase], which was misleading. Patterns based on bunker commits 14ec4d97f / c15afb45f (which were the research source) but written fresh against our existing TUI_FOOTER_STATUS_KEYS structure rather than cherry-picked. The assistant-text-preview commit (cf0274c63) is a separate, larger refactor in headless.ts and is deferred to v3.1. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:43:40 +02:00
Mikael Hugo	c41912ff55	fix(prompts): tell agents about Serena (repo-intelligence MCP) for code exploration We have .serena/ configured (cache, memories, project.local.yml) but no prompt mentioned Serena anywhere. Agents weren't using it for symbol lookup or cross-file architecture mapping; they fell straight to rg/find. Added a one-sentence Serena hint to the code-exploration step in: - research-slice.md - research-milestone.md - plan-slice.md - plan-milestone.md - guided-research-slice.md Phrased generically ("If a repo-intelligence MCP (e.g. Serena) is configured...") so it degrades cleanly when Serena isn't set up. Pattern based on bunker commit 4ba746888 but written fresh against our post-rename prompt structure rather than cherry-picked. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:41:33 +02:00
Mikael Hugo	7a6169705a	docs: lock in fork stance, reframe cherry-pick list as reference-only After attempting cluster B (4 surgical agent-session fixes), even the first commit conflicted because of structural namespace divergence (gsd_→sf_ rename, @sf-run/→@singularity-forge/ rename, prior pi-mono direct cherry-picks). The conflicts are real semantic divergence, not noise. Conclusion: sf is a fork; we do not periodically sync from gsd-build/gsd-2. Pretending we still track upstream means weeks of merge work for diminishing return. BUILD_PLAN.md adds an explicit "Upstream stance" section documenting the fork posture and the rationale for the three irreversible naming choices. UPSTREAM_CHERRY_PICK_CANDIDATES.md is reframed as a reference list, not an action plan. The clusters and SHAs remain useful as an intelligence source — port specific fixes by hand when one bites us; do not run automated cherry-picks against the list. Pi-mono SDK syncs continue separately — that path doesn't have the same divergence problem. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 12:57:44 +02:00
Mikael Hugo	a80beb83b5	docs: enumerate high-value upstream cherry-pick candidates The origin↔upstream divergence is 4,589 commits. This file picks the high-leverage subset (~70 commits across 16 topical clusters) worth considering for cherry-pick. Recommended order at the bottom. Each cluster lists candidate SHAs with one-line context and effort estimates. Total estimated work if all clusters A-N are taken: ~10-15 hours plus conflict resolution. Cluster O (UnitContextManifest / Composer rewrite, ~15 commits) is deferred — likely conflicts heavily with our work and should be revisited during v3 schema reconciliation. Cluster P (memories table cutover, 1 commit) is flagged as READ FIRST because it's upstream's answer to what BUILD_PLAN calls Singularity Memory integration; reading it may change the recommended integration path. This is a candidate list for human decision, not an action plan. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 12:53:46 +02:00
Mikael Hugo	b24f426f2b	batch: snapshot of in-flight v2 work This commit captures uncommitted modifications that accumulated in the working tree across multiple in-progress workstreams. It is a snapshot to clear the deck before sf v3 work begins; individual workstreams should land separately on top of this. Notable additions: - trace-collector.ts, traces.ts, src/tests/trace-export.test.ts — trace export plumbing - biome.json — Biome linter configuration - .gitignore — exclude native/npm/*/.node compiled binaries The bulk of the diff is across src/resources/extensions/sf/ (301 files) and src/resources/extensions/sf/tests/ (277 files), reflecting the ongoing sf extension work. Specific feature commits should follow this snapshot rather than being archaeology'd out of it. The 76MB native/npm/linux-x64-gnu/forge_engine.node compiled binary was left out of the commit — it's now gitignored and built locally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 12:42:31 +02:00
Mikael Hugo	31842885ea	docs: add BUILD_PLAN.md — tiered cut of v3 NEW items Of the 56 NEW items in SPEC.md, not all are worth building for v3. This plan groups them by tier: - Tier 1 ESSENTIAL (~5 weeks): Vault resolver, sm integration decision, schema reconciliation, config alignment. - Tier 2 STRONG (~3-4 weeks): doc-sync, intent chapters, PhaseReview 3-pass, turn_status marker, last_error cap, cost_micro_usd. - Tier 3 NICE (v3.1+): persistent agents, inter-agent messaging, workflow content pinning, runs table, pending_retain. - Tier 4 DEFER: SSH workers, HTTP API auth, trace_index, PhaseUAT — build when a deployment demands it. - Tier 5 DROP: items from late adversarial-review iterations that don't earn their keep (workflow_pins separate table, snap_ columns, agent_capabilities separate index). Includes a recommended ~6-8 week v3.0 schedule and four decision points that should be settled before starting work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 12:33:07 +02:00
Mikael Hugo	57a1bc6505	docs: import sf v3 spec from singularity-crush, annotated for status Imports SPEC.md (v1.0-draft) from singularity-ng/crush#docs/spec — the forward-looking contract for sf v3. Annotated section-by-section and item-by-item with implementation status against current sf: - EXISTS — already implemented in sf, matches the spec - PARTIAL — implemented but diverges from spec; needs alignment work - NEW — not yet implemented Conformance breakdown (123 items total): - 37 EXISTS - 30 PARTIAL - 56 NEW The NEW items concentrate in: persistent-agent inbox model (§17/§18), Singularity Memory integration (§16/§24), SSH worker extension (§22), several supervisor refinements (§9), and policy/operations details (audit fields, trace metadata, version pinning) introduced during the v0.x adversarial review iterations. The PARTIAL items concentrate in: schema reconciliation (sf has 3 tables — milestones/slices/tasks — vs spec's single units table), config schema alignment, runs-table unification with audit_events, and several worker-attempt lifecycle details that exist in different shapes today. This is an informational import. Implementing v3 against this spec is its own work; the next step is deciding which NEW items are actually wanted vs deferred, and whether to migrate the 3-table planning schema to the single-units shape or keep what sf has and update the spec. Spec source: https://github.com/singularity-ng/crush/blob/docs/spec/SPEC.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 12:15:02 +02:00
Mikael Hugo	6eaf5926ad	sf snapshot: uncommitted changes after 248m inactivity	2026-04-28 21:10:17 +02:00
Mikael Hugo	d30d91bf2f	sf snapshot: uncommitted changes after 41m inactivity	2026-04-28 17:01:26 +02:00
Mikael Hugo	5d3c204006	fix(git-merge): no auto-flip from approved to declined; cached approval is sticky Codex-rescue output (a299c461 / bnr88iy59) — the 'Git merge approved once' followed seconds later by 'Git merge declined by user' bug we hit on M002 complete-milestone. Same gate, same agent run, opposite verdicts. Single source of truth for the merge-gate state in guardrails/index.ts. Approval is now sticky — re-asks return the cached approval until consumed or explicitly revoked, never auto-flip to decline. Timeout converts to pause+log instead of decline. Adds tests/safe-git-merge-gate.test.ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <noreply@openai.com>	2026-04-28 16:20:08 +02:00
Mikael Hugo	d38e5ea092	fix(schema): auto-coerce string → [string] for sf_* list fields + provider_model_allow tests Two codex-rescue tasks landed together: 1. Auto-coerce JSON-schema validator: when a tool field declares {type:"array", items:{type:"string"}} and the model sends a single string, wrap it in [string] before validation instead of hard-rejecting. Fixes the recurring "keyDecisions: must be array" rejection on sf_complete_task that wasted retries. 2. Provider_model_allow filter (proper implementation with helpers): - resolveProviderModelAllowList / isProviderModelAllowed / filterModelsByProviderModelAllow helpers in preferences-models - Wired into model-registry and auto-model-selection - New tests/provider-model-allow.test.ts Tools coerced: sf_complete_task, sf_complete_milestone, sf_plan_milestone, sf_plan_slice, sf_replan_slice, sf_reassess_roadmap (key list fields). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <noreply@openai.com>	2026-04-28 12:30:55 +02:00
Mikael Hugo	f98a1e360e	batch: codex-rescue session output (multiple in-flight tasks) Combined output of multiple parallel codex-rescue runs that produced working-tree edits but didn't commit. Tasks contributing: - prefs: per-provider model allow-list (provider_model_allow) — manual - TUI scroll + unresponsive (a7884d1a / bt3fpn4y2) - planningMeeting required (aa09e904 / br127l763) - Logs UX 4-pack (a5c65314 / btcplhu7f) - Gate auto-resolve + completion nudge (ae4c8b64 / bw1w1fjkp) - sf_task_complete atomic + retry (a7a079b4 / b20cy5owv) - Multi-model meeting + minimax M2.7 + draft promotion (a756faac / task-moifjknd-lwjc98) - Per-role slice prompts (a94c3e1a) - Per-role vision-meeting prompts (afd165a0 / task-moifple5-lcwtjl) - Schema sweep (ac994b1e / task-moifq7pu-83coqz) - Flow audit (ad26ecfd / bttj4vrqm) Typecheck passes. Tests not run as a full suite — spot-check after merge. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <noreply@openai.com>	2026-04-28 11:52:42 +02:00
Mikael Hugo	66ff949c11	cherry-pick(security): harden project-controlled surfaces (PR #4755 partial) Cherry-pick of gsd-build/gsd-2 65ca5aa2e — applies the security hardening hunks that conflicted minimally: - mcp-server/env-writer: validate writes against a strict allowlist - web/api/files: enforce path containment via web/lib/secure-path - vscode-extension: read binaryPath/autoStart only from trusted global/default scopes (resolveTrustedSfStartupConfig), avoiding workspace-controlled override (renamed Gsd → Sf for sf naming) - New regression tests: mcp-client-security, vscode-startup-security, web-files-symlink Skipped hunks (drifted): mcp-server/server.ts, mcp-client/index.ts, mcp-server/README.md. Co-Authored-By: Jeremy <jeremy@fluxlabs.net> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:37:07 +02:00
Mikael Hugo	bf727173e7	cherry-pick(file-lock): make file-lock actually lock and throw on contention Cherry-pick of gsd-build/gsd-2 a09e01640 — withFileLockSync now actually acquires a proper-lockfile (was previously a no-op when proper-lockfile wasn't required) and throws on ELOCKED contention by default. Adds onLocked: 'skip' option for best-effort callers that tolerate dropped entries (audit, journal). Modernizes import style (createRequire/join from imports rather than ad-hoc require). Path-renames preserved (gsd-pi → sf-run). Co-Authored-By: Jeremy <jeremy@fluxlabs.net> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:28:36 +02:00
Mikael Hugo	22d4579690	cherry-pick(state): lock-wrapped appends for journal, audit, workflow-logger Cherry-pick of gsd-build/gsd-2 53babec29 — lock-wrapped append half. Wraps appends to .sf/journal/, .sf/audit/events.jsonl, and the workflow-logger error log in withFileLockSync (onLocked: skip), preserving best-effort semantics while preventing torn writes under contention. Companion to the atomic-write half landed in `3df56cb94`. Path-renames (gsdRoot→sfRoot, gsd-db→sf-db) preserved during conflict resolution. Co-Authored-By: Jeremy <jeremy@fluxlabs.net> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:27:44 +02:00
Mikael Hugo	f1f4b840e1	cherry-pick(doctor): self-heal symlinked .sf staging to prevent silent data loss Cherry-pick of gsd-build/gsd-2 9340f1e9b (#4423) — doctor self-heal detection for symlinked staging directories that can cause silent data loss. Skips native-git-bridge.ts and git-service test (drifted). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:25:56 +02:00
Mikael Hugo	7fd4672e55	cherry-pick(auto): handle worktree context fallback + sanitize paused session paths Cherry-pick of gsd-build/gsd-2 a4f78731f — handles worktree context fallback and sanitizes paths in paused session resumption. Skips uok-plan-v2-wiring test hunk (drifted in sf). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:25:40 +02:00
Mikael Hugo	93402643f4	cherry-pick(sf-db): tolerate corrupt task arrays in milestone rows Cherry-pick of gsd-build/gsd-2 851507913 (#4056) — defensive parsing so a corrupt or non-array tasks blob in a milestone row doesn't crash sf-db reads. Test hunk skipped (sf-db.test.ts has drifted). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:25:21 +02:00
Mikael Hugo	3df56cb94f	cherry-pick(state): atomic-writes for guided-flow-queue and reports Cherry-pick of gsd-build/gsd-2 53babec29 (Jeremy <jeremy@fluxlabs.net>) — atomic-write half only. Eliminates torn-write risk on PROJECT.md queue sync and reports.json/HTML index regeneration by switching writeFileSync → atomicWriteSync (tmp+rename). The companion lock-wrapped-append changes (journal.ts, uok/audit.ts, workflow-logger.ts) are deferred — they need proper-lockfile + withFileLockSync helper introduced first. Co-Authored-By: Jeremy <jeremy@fluxlabs.net> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:16:39 +02:00
Mikael Hugo	8e827147c9	feat(code-intelligence): add sift indexer backend alongside project-rag Generalize the code-intelligence hook to support multiple indexer backends, with sift (rupurt/sift) as a new option next to the existing project-rag MCP server. Backend is selected via CodebaseMapPreferences. - code-intelligence.ts: new abstraction + sift backend (detect, resolve, status, context-block contribution) - preferences-types.ts: codebaseIndexer field (project-rag \| sift \| none) - preferences-validation.ts: validate the new field - bootstrap/system-context.ts, commands-codebase.ts: dispatch on backend - tests/code-intelligence.test.ts: sift detection/resolution/status tests (19 pass, 0 fail) project-rag path unchanged and continues to work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:05:26 +02:00
Mikael Hugo	0606983d97	feat(subagent): add background job manager and tests SubagentBackgroundJobManager tracks long-running subagent jobs with status, abort support, and TTL-based eviction of completed results. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 04:18:17 +02:00
Mikael Hugo	efd5e14e0a	feat: add FEATURES.md capability map and generator Human-oriented documentation of SF capabilities, with a script that keeps it in sync with workflow-tools.ts and extension manifests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 04:18:12 +02:00
Mikael Hugo	25797129e2	sf snapshot: pre-dispatch, uncommitted changes after 38m inactivity	2026-04-28 00:21:39 +02:00
Mikael Hugo	0d286b991b	sf snapshot: pre-dispatch, uncommitted changes after 2902m inactivity	2026-04-27 23:42:51 +02:00
Mikael Hugo	260d50a823	docs: warn against Python for managed-resources hash; causes resync hang Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 23:20:15 +02:00
Mikael Hugo	f0da5b6d21	fix: bind getProviderAuthMode to registry instance to avoid undefined 'this' Extracting a class method as a bare reference loses its 'this' context, causing 'Cannot read properties of undefined' when minimax (or any provider) triggers the flat-rate auth-mode lookup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 19:22:39 +02:00
Mikael Hugo	7be540480e	docs: add CLAUDE.md with dev guide for build pipeline and test runner Documents the dist-vs-source distinction that caused the memoriesSection fix to not take effect, the c8 coverage runner process leak, and the template variable maintenance contract. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 18:56:03 +02:00
Mikael Hugo	7289933909	fix: populate memoriesSection in execute-task prompt and fix stale dist buildExecuteTaskPrompt was not passing memoriesSection to loadPrompt, causing headless auto to fail with a template variable error. Also updated plan-slice-prompt.test.ts to supply the four template variables (memoriesSection, runtimeContext, phaseAnchorSection, gatesToClose) that were missing from the test fixture. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 18:46:55 +02:00
Mikael Hugo	a30a7692e3	fix: dist-redirect.mjs incorrectly rewrites .js→.ts for node_modules paths containing /src/ The resolver guarded on context.parentURL.includes('/src/') to identify in-repo source files, but @google/gemini-cli-core installs to node_modules/@google/gemini-cli-core/dist/src/ which also contains '/src/'. Relative imports from that dist package (e.g. './config/config.js') were incorrectly rewritten to './config/config.ts', causing ERR_MODULE_NOT_FOUND on every test that transitively imports the google-gemini provider. Fix: add !context.parentURL.includes('/node_modules/') guard. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 18:04:23 +02:00
Mikael Hugo	2e32c96fa0	Port gsd2 functional parity: turn-epoch, abandon-detect, reapplyThinking, exec chain, memory chain, onboarding-state - auto/turn-epoch.ts: AsyncLocalStorage-backed stale-write dropping for timeout recovery - journal.ts: isStaleWrite() guard drops superseded turn writes - auto/run-unit.ts: wrap agent_end Promise.race in runWithTurnGeneration - auto/session.ts: ThinkingLevelSnapshot type + autoModeStartThinkingLevel/originalThinkingLevel fields - auto-model-selection.ts: reapplyThinkingLevel() called after every successful setModel() - auto/phases.ts: pass autoModeStartThinkingLevel to selectAndApplyModel + hook override restore - abandon-detect.ts: two-signal milestone abandon detection in rewrite-docs overrides - auto-post-unit.ts: use detectAbandonMilestone + parkMilestone in rewrite-docs handler - preferences-types.ts: ContextModeConfig + isContextModeEnabled - exec-sandbox.ts: sandboxed bash/node/python subprocess with .sf/exec/ persistence - exec-history.ts: read-side scan of .sf/exec/*.meta.json - compaction-snapshot.ts: ≤2 KB markdown digest written before context compaction - tools/exec-tool.ts: sf_exec MCP tool executor - tools/exec-search-tool.ts: sf_exec_search MCP tool executor - tools/resume-tool.ts: sf_resume MCP tool executor - bootstrap/exec-tools.ts: registers sf_exec/sf_exec_search/sf_resume - memory-relations.ts: knowledge-graph edges between memories (traverseGraph) - tools/memory-tools.ts: capture_thought/memory_query/sf_graph executors - bootstrap/memory-tools.ts: registers capture_thought/memory_query/sf_graph - bootstrap/register-extension.ts: wire exec-tools + memory-tools into registration - onboarding-state.ts: onboarding completion record at ~/.sf/agent/onboarding.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 10:58:39 +02:00
Mikael Hugo	5887ea3fd1	port gsd2: blocked-models gate, milestone-summary classifier, unsupported-model recovery blocked-models.ts (new): Persistent per-project blocklist at .sf/runtime/blocked-models.json. loadBlockedModels / isModelBlocked / blockModel (file-lock-safe write). milestone-summary-classifier.ts (new): classifyMilestoneSummaryContent → "success" \| "failure" \| "unknown". isTerminalMilestoneSummaryContent: failure summaries are NOT terminal — lets auto-mode re-enter a milestone after a failed recovery summary. state.ts: Phase 1 (completeMilestoneIds) and Phase 2 (registry) now check isTerminalMilestoneSummaryContent before treating a SUMMARY as complete. A failure SUMMARY no longer prematurely parks a milestone. error-classifier.ts: Add "unsupported-model" ErrorClass kind with regex detection (model + not-supported/unavailable/no-access + account/plan/tier). Checked before "permanent" so /account/i in PERMANENT_RE doesn't swallow it. auto-model-selection.ts: Wire isModelBlocked() gate in selectAndApplyModel candidate loop: skips provider-rejected models and continues to fallbacks. bootstrap/agent-end-recovery.ts: Handle cls.kind === "unsupported-model": blockModel(), try fallback chain skipping already-blocked models, pause if no usable fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 10:13:27 +02:00
Mikael Hugo	6cb6de4fd2	perf: parallelize I/O, add runtime cache, extend nix devenv - unit-context-composer: resolve artifact keys in parallel (Promise.all) - unit-runtime: add in-memory cache to avoid repeated disk reads per dispatch - auto-timers: share 15s idle watchdog tick with context-pressure check - auto-prompts: 1s TTL budget cache to coalesce repeated loadEffectiveSFPreferences calls - native-git-bridge: extend nativeHasChanges TTL 10s→30s - auto-dashboard: remove pulsing dot animation (CPU churn, no UX value) - flake.nix: add nodePackages.typescript to dev shell Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 10:12:32 +02:00
Mikael Hugo	12aabd863e	port gsd2 #4769 : worktree telemetry, slice-cadence, canonical-root fix + /sf scan Ports commit 7fb35ca58 from gsd2 (PR #4769) covering four issues: #4761 — resolveCanonicalMilestoneRoot in worktree-manager.ts routes validate-milestone through the live worktree path instead of stale project-root state when a milestone worktree is active. #4762 — auditOrphanedMilestoneBranches in auto-start.ts now surfaces in-progress milestone branches with unmerged commits ahead of main (previously only complete milestones were audited). Gated on isClosedStatus so parked/other closed statuses are unaffected. #4764 — worktree-telemetry.ts: typed emit helpers (emitWorktreeCreated, emitWorktreeMerged, emitWorktreeOrphaned, emitAutoExit, emitWorktreeSync, emitCanonicalRootRedirect, emitSliceMerged, emitMilestoneResquash) plus summarizeWorktreeTelemetry aggregator and nearest-rank percentile(). Wired in: worktree-resolver.ts (create/merge events), auto-start.ts (orphan telemetry), auto.ts stopAuto (auto-exit with normalized reason), worktree-manager.ts (canonical-root-redirect). Surfaced in forensics.ts via detectWorktreeOrphans and Worktree Telemetry sections. #4765 — slice-cadence.ts: mergeSliceToMain squash-merges each slice's commits onto main as soon as the slice passes validation (opt-in via git.collapse_cadence: "slice"). resquashMilestoneOnMain collapses N per-slice commits into one milestone commit at completion. Wired in auto-post-unit.ts (slice merge after complete-slice with stopAuto on conflict/error) and worktree-resolver.ts (resquash at mergeAndExit). AutoSession.milestoneStartShas tracks the pre-first-slice SHA. GitPreferences and preferences-validation.ts extended with collapse_cadence and milestone_resquash fields. Also ports /sf scan command: commands-scan.ts with parseScanArgs, resolveScanDocuments, buildScanOutputPaths, and handleScan dispatching a focused codebase assessment prompt to .sf/codebase/. journal.ts: 9 new JournalEventType values for the telemetry events. All changes are additive; default behavior (cadence="milestone") unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 09:03:56 +02:00
Mikael Hugo	2911d3b93d	port gsd2: reassess-roadmap opt-in (ADR-003 §4) + prefer toolDefinition.label reassess-roadmap: flip default from true → false. Most reassess units conclude "roadmap is fine" burning a session for no change; the plan-slice prompt now carries a JIT preamble at zero cost. (#4778) tool-execution: always prefer toolDefinition.label when non-empty, even when label === name — allows tools to display their canonical name explicitly. (#4758) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 08:33:50 +02:00
Mikael Hugo	d4cdcb582d	port gsd2 #3338 : ecosystem plugin loader for .sf/extensions/ Adds support for project-local SF extension plugins dropped in .sf/extensions/. Trust-gated (requires pi trust), symlink-escape safe. - ecosystem/sf-extension-api.ts: SFExtensionAPI wrapper exposing getPhase() and getActiveUnit() to third-party handlers; updateSnapshot refreshes state before_agent_start so handlers see current phase/unit - ecosystem/loader.ts: discovers .sf/extensions/*.js, loads them via dynamic import, dispatches factory(api) for each - register-extension.ts: initializes ecosystemHandlers array, wires loader - register-hooks.ts: before_agent_start refreshes snapshot then dispatches ecosystem handlers before returning SF system prompt - types.ts: SFActiveUnit interface (milestoneId/sliceId/taskId + titles) - workflow-logger.ts: "ecosystem" added to LogComponent union Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 08:27:55 +02:00
Mikael Hugo	6c36d62f35	port gsd2 #4961 : stop using active-tool snapshot as model-policy gate Fixes a bug where per-unit tool narrowing poisoned the policy gate for subsequent units, causing "Model policy denied dispatch before prompt send" errors on complete-slice and discuss-milestone (100% Win repro). Four-part port from gsd2@817031b2a: - ModelPolicyDispatchBlockedError class with per-model deny reasons - TOOL_BASELINE WeakMap + clearToolBaseline/restoreToolBaseline lifecycle - auto-model-selection: use getRequiredWorkflowToolsForAutoUnit as requiredTools - auto/loop: catch ModelPolicyDispatchBlockedError as non-retryable (pause) - auto.ts: wire clearToolBaseline at startAuto (fresh only) and stopAuto Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 08:15:04 +02:00
Mikael Hugo	4fdd8700a3	port gsd2 upstream features: scope classifier, composer v2, GPT-5.5, test timeout - milestone-scope-classifier: add getMilestonePipelineVariant + milestoneRowToScopeInput wired into auto-dispatch trivial-skip for research/validation phases (#4781) - auto-prompts: rename GSD→SF identifiers, add isSummaryCleanForSkip, prefs param on checkNeedsReassessment, buildExtractionStepsBlock from commands-extract-learnings - unit-context-manifest + unit-context-composer: port v2 typed computed artifacts (#4924) - skill-manifest: per-unit-type skill filter resolver (#4788, #4792) - escalation: stub for ADR-011 mid-execution escalation (full port deferred) - auto-start: extract decideSurvivorAction for testability (#4832) - models: add gpt-5.5 + gpt-5.4-mini to cost table, router, and models.generated.ts - types: EscalationArtifact, context_window_override, skip_clean_reassess, mid_execution_escalation, sketch_scope on SliceRow - tool-execution: add visibleWidth import (was undefined) - package.json: add --test-timeout=30000 to prevent parallel tests from freezing machine Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 08:08:11 +02:00

1 2 3 4 5 ...

3635 commits