singularity/singularity-forge

Author	SHA1	Message	Date
Mikael Hugo	7c487bb60e	port(pi-mono): normalize Bedrock model names for inference profiles (refs ed4bc7308) Pi-mono Tier 0 #5 — first sf-driven port. sf-from-source dispatched the task in print mode and produced this fix autonomously. Adds getModelMatchCandidates(modelId, modelName?) helper that normalizes both inputs to lowercase and dash-separated form (s.replace(/[\s_.:]+/g, "-")). Inference profile ARNs don't embed the model name; the helper lets capability checks match against either the inference profile ARN or the underlying model name. Updated: - supportsAdaptiveThinking — uses the helper; consolidates the opus-4.6/opus-4-6 dot-vs-dash variants. - mapThinkingLevelToEffort — same pattern. - supportsPromptCaching — same pattern (also from pi-mono PR #3527). - streamSimpleBedrock and buildAdditionalModelRequestFields — pass model.name through to capability checks. Type-check passes (cd packages/pi-ai && npx tsc --noEmit). Co-Authored-By: sf v2.75.1 (session 911dd2de) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 14:14:17 +02:00
Mikael Hugo	a3c487c918	docs: add Tier 0 (pi-mono ports) and Tier 0.5 (gsd-2 manual ports) — sf does these first Tier 0 (pi-mono — should land cleanly via cherry-pick, no namespace divergence): 9 items ranked security → bug-fixes → infra → features. Critical: 1. HTML export escape (security) 2. Empty tools array fix (provider compatibility) 3. Anthropic SSE proxy event tolerance 4. Long local-LLM SSE 5min timeout fix Infrastructure: 5. Bedrock inference profile normalization 6. Symlinked packages dedup 7. ctx.ui.setWorkingVisible() extension API Features: 8. Cloudflare Workers AI provider 9. Azure Cognitive Services endpoint Tier 0.5 (gsd-2 — must be MANUALLY ported; cherry-pick fails on namespace): Critical fixes (11): 1-6. bash race, security hardening, web_search injection narrowing, symlinked staging self-heal, KNOWLEDGE budget, mcp-server deadlock 7-10. agent_end transition fixes (4 commits) 11. claude-code-cli Always-Allow persistence Normal-value features (6): 12. /gsd eval-review slim port (prompt + tool + template) 13. Workflow state machine hardening (5 commits as unit) 14. Proactive rate limiting (min_request_interval_ms) 15. Per-call token telemetry (opt-in pi-coding-agent hooks) 16. Worktree TUI commands 17. Doctor check for orphan milestone directories Skipped from each upstream is documented. All in BUILD_PLAN.md so sf can work the list systematically. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 14:04:31 +02:00
Mikael Hugo	bb1c68b7ab	docs: drop OpenRouter-removal follow-up OpenRouter is already neutered via the provider_model_allow allowlist (see `d38e5ea09` fix(schema): auto-coerce string → [string] for sf_* list fields + provider_model_allow tests). The 248 model entries in models.generated.ts are inert — no dispatch path reaches them. Removing the data entries would be aesthetic cleanup with zero behavioral effect. Not worth a Tier-1 follow-up. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:58:33 +02:00
Mikael Hugo	310ce963ea	docs: add session follow-ups to BUILD_PLAN Six items surfaced during 2026-04-29 ports/refactors that didn't get tracked anywhere: - Tier 1: Remove OpenRouter (~248 model entries; user confirmed unused) - Tier 1: Minimax search tests (deferred from initial port) - Tier 2: Search provider registry refactor (rid of 9-file-per-provider) - Tier 2: Product-audit phase machine wire-up (slim port shipped tool; phase dispatch not yet wired) - Tier 2: Headless assistant-text preview (bunker pattern, deferred from headless UX commit) - Tier 3: Pi-mono SDK sync cadence Each entry has rationale + effort estimate. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:56:55 +02:00
Mikael Hugo	a8cf2cd941	feat(workflow): add product-audit (slim port) Milestone-end workflow that compares declared product intent (VISION.md, RUNBOOKS.md, etc.) against actual code/test/deploy/docs evidence and emits structured gaps with severity. Soft gates — adds follow-up slices but doesn't hard-block merge. Slim port (4 new files + 1 registration) — extracts only the audit feature itself, not bunker's parallel rewrite of dispatch/prompts/ benchmark-selector that came with it in commit 2aa785475. Created: - prompts/product-audit.md — prompt verbatim, gsd_→sf_ and .gsd→.sf - tools/product-audit-tool.ts — slim file-write implementation, atomicWriteAsync to .sf/active/{mid}/ PRODUCT-AUDIT.{json,md}; no DB deps - bootstrap/product-audit-tool.ts — pi-coding-agent tool registration, TypeBox schema for sf_product_audit - workflow-templates/product-audit.md — workflow template Modified: - bootstrap/register-extension.ts — 2 lines: import + add to nonCriticalRegistrations - workflow-templates/registry.json — registry entry - package.json — version 2.75.0 → 2.75.1 Verdict logic (no-gaps \| gaps-found \| contract-underspecified) is the load-bearing innovation: contract-underspecified forces the auditor to flag unverifiable docs as a real gap rather than rubber-stamping no-gaps when the product contract is silent. Out of scope: phase enum changes, dispatch hookup. Wire-up to the phase machine is a follow-up; the prompt + tool + template stand alone. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:55:23 +02:00
Mikael Hugo	2eebeccb93	feat(search): add MiniMax web search provider New search backend alongside tavily/brave/serper/exa/ollama. API key resolution: MINIMAX_CODE_PLAN_KEY → MINIMAX_CODING_API_KEY → MINIMAX_API_KEY (fallback order matches MiniMax's documented aliases). Wired through every existing seam: - type union: SearchProvider = 'tavily' \| 'minimax' \| 'brave' \| 'ollama' - VALID_PREFERENCES set + selection logic in provider.ts - native-search routing (Anthropic native web_search delegates correctly) - /search-provider CLI command (tab completion, select UI, parser) - tool-search.ts: search execution path - tool-llm-context.ts: prefetch / context-builder path - preferences-types + preferences-validation - configuration.md user docs - extension-manifest description Tests not added in this commit — the bunker reference tests don't match our preferences/provider export shape (we have serper/exa/combosearch that bunker doesn't). Tests for getMiniMaxSearchApiKey priority order, resolveSearchProvider returning "minimax", /search-provider minimax CLI behavior, no-key error messages, and executeMiniMaxSearch request shape are TODO. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:55:04 +02:00
Mikael Hugo	ae0bbe32fc	feat(providers): add xiaomi direct API (token-plan-{ams,sgp,cn}) — additive Adds direct xiaomi token-plan API access alongside the existing OpenRouter-routed xiaomi entries. ADDITIVE only — OpenRouter cleanup is a separate follow-up. Three new region providers: - xiaomi-token-plan-ams (Amsterdam, default for plain `xiaomi`) - xiaomi-token-plan-sgp (Singapore) - xiaomi-token-plan-cn (China) All use Anthropic Messages API. Env-var resolution: XIAOMI_API_KEY → XIAOMI_TOKEN_PLAN_API_KEY → MIMO_API_KEY (in that fallback order). Three xiaomi MiMo models registered under each direct provider: - mimo-v2-flash (256k ctx, 64k output, text-only, reasoning) - mimo-v2-omni (256k ctx, 128k output, text+image, reasoning) - mimo-v2-pro (1M ctx, 128k output, text-only, reasoning) Same model literals × 4 provider keys, different baseUrls per region. Test count assertion bumped 22 → 26 providers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:54:43 +02:00
Mikael Hugo	dff0df5fdc	fix(headless): suppress notification spam, categorize messages, distinguish phase vs status Three small UX fixes for headless / autopilot logs: 1. Add `zz-notifications` to TUI_FOOTER_STATUS_KEYS — these are sticky notification dots from the interactive TUI footer; they have no meaning in headless and were spamming the log. 2. Categorize notification messages by prefix so headless output is scannable: [mcp] for MCP-client-ready, [search] for web search status, [parallel] for slice-parallel/subagent dispatch. Falls through to the existing important/non-important formatting for everything else. 3. Distinguish phase transitions from generic status updates: phase:/ milestone:/slice:/task: prefixed keys get [phase]; everything else gets [status]. Previously both used [phase], which was misleading. Patterns based on bunker commits 14ec4d97f / c15afb45f (which were the research source) but written fresh against our existing TUI_FOOTER_STATUS_KEYS structure rather than cherry-picked. The assistant-text-preview commit (cf0274c63) is a separate, larger refactor in headless.ts and is deferred to v3.1. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:43:40 +02:00
Mikael Hugo	c41912ff55	fix(prompts): tell agents about Serena (repo-intelligence MCP) for code exploration We have .serena/ configured (cache, memories, project.local.yml) but no prompt mentioned Serena anywhere. Agents weren't using it for symbol lookup or cross-file architecture mapping; they fell straight to rg/find. Added a one-sentence Serena hint to the code-exploration step in: - research-slice.md - research-milestone.md - plan-slice.md - plan-milestone.md - guided-research-slice.md Phrased generically ("If a repo-intelligence MCP (e.g. Serena) is configured...") so it degrades cleanly when Serena isn't set up. Pattern based on bunker commit 4ba746888 but written fresh against our post-rename prompt structure rather than cherry-picked. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 13:41:33 +02:00
Mikael Hugo	7a6169705a	docs: lock in fork stance, reframe cherry-pick list as reference-only After attempting cluster B (4 surgical agent-session fixes), even the first commit conflicted because of structural namespace divergence (gsd_→sf_ rename, @sf-run/→@singularity-forge/ rename, prior pi-mono direct cherry-picks). The conflicts are real semantic divergence, not noise. Conclusion: sf is a fork; we do not periodically sync from gsd-build/gsd-2. Pretending we still track upstream means weeks of merge work for diminishing return. BUILD_PLAN.md adds an explicit "Upstream stance" section documenting the fork posture and the rationale for the three irreversible naming choices. UPSTREAM_CHERRY_PICK_CANDIDATES.md is reframed as a reference list, not an action plan. The clusters and SHAs remain useful as an intelligence source — port specific fixes by hand when one bites us; do not run automated cherry-picks against the list. Pi-mono SDK syncs continue separately — that path doesn't have the same divergence problem. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 12:57:44 +02:00
Mikael Hugo	a80beb83b5	docs: enumerate high-value upstream cherry-pick candidates The origin↔upstream divergence is 4,589 commits. This file picks the high-leverage subset (~70 commits across 16 topical clusters) worth considering for cherry-pick. Recommended order at the bottom. Each cluster lists candidate SHAs with one-line context and effort estimates. Total estimated work if all clusters A-N are taken: ~10-15 hours plus conflict resolution. Cluster O (UnitContextManifest / Composer rewrite, ~15 commits) is deferred — likely conflicts heavily with our work and should be revisited during v3 schema reconciliation. Cluster P (memories table cutover, 1 commit) is flagged as READ FIRST because it's upstream's answer to what BUILD_PLAN calls Singularity Memory integration; reading it may change the recommended integration path. This is a candidate list for human decision, not an action plan. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 12:53:46 +02:00
Mikael Hugo	b24f426f2b	batch: snapshot of in-flight v2 work This commit captures uncommitted modifications that accumulated in the working tree across multiple in-progress workstreams. It is a snapshot to clear the deck before sf v3 work begins; individual workstreams should land separately on top of this. Notable additions: - trace-collector.ts, traces.ts, src/tests/trace-export.test.ts — trace export plumbing - biome.json — Biome linter configuration - .gitignore — exclude native/npm/*/.node compiled binaries The bulk of the diff is across src/resources/extensions/sf/ (301 files) and src/resources/extensions/sf/tests/ (277 files), reflecting the ongoing sf extension work. Specific feature commits should follow this snapshot rather than being archaeology'd out of it. The 76MB native/npm/linux-x64-gnu/forge_engine.node compiled binary was left out of the commit — it's now gitignored and built locally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 12:42:31 +02:00
Mikael Hugo	31842885ea	docs: add BUILD_PLAN.md — tiered cut of v3 NEW items Of the 56 NEW items in SPEC.md, not all are worth building for v3. This plan groups them by tier: - Tier 1 ESSENTIAL (~5 weeks): Vault resolver, sm integration decision, schema reconciliation, config alignment. - Tier 2 STRONG (~3-4 weeks): doc-sync, intent chapters, PhaseReview 3-pass, turn_status marker, last_error cap, cost_micro_usd. - Tier 3 NICE (v3.1+): persistent agents, inter-agent messaging, workflow content pinning, runs table, pending_retain. - Tier 4 DEFER: SSH workers, HTTP API auth, trace_index, PhaseUAT — build when a deployment demands it. - Tier 5 DROP: items from late adversarial-review iterations that don't earn their keep (workflow_pins separate table, snap_ columns, agent_capabilities separate index). Includes a recommended ~6-8 week v3.0 schedule and four decision points that should be settled before starting work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 12:33:07 +02:00
Mikael Hugo	57a1bc6505	docs: import sf v3 spec from singularity-crush, annotated for status Imports SPEC.md (v1.0-draft) from singularity-ng/crush#docs/spec — the forward-looking contract for sf v3. Annotated section-by-section and item-by-item with implementation status against current sf: - EXISTS — already implemented in sf, matches the spec - PARTIAL — implemented but diverges from spec; needs alignment work - NEW — not yet implemented Conformance breakdown (123 items total): - 37 EXISTS - 30 PARTIAL - 56 NEW The NEW items concentrate in: persistent-agent inbox model (§17/§18), Singularity Memory integration (§16/§24), SSH worker extension (§22), several supervisor refinements (§9), and policy/operations details (audit fields, trace metadata, version pinning) introduced during the v0.x adversarial review iterations. The PARTIAL items concentrate in: schema reconciliation (sf has 3 tables — milestones/slices/tasks — vs spec's single units table), config schema alignment, runs-table unification with audit_events, and several worker-attempt lifecycle details that exist in different shapes today. This is an informational import. Implementing v3 against this spec is its own work; the next step is deciding which NEW items are actually wanted vs deferred, and whether to migrate the 3-table planning schema to the single-units shape or keep what sf has and update the spec. Spec source: https://github.com/singularity-ng/crush/blob/docs/spec/SPEC.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-29 12:15:02 +02:00
Mikael Hugo	6eaf5926ad	sf snapshot: uncommitted changes after 248m inactivity	2026-04-28 21:10:17 +02:00
Mikael Hugo	d30d91bf2f	sf snapshot: uncommitted changes after 41m inactivity	2026-04-28 17:01:26 +02:00
Mikael Hugo	5d3c204006	fix(git-merge): no auto-flip from approved to declined; cached approval is sticky Codex-rescue output (a299c461 / bnr88iy59) — the 'Git merge approved once' followed seconds later by 'Git merge declined by user' bug we hit on M002 complete-milestone. Same gate, same agent run, opposite verdicts. Single source of truth for the merge-gate state in guardrails/index.ts. Approval is now sticky — re-asks return the cached approval until consumed or explicitly revoked, never auto-flip to decline. Timeout converts to pause+log instead of decline. Adds tests/safe-git-merge-gate.test.ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <noreply@openai.com>	2026-04-28 16:20:08 +02:00
Mikael Hugo	d38e5ea092	fix(schema): auto-coerce string → [string] for sf_* list fields + provider_model_allow tests Two codex-rescue tasks landed together: 1. Auto-coerce JSON-schema validator: when a tool field declares {type:"array", items:{type:"string"}} and the model sends a single string, wrap it in [string] before validation instead of hard-rejecting. Fixes the recurring "keyDecisions: must be array" rejection on sf_complete_task that wasted retries. 2. Provider_model_allow filter (proper implementation with helpers): - resolveProviderModelAllowList / isProviderModelAllowed / filterModelsByProviderModelAllow helpers in preferences-models - Wired into model-registry and auto-model-selection - New tests/provider-model-allow.test.ts Tools coerced: sf_complete_task, sf_complete_milestone, sf_plan_milestone, sf_plan_slice, sf_replan_slice, sf_reassess_roadmap (key list fields). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <noreply@openai.com>	2026-04-28 12:30:55 +02:00
Mikael Hugo	f98a1e360e	batch: codex-rescue session output (multiple in-flight tasks) Combined output of multiple parallel codex-rescue runs that produced working-tree edits but didn't commit. Tasks contributing: - prefs: per-provider model allow-list (provider_model_allow) — manual - TUI scroll + unresponsive (a7884d1a / bt3fpn4y2) - planningMeeting required (aa09e904 / br127l763) - Logs UX 4-pack (a5c65314 / btcplhu7f) - Gate auto-resolve + completion nudge (ae4c8b64 / bw1w1fjkp) - sf_task_complete atomic + retry (a7a079b4 / b20cy5owv) - Multi-model meeting + minimax M2.7 + draft promotion (a756faac / task-moifjknd-lwjc98) - Per-role slice prompts (a94c3e1a) - Per-role vision-meeting prompts (afd165a0 / task-moifple5-lcwtjl) - Schema sweep (ac994b1e / task-moifq7pu-83coqz) - Flow audit (ad26ecfd / bttj4vrqm) Typecheck passes. Tests not run as a full suite — spot-check after merge. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Co-Authored-By: OpenAI Codex <noreply@openai.com>	2026-04-28 11:52:42 +02:00
Mikael Hugo	66ff949c11	cherry-pick(security): harden project-controlled surfaces (PR #4755 partial) Cherry-pick of gsd-build/gsd-2 65ca5aa2e — applies the security hardening hunks that conflicted minimally: - mcp-server/env-writer: validate writes against a strict allowlist - web/api/files: enforce path containment via web/lib/secure-path - vscode-extension: read binaryPath/autoStart only from trusted global/default scopes (resolveTrustedSfStartupConfig), avoiding workspace-controlled override (renamed Gsd → Sf for sf naming) - New regression tests: mcp-client-security, vscode-startup-security, web-files-symlink Skipped hunks (drifted): mcp-server/server.ts, mcp-client/index.ts, mcp-server/README.md. Co-Authored-By: Jeremy <jeremy@fluxlabs.net> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:37:07 +02:00
Mikael Hugo	bf727173e7	cherry-pick(file-lock): make file-lock actually lock and throw on contention Cherry-pick of gsd-build/gsd-2 a09e01640 — withFileLockSync now actually acquires a proper-lockfile (was previously a no-op when proper-lockfile wasn't required) and throws on ELOCKED contention by default. Adds onLocked: 'skip' option for best-effort callers that tolerate dropped entries (audit, journal). Modernizes import style (createRequire/join from imports rather than ad-hoc require). Path-renames preserved (gsd-pi → sf-run). Co-Authored-By: Jeremy <jeremy@fluxlabs.net> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:28:36 +02:00
Mikael Hugo	22d4579690	cherry-pick(state): lock-wrapped appends for journal, audit, workflow-logger Cherry-pick of gsd-build/gsd-2 53babec29 — lock-wrapped append half. Wraps appends to .sf/journal/, .sf/audit/events.jsonl, and the workflow-logger error log in withFileLockSync (onLocked: skip), preserving best-effort semantics while preventing torn writes under contention. Companion to the atomic-write half landed in `3df56cb94`. Path-renames (gsdRoot→sfRoot, gsd-db→sf-db) preserved during conflict resolution. Co-Authored-By: Jeremy <jeremy@fluxlabs.net> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:27:44 +02:00
Mikael Hugo	f1f4b840e1	cherry-pick(doctor): self-heal symlinked .sf staging to prevent silent data loss Cherry-pick of gsd-build/gsd-2 9340f1e9b (#4423) — doctor self-heal detection for symlinked staging directories that can cause silent data loss. Skips native-git-bridge.ts and git-service test (drifted). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:25:56 +02:00
Mikael Hugo	7fd4672e55	cherry-pick(auto): handle worktree context fallback + sanitize paused session paths Cherry-pick of gsd-build/gsd-2 a4f78731f — handles worktree context fallback and sanitizes paths in paused session resumption. Skips uok-plan-v2-wiring test hunk (drifted in sf). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:25:40 +02:00
Mikael Hugo	93402643f4	cherry-pick(sf-db): tolerate corrupt task arrays in milestone rows Cherry-pick of gsd-build/gsd-2 851507913 (#4056) — defensive parsing so a corrupt or non-array tasks blob in a milestone row doesn't crash sf-db reads. Test hunk skipped (sf-db.test.ts has drifted). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:25:21 +02:00
Mikael Hugo	3df56cb94f	cherry-pick(state): atomic-writes for guided-flow-queue and reports Cherry-pick of gsd-build/gsd-2 53babec29 (Jeremy <jeremy@fluxlabs.net>) — atomic-write half only. Eliminates torn-write risk on PROJECT.md queue sync and reports.json/HTML index regeneration by switching writeFileSync → atomicWriteSync (tmp+rename). The companion lock-wrapped-append changes (journal.ts, uok/audit.ts, workflow-logger.ts) are deferred — they need proper-lockfile + withFileLockSync helper introduced first. Co-Authored-By: Jeremy <jeremy@fluxlabs.net> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:16:39 +02:00
Mikael Hugo	8e827147c9	feat(code-intelligence): add sift indexer backend alongside project-rag Generalize the code-intelligence hook to support multiple indexer backends, with sift (rupurt/sift) as a new option next to the existing project-rag MCP server. Backend is selected via CodebaseMapPreferences. - code-intelligence.ts: new abstraction + sift backend (detect, resolve, status, context-block contribution) - preferences-types.ts: codebaseIndexer field (project-rag \| sift \| none) - preferences-validation.ts: validate the new field - bootstrap/system-context.ts, commands-codebase.ts: dispatch on backend - tests/code-intelligence.test.ts: sift detection/resolution/status tests (19 pass, 0 fail) project-rag path unchanged and continues to work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 05:05:26 +02:00
Mikael Hugo	0606983d97	feat(subagent): add background job manager and tests SubagentBackgroundJobManager tracks long-running subagent jobs with status, abort support, and TTL-based eviction of completed results. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 04:18:17 +02:00
Mikael Hugo	efd5e14e0a	feat: add FEATURES.md capability map and generator Human-oriented documentation of SF capabilities, with a script that keeps it in sync with workflow-tools.ts and extension manifests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-28 04:18:12 +02:00
Mikael Hugo	25797129e2	sf snapshot: pre-dispatch, uncommitted changes after 38m inactivity	2026-04-28 00:21:39 +02:00
Mikael Hugo	0d286b991b	sf snapshot: pre-dispatch, uncommitted changes after 2902m inactivity	2026-04-27 23:42:51 +02:00
Mikael Hugo	260d50a823	docs: warn against Python for managed-resources hash; causes resync hang Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 23:20:15 +02:00
Mikael Hugo	f0da5b6d21	fix: bind getProviderAuthMode to registry instance to avoid undefined 'this' Extracting a class method as a bare reference loses its 'this' context, causing 'Cannot read properties of undefined' when minimax (or any provider) triggers the flat-rate auth-mode lookup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 19:22:39 +02:00
Mikael Hugo	7be540480e	docs: add CLAUDE.md with dev guide for build pipeline and test runner Documents the dist-vs-source distinction that caused the memoriesSection fix to not take effect, the c8 coverage runner process leak, and the template variable maintenance contract. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 18:56:03 +02:00
Mikael Hugo	7289933909	fix: populate memoriesSection in execute-task prompt and fix stale dist buildExecuteTaskPrompt was not passing memoriesSection to loadPrompt, causing headless auto to fail with a template variable error. Also updated plan-slice-prompt.test.ts to supply the four template variables (memoriesSection, runtimeContext, phaseAnchorSection, gatesToClose) that were missing from the test fixture. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 18:46:55 +02:00
Mikael Hugo	a30a7692e3	fix: dist-redirect.mjs incorrectly rewrites .js→.ts for node_modules paths containing /src/ The resolver guarded on context.parentURL.includes('/src/') to identify in-repo source files, but @google/gemini-cli-core installs to node_modules/@google/gemini-cli-core/dist/src/ which also contains '/src/'. Relative imports from that dist package (e.g. './config/config.js') were incorrectly rewritten to './config/config.ts', causing ERR_MODULE_NOT_FOUND on every test that transitively imports the google-gemini provider. Fix: add !context.parentURL.includes('/node_modules/') guard. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 18:04:23 +02:00
Mikael Hugo	2e32c96fa0	Port gsd2 functional parity: turn-epoch, abandon-detect, reapplyThinking, exec chain, memory chain, onboarding-state - auto/turn-epoch.ts: AsyncLocalStorage-backed stale-write dropping for timeout recovery - journal.ts: isStaleWrite() guard drops superseded turn writes - auto/run-unit.ts: wrap agent_end Promise.race in runWithTurnGeneration - auto/session.ts: ThinkingLevelSnapshot type + autoModeStartThinkingLevel/originalThinkingLevel fields - auto-model-selection.ts: reapplyThinkingLevel() called after every successful setModel() - auto/phases.ts: pass autoModeStartThinkingLevel to selectAndApplyModel + hook override restore - abandon-detect.ts: two-signal milestone abandon detection in rewrite-docs overrides - auto-post-unit.ts: use detectAbandonMilestone + parkMilestone in rewrite-docs handler - preferences-types.ts: ContextModeConfig + isContextModeEnabled - exec-sandbox.ts: sandboxed bash/node/python subprocess with .sf/exec/ persistence - exec-history.ts: read-side scan of .sf/exec/*.meta.json - compaction-snapshot.ts: ≤2 KB markdown digest written before context compaction - tools/exec-tool.ts: sf_exec MCP tool executor - tools/exec-search-tool.ts: sf_exec_search MCP tool executor - tools/resume-tool.ts: sf_resume MCP tool executor - bootstrap/exec-tools.ts: registers sf_exec/sf_exec_search/sf_resume - memory-relations.ts: knowledge-graph edges between memories (traverseGraph) - tools/memory-tools.ts: capture_thought/memory_query/sf_graph executors - bootstrap/memory-tools.ts: registers capture_thought/memory_query/sf_graph - bootstrap/register-extension.ts: wire exec-tools + memory-tools into registration - onboarding-state.ts: onboarding completion record at ~/.sf/agent/onboarding.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 10:58:39 +02:00
Mikael Hugo	5887ea3fd1	port gsd2: blocked-models gate, milestone-summary classifier, unsupported-model recovery blocked-models.ts (new): Persistent per-project blocklist at .sf/runtime/blocked-models.json. loadBlockedModels / isModelBlocked / blockModel (file-lock-safe write). milestone-summary-classifier.ts (new): classifyMilestoneSummaryContent → "success" \| "failure" \| "unknown". isTerminalMilestoneSummaryContent: failure summaries are NOT terminal — lets auto-mode re-enter a milestone after a failed recovery summary. state.ts: Phase 1 (completeMilestoneIds) and Phase 2 (registry) now check isTerminalMilestoneSummaryContent before treating a SUMMARY as complete. A failure SUMMARY no longer prematurely parks a milestone. error-classifier.ts: Add "unsupported-model" ErrorClass kind with regex detection (model + not-supported/unavailable/no-access + account/plan/tier). Checked before "permanent" so /account/i in PERMANENT_RE doesn't swallow it. auto-model-selection.ts: Wire isModelBlocked() gate in selectAndApplyModel candidate loop: skips provider-rejected models and continues to fallbacks. bootstrap/agent-end-recovery.ts: Handle cls.kind === "unsupported-model": blockModel(), try fallback chain skipping already-blocked models, pause if no usable fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 10:13:27 +02:00
Mikael Hugo	6cb6de4fd2	perf: parallelize I/O, add runtime cache, extend nix devenv - unit-context-composer: resolve artifact keys in parallel (Promise.all) - unit-runtime: add in-memory cache to avoid repeated disk reads per dispatch - auto-timers: share 15s idle watchdog tick with context-pressure check - auto-prompts: 1s TTL budget cache to coalesce repeated loadEffectiveSFPreferences calls - native-git-bridge: extend nativeHasChanges TTL 10s→30s - auto-dashboard: remove pulsing dot animation (CPU churn, no UX value) - flake.nix: add nodePackages.typescript to dev shell Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 10:12:32 +02:00
Mikael Hugo	12aabd863e	port gsd2 #4769 : worktree telemetry, slice-cadence, canonical-root fix + /sf scan Ports commit 7fb35ca58 from gsd2 (PR #4769) covering four issues: #4761 — resolveCanonicalMilestoneRoot in worktree-manager.ts routes validate-milestone through the live worktree path instead of stale project-root state when a milestone worktree is active. #4762 — auditOrphanedMilestoneBranches in auto-start.ts now surfaces in-progress milestone branches with unmerged commits ahead of main (previously only complete milestones were audited). Gated on isClosedStatus so parked/other closed statuses are unaffected. #4764 — worktree-telemetry.ts: typed emit helpers (emitWorktreeCreated, emitWorktreeMerged, emitWorktreeOrphaned, emitAutoExit, emitWorktreeSync, emitCanonicalRootRedirect, emitSliceMerged, emitMilestoneResquash) plus summarizeWorktreeTelemetry aggregator and nearest-rank percentile(). Wired in: worktree-resolver.ts (create/merge events), auto-start.ts (orphan telemetry), auto.ts stopAuto (auto-exit with normalized reason), worktree-manager.ts (canonical-root-redirect). Surfaced in forensics.ts via detectWorktreeOrphans and Worktree Telemetry sections. #4765 — slice-cadence.ts: mergeSliceToMain squash-merges each slice's commits onto main as soon as the slice passes validation (opt-in via git.collapse_cadence: "slice"). resquashMilestoneOnMain collapses N per-slice commits into one milestone commit at completion. Wired in auto-post-unit.ts (slice merge after complete-slice with stopAuto on conflict/error) and worktree-resolver.ts (resquash at mergeAndExit). AutoSession.milestoneStartShas tracks the pre-first-slice SHA. GitPreferences and preferences-validation.ts extended with collapse_cadence and milestone_resquash fields. Also ports /sf scan command: commands-scan.ts with parseScanArgs, resolveScanDocuments, buildScanOutputPaths, and handleScan dispatching a focused codebase assessment prompt to .sf/codebase/. journal.ts: 9 new JournalEventType values for the telemetry events. All changes are additive; default behavior (cadence="milestone") unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 09:03:56 +02:00
Mikael Hugo	2911d3b93d	port gsd2: reassess-roadmap opt-in (ADR-003 §4) + prefer toolDefinition.label reassess-roadmap: flip default from true → false. Most reassess units conclude "roadmap is fine" burning a session for no change; the plan-slice prompt now carries a JIT preamble at zero cost. (#4778) tool-execution: always prefer toolDefinition.label when non-empty, even when label === name — allows tools to display their canonical name explicitly. (#4758) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 08:33:50 +02:00
Mikael Hugo	d4cdcb582d	port gsd2 #3338 : ecosystem plugin loader for .sf/extensions/ Adds support for project-local SF extension plugins dropped in .sf/extensions/. Trust-gated (requires pi trust), symlink-escape safe. - ecosystem/sf-extension-api.ts: SFExtensionAPI wrapper exposing getPhase() and getActiveUnit() to third-party handlers; updateSnapshot refreshes state before_agent_start so handlers see current phase/unit - ecosystem/loader.ts: discovers .sf/extensions/*.js, loads them via dynamic import, dispatches factory(api) for each - register-extension.ts: initializes ecosystemHandlers array, wires loader - register-hooks.ts: before_agent_start refreshes snapshot then dispatches ecosystem handlers before returning SF system prompt - types.ts: SFActiveUnit interface (milestoneId/sliceId/taskId + titles) - workflow-logger.ts: "ecosystem" added to LogComponent union Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 08:27:55 +02:00
Mikael Hugo	6c36d62f35	port gsd2 #4961 : stop using active-tool snapshot as model-policy gate Fixes a bug where per-unit tool narrowing poisoned the policy gate for subsequent units, causing "Model policy denied dispatch before prompt send" errors on complete-slice and discuss-milestone (100% Win repro). Four-part port from gsd2@817031b2a: - ModelPolicyDispatchBlockedError class with per-model deny reasons - TOOL_BASELINE WeakMap + clearToolBaseline/restoreToolBaseline lifecycle - auto-model-selection: use getRequiredWorkflowToolsForAutoUnit as requiredTools - auto/loop: catch ModelPolicyDispatchBlockedError as non-retryable (pause) - auto.ts: wire clearToolBaseline at startAuto (fresh only) and stopAuto Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 08:15:04 +02:00
Mikael Hugo	4fdd8700a3	port gsd2 upstream features: scope classifier, composer v2, GPT-5.5, test timeout - milestone-scope-classifier: add getMilestonePipelineVariant + milestoneRowToScopeInput wired into auto-dispatch trivial-skip for research/validation phases (#4781) - auto-prompts: rename GSD→SF identifiers, add isSummaryCleanForSkip, prefs param on checkNeedsReassessment, buildExtractionStepsBlock from commands-extract-learnings - unit-context-manifest + unit-context-composer: port v2 typed computed artifacts (#4924) - skill-manifest: per-unit-type skill filter resolver (#4788, #4792) - escalation: stub for ADR-011 mid-execution escalation (full port deferred) - auto-start: extract decideSurvivorAction for testability (#4832) - models: add gpt-5.5 + gpt-5.4-mini to cost table, router, and models.generated.ts - types: EscalationArtifact, context_window_override, skip_clean_reassess, mid_execution_escalation, sketch_scope on SliceRow - tool-execution: add visibleWidth import (was undefined) - package.json: add --test-timeout=30000 to prevent parallel tests from freezing machine Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-25 08:08:11 +02:00
Mikael Hugo	e2147c0694	sf snapshot: pre-dispatch, uncommitted changes after 43m inactivity	2026-04-25 06:34:49 +02:00
Mikael Hugo	7b6c9dd099	sf snapshot: pre-dispatch, uncommitted changes after 4703m inactivity	2026-04-25 05:51:29 +02:00
ace-pm	e625d20a59	fix: add self to flake outputs	2026-04-21 23:27:40 +02:00
ace-pm	c744bdf6c1	fix: atomic writes, parse radix, lossy json, silent worker spawn 8 fixes from 3rd-pass scan: 1. web/components/sf/tempCodeRunnerFile.tsx: remove orphan VS Code 'Code Runner' artifact (850+ lines duplicated from shell-terminal.tsx). Unreferenced but compiled into tsc project. 2. sf/phase-anchor.ts: writePhaseAnchor used plain writeFileSync — a crash mid-write would corrupt the handoff checkpoint that readPhaseAnchor then silently returns null for, losing cross-phase context. Switched to atomicWriteSync (already used by sibling files). 3. sf/forensics.ts: same non-atomic writeFileSync on active-forensics.json marker. Race with a concurrent reader produces an empty object and the forensics session is lost. Switched to atomicWriteSync. 4. web/auto-dashboard-service.ts: paused-session.json existence was the intended signal but a corrupt body silently dropped the paused flag so the UI showed active. Now reports paused on file existence regardless of body integrity, and warns on corruption. 5. sf/visualizer-data.ts: doctor-history.jsonl parser did .map(JSON.parse) inside an outer catch. One corrupt line discarded 19 valid entries. Per-line try/catch preserves the valid rows. 6. sf/files.ts: three parseInt calls without radix (step, total_steps, totalSteps) — also missing \|\| 0 fallback for NaN. 7. cli.ts: parseInt(process.versions.node) without radix. Split on '.' and use radix 10 explicitly. 8. sf/slice-parallel-orchestrator.ts: silent 'catch {}' around spawn() masked worker-spawn failures as 'no workers available'. Matches sibling parallel-orchestrator.ts pattern — now logs via logWarning. Skipped from the scan (need a real lock mechanism, not safe as a one-line fix): - sf/auto-dispatch.ts:164 (UAT counter race) - sf/captures.ts:107 (CAPTURES.md append race) Deferred (low-value): - preferences-models.ts, key-manager.ts, auto-timers.ts silent catches - dead variable in visualizer-data.ts - google-gemini-cli.ts maxTokens clamp interaction tsc --noEmit green at root.	2026-04-21 02:13:10 +02:00
ace-pm	51b65fd490	fix: symlink extensions + silent catches masking real errors Real bugs from 2nd-pass scan: 1. extension-registry.ts: discoverAllManifests skipped symlinked extension dirs because Dirent.isDirectory() returns false for symlinks. Dev-workflow symlinks under ~/.sf/agent/extensions/ were invisible to list/enable/ disable/info. Matches the regression documented in symlink-extension-discovery.test.ts — the test inlines the correct logic, but this callsite still had the buggy form. Now accepts isDirectory() \|\| isSymbolicLink(). 2. headless.ts SIGINT handler: client.stop() failures were double-silenced (inner .catch(()=>{}), outer try{}catch{}). Interactive mode logs stop errors to stderr. Restored head/headless parity — still fire-and-forget (exit code is forced via process.exit) but failures are observable. 3. openai-codex-responses.ts SSE parser: malformed data frames were silently dropped so broken streams looked identical to clean ones. Now debug-logs the parse error with the chunk context so broken streams are distinguishable in logs. Stream continues on bad chunk (one bad frame shouldn't kill the whole generation). 4. web/cleanup-service.ts generated script: bare 'catch {}' around four native git calls (nativeBranchList, nativeDetectMainBranch, nativeBranchListMerged, nativeForEachRef). A failed main-branch detection silently left mainBranch undefined-shaped, then the next native call operated on garbage. Now emits console.warn so failures surface in the subprocess log. 5. web/undo-service.ts generated script: git revert failure was silenced; when --no-commit failed, user saw commitsReverted=0 with no reason. Now logs the revert error before attempting --abort (abort itself remains best-effort silent). False positives from the same scan (investigated and dismissed): - auto-worktree.ts #2505: code uses ':(exclude).sf/milestones' pathspec + shelter-and-restore, which is a better fix than the 'drop --include-untracked' approach the test comment describes. Test comment is stale; source is correct. - Lifecycle handler unhandled rejections across 5 extensions: extensions/runner.ts already try/catches handler invocations and routes to emitError. Wrapping the individual handlers would be redundant.	2026-04-21 02:01:41 +02:00
ace-pm	0f94341b43	fix(loader): fall back to src/resources when SF-WORKFLOW.md missing from dist Build sometimes copies dist/resources/extensions/ without the top-level markdown files (observed: SF-WORKFLOW.md absent in dist/resources/ while extensions/ was present). existsSync(distRes) was true either way, so SF_WORKFLOW_PATH pointed at a non-existent path and /sf failed with ENOENT. Check for the specific file instead of the directory.	2026-04-21 01:39:18 +02:00

1 2 3 4 5 ...

3629 commits