singularity/singularity-forge

Author	SHA1	Message	Date
Mikael Hugo	44fcfb643c	fix(sift): use bm25 only for repo-root — phrase retriever hangs on full scope Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Root cause: the sift binary's phrase retriever hangs indefinitely when queried against the full repo-root scope (57K+ files). Earlier tests mistook this for a general slowness, but isolated testing confirms: - bm25 alone on repo root: works (1m 30s cold, instant warm) - phrase alone on repo root: hangs forever - bm25+phrase on repo root: hangs forever (phrase path blocks) - all retrievers on scoped subdirs: work correctly The earlier Rust panic was from a corrupted cache state left by killing a mid-build vector process. After clearing the cache, bm25 alone works. Fix: chooseSiftRetrievers now returns retrievers: "bm25" (not "bm25,phrase") for repo-root scope. Scoped subdirs still get bm25+phrase+vector with position-aware reranking. Tests: updated 3 assertions in sift-retriever-scope.test.mjs. Full suite: 183 files / 1958 tests pass. Type check: clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 14:28:23 +02:00
Mikael Hugo	1b5348e28e	feat(providers): live discovery for opencode, opencode-go, minimax Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Three providers were missing from PROVIDER_CATALOG_CONFIG so their model lists couldn't be auto-discovered. Their wire ids only existed in packages/ai/src/models.generated.ts as hand-coded entries, meaning new model variants from these providers required manual catalog edits. Verified live endpoints respond to /v1/models with bearer auth: - opencode → https://opencode.ai/zen/v1/models (6 free models) - opencode-go → https://opencode.ai/zen/go/v1/models (15 models) - minimax → https://api.minimax.io/v1/models (works) Added entries: opencode: baseUrl https://opencode.ai/zen, modelsPath /v1/models opencode-go: baseUrl https://opencode.ai/zen/go, modelsPath /v1/models minimax: baseUrl https://api.minimax.io, modelsPath /v1/models (international endpoint; Chinese-network api.minimaxi.com still handled separately in the SDK) Auth keys already wired: OPENCODE_API_KEY, OPENCODE_GO_API_KEY (with OPENCODE_API_KEY fallback), MINIMAX_API_KEY. No env-api-keys.ts changes. Combined with `385e0b448` (dynamic canonicalIdFor resolver), new model variants from these three providers will be auto-grouped in .sf/model-performance.json without hand-editing CANONICAL_BY_ROUTE. Live counts after fresh discovery will reveal experimental models absent from static catalog (e.g. opencode's "big-pickle", opencode-go's deepseek-v4-pro, mimo-v2.5-pro, hy3-preview). The model-router tolerates unconventional wire IDs — no naming constraints. To populate cache: rm -rf ~/.sf/runtime/model-catalog/ + relaunch sf. Tests: 13 new in provider-catalog-discovery.test.mjs (catalog shape, modelsPath presence, DISCOVERABLE_PROVIDER_IDS inclusion). Full suite 183 files / 1940 tests pass, zero regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:19:08 +02:00
Mikael Hugo	db3525b933	chore(model-registry): prune 15 redundant identity-strip aliases Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details After `385e0b448` added the dynamic discovery-cache resolver to canonicalIdFor, the 15 identity-strip aliases added in `089bf0cbe` for discovered providers became pure redundancy — the dynamic path returns the same bare modelId from the discovery cache. Removed (all canonical == bare modelId, all providers in discovery cache): - minimax/MiniMax-M2.7, minimax/MiniMax-M2.7-highspeed - mistral/codestral-latest, mistral/devstral-2512, mistral/devstral-small-2507, mistral/mistral-large-latest, mistral/mistral-medium-latest, mistral/mistral-small-latest - zai/glm-4.5, zai/glm-4.5-air, zai/glm-4.6, zai/glm-4.7, zai/glm-5, zai/glm-5-turbo, zai/glm-5.1 Kept (real aliases — canonical differs from wire id, NOT identity strips): - kimi-coding/kimi-for-coding → kimi-k2.6 (Moonshot alias) - mistral/devstral-medium-2507 → devstral-medium-latest (alias to latest) - minimax/MiniMax-M2 family lowercase mappings (case-change aliases) Also kept: - zai/glm-4.5-flash, zai/glm-4.7-flash (not yet in discovery cache; flash variants may launch before cache refresh — fast-path safety) - kimi-coding/kimi-k2.6 + kimi-k2-thinking (kimi-coding cache only has kimi-for-coding; these resolve via _ENTRY_BY_ROUTE fallback) Tests: 15 new regression tests in canonical-id-dynamic.test.mjs verify each removed entry STILL resolves correctly via dynamic discovery. Total 21/21 in that file, plus 101 model-registry tests, plus 16 canonical-id-mapping tests — all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:17:06 +02:00
Mikael Hugo	385e0b4480	feat(model-learner): canonicalIdFor consults discovery cache as fallback Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details After commit `089bf0cbe` added 23 hand-written aliases for production route keys, the right structural fix is to also consult the dynamic model-discovery cache (~/.sf/agent/discovery-cache.json). Otherwise every new model variant from a discovered provider (ollama-cloud +39 models, openrouter +24, etc.) requires another round of hand-editing. canonicalIdFor now resolves in this order: 1. CANONICAL_BY_ROUTE (static fast path, retains real aliases like kimi-coding/kimi-for-coding → kimi-k2.6 where canonical differs) 2. _ENTRY_BY_ROUTE (existing static path) 3. canonicalIdFromDiscovery — reads ~/.sf/agent/discovery-cache.json, finds (provider, modelId) pair, returns bare modelId In-memory cache with 60s TTL (DISCOVERY_CACHE_TTL_MS) so the readFileSync on the hot path becomes one disk read per minute at most. canonicalIdFor is per-dispatch, not per-token, so the overhead is negligible. Test hook __setDiscoveryCacheForTest lets vitest inject a cache without touching the fs. Tests: 6 new in canonical-id-dynamic.test.mjs (dynamic hit, static-alias wins over dynamic, cache miss → null, null cache graceful, missing-models graceful, multiple models per provider). Combined with existing canonical-id-mapping: 22/22 pass. Full suite 1912 pass, no regressions. Sanity verified: canonicalIdFor("ollama-cloud/glm-5.1") → "glm-5.1" (dynamic-only, not in static table); canonicalIdFor("unknown/never") → null. Follow-up (in flight, separate agent): prune the static identity-strip aliases from CANONICAL_BY_ROUTE for providers in the discovery cache since they're now redundant with the dynamic resolver. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:14:04 +02:00
Mikael Hugo	2a58f4ebec	feat(model-routing): autonomous fallback strict to enabledModels allowlist Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Autonomous mode's model-fallback chain bypassed enabledModels — when zai 429'd, the chain happily fell through to mistral/codestral-latest even though only minimax/, kimi-coding/, zai/, ollama-cloud/ were allowed. Of 52 dispatches in this repo's journal this session, 10 (~19%) escaped the allowlist (mistral×2, opencode-go×3, google-gemini-cli×5). enabledModels was honored by interactive cycling (settings-manager.ts) and by self-feedback-drain.js for triage routing, but auto-model-selection.js's fallback chain in selectAndApplyModel never read it. Now: isModelInEnabledList(provider, modelId, enabledModels) filters each fallback candidate. Supports exact "provider/model" or "provider/*" wildcard. Empty/undefined list = open behavior (no regression for setups without an allowlist). readEnabledModels reads ~/.sf/agent/settings.json once per chain; swallows IO errors → undefined → no constraint (safe failure mode). Escape hatch: SF_BYPASS_ENABLED_MODELS=1 disables the check for emergency / misconfigured cases. When ALL candidates are filtered out and the chain exhausts, throws a clear error directing the operator to add to allowlist or unset. Tests: 13 in enabled-models-fallback.test.mjs covering pattern matrix, multi-candidate chain skipping, bypass env, and exhaustion path. Full suite 1906 pass, no regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:02:58 +02:00
Mikael Hugo	089bf0cbeb	fix(model-learner): resolve canonical-id lazy-load race + 23 wire-id aliases Of 52 dispatches in this repo's journal this session, 51 landed in .sf/model-performance.json's _unmapped bucket — meaning the live-outcome learner couldn't tell which provider/model succeeded or failed. Only 1 dispatch (google-gemini-cli/gemini-3-flash-preview) bucketed correctly. Root cause was NOT just missing aliases — it was a lazy-load race: - model-learner.js declared canonicalIdFor as a fire-and-forget dynamic import side-effect at module bottom - metrics.js called recordOutcome() synchronously after `await import("./model-learner.js")` resolved — before the registry injection promise settled - Result: _canonicalIdForFn was null for the first dispatch every session. Every session. Since the file shipped. Why nobody noticed: _unmapped is a bucket, not an error. No throw, no warning, no UI surface. Selection still worked because benchmark-selector + static hand-tuned scores carry the routing decision. Only the feedback loop (recordOutcome → adjust scores) was silently severed. Fix: - model-learner.js: export `registryReady` promise instead of swallowing it - metrics.js: await registryReady before recordOutcome() - model-registry.ts: 23 new CANONICAL_BY_ROUTE entries covering the actual production fallback chain — zai/glm-4.5{-air,-flash,5,5.1,5-turbo,4.6,4.7,4.7-flash}, mistral/codestral-latest + devstral-2512 + devstral-{small,medium}-* + mistral-{large,medium,small}-latest, google-gemini-cli/gemini-{2.5-pro,3-flash-preview,3.1-pro-preview}, opencode-go/{glm-5,glm-5.1,mimo-v2-omni,mimo-v2-pro} Also adds opt-in backfillModelPerformanceFromJournal(basePath) to reclassify the existing 51 _unmapped records from past journal events. Never auto-runs; backs up the old file before overwriting. Tests: 16 in canonical-id-mapping.test.mjs covering pattern matching, non-mappable cases, bare canonical-id passthrough, and the backfill path. Full suite 1906 pass, no regressions. Known follow-up: CANONICAL_BY_ROUTE uses mixed casing (MiniMax-M2.7 vs minimax-m2) — should be standardized lowercase in a future pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:02:58 +02:00
Mikael Hugo	5f92320c7d	fix(auto): timeout silent swarm turns despite heartbeats	2026-05-15 13:55:04 +02:00
Mikael Hugo	85f6650852	fix(auto): keep solver checkpoint pass out of swarm	2026-05-15 13:35:20 +02:00
Mikael Hugo	bd3fbda9cb	feat(journal): swarm-dispatch event per dispatch — cross-repo telemetry Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details The swarm dispatch path is default in headless (`ea8a3d935`) but the journal didn't tag events with which dispatch path was used. Result: grep "swarm" .sf/journal/.jsonl returned zero hits across this repo, ~/code/dr-repo, ~/code/centralcloud/dr — even where swarm IS running. Cross-repo telemetry was blind to swarm adoption. Now both swarm dispatch sites emit a journal event per call: runUnitViaSwarm (auto/run-unit.js): - success: outcome from worker checkpoint or "continue", via "autonomous-unit" - no-reply: outcome "no-reply" with error field - throw: outcome "error" with error field runSingleAgentViaSwarm (subagent/index.js): - success: outcome "agent-reply", via "subagent-extension", agentName - no-reply / catch: same outcome scheme as run-unit Event shape: { ts, eventType: "swarm-dispatch", data: { unitType, unitId, targetAgent, workMode, toolCallCount, outcome, via, agentName?, error? } } All six emitJournalEvent calls wrapped in try/catch — journal write failure must not break dispatch (mirrors crash-recovery.js pattern). Tests: 68 new assertions across the two files (5 + 4 test groups covering happy path, no-reply, throw). Full suite 1872 pass, no regressions. Once landed everywhere this enables: - grep swarm-dispatch .sf/journal/.jsonl shows adoption - ~/.sf/agent/upstream-feedback.jsonl rolls up swarm vs legacy ratio - "is this repo using swarms?" becomes a one-line query Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 13:22:28 +02:00
Mikael Hugo	c42c13b882	feat(auto): trigger sift index warmup at start of every autonomous loop Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Previously, sift warmup only ran during sf init/auto-start, which meant repos launched via sf headless or entered mid-session never got their index built. The first sift_search/codebase_search call would then block for minutes while the cold cache was built. Now autoLoop() calls ensureSiftIndexWarmup() at loop entry. The warmup runs detached (background process) and is skipped if already running or if a recent marker exists. This ensures every repo SF operates on gets indexed regardless of entry path. - Best-effort: wrapped in try/catch so warmup failures never block the loop - Lazy import to avoid circular dependencies - Debug-logged for observability Tests: 179 files / 1863 tests pass. Type check: clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 13:17:44 +02:00
Mikael Hugo	8b4123cccc	fix(self-feedback): JSONL header is JSON-valid meta marker, not # comment Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Phase 2 (`216b1d43f`) wrote "# generated from .sf/sf.db ..." as line 1 of .sf/self-feedback.jsonl. readJsonl tolerated it via try/catch around JSON.parse, but the doctor's stricter JSONL syntax check flagged it as "invalid jsonl syntax: line 1: Unexpected token '#'". Replace the # comment with a JSON-valid meta marker: {"_meta":"generated from .sf/sf.db","_warning":"do not edit directly; use the resolve_issue tool or sf headless triage --apply"} readJsonl now skips entries carrying `_meta` so downstream consumers don't see the marker as a self-feedback record. Tests updated to match the new marker shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 12:39:16 +02:00
Mikael Hugo	216b1d43f1	feat(self-feedback): DB-first migration — JSONL + Markdown as render targets Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Phase 2 of the DB-first planning state migration (proposal `f3571475d`, Phase 1 `ec65b4d88` covered VALIDATION.md). Same approach for self-feedback: DB is canonical; .sf/self-feedback.jsonl and .sf/SELF-FEEDBACK.md are projections regenerated from DB. Solves a real pain: 4 self-feedback entries were stuck visible in sf headless triage --list because the resolution path (markResolved) read JSONL while the entries lived only in DB after autonomous wrote them through the structured ledger. Hand-edit fixes were obsolete-bound under the divergent-stores design. markResolved (self-feedback.js:870-940): success branch now calls regenerateSelfFeedbackJsonl + regenerateSelfFeedbackMarkdown after the DB write (resolveSelfFeedbackEntry), replacing the appendResolutionToJsonl + regenerate-markdown sequence. Legacy in-place JSONL rewrite path retained only for !isForgeRepo (upstream log). New helpers: - regenerateSelfFeedbackJsonl(basePath): writes JSONL from DB via listSelfFeedbackEntries(); first line is "# generated from .sf/sf.db — do not edit directly; use the resolve_issue tool" (readJsonl already tolerates non-JSON lines via try/catch in JSON.parse, no parser change needed) - backfillSelfFeedbackJsonl(basePath): calls importLegacyJsonlToDb then regenerateSelfFeedbackJsonl; idempotent and exact-byte stable on repeated calls Bootstrap (register-hooks.js): backfillSelfFeedbackJsonl runs on every session start before compactSelfFeedbackMarkdown. No-op when DB unavailable. DB schema unchanged: acceptanceCriteria lives in full_json column and is surfaced via rowToSelfFeedback's ...parsed spread; markResolved's AC-file-touch verification works without change. Tests: 6 new in self-feedback-db.test.mjs (DB-only entry resolves without JSONL, both projections reflect resolution, backfill idempotent + byte-stable, generated-header present, 4 flagged entries resolve cleanly via the new path). 28 tests in the file pass; full suite 179 files / 1863 tests pass, no regressions. Live verification: backfillSelfFeedbackJsonl ran against production .sf/sf.db; all 50 DB entries now in JSONL including the 4 previously stuck entries — resolve_issue calls for them now succeed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 12:29:39 +02:00
Mikael Hugo	7c78994612	fix(auto): pause on out-of-scope task changes	2026-05-15 12:17:20 +02:00
Mikael Hugo	32362a83bc	feat(sift): add --verbose flag and vector-index progress logging Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Adds three improvements to sift diagnostics: 1. --verbose flag: When SF_SIFT_LOG_LEVEL=debug\|trace, sift search calls now include --verbose for richer stderr output from the Rust binary. Applied to sift_search, codebase_search, and warmup paths. 2. Vector-index progress poller: During searches that include the 'vector' retriever, a 30-second interval polls the global sift cache (~/.cache/sift/search/artifacts/indexes/*/sectors/) and writes progress lines to the log file: [2026-05-15T11:00:00Z] vector-index progress: 32 sectors (80 MB total) This lets an operator tail the log during long cold-cache embedding builds instead of staring at a silent process. 3. estimateVectorIndexProgress / countVectorSectors helpers count sector files across all index directories and report total count + size. Tests: 179 files / 1858 tests pass. Type check: clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 11:23:54 +02:00
Mikael Hugo	9b42404149	fix(sift): change reranking from invalid 'rerank' to 'position-aware' Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details chooseSiftRetrievers returned reranking: 'rerank' which is not a valid sift CLI value. Valid values are: none, position-aware, llm, jina, gemma. This caused vector searches to fail with 'invalid value for --reranking'. Fix: use 'position-aware' for scoped subdir searches. This is the structural reranking that pairs with the vector retriever strategy. Tests: 9/9 in sift-retriever-scope.test.mjs updated and passing. Full suite: 178 files / 1845 tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 11:06:33 +02:00
Mikael Hugo	5e478d6506	fix(auto): avoid duplicate swarm checkpoints	2026-05-15 11:01:08 +02:00
Mikael Hugo	7a4a62e244	fix(auto): cap checkpoint repairs before retries	2026-05-15 10:58:02 +02:00
Mikael Hugo	604ebbf824	feat(sift): structured stderr logging — last-search.log + RUST_LOG=info Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Adds operator/agent visibility into sift's indexing + retrieval stages. The 30-min cold full-repo vector indexing test went silent for the full budget because SF's wrappers never enabled sift's tracing layer; CPU and disk activity were the only externally visible signals. resolveSiftLogging(projectRoot) (code-intelligence.js:897) returns { env: { RUST_LOG: level }, logPath } honoring SF_SIFT_LOG_LEVEL (default "info"; "off"/"none"/"" disables). Default destination: ${projectRoot}/.sf/runtime/sift/last-search.log, truncated per call so it always reflects the most recent invocation. Wired into three spawn sites: - ensureSiftIndexWarmup (code-intelligence.js): detached child's stderr fd opened with openSync(logPath, "a") and passed as stdio[2] - runSift (tools/sift-search-tool.js): execFile env merges logEnv, stderr appended to logPath in the execFile callback - codebase_search execute (subagent/index.js): proc.stderr.on("data") tees to logPath via fs.appendFileSync alongside the existing in-memory buffer for tool output When a sift result is empty or times out, the tool reply now includes "(stage diagnostic: .sf/runtime/sift/last-search.log)" so the agent sees immediately where to look. Tests: 11 new in sift-logging.test.mjs — env resolution matrix, log-file truncate/write contract, hint-string format on timeout/no-output/disabled. Full suite 1857/1857, no regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 10:56:32 +02:00
Mikael Hugo	091168303c	fix(auto): abort swarm checkpoint loops	2026-05-15 10:55:37 +02:00
Mikael Hugo	22760e03d5	fix(sift): increase timeouts for vector retriever + scope-aware retriever for codebase_search Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Vector retriever was disabled everywhere because it appeared to hang. It was actually doing a first-time embedding index build for 57K files, which takes ~60-90 min. Re-enable vector by increasing timeouts and letting scope-aware retriever selection decide when vector is safe. Changes: - sift_search: retriever timeout 30s->300s, total 60s->600s - codebase_search: total timeout 120s->600s - warmup: retriever timeout 30s->300s, hard timeout 600s->3600s - codebase_search now uses chooseSiftRetrievers() instead of hardcoded bm25+phrase: repo-root -> bm25+phrase (fast), scoped subdirs -> vector - Comments updated to reflect "slow first build" not "hang" Tests: 178 files / 1845 tests, all pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 10:46:35 +02:00
Mikael Hugo	427324fb93	fix(plan): update existing milestone specs without stale params	2026-05-15 10:45:18 +02:00
Mikael Hugo	6e40b829f2	feat(sift): scope-aware retriever selection — vector for scoped, bm25 for repo-root Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Commit `1a98d8f9a` hardcoded --retrievers bm25,phrase across all sift calls to work around the full-repo vector inference hang. But vector retrieval works fine on scoped subdirectory queries (empirically: ~30s on src/resources/extensions/sf/uok with real semantic scoring). The hang is the full-repo indexing scope, not the inference path. This commit replaces the universal bm25 restriction with a scope-aware selector chooseSiftRetrievers(scopePath, projectRoot): - scopePath resolves to repo root → bm25+phrase, no rerank (safe) - scopePath resolves to anything else → bm25+phrase+vector, rerank enabled (semantic ranking unlocked) ensureSiftIndexWarmup behavior unchanged (scope is "." → repo-root → bm25+phrase). buildSiftArgs in the codebase_search tool now defaults to vector when the caller passes a scoped path; explicit retrievers overrides still win. Unlocks the high-leverage uses described earlier this session (memory ranking, plan/research context pre-fetch) for free — those always scope to a sub-tree. Tests: 9 new in sift-retriever-scope.test.mjs cover the dispatch matrix (repo-root variants get bm25, subdir variants get vector, explicit override wins, regression guard for warmup default). Full suite: 178 files / 1844 tests, no regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 10:25:22 +02:00
Mikael Hugo	d90ac1fd69	fix(codebase_search): disable vector retriever to prevent hang Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details The vector retriever in sift hangs indefinitely during embedding model inference, causing all codebase_search calls to timeout. Apply the same fix as sift_search: restrict retrievers to bm25+phrase and disable ML reranking. - buildCodebaseSearchArgs: add --retrievers bm25,phrase --reranking none - Update tool description from (BM25 + Vector) to (BM25 + phrase) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 10:13:31 +02:00
Mikael Hugo	1a98d8f9af	fix(sift): disable vector retriever + ML reranking to prevent hang Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details The sentence-transformers/all-MiniLM-L6-v2 embedding model inference hangs indefinitely during sift search, causing: - Warmup to never complete (TTL expired 62+ min ago) - All page-index-hybrid searches to timeout - The search cache to become stale Fix: Restrict warmup and search to bm25+phrase retrievers with no ML reranking. This gives fast lexical results while avoiding the hanging embedding inference path. Also expose --retrievers and --reranking params in sift_search tool so callers can override per-query if needed. Closes #vector-hang-fix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 09:45:49 +02:00
Mikael Hugo	ec65b4d881	feat(planning-state): DB-first VALIDATION.md migration (proposal MVP) Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Implements Phase 1 of docs/dev/proposals/db-first-planning-state.md (commit `f3571475d`). VALIDATION.md is now a render target; DB is canonical. Three read sites switched to DB: - tools/complete-milestone.js: getMilestoneValidationAssessment(id)?.status replaces readFile + extractVerdict (lines 126-137 → 126-140) - workspace-index.js: same swap in the indexWorkspace loop (was resolveMilestoneFile → loadFile → extractVerdict per milestone) - state-shared.js:readMilestoneValidationVerdict was already DB-first (prefers DB, file fallback only when no DB) — no change needed Write path regenerates: - tools/validate-milestone.js:renderValidationMarkdown now prepends <!-- generated from .sf/sf.db — do not edit directly; use the validate_milestone tool --> so the file is unambiguously a projection - verdict-parser.js:extractVerdict strips the comment header before frontmatter parsing so legacy readers (reflection.js, auto-prompts.js) still work on generated files Doctor check retired (clean delete): - doctor-engine-checks.js: db_projection_validation_drift detector removed entirely. Drift is structurally impossible once the write path always regenerates from DB. Comment block explains the removal. Tests: - New: db-first-validation.test.mjs — 6 tests covering regeneration, three read-site overrides, hand-edit override, doctor non-emission - Updated: doctor-db-projection-drift.test.mjs now asserts the check is NOT emitted (was previously asserting it WAS) Full suite: 469 passed, 0 failed, 3 skipped. No regressions. Closes the same class as the self-feedback DB/JSONL divergence pain — the M001-6377a4-VALIDATION.md doctor warning that's been firing repeatedly this session is gone by construction. Other planning artifacts (CONTEXT.md, ROADMAP.md, SUMMARY.md) follow in later phases per the proposal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 09:35:28 +02:00
Mikael Hugo	7dbf8ad430	feat(model-policy): wire lineage-diverse-from-worker into selector Round 8's `e7cf16882` declared the adversary role and the lineage-diverse-from-worker constraint but left actual filtering as a TODO in selectAndApplyModel. This wires the filter end-to-end. selectAndApplyModel now accepts (role, workerModelId) trailing params: - role: from modelRoleForUnitType(unitType) (extended to recognize "adversary"/"challenge"/"red-team" unit types as the adversary role) - workerModelId: explicit caller-supplied override, else falls back to _lastWorkerModelId (process-local cache populated whenever a worker- role dispatch resolves a model) When role is adversary or reviewer AND the role-policy includes lineage-diverse-from-worker, applyLineageDiverseFilter strips candidates that share root vendor with the worker model (via isSameRootVendor from model-role-policy.js). If filtering would leave zero candidates, a warning is logged and the unfiltered set is used (better a same-vendor reviewer than no reviewer). phases-unit.js threads modelRoleForUnitType(unitType) into selectAndApplyModel — the only producer site that needed the role parameter. Tests: 13 new (7 pure unit on applyLineageDiverseFilter — vendor mapping matrix + edge cases; 6 integration on selectAndApplyModel + modelRoleForUnitType wiring). All 37 tests in the affected files pass, no regressions. Concern: if the per-unit model config (from disk prefs) maps exclusively to the worker's vendor and has no fallback candidates, returns appliedModel: null — operator-configurable. Documented in tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 09:24:50 +02:00
Mikael Hugo	f3454de58a	fix(triage): --run routes through runTriageApply{dryRun:true} via SF router Closes sf-mp5khix3-9beona architecture-defect:triage-run-bypasses-sf-routing. The legacy `runTriage` in self-feedback-drain.js hardcoded DEFAULT_TRIAGE_MODEL="google-gemini-cli/gemini-3-pro-preview" and dispatched via @singularity-forge/ai completeSimple (text-only, no tools). The result: an autonomous triage path that produced a markdown decision matrix operators had to manually apply via resolve_issue. Now `--run` goes through runTriageApply with a new `dryRun: true` option that: - uses the same Phase 1/2 pipeline as --apply (triage-decider + review) - pre-resolves the model via SF's router (rankTriageModelsViaRouter), no hardcoded model - skips Phase 3 applyTriagePlan (read-only by design) - uses permissionProfile="low" and relaxes the trusted-source + custom-runner guards for the inspection path - prefixes flowId with "triage-run-" for clean trace separation Legacy runTriage kept as @deprecated (still exercised by self-feedback-drain.test.mjs unit tests that target completeSimple dispatch directly). Tests: 6 new in headless-triage-run-routing.test.ts covering dryRun short-circuit, no ledger mutations, guard relaxation, router not hardcoded, disagreement surfaces deciderOutput. Full triage suite: 35 tests pass, 0 regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 09:20:43 +02:00
Mikael Hugo	a5dd5db354	fix(self-feedback): align report kinds and isolate watchdog tests	2026-05-15 09:19:27 +02:00
Mikael Hugo	ff31258629	chore: capture autonomous in-flight self-improvements Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Snapshot uncommitted work autonomous made in this session: - run-unit.js +54: enrich runUnitViaSwarm with completedItems / remainingItems / verificationEvidence pass-through from worker checkpoint args - self-feedback.js +10 - 2 test files updated to match the new shape All 72 affected tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 09:03:42 +02:00
Mikael Hugo	d57cd84d9a	fix(auto): make halt watchdog observable	2026-05-15 08:09:02 +02:00
Mikael Hugo	f9c147a08b	fix(swarm): ignore heartbeats for silent worker timeout	2026-05-15 08:00:35 +02:00
Mikael Hugo	e464a1bd6e	fix(swarm): bound silent worker responses	2026-05-15 07:35:31 +02:00
Mikael Hugo	81425230f5	fix(headless): do not restart graceful child exits	2026-05-15 07:25:06 +02:00
Mikael Hugo	9ba9b55f7a	fix(uok): import memory extractor from closeout	2026-05-15 07:12:10 +02:00
Mikael Hugo	c5850c8039	fix(verify): ignore stale broad cargo preferences	2026-05-15 07:06:17 +02:00
Mikael Hugo	d1ca3d035c	fix(auto): count only unproductive runaway iterations	2026-05-15 06:55:05 +02:00
Mikael Hugo	5faa789f52	fix: ensure shared/tui.js stub is tracked for build/test stability Prevents ERR_MODULE_NOT_FOUND and unblocks builds/tests.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 06:48:49 +02:00
Copilot	cf9203aee0	feat(swarm): forward parent permission profile to in-process worker sessions Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details In-process swarm workers get a fresh headless AgentSession whose permission extension defaults to read-only minimal. This blocks normal autonomous edits (e.g., write_file, edit) even when the parent session runs at normal or trusted level. - run-unit.js: add legacyPermissionLevelForProfile mapping and include executorPermissionLevel in the dispatch envelope. - swarm-dispatch.js: forward executorPermissionLevel from envelope to runAgentTurn as permissionLevel. - agent-runner.js: accept permissionLevel option and pass it to runSubagent config. - subagent-runner.ts: add permissionLevel to SubagentConfig; when set, temporarily set SF_PERMISSION_LEVEL env and run extension lifecycle so the permission extension reads the level before tool hooks execute. - Tests for envelope field, dispatch forwarding, and run-unit integration. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 06:38:42 +02:00
Mikael Hugo	f3571475d5	docs: DB-first planning state migration proposal Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Design doc for moving SF's milestone planning state from markdown-as-source-of-truth to DB-as-source-of-truth, with markdown becoming a render target. 463 lines, ~4500 words. Includes: - Survey of all markdown artifacts under .sf/milestones/M/ and who writes/reads each today (drift authoritative-ness is ambiguous in most cases) - MVP picks -VALIDATION.md as first artifact to migrate — three read-site fixes, no schema change, the doctor's db_projection_validation_drift check retires immediately - Hybrid editing UX (option c): CONTEXT-DRAFT and in-progress PLAN stay LLM-writable markdown; tool-call-bounded artifacts (validate_milestone, complete_slice, etc.) become DB-first with generated <!-- generated --> headers - 5-phase rollout plan - Open question flagged: git atomicity for milestone-level syncMilestoneLevelFiles calls — needs explicit tracing before Phase 4/5 No source-code changes. Implementation comes later. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 06:35:02 +02:00
Mikael Hugo	19e33f7239	feat(subagent): SF_SUBAGENT_VIA_SWARM=1 routes /delegate via swarm dispatch Add runSingleAgentViaSwarm as an opt-in path in subagent/index.js. When SF_SUBAGENT_VIA_SWARM=1 (or =true), /delegate, /rubber-duck, /ask, /share, /sidekicks dispatch through swarmDispatchAndWait instead of calling runSubagent directly. This consolidates the subagent extension onto the same dispatch path autonomous unit work uses (Round 4's runUnitViaSwarm). Gains memory inheritance from MessageBus, durable bus audit trail, and the same event-streaming + onEvent plumbing built up through Rounds 2-7. Default (flag unset) is byte-identical to today — no regression in the in-process runSubagent path; existing TUI live update panel still works via the same processSubagentEventLine adapter. Tests: 9 passing in subagent-via-swarm.test.mjs covering: - flag unset → existing path, swarmDispatchAndWait not called - flag=1 → swarmDispatchAndWait called with composed prompt and tools - result shape parity with existing path - onEvent forwards through processSubagentEventLine Confirms end-to-end tool registration works in the worker session: test output shows "tool count after bindExtensions: 3 (read, bash, Skill)" — Round 7's bindExtensions + _refreshToolRegistry wiring is live. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 06:35:02 +02:00
Mikael Hugo	1478579069	docs: AgentRuntime unification proposal Design doc for collapsing the five parallel agent-dispatch sites (defaultAgentRunner, runHeadlessPrompt, runSingleAgent, runUnitViaSwarm, slice-parallel-orchestrator) onto one runtime with three orthogonal axes — persistence, isolation, routing. 590 lines, ~5200 words. Includes: - Problem statement with five concrete pain points from this session's swarm convergence rounds (spawn hangs, inbox cache, checkpoint synthesis, ledger isolation, etc.) - Worked-out TypeScript interface - Mapping of each existing site to runtime options (table) - 8-step migration plan in blast-radius order (~4-5 days focused work) - Open questions No source-code changes. Implementation comes later. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 06:32:28 +02:00
Copilot	1e99bd669e	fix(auto): heartbeat before unit execution to prevent false-positive watchdog stalls The HaltWatchdog fires when the loop goes >10s without a heartbeat. Each iteration ends with a heartbeat, but unit execution itself can take 3+ minutes. Without a heartbeat at the start of the unit phase, the watchdog detects idle and emits a false-positive 'possible stuck iteration' error. Add watchdog.heartbeat() immediately before both runUnitPhaseViaContract calls (one in the custom-engine path, one in the dev path) so the watchdog timer is reset before the long-running work begins. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 06:30:40 +02:00
Mikael Hugo	e7cf168824	feat(model-policy): adversary role + lineage-diverse-from-worker constraint Add `adversary` to SUPPORTED_MODEL_ROLES and a new symbolic constraint `lineage-diverse-from-worker` to SUPPORTED_MODEL_ROLE_CONSTRAINTS. Default constraints for `adversary` and `reviewer` now include `lineage-diverse-from-worker` so the reviewer/adversary CANNOT be a lineage-twin of the model that produced the artifact under review — prevents "yeah looks fine to me" rubber-stamp from same-family models. Helpers exported alongside the policy: - rootVendorFor(modelId) → "anthropic" \| "openai" \| "google" \| "moonshot" \| "mistral" \| "minimax" \| "zhipu" \| "meituan" \| "unknown" - isSameRootVendor(candidateId, workerId) → boolean (fail-open on unknown) These are the building blocks the selector needs. The actual filter wiring in auto-model-selection's selectAndApplyModel is left as a documented TODO — the function doesn't currently thread role context through, so plugging in lineage filtering needs a small refactor that is out of scope here. Tests: 24 pass (was 6 + 18 new). Coverage: role registration, constraint registration, defaults, validation, rootVendor mapping matrix, isSameRootVendor predicate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 06:30:08 +02:00
Mikael Hugo	8832be0785	chore(headless): surface v2 init failure reason in fallback warning Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details The catch block was swallowing the actual error, leaving operators with "v2 init failed, falling back to v1 string-matching" and no diagnostic to act on. Found out this session that the failure was build staleness (packages/coding-agent dist was not rebuilt by copy-resources) — would have been instant to diagnose if the reason had been logged. Now: "[headless] Warning: v2 init failed (Timeout waiting for response to init...), falling back to v1 string-matching" Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 06:28:41 +02:00
Mikael Hugo	996b82001f	fix(auto): keep swarm continue checkpoints actionable	2026-05-15 06:26:30 +02:00
Mikael Hugo	3464db441c	fix(auto): repair empty continue checkpoints Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details	2026-05-15 06:21:58 +02:00
Mikael Hugo	7e2f62ead3	fix(verify): ignore stale repo verification commands	2026-05-15 06:11:57 +02:00
Mikael Hugo	50383eb2bf	fix(auto): honor solver swarm tool counts	2026-05-15 05:54:02 +02:00
Mikael Hugo	dbfaca61cf	fix(swarm): surface worker tool call count to bypass parent-ledger guard Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Round 7 dogfood failed with "0 tool calls — context exhaustion" even though the swarm worker's session DID call tools. Root cause: the phases-unit.js zero-tool-call guard reads from the PARENT session's message ledger via snapshotUnitMetrics. The swarm worker runs in an ISOLATED subagent session — its tool calls never appear in the parent's messages, so the guard always sees 0 and fires a false- positive context-exhaustion retry. Fix: - runUnitViaSwarm now returns swarmToolCallCount on the UnitResult, surfacing the real worker tool call count from the onEvent stream (collectedToolCalls.length, accurate end-to-end). - phases-unit.js zero-tool-call guard checks unitResult._via === "swarm" && swarmToolCallCount > 0 and bypasses the false-positive retry, logging "zero-tool-calls-swarm-bypass". Also adds a debug stderr line in subagent-runner.ts printing the tool count after bindExtensions, confirming the worker session HAS the full tool set (checkpoint + built-ins) — Hypotheses 1 and 2 from the Round 8 brief ruled out by direct observation. Tests: 3 new (swarmToolCallCount = 0 / N / 1-on-checkpoint-only); 2518 tests pass total, 0 regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 05:46:17 +02:00
Copilot	ea8a3d9354	feat(swarm): default SF_AUTONOMOUS_VIA_SWARM on in headless mode Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details The swarm dispatch path is now automatically enabled when SF_HEADLESS=1 without requiring the operator to set SF_AUTONOMOUS_VIA_SWARM=1. This makes headless mode use the swarm execution engine by default, which is the intended architecture for autonomous execution. - Explicit SF_AUTONOMOUS_VIA_SWARM=1/true still works. - Explicit SF_AUTONOMOUS_VIA_SWARM=0/false disables it even in headless. - When unset + SF_HEADLESS=1, swarm is used. - When unset + SF_HEADLESS!=1, legacy path is used (unchanged). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 05:34:01 +02:00

1 2 3 4 5 ...

4523 commits