singularity/singularity-forge

Author	SHA1	Message	Date
Mikael Hugo	6d8fc62243	fix: use shared sf webserver project config Some checks are pending sf self-deploy / build, test, and publish server image (push) Waiting to run Details sf self-deploy / deploy test and probe (push) Blocked by required conditions Details sf self-deploy / promote prod (push) Blocked by required conditions Details	2026-05-17 22:09:28 +02:00
Mikael Hugo	c26de39afa	feat: add source-mounted sf server self-deploy Some checks are pending sf self-deploy / build, test, and publish server image (push) Waiting to run Details sf self-deploy / deploy test and probe (push) Blocked by required conditions Details sf self-deploy / promote prod (push) Blocked by required conditions Details	2026-05-17 22:00:01 +02:00
Mikael Hugo	cc67970fa0	fix(sf-db): share open-DB state across module instances via globalThis Two SQLite connections were being opened in the same Node process when the same module loaded under two graphs: - the autonomous-loop side loads sf-db modules via normal ESM resolution - src/headless-feedback.ts re-imports them via jiti.createJiti() so the in-server `sf headless feedback ...` drain can call them without bringing the agent extension into the rpc-mode bundle Module-level `let currentDb / currentPath / currentPid` etc. lived on two independent module instances, so each instance opened its own SQLite handle to .sf/sf.db. WAL mode lets readers share, but two writer connections in the same process produced SQLITE_BUSY / writer stalls — the hang we saw on sf-mpa4g46x and the wedged-drainer recurrence after the server restart at 19:35. Fix: hoist the connection slot onto globalThis under a well-known Symbol so every module instance points at the same record. All five fields formerly module-level become `_sf.<field>` and live in one shared object. Codex's original diagnosis (split module-graph DB-writer contention) was right; I dismissed it earlier because I missed that headless-feedback uses jiti even though rpc-mode itself doesn't import sf-db directly. Verification: - Syntax check: clean - sf-db-migration.test.mjs: 12/13 pass. The one failure (openDatabase_migrates_v27_tasks_without_created_at_through_spec_backfill expects schema version 72, actual 73) is unrelated — a schema migration landed elsewhere without bumping that test's expected version. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:47:01 +02:00
Mikael Hugo	a3469f2334	feat(detectors): wire gate-deadlock-classifier into the autonomous loop Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Three changes that close the gap between the gate-deadlock-classifier landed in `ab2c99686` and a working detection signal. (1) Detector wrapper now returns outcome=manual-attention (not fail) when a deadlock fires. The whole point of detecting the deadlock is to escape it — returning `fail` would add another refusal and compound the lockout. Same precedent as periodicDetectorSweepGate. (2) New auto/gate-refusal-recorder.js — in-process ring buffer (cap 32, TTL 30 min) that records UokGate refusals from the dispatcher. Storage is intentionally in-memory; refusals are operational signals, not durable state. (3) auto/run-unit.js — calls recordGateRefusal() at the inline-route-refused branch, passing the rationale (already includes `[gate-id]` prefix + R-id status fragments the detector parses) plus unitType/unitId. (4) detectors/periodic-runner.js — adds a `gate-deadlock` entry to the default detector list, pulling ctx.gateRefusals from the caller OR falling back to recentGateRefusals() from the recorder. ctx can also override requirementCoverageByMilestone + resolveMilestoneId for tests. After this change, an inline-route refusal flows: inlineRuntimeGate.execute → outcome=fail → run-unit.js records the refusal in gate-refusal-recorder → periodic-runner sweep picks it up via recentGateRefusals() → detectGateDeadlock cross-references against milestone coverage → if overlap: detectorsFired includes {name:"gate-deadlock", signature} → periodicDetectorSweepGate surfaces as manual-attention Tests: 16 detector + 10 existing periodic-runner = 26/26 pass. The existing periodic-runner test exercises the default detector list, so adding the new entry is implicitly validated. Follow-up still open: have the periodic sweep file a self_feedback entry when the gate-deadlock detector fires, so the operator and SF's autonomous triage both see the signal without polling logs. That belongs in the sweep handler, not the detector — separate commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:19:29 +02:00
Mikael Hugo	ab2c996866	feat(detectors): gate-deadlock-classifier — Wiggums detector for R074 self-deadlock The R074 inlineRuntimeGate refused inline dispatch for M048/S05 reassess-roadmap because R020 and R066 are still 'active' — but those slices ARE the work that validates R066. Autonomous mode stopped with no way to escape. Filed earlier as sf-mpa4f9k1-jm01rc. This detector classifies the pattern at runtime: parseGateRefusal(rationale) extracts gateId + refused requirement ids from gate-refusal text matching shape "[gate-id] ... R020=active R066=active ..." detectGateDeadlock(ctx, options) ctx.gateRefusals: recent gate refusal events ({rationale, unitType, unitId}) ctx.requirementCoverageByMilestone: milestone -> R-ids in its DoD/coverage ctx.resolveMilestoneId: optional unit -> milestone resolver (default: strip after '/', require M-prefix) Returns { stuck, reason: "gate-deadlock", signature: { gateId, deadlockedRequirements, refusedUnits, examples, suggestedAction }} when any refused unit's milestone coverage overlaps the gate's refused requirements. Per-gateId throttle prevents repeat firings within 60s. gateDeadlockClassifierGate UokGate (type=verification per ADR-0075) wrapping the detector for integration into periodicDetectorSweepGate + post-finalize sweeps. Registered in uok/gate-registry-bootstrap.js between inlineRuntimeGate and the existing detector chain. Also re-exported from detectors/index.js for the common detector import surface. Test coverage: - parseGateRefusal: 5 cases (inline shape, dedup, missing reqs, missing gate, empty) - detectGateDeadlock: 7 cases (empty input, fire-on-overlap, no-overlap, empty coverage, throttle, custom resolver, examples cap) - UokGate wrapper: 3 cases (contract shape, pass, fail-with-findings) - Threshold export sanity: 1 case 16/16 tests pass. The wiring from autonomous-loop output (where gate refusals are emitted) into the detector's gateRefusals input is a follow-up — this commit lands the detector with a stable contract and tests it can be wired against. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:15:21 +02:00
Mikael Hugo	acd907fec2	fix: harden sf server control loop Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details	2026-05-17 21:13:12 +02:00
Mikael Hugo	70d89eebec	feat(dev-server): auto-reload on SF extension + coding-agent + git upgrades Before: dev-server watched packages/daemon/src + dev scripts + package.json. SF extension source edits in src/resources/extensions/sf/ AND coding-agent edits in packages/coding-agent/src/ did NOT trigger restart. Operators had to restart manually after copy-resources / git pull / coding-agent edits. Adds three watched paths: 1. packages/coding-agent/src — rpc-mode hosts sf_feedback / start_autonomous handlers, lives here. Edits must restart the sf child. 2. dist/resources/.sf-resource-build-stamp — atomic stamp updated by copy-resources. Watching the stamp (not the dist tree) avoids heavy recursive walk while picking up extension upgrades the moment they land. Idempotent: ensure-source-resources only updates the stamp when an actual rebuild ran, so no restart-loop on identical re-runs. 3. .git/HEAD — changes on pull / branch switch / commit. Catches upgrade flows where source moved outside this process. Native (packages/native/) intentionally not watched — Rust build is 5–10 min, auto-trigger would loop. Operator triggers native rebuild manually per the existing ensure-source-resources policy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 21:03:49 +02:00
Mikael Hugo	dd03d17089	chore: auto-commit after challenge SF-Unit: M048/S04/challenge	2026-05-17 20:33:12 +02:00
Mikael Hugo	d8fd70e57f	fix(sf): keep web autonomy on proven routes	2026-05-17 20:24:51 +02:00
Mikael Hugo	8f097f8dca	chore: auto-commit after challenge SF-Unit: M048/S03/challenge	2026-05-17 20:16:24 +02:00
Mikael Hugo	cf2d1a768e	feat(sf): route server control through rpc	2026-05-17 20:07:36 +02:00
Mikael Hugo	3adcb833ed	refactor(sf): separate daemon from server identity	2026-05-17 19:18:33 +02:00
Mikael Hugo	187d736930	fix(sf): run source server with live web host	2026-05-17 19:13:10 +02:00
Mikael Hugo	f7b262f33a	fix(sf): harden server pid lifecycle	2026-05-17 19:00:21 +02:00
Mikael Hugo	3568972059	fix(sf): use fixed server port	2026-05-17 18:55:21 +02:00
Mikael Hugo	425bba7d39	fix: restore full content of R074/R075 swarm files from worktrees The prior commit (`cc32ab79d`) accidentally landed truncated versions of the new R074 + R075 files due to a cherry-pick partial-state. Restored: - inline-runtime-gate.js: 74→96 LOC - inline-runtime-gate.test.mjs: 115→273 LOC (15 tests; 2 sonnet-imagined bootstrapGateRegistry/BOOTSTRAP_GATES tests rewritten to assert SF's actual side-effect-on-import registry pattern) - adversarial-budget.js: 86→106 LOC - adversarial-budget.test.mjs: 63→132 LOC (9 tests) - adversarial-finding-bridge.js: 123→191 LOC - adversarial-finding-bridge.test.mjs: 98→216 LOC (14 tests) 45/45 tests pass across the four affected files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 18:54:39 +02:00
Mikael Hugo	cc32ab79d9	fix(docs): remove stale hashline-{read,edit}.ts rows post-fold Hashline read/edit tool wrappers were folded into Edit({match}) and Read({format}) modes in commit `ffdec0fee`. The two rows in FILE-SYSTEM-MAP.md pointed to files that no longer exist. Updated the surviving hashline.ts row to note its new consumer relationship with Edit/Read. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 18:48:34 +02:00
Mikael Hugo	781a7e7319	chore(safety): narrow autonomous-rollback to flag-flip only (R066 D1) Remove git-revert authority per operator decision M048-D1. Crash-loop classifier sees runtime evidence, not commit attribution; reverting on runtime symptoms risks reverting the wrong commit. On quarantine trigger, smoke_gate is flipped false to halt ledger writes and a self-feedback entry (kind: crash-loop-detected, severity: high) is filed with a manual-review suggestion. Operator retains sole authority to git-revert. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-17 18:24:00 +02:00
Mikael Hugo	c2f101734f	feat: enforce purpose-first adversarial review	2026-05-17 18:15:15 +02:00
Mikael Hugo	acafee06e2	fix: iter-completion-reconciler test uses relative timestamps Test had fixed literal timestamps (TS_X = "2026-05-17T12:42:05.618Z") that became stale once the calendar moved past them — the reconciler's default maxAgeMs (1h, "older drift is operator territory") filtered them out. By 3h after the original write the test failed: reconciled.length was 0 because no entry passed the age filter. Switch to NOW-relative timestamps (5/30/1 min back from Date.now()) so the fixture always lands inside the default age window regardless of when the test runs. Sonnet #13 (tool rename) report flagged this test as failing alongside the 4 known pre-existing failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 17:49:11 +02:00
Mikael Hugo	623af869b1	remove: SF voice IVR / ElevenLabs paging — migrated to centralcloud Per operator-direction 2026-05-17 (R089 — Migrate Voice IVR / ElevenLabs On-Call Paging Infrastructure out of SF). Migration target landed in centralcloud monorepo: - centralcloud_core/lib/centralcloud_core/voice.ex (TwiML + ElevenLabs) - centralcloud_staff/lib/.../controllers/voice_controller.ex (Phoenix) - centralcloud_staff/lib/.../controllers/voice_prompt_controller.ex - centralcloud_staff/lib/.../router.ex (/twilio scope) SF removal: - web/app/api/voice/route.ts - web/app/api/voice/prompt/route.ts - web/app/api/voice/ directory - src/tests/integration/web-voice-ivr-contract.test.ts Operator-paging infra was historical drift in SF (per-project compiler); belongs in centralcloud (org-level ops). R088 (Pre-Removal Test-Import Safety Gate) not yet built — operator manually verified safety scan: TWILIO_/ELEVENLABS_ env vars only referenced in the deleted files; no internal SF callers; centralcloud version verified present. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 17:42:16 +02:00
Mikael Hugo	ffdec0feee	fold: hashline_edit + hashline_read → Edit({match}) + Read({format}) modes Per operator R-entry sf-mp9wo7e3-sdxqss + no-compat directive. - Edit gains `match: "substring"\|"anchor"` arg; anchor mode routes to the existing applyHashlineEdits logic. Substring stays default. - Read gains `format: "plain"\|"tagged"` arg; tagged mode emits LINE#HASH prefixes via formatHashLines. - Delete hashline-edit.ts, hashline-read.ts. KEEP hashline.ts (helpers are now Edit/Read internals). - tools/index.ts: drop the two tools + the createHashlineCodingTools preset. - agent-session.ts: setEditMode no longer swaps tool instances (single tool surface; mode preserved for system-prompt context only). - sdk.ts + index.ts: remove hashline tool re-exports. - headless-ui.ts + test: remove hashline_edit case. Net agent-visible tool surface: -2 tools. Capability preserved as modes. No backward-compat alias for the removed tool names. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-17 17:39:59 +02:00
Mikael Hugo	d03758d803	feat: replace launchd with systemd user-unit install path Operator-direction 2026-05-17 "we will never use mac" — no compat preservation. Single-cutover replacement. - new packages/daemon/src/systemd.ts: install/uninstall/status using systemctl --user + ~/.config/systemd/user/sf-server.service - new packages/daemon/src/systemd.test.ts: ports launchd tests, same shape, mocked systemctl via RunCommandFn injection + SF_SYSTEMD_USER_DIR env override for real filesystem tests - cli-main.ts: switch import + update help text + status messages - index.ts: re-export systemd module (installSystemdUnit, uninstallSystemdUnit, systemdUnitStatus, generateUnit, getServicePath, SystemdStatus, SystemdUnitOptions) - DELETED: launchd.ts (253 LOC), launchd.test.ts (379 LOC) - docs/dev/drafts/M053-per-repo-supervisor.md: remove "launchd" mention - CHANGELOG.md: document systemd-only install path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-17 17:33:34 +02:00
Mikael Hugo	44915b73d4	rename: tool names → Claude-Code-aligned (Bash/Read/Write/Edit/Grep/Glob/LS); remove run_command/read_output/hashline duplicates Per operator-direction 2026-05-17 (sf-mp9w20y1-nld9hc + "DONT KEE COMPAT" stance + adversarial-review override). Cross-vendor frontier LLMs are trained on PascalCase Claude Code tool names; calling them by SF's lowercase + novel names increases tool-call error rates. Single atomic cutover, no aliases. Internal implementations preserved; only the LLM-facing names + registrations change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 17:26:36 +02:00
Mikael Hugo	57fef5979d	feat: make sf server the operator entrypoint	2026-05-17 17:23:46 +02:00
Mikael Hugo	6e3b3d3c54	feat: add Serena-style AST tools (ReplaceSymbol, InsertAroundSymbol, AstGrep) Wraps the native AST primitives from @singularity-forge/native/{edit,ast} as LLM tools so agents can do tree-sitter-anchored code edits instead of substring-based Edit or line-anchor hashline. - replace-symbol.ts (+117): wraps replaceSymbol(file, symbolPath, newBody); matches function/class/method declarations via tree-sitter, returns matched=false sentinel when the symbol isn't located. - insert-around-symbol.ts (+122): wraps insertAroundSymbol with position enum BeforeDecl/AfterDecl/AtBodyStart/AtBodyEnd. - ast-grep.ts (+152): wraps astGrep for pattern matching across files with $VAR/$$$ARGS meta-variables; returns ranked matches with byte/line/column + captured meta-variable bindings. Each tool: - typebox schema matching the existing AgentTool pattern (edit.ts) - notifyFileChanged() into the LSP layer on write ops - resolveToCwd() for path normalization - catches native errors + returns isError result with the NativeUnavailableError message pointing operators to `nix develop` + `node rust-engine/scripts/build.js --dev` Wire-in: - tools/index.ts: re-exports + imports + entries in `allTools` map and createAllTools() factory. - extension-manifest.json: ReplaceSymbol / InsertAroundSymbol / AstGrep appended to provides.tools so SF extension agents see them. Higher value than substring/line-anchor for code in tree-sitter-supported languages (TS/JS/TSX/Python/Rust). Edit + hashline remain for non-code files. PascalCase names per the Claude-Code-aligned convention from sf-mp9w20y1-nld9hc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 17:14:12 +02:00
Mikael Hugo	19b10eb67c	feat: make sf-server own swarm registry sync	2026-05-17 17:05:16 +02:00
Mikael Hugo	eeb80bbbdd	fix: register 6 detector gates + add adversarial-finding kind + watchdog log rotation Three concrete fixes from open self-feedback assessment 2026-05-17: - uok/gate-registry-bootstrap.js: register all 6 R081 detector gates (same-unit-loop, zero-progress, repeated-feedback-kind, artifact-flap, stale-lock, periodic-detector-sweep) alongside drift-detection and iter-completion-reconciler. Closes the gap reported by sf-mp9udspu-fsf7si — bootstrap previously registered 2 of 8 gates. - self-feedback.js ALLOWED_KIND_DOMAINS: add `adversarial-finding`. Closes gap reported by sf-mp9u4i25-fczmcj — R075 (autonomous adversarial review) challenge unit had no kind to file findings under. - sf-autonomous-watchdog.sh: delete watchdog-run-*.log files older than 60 minutes at each cycle start. Without rotation .sf/ grew to 1.9 GB in 24h (today's snapshot). 60 min retention captures last cycle for post-incident triage; older state is already in DB + iterations.jsonl. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 16:08:05 +02:00
Mikael Hugo	077fd0a2a7	remove A2A; swarm enrollment + status projection + web swarms view; headless refactor - A2A removal per M054/R071 cancellation 2026-05-17 (-2294 lines): - docs/plans/A2A_ADOPTION_PLAN.md, MISSION-A2A-ADOPTION.md deleted - src/resources/extensions/sf/uok/a2a-agent-server.js, a2a-transport.js deleted - tests/a2a-auth.test.mjs deleted - swarm-dispatch.js purged of A2A-conditional code paths - New: scripts/sf-swarm-enroll.mjs + test (operator-facing swarm enrollment, replaces former A2A pairing flow) - New: src/status-projection.ts + test, web/lib/swarm-status.ts + test, web/components/sf/swarms-view.tsx, web/app/api/swarms/ (web swarms-view surface — direct visibility into running swarm state without requiring TUI; aligns with project_tui_deprecating) - headless-{answers,query,ui,headless}.ts: coordinated tweaks consistent with the headless-as-default direction (R124 proposal) - docs/dev/drafts/M053-per-repo-supervisor.md: design refinement - .sf/REQUIREMENTS.md: small text fixes (6/6 churn) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 16:04:06 +02:00
Mikael Hugo	1cd7890d64	fix: auto-version-bump swallowed operator-direction; ptrmap + lock guards - sf-db-schema.js: auto_vacuum INCREMENTAL → NONE. The "Bad ptr map entry" corruption on 2026-05-17 was incremental-autovacuum ptrmap drift under concurrent writers. Recovered DB has no ptrmap; future fresh DBs must match. incremental_vacuum() callers in sf-db-core.js become no-ops. - bin/sf-from-source: lock allowlist extended to skip readonly sf headless subcommands (--help, query, status, usage, reflect, feedback list, triage --list/--json). Previously every sf headless invocation tried to acquire the project lock — operator couldn't even inspect SF state while autonomous was running. - self-feedback.js triageBlockedEntries: (1) treat empty/null/undefined sfVersion as unknown, not zero; (2) exempt operator-direction kinds (improvement-idea, architecture-defect, missing-feature, gap) from auto-version-bump close. Both were needed to prevent the R124 incident recurring. - headless-feedback.ts handleAdd: populate sfVersion via getCurrentSfVersion + detect repoIdentity via isForgeRepo, not hardcoded "external"/"". An empty sfVersion sorts below any real semver, so the resolver retry-closed every operator-filed entry within seconds. Net effect: R124 proposal (filed via sf headless feedback add) is no longer auto-resolved as version-stale. Larger architectural fix (single- writer SF daemon / RPC for all DB writes — M040 territory) tracked as follow-up R-entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 15:51:36 +02:00
Mikael Hugo	87e9729c13	fix: shard sift search and project requirements	2026-05-17 15:38:55 +02:00
Mikael Hugo	3e5b6fc511	fix: reconcile iteration completion drift	2026-05-17 15:06:40 +02:00
Mikael Hugo	f643272a91	fix: preserve requirements projection fidelity	2026-05-17 15:02:25 +02:00
Mikael Hugo	4289946e11	fix: clear task verification status on revert	2026-05-17 14:59:20 +02:00
Mikael Hugo	3e002ca698	refactor: consolidate loop signals and gate registry wiring	2026-05-17 14:45:12 +02:00
Mikael Hugo	4d2266e57d	fix: consolidate loop supervision gates	2026-05-17 14:35:40 +02:00
Mikael Hugo	625a830d2f	wire R053-R056 detectors into auto-runaway-guard + R081 UokGate retrofit - uok/auto-runaway-guard.js: invoke runDetectorSweep alongside the existing zero-progress check (fire-and-forget for sync-tick compatibility; results consumed on next tick via sweepState ring buffer). Passes unitId, unitMetrics, sessionFingerprint, lockPaths, and a 30-min DB-windowed recentFeedback slice. - detectors/{same-unit-loop, zero-progress, repeated-feedback-kind, artifact-flap, stale-lock, periodic-runner}.js: each detector now also exports a UokGate wrapper (id/type/execute -> GateResult per ADR-0075). Plain detector functions kept for existing consumers. - detectors/index.js: single import surface for the gate exports. - detector-stale-lock.test.mjs (9), detector-periodic-runner.test.mjs (10), detector-gates-contract.test.mjs: fills the R055/R056 test gap filed earlier today + proves UokGate contract conformance. - 41/41 detector tests green; copy-resources clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 14:18:54 +02:00
Mikael Hugo	d5664f7142	meta-supervisor (node daemon) + R091 triage gate + R091-R094 spec - scripts/sf-meta-supervisor.mjs: pure-node daemon supervising scripts/sf-autonomous-watchdog.sh. Tick=60s, restarts watchdog if dead, emits .sf/meta-status.json, halt via .sf/meta-supervisor.halt. Uses only node builtins (no SF dist deps) so it survives dist breakage. - src/headless.ts: R091 — gate the per-cycle handleTriage call on a time interval (SF_TRIAGE_INTERVAL_MS, default 30 min) and bump batch size (SF_TRIAGE_MAX, default 25, was 5). Drops the ~8min triage hit from every cycle while letting daily drain capacity rise. - .sf/REQUIREMENTS.md: R091 (triage sidecar) + R092 (PDD-completeness as routing signal) + R093 (pin model per orchestration agent.yaml) + R094 (swarm-role model tier specialization — 8 roles already exist in uok/swarm-roles.js; model field per role missing). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 14:08:30 +02:00
Mikael Hugo	e93d17a3b4	spec + ADR annotations + dormant-code cleanup - .sf/REQUIREMENTS.md: today's R-entries (R066..R090) covering parallel-rescue targets — bus deliver verify, drift detection gate, PDD typed contracts, lane split, Wiggums detector family, repo supervisor design. - ADR-014/019/020: SF-first banners (operator direction: get SF working before ACE/wire-architecture changes land downstream). - docs/records + drafts: 2026-05-07 strategy + cli-agent survey index refresh; SF/ACE pattern draft annotations. - roadmap-mutations.js removed (dormant — never imported; reachable shape verified against handler-relative + dynamic import audit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:45:00 +02:00
Mikael Hugo	d2ff4e84ba	land 6 parallel codex/sonnet rescue outputs - R016 swarm bus deliver verify (uok/swarm-dispatch.js + test): _busDispatch now force-refreshes target inbox and verifies messageId visibility before returning ok:true; ack-without-deliver class closed. - R082 drift detection UokGate (uok/drift-detection-gate.js + test): single-task + sweep scope; 3 drift classes (artifact-missing, prose-status mismatch, broken-import); follows ADR-0075 id/type/execute -> GateResult contract. - R087 PDD typed contracts (engine-types.js + test): ADR-0000 8 PDD fields + 7-dim run-control policy + ADR-0075 GateResult typedefs and validators. - R090 planning-execute lane split (auto/unit-lanes.js + auto/loop.js + 2 tests): lane classifier + capacity-aware tick dispatcher; SF_LANES=0 fallback is byte-equivalent to pre-R090. - R053 + R054 Wiggums detectors (detectors/repeated-feedback-kind.js + detectors/artifact-flap.js + 2 tests); R055 stale-lock + R056 periodic-runner source landed without tests (gap filed as self-feedback). - M053 per-repo supervisor design + skeleton (supervisor/repo-supervisor.js + test + design doc): RepoSupervisor class, zero module-global state, tick stub, failure isolation; M056 trust-boundary called out as follow-up. 85/85 tests green across the 8 new test files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:44:42 +02:00
Mikael Hugo	eaac4f0bd3	sf snapshot: uncommitted changes after 187m inactivity	2026-05-17 12:04:55 +02:00
Mikael Hugo	14fe3fa20a	spec(R057): rename to "System Lane" + introduce lane primitive Operator coined "system lane" — better than my "side-track". Frames the architecture cleanly. The lane primitive unifies: - R046 (multi-unit lanes) parallel slice dispatch - R049 (per-lane model routing) different LLM per lane - R057 (system lane) non-unit work alongside unit lane Today autoLoop is 1 unit lane. System lane runs alongside for memory consolidation, triage drain, doctor audits, log compaction, reflection assembly, catalog refresh — all currently queued between units. Single-writer DB met by sf-db.js serial queue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 08:45:30 +02:00
Mikael Hugo	9bd7067b69	fix(wiggums): permission level — "normal" + default fallback to "medium" legacyPermissionLevelForProfile had a switch with cases for restricted/trusted/unrestricted only, no case for "normal" (the DEFAULT autonomous session profile per auto/session.js:377). "normal" fell through to default → "low" — too restrictive for autonomous work. Witnessed M010/S04/T01: solver note "TypeScript compilation and git diff blocked by low permission level" — SF couldn't verify its own deliverable because permissions were locked down despite running in autonomous mode. Fix: - "normal" → "medium" (allows tsc, git, npm test) - default → "medium" (was "low"); unknown profiles shouldn't cripple autonomous executors. Operators wanting strict mode set profile: "restricted" explicitly. Per operator intent 2026-05-17: "SF should have permission even if it can limit its agents and only allow orchestrator or whatever." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 08:38:10 +02:00
Mikael Hugo	80ede48f06	sf snapshot: uncommitted changes after 246m inactivity	2026-05-17 08:28:04 +02:00
Mikael Hugo	c7b13607b5	fix(wiggums): exponential backoff on autoLoop halt-watchdog-break When the halt-watchdog detects stuck state, the autoLoop was logging "halt-watchdog-break" every iteration but otherwise tight-spinning through dispatch-resolve at ~2s/iteration. 2026-05-17 dogfood logged 60+ such events in a 30s window — pure CPU burn while the actual stuck condition stayed stuck. Fix: exponential backoff (1s → 2s → 4s → 8s → 16s → capped at 30s) based on how many halt thresholds have elapsed. Heartbeat() resets when real progress resumes (existing behavior). Backoff costs nothing when the loop is healthy. One of the 14 Ralph-Wiggum patterns surfaced this session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 04:21:21 +02:00
Mikael Hugo	24d2b37562	fix(wiggums): verify PID liveness before "Another session running" message Two sites told operator to "kill PID X" without checking X was alive: - interrupted-session.js:formatInterruptedSessionRunningMessage - auto.js autonomous-start blocking notification Both report stale locks from crashed prior sessions as if a live session exists, confusing operator and blocking restart. Session-lock.js already has auto-recovery for stale-PID locks; these two surfaces just needed matching liveness checks to label dead-PID locks correctly. Now: dead-PID → "Stale lock from dead PID X — will be auto-recovered" alive-PID → original "kill X" message Catches one of the 14 Ralph-Wiggum-obvious patterns surfaced this session. Reduces operator confusion + dovetails with R055 (M038/S05) when stale-lock auto-recovery becomes a core-loop detector. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 04:20:00 +02:00
Mikael Hugo	a737af318d	fix(dispatch): reassess-roadmap loop on slice with ASSESSMENT.md checkNeedsReassessment looked up the slice's assessment file with suffix='ASSESS', but actual files are 'S03-ASSESSMENT.md'. The resolveFile pattern requires at least one char before the suffix (/^S03-.*-ASSESS\.md$/), so 'S03-ASSESSMENT.md' never matched and the helper returned {sliceId} on every poll → dispatcher kept firing reassess-roadmap forever. Fix: try 'ASSESSMENT' first, fall back to legacy 'ASSESS'. Now S03-ASSESSMENT.md properly satisfies the "already reassessed" check and the dispatcher advances to the next slice (S04). Verified: resolveSliceFile('M010','S03','ASSESSMENT') returns the real path; with the fallback, this resolves on first call. The 70+ degenerate reassess iterations on M010/S03 (witnessed 2026-05-17) won't recur. Ralph Wiggum approved. (per operator: "sf should clear these stuck itself ralph wiggums would fix") Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 04:12:15 +02:00
Mikael Hugo	0c0608fa50	fix(swarm): recognize unit-specific completion tools as implicit complete Detected via supervisory check 2026-05-17: SF stuck in degenerate reassess- roadmap loop on M010/S03 (5 iterations in 8min, all returning outcome=continue). Root cause: synthesized-checkpoint in runUnitViaSwarm only treats the generic `checkpoint` tool as a completion signal — but units routinely complete via their unit-specific tool (reassess_roadmap with verdict=roadmap-confirmed, validate_milestone, complete_milestone, complete_slice, save_summary). The LLM correctly emitted the unit's specific completion tool + assistant text "<turn_status>complete</turn_status>", but workerSignaledOutcome stayed null → synthesized checkpoint fell back to continue → solver re-iterated. Fix: recognize UNIT_COMPLETION_TOOLS = {reassess_roadmap, validate_milestone, complete_milestone, complete_slice, save_summary} as implicit "complete" signals. The check fires when those tools are called and an earlier explicit checkpoint hasn't already said "complete" or "blocked". This resolves sf-mp94lth4-ew26om and should prevent future degenerate-iteration loops on reassess-roadmap and milestone completion units. 13/13 existing M010 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 03:59:30 +02:00
Mikael Hugo	7a273262f1	fix(benchmark-coverage): tier-strip fallbacks downgraded to 'approx' proxy User caught: flash-lite ≠ flash (different model tier, different scores). Previous fix counted flash-lite as fully covered via flash proxy, which overstated coverage and could mislead routing. benchmarkLookupVariants now tags variants with kind: - 'exact' → date/version strip + -latest alias (same model line) - 'approx' → tier strip (flash-lite→flash, X-lite→X) — different model computeBenchmarkCoverage promotes 'exact' matches to covered; 'approx' matches stay in uncovered with `approximatedBy` field so operators see when a real benchmark is still needed. Honest report: 64 exact covered / 1 proxy-only / 104 genuine uncovered (was 65/0/104 with the overcount). R049 + R050 added to traceability (M036/M037 future milestones). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 03:52:29 +02:00
Mikael Hugo	1dc7c2e278	feat(benchmark-coverage): variant-fallback lookup (R050 step 1) benchmark-coverage.js: new benchmarkLookupVariants() returns ordered fallback keys for a model id, and computeBenchmarkCoverage tries each variant before flagging uncovered. Patterns covered: - date/version suffix strip ("mistral-medium-2505" → "mistral-medium") - tier strip ("X-flash-lite" → "X-flash", "Y-lite" → "Y") - "-latest" append for bare names ("mistral-medium" → "mistral-medium-latest") The audit reports the matched variant via `matchedVia` so operators can see when fallback applied (vs adding a real entry). Verified: coverage 62/169 (37%) → 65/169 (38.4%). Sample fallback matches: google-gemini-cli/gemini-2.5-flash-lite → gemini-2.5-flash mistral/mistral-medium → mistral-medium-latest mistral/magistral-small-2509 → magistral-small R050 now active: full closure requires auto-benchmark of remaining 104 uncovered models via bulk-import of published scores or live eval. This step shrinks the gap via cheap structural fallback; future work adds the real scoring loop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 03:50:21 +02:00

1 2 3 4 5 ...

3400 commits