Two SQLite connections were being opened in the same Node process when
the same module loaded under two graphs:
- the autonomous-loop side loads sf-db modules via normal ESM resolution
- src/headless-feedback.ts re-imports them via jiti.createJiti() so the
in-server `sf headless feedback ...` drain can call them without
bringing the agent extension into the rpc-mode bundle
Module-level `let currentDb / currentPath / currentPid` etc. lived on
two independent module instances, so each instance opened its own
SQLite handle to .sf/sf.db. WAL mode lets readers share, but two writer
connections in the same process produced SQLITE_BUSY / writer stalls —
the hang we saw on sf-mpa4g46x and the wedged-drainer recurrence after
the server restart at 19:35.
Fix: hoist the connection slot onto globalThis under a well-known
Symbol so every module instance points at the same record. All five
fields formerly module-level become `_sf.<field>` and live in one
shared object.
Codex's original diagnosis (split module-graph DB-writer contention)
was right; I dismissed it earlier because I missed that
headless-feedback uses jiti even though rpc-mode itself doesn't import
sf-db directly.
Verification:
- Syntax check: clean
- sf-db-migration.test.mjs: 12/13 pass. The one failure
(openDatabase_migrates_v27_tasks_without_created_at_through_spec_backfill
expects schema version 72, actual 73) is unrelated — a schema
migration landed elsewhere without bumping that test's expected
version.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three changes that close the gap between the gate-deadlock-classifier
landed in ab2c99686 and a working detection signal.
(1) Detector wrapper now returns outcome=manual-attention (not fail) when
a deadlock fires. The whole point of detecting the deadlock is to
escape it — returning `fail` would add another refusal and compound
the lockout. Same precedent as periodicDetectorSweepGate.
(2) New auto/gate-refusal-recorder.js — in-process ring buffer (cap 32,
TTL 30 min) that records UokGate refusals from the dispatcher.
Storage is intentionally in-memory; refusals are operational signals,
not durable state.
(3) auto/run-unit.js — calls recordGateRefusal() at the inline-route-refused
branch, passing the rationale (already includes `[gate-id]` prefix +
R-id status fragments the detector parses) plus unitType/unitId.
(4) detectors/periodic-runner.js — adds a `gate-deadlock` entry to the
default detector list, pulling ctx.gateRefusals from the caller OR
falling back to recentGateRefusals() from the recorder. ctx can also
override requirementCoverageByMilestone + resolveMilestoneId for tests.
After this change, an inline-route refusal flows:
inlineRuntimeGate.execute → outcome=fail
→ run-unit.js records the refusal in gate-refusal-recorder
→ periodic-runner sweep picks it up via recentGateRefusals()
→ detectGateDeadlock cross-references against milestone coverage
→ if overlap: detectorsFired includes {name:"gate-deadlock", signature}
→ periodicDetectorSweepGate surfaces as manual-attention
Tests: 16 detector + 10 existing periodic-runner = 26/26 pass. The
existing periodic-runner test exercises the default detector list, so
adding the new entry is implicitly validated.
Follow-up still open: have the periodic sweep file a self_feedback entry
when the gate-deadlock detector fires, so the operator and SF's autonomous
triage both see the signal without polling logs. That belongs in the
sweep handler, not the detector — separate commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The R074 inlineRuntimeGate refused inline dispatch for M048/S05 reassess-roadmap
because R020 and R066 are still 'active' — but those slices ARE the work that
validates R066. Autonomous mode stopped with no way to escape. Filed earlier as
sf-mpa4f9k1-jm01rc.
This detector classifies the pattern at runtime:
parseGateRefusal(rationale)
extracts gateId + refused requirement ids from gate-refusal text
matching shape "[gate-id] ... R020=active R066=active ..."
detectGateDeadlock(ctx, options)
ctx.gateRefusals: recent gate refusal events ({rationale, unitType, unitId})
ctx.requirementCoverageByMilestone: milestone -> R-ids in its DoD/coverage
ctx.resolveMilestoneId: optional unit -> milestone resolver
(default: strip after '/', require M-prefix)
Returns { stuck, reason: "gate-deadlock", signature: {
gateId, deadlockedRequirements, refusedUnits, examples, suggestedAction
}} when any refused unit's milestone coverage overlaps the gate's refused
requirements. Per-gateId throttle prevents repeat firings within 60s.
gateDeadlockClassifierGate
UokGate (type=verification per ADR-0075) wrapping the detector for
integration into periodicDetectorSweepGate + post-finalize sweeps.
Registered in uok/gate-registry-bootstrap.js between inlineRuntimeGate and the
existing detector chain. Also re-exported from detectors/index.js for the
common detector import surface.
Test coverage:
- parseGateRefusal: 5 cases (inline shape, dedup, missing reqs, missing gate, empty)
- detectGateDeadlock: 7 cases (empty input, fire-on-overlap, no-overlap,
empty coverage, throttle, custom resolver,
examples cap)
- UokGate wrapper: 3 cases (contract shape, pass, fail-with-findings)
- Threshold export sanity: 1 case
16/16 tests pass.
The wiring from autonomous-loop output (where gate refusals are emitted) into
the detector's gateRefusals input is a follow-up — this commit lands the
detector with a stable contract and tests it can be wired against.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before: dev-server watched packages/daemon/src + dev scripts + package.json.
SF extension source edits in src/resources/extensions/sf/ AND coding-agent
edits in packages/coding-agent/src/ did NOT trigger restart. Operators had to
restart manually after copy-resources / git pull / coding-agent edits.
Adds three watched paths:
1. packages/coding-agent/src — rpc-mode hosts sf_feedback / start_autonomous
handlers, lives here. Edits must restart the sf child.
2. dist/resources/.sf-resource-build-stamp — atomic stamp updated by
copy-resources. Watching the stamp (not the dist tree) avoids heavy
recursive walk while picking up extension upgrades the moment they land.
Idempotent: ensure-source-resources only updates the stamp when an actual
rebuild ran, so no restart-loop on identical re-runs.
3. .git/HEAD — changes on pull / branch switch / commit. Catches upgrade
flows where source moved outside this process.
Native (packages/native/) intentionally not watched — Rust build is 5–10 min,
auto-trigger would loop. Operator triggers native rebuild manually per the
existing ensure-source-resources policy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The prior commit (cc32ab79d) accidentally landed truncated versions of the
new R074 + R075 files due to a cherry-pick partial-state. Restored:
- inline-runtime-gate.js: 74→96 LOC
- inline-runtime-gate.test.mjs: 115→273 LOC (15 tests; 2 sonnet-imagined
bootstrapGateRegistry/BOOTSTRAP_GATES tests rewritten to assert SF's
actual side-effect-on-import registry pattern)
- adversarial-budget.js: 86→106 LOC
- adversarial-budget.test.mjs: 63→132 LOC (9 tests)
- adversarial-finding-bridge.js: 123→191 LOC
- adversarial-finding-bridge.test.mjs: 98→216 LOC (14 tests)
45/45 tests pass across the four affected files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hashline read/edit tool wrappers were folded into Edit({match}) and
Read({format}) modes in commit ffdec0fee. The two rows in FILE-SYSTEM-MAP.md
pointed to files that no longer exist. Updated the surviving hashline.ts row
to note its new consumer relationship with Edit/Read.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove git-revert authority per operator decision M048-D1. Crash-loop
classifier sees runtime evidence, not commit attribution; reverting on
runtime symptoms risks reverting the wrong commit. On quarantine trigger,
smoke_gate is flipped false to halt ledger writes and a self-feedback entry
(kind: crash-loop-detected, severity: high) is filed with a manual-review
suggestion. Operator retains sole authority to git-revert.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Test had fixed literal timestamps (TS_X = "2026-05-17T12:42:05.618Z")
that became stale once the calendar moved past them — the reconciler's
default maxAgeMs (1h, "older drift is operator territory") filtered
them out. By 3h after the original write the test failed: reconciled.length
was 0 because no entry passed the age filter.
Switch to NOW-relative timestamps (5/30/1 min back from Date.now()) so
the fixture always lands inside the default age window regardless of
when the test runs.
Sonnet #13 (tool rename) report flagged this test as failing alongside
the 4 known pre-existing failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per operator-direction 2026-05-17 (R089 — Migrate Voice IVR / ElevenLabs
On-Call Paging Infrastructure out of SF). Migration target landed in
centralcloud monorepo:
- centralcloud_core/lib/centralcloud_core/voice.ex (TwiML + ElevenLabs)
- centralcloud_staff/lib/.../controllers/voice_controller.ex (Phoenix)
- centralcloud_staff/lib/.../controllers/voice_prompt_controller.ex
- centralcloud_staff/lib/.../router.ex (/twilio scope)
SF removal:
- web/app/api/voice/route.ts
- web/app/api/voice/prompt/route.ts
- web/app/api/voice/ directory
- src/tests/integration/web-voice-ivr-contract.test.ts
Operator-paging infra was historical drift in SF (per-project compiler);
belongs in centralcloud (org-level ops). R088 (Pre-Removal Test-Import
Safety Gate) not yet built — operator manually verified safety scan:
TWILIO_/ELEVENLABS_ env vars only referenced in the deleted files; no
internal SF callers; centralcloud version verified present.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per operator-direction 2026-05-17 (sf-mp9w20y1-nld9hc + "DONT KEE COMPAT" stance + adversarial-review override). Cross-vendor frontier LLMs are trained on PascalCase Claude Code tool names; calling them by SF's lowercase + novel names increases tool-call error rates. Single atomic cutover, no aliases. Internal implementations preserved; only the LLM-facing names + registrations change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps the native AST primitives from @singularity-forge/native/{edit,ast} as
LLM tools so agents can do tree-sitter-anchored code edits instead of
substring-based Edit or line-anchor hashline.
- replace-symbol.ts (+117): wraps replaceSymbol(file, symbolPath, newBody);
matches function/class/method declarations via tree-sitter, returns
matched=false sentinel when the symbol isn't located.
- insert-around-symbol.ts (+122): wraps insertAroundSymbol with position
enum BeforeDecl/AfterDecl/AtBodyStart/AtBodyEnd.
- ast-grep.ts (+152): wraps astGrep for pattern matching across files with
$VAR/$$$ARGS meta-variables; returns ranked matches with byte/line/column
+ captured meta-variable bindings.
Each tool:
- typebox schema matching the existing AgentTool pattern (edit.ts)
- notifyFileChanged() into the LSP layer on write ops
- resolveToCwd() for path normalization
- catches native errors + returns isError result with the
NativeUnavailableError message pointing operators to
`nix develop` + `node rust-engine/scripts/build.js --dev`
Wire-in:
- tools/index.ts: re-exports + imports + entries in `allTools` map and
createAllTools() factory.
- extension-manifest.json: ReplaceSymbol / InsertAroundSymbol / AstGrep
appended to provides.tools so SF extension agents see them.
Higher value than substring/line-anchor for code in tree-sitter-supported
languages (TS/JS/TSX/Python/Rust). Edit + hashline remain for non-code
files. PascalCase names per the Claude-Code-aligned convention from
sf-mp9w20y1-nld9hc.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three concrete fixes from open self-feedback assessment 2026-05-17:
- uok/gate-registry-bootstrap.js: register all 6 R081 detector gates
(same-unit-loop, zero-progress, repeated-feedback-kind, artifact-flap,
stale-lock, periodic-detector-sweep) alongside drift-detection and
iter-completion-reconciler. Closes the gap reported by
sf-mp9udspu-fsf7si — bootstrap previously registered 2 of 8 gates.
- self-feedback.js ALLOWED_KIND_DOMAINS: add `adversarial-finding`.
Closes gap reported by sf-mp9u4i25-fczmcj — R075 (autonomous
adversarial review) challenge unit had no kind to file findings under.
- sf-autonomous-watchdog.sh: delete watchdog-run-*.log files older than
60 minutes at each cycle start. Without rotation .sf/ grew to 1.9 GB
in 24h (today's snapshot). 60 min retention captures last cycle for
post-incident triage; older state is already in DB + iterations.jsonl.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- sf-db-schema.js: auto_vacuum INCREMENTAL → NONE. The "Bad ptr map entry"
corruption on 2026-05-17 was incremental-autovacuum ptrmap drift under
concurrent writers. Recovered DB has no ptrmap; future fresh DBs must
match. incremental_vacuum() callers in sf-db-core.js become no-ops.
- bin/sf-from-source: lock allowlist extended to skip readonly sf headless
subcommands (--help, query, status, usage, reflect, feedback list,
triage --list/--json). Previously every sf headless invocation tried
to acquire the project lock — operator couldn't even inspect SF state
while autonomous was running.
- self-feedback.js triageBlockedEntries: (1) treat empty/null/undefined
sfVersion as unknown, not zero; (2) exempt operator-direction kinds
(improvement-idea, architecture-defect, missing-feature, gap) from
auto-version-bump close. Both were needed to prevent the R124 incident
recurring.
- headless-feedback.ts handleAdd: populate sfVersion via getCurrentSfVersion
+ detect repoIdentity via isForgeRepo, not hardcoded "external"/"". An
empty sfVersion sorts below any real semver, so the resolver retry-closed
every operator-filed entry within seconds.
Net effect: R124 proposal (filed via sf headless feedback add) is no
longer auto-resolved as version-stale. Larger architectural fix (single-
writer SF daemon / RPC for all DB writes — M040 territory) tracked as
follow-up R-entry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- uok/auto-runaway-guard.js: invoke runDetectorSweep alongside the existing
zero-progress check (fire-and-forget for sync-tick compatibility; results
consumed on next tick via sweepState ring buffer). Passes unitId,
unitMetrics, sessionFingerprint, lockPaths, and a 30-min DB-windowed
recentFeedback slice.
- detectors/{same-unit-loop, zero-progress, repeated-feedback-kind,
artifact-flap, stale-lock, periodic-runner}.js: each detector now also
exports a UokGate wrapper (id/type/execute -> GateResult per ADR-0075).
Plain detector functions kept for existing consumers.
- detectors/index.js: single import surface for the gate exports.
- detector-stale-lock.test.mjs (9), detector-periodic-runner.test.mjs (10),
detector-gates-contract.test.mjs: fills the R055/R056 test gap filed
earlier today + proves UokGate contract conformance.
- 41/41 detector tests green; copy-resources clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- scripts/sf-meta-supervisor.mjs: pure-node daemon supervising
scripts/sf-autonomous-watchdog.sh. Tick=60s, restarts watchdog if dead,
emits .sf/meta-status.json, halt via .sf/meta-supervisor.halt. Uses
only node builtins (no SF dist deps) so it survives dist breakage.
- src/headless.ts: R091 — gate the per-cycle handleTriage call on a time
interval (SF_TRIAGE_INTERVAL_MS, default 30 min) and bump batch size
(SF_TRIAGE_MAX, default 25, was 5). Drops the ~8min triage hit from
every cycle while letting daily drain capacity rise.
- .sf/REQUIREMENTS.md: R091 (triage sidecar) + R092 (PDD-completeness
as routing signal) + R093 (pin model per orchestration agent.yaml) +
R094 (swarm-role model tier specialization — 8 roles already exist
in uok/swarm-roles.js; model field per role missing).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Operator coined "system lane" — better than my "side-track". Frames
the architecture cleanly. The lane primitive unifies:
- R046 (multi-unit lanes) parallel slice dispatch
- R049 (per-lane model routing) different LLM per lane
- R057 (system lane) non-unit work alongside unit lane
Today autoLoop is 1 unit lane. System lane runs alongside for memory
consolidation, triage drain, doctor audits, log compaction, reflection
assembly, catalog refresh — all currently queued between units.
Single-writer DB met by sf-db.js serial queue.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
legacyPermissionLevelForProfile had a switch with cases for
restricted/trusted/unrestricted only, no case for "normal" (the
DEFAULT autonomous session profile per auto/session.js:377). "normal"
fell through to default → "low" — too restrictive for autonomous work.
Witnessed M010/S04/T01: solver note "TypeScript compilation and git
diff blocked by low permission level" — SF couldn't verify its own
deliverable because permissions were locked down despite running in
autonomous mode.
Fix:
- "normal" → "medium" (allows tsc, git, npm test)
- default → "medium" (was "low"); unknown profiles shouldn't cripple
autonomous executors. Operators wanting strict mode set
profile: "restricted" explicitly.
Per operator intent 2026-05-17: "SF should have permission even if
it can limit its agents and only allow orchestrator or whatever."
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the halt-watchdog detects stuck state, the autoLoop was logging
"halt-watchdog-break" every iteration but otherwise tight-spinning
through dispatch-resolve at ~2s/iteration. 2026-05-17 dogfood logged
60+ such events in a 30s window — pure CPU burn while the actual
stuck condition stayed stuck.
Fix: exponential backoff (1s → 2s → 4s → 8s → 16s → capped at 30s)
based on how many halt thresholds have elapsed. Heartbeat() resets
when real progress resumes (existing behavior). Backoff costs nothing
when the loop is healthy.
One of the 14 Ralph-Wiggum patterns surfaced this session.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two sites told operator to "kill PID X" without checking X was alive:
- interrupted-session.js:formatInterruptedSessionRunningMessage
- auto.js autonomous-start blocking notification
Both report stale locks from crashed prior sessions as if a live session
exists, confusing operator and blocking restart. Session-lock.js already
has auto-recovery for stale-PID locks; these two surfaces just needed
matching liveness checks to label dead-PID locks correctly.
Now: dead-PID → "Stale lock from dead PID X — will be auto-recovered"
alive-PID → original "kill X" message
Catches one of the 14 Ralph-Wiggum-obvious patterns surfaced this
session. Reduces operator confusion + dovetails with R055 (M038/S05)
when stale-lock auto-recovery becomes a core-loop detector.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
checkNeedsReassessment looked up the slice's assessment file with
suffix='ASSESS', but actual files are 'S03-ASSESSMENT.md'. The
resolveFile pattern requires at least one char before the suffix
(/^S03-.*-ASSESS\.md$/), so 'S03-ASSESSMENT.md' never matched and
the helper returned {sliceId} on every poll → dispatcher kept
firing reassess-roadmap forever.
Fix: try 'ASSESSMENT' first, fall back to legacy 'ASSESS'. Now
S03-ASSESSMENT.md properly satisfies the "already reassessed" check
and the dispatcher advances to the next slice (S04).
Verified: resolveSliceFile('M010','S03','ASSESSMENT') returns the
real path; with the fallback, this resolves on first call. The
70+ degenerate reassess iterations on M010/S03 (witnessed
2026-05-17) won't recur.
Ralph Wiggum approved. (per operator: "sf should clear these stuck
itself ralph wiggums would fix")
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Detected via supervisory check 2026-05-17: SF stuck in degenerate reassess-
roadmap loop on M010/S03 (5 iterations in 8min, all returning
outcome=continue). Root cause: synthesized-checkpoint in runUnitViaSwarm
only treats the generic `checkpoint` tool as a completion signal — but
units routinely complete via their unit-specific tool (reassess_roadmap
with verdict=roadmap-confirmed, validate_milestone, complete_milestone,
complete_slice, save_summary). The LLM correctly emitted the unit's
specific completion tool + assistant text "<turn_status>complete</turn_status>",
but workerSignaledOutcome stayed null → synthesized checkpoint fell back
to continue → solver re-iterated.
Fix: recognize UNIT_COMPLETION_TOOLS = {reassess_roadmap,
validate_milestone, complete_milestone, complete_slice, save_summary}
as implicit "complete" signals. The check fires when those tools are
called and an earlier explicit checkpoint hasn't already said
"complete" or "blocked".
This resolves sf-mp94lth4-ew26om and should prevent future
degenerate-iteration loops on reassess-roadmap and milestone completion
units. 13/13 existing M010 tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User caught: flash-lite ≠ flash (different model tier, different scores).
Previous fix counted flash-lite as fully covered via flash proxy, which
overstated coverage and could mislead routing.
benchmarkLookupVariants now tags variants with kind:
- 'exact' → date/version strip + -latest alias (same model line)
- 'approx' → tier strip (flash-lite→flash, X-lite→X) — different model
computeBenchmarkCoverage promotes 'exact' matches to covered; 'approx'
matches stay in uncovered with `approximatedBy` field so operators see
when a real benchmark is still needed.
Honest report: 64 exact covered / 1 proxy-only / 104 genuine uncovered
(was 65/0/104 with the overcount).
R049 + R050 added to traceability (M036/M037 future milestones).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
benchmark-coverage.js: new benchmarkLookupVariants() returns ordered
fallback keys for a model id, and computeBenchmarkCoverage tries each
variant before flagging uncovered. Patterns covered:
- date/version suffix strip ("mistral-medium-2505" → "mistral-medium")
- tier strip ("X-flash-lite" → "X-flash", "Y-lite" → "Y")
- "-latest" append for bare names ("mistral-medium" → "mistral-medium-latest")
The audit reports the matched variant via `matchedVia` so operators can
see when fallback applied (vs adding a real entry).
Verified: coverage 62/169 (37%) → 65/169 (38.4%). Sample fallback matches:
google-gemini-cli/gemini-2.5-flash-lite → gemini-2.5-flash
mistral/mistral-medium → mistral-medium-latest
mistral/magistral-small-2509 → magistral-small
R050 now active: full closure requires auto-benchmark of remaining
104 uncovered models via bulk-import of published scores or live eval.
This step shrinks the gap via cheap structural fallback; future work
adds the real scoring loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>