Two SF processes writing to the same .sf/sf.db over WAL caused torn
pages and "database disk image is malformed" corruption (observed
2026-05-17 in dogfood-5 — the project DB ended up with B-tree
pointer-map desync at page 69, requiring a backup restore). The
session-lock in src/resources/extensions/sf/session-lock.js exists
but is only acquired from auto-start.js when autonomous mode starts.
Interactive sf or pre-autonomous-start work did not take it, so a
second sf could open the same DB and contend.
Promote the lock to the shell wrapper so EVERY sf invocation in a
write-capable mode acquires a project-level flock on .sf/sf.lock
BEFORE node is launched. Read-only commands (logs, status, dash,
sessions, list, --version, --help) skip the lock to keep concurrent
read use-cases working. SF_SKIP_LOCK=1 escape hatch for tests that
intentionally exercise concurrent paths.
On collision the wrapper prints the current lock holder (pid + args
+ cwd + started timestamp) so the operator can identify the
conflicting session, then exits with 75 (EX_TEMPFAIL). The lock is
released automatically when the wrapper bash exits — no stale-lock
recovery needed since flock is kernel-owned and dies with the fd.
The fd opens in read+write mode (`<>`) WITHOUT truncating so the
collision branch can still cat the existing holder; truncation
happens only after flock succeeds, preventing two racers from
clobbering each other's metadata.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sets SF_SUBAGENT_VIA_SWARM=1 by default in the wrapper so all sf
launches route subagent calls through runSingleAgentViaSwarm (uok
message-bus / uok_messages table) instead of spawning a child sf
process via runSubagent. Operators can opt out with
SF_SUBAGENT_VIA_SWARM=0 (or =false) in env.
Leaves the runSingleAgent code default (opt-in) unchanged so the
existing tests/subagent-via-swarm.test.mjs "unset → subprocess"
assertion keeps holding. The flip lives at the wrapper layer where
every interactive/headless sf launch picks it up but tests and
direct dev-cli launches stay on documented opt-in semantics.
Note: this is Layer 1 of the inline-execution path. Layer 2 (full
in-process unit dispatch via runUnitInline) is tracked separately
in REQUIREMENTS.md R013/R014 and is not addressed here.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bun was the wrong runtime for our environment, two ways:
1. bun doesn't ship node:sqlite. sf-db.ts falls back through node:sqlite
→ better-sqlite3 → null. Result: 'No SQLite provider available' and
degraded-mode filesystem-state derivation, even though sqlite is
actually available (node:sqlite under node, bun:sqlite under bun —
both valid, but our code only knows the node names).
2. bun's loader doesn't inherit the system library search path under
Nix. libz.so.1 isn't found for forge_engine.node, so the native
addon falls through to JS implementations (slower).
Both warnings ("Native addon not available", "DB unavailable —
degraded mode") were the symptom of "we're running under bun".
Fix: use node + the existing src/resources/extensions/sf/tests/
resolve-ts.mjs loader hook (which already handles .js → .ts
import-specifier remapping for runtime resolution) +
--experimental-strip-types (node 22+, native in 24).
Result: from-source via node loads cleanly. No native warning.
No sqlite warning. No degraded mode. Exec: `./bin/sf-from-source
--print "..."` returns the model output and nothing else.
Drops the LD_LIBRARY_PATH zlib-injection hack that was added in
4912f6ea8 — that was working around the bun native-loader issue
that doesn't exist under node.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
bun's loader doesn't inherit the same library search path as node under
Nix, so require('forge_engine.linux-x64.node') fails with
'libz.so.1: cannot open shared object file' even when the native addon
exists at the expected path. Result: sf-from-source ran in
JS-fallback mode, and we'd been working around it by switching to
node dist/loader.js — which forces a manual `npm run copy-resources`
after every src/ change to keep dist in sync.
This wraps sf-from-source to find a Nix-store zlib at startup and
prepend it to LD_LIBRARY_PATH before exec'ing bun. The native addon
loads cleanly; from-source becomes the reliable default again; no
more dist drift to worry about.
Find pattern: /nix/store/*-zlib-*/lib/libz.so.1 at maxdepth 4
(maxdepth 2 was too shallow — the hash dir is depth 1, lib is depth 2,
the .so.1 file is depth 3, plus we want the parent dir for
LD_LIBRARY_PATH so '%h' on a depth-3 match gives the lib dir).
Outside Nix (no /nix/store), this is a no-op and falls through to
the existing exec.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Updates channel prefixes, log messages, comments, and configuration values
across daemon, mcp-server, and related packages to complete the rebrand from
gsd to sf-run naming.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>