Commit graph

841 commits

Author SHA1 Message Date
Mikael Hugo
0acb0f9be0 feat: harden sf server build and routing
Some checks failed
sf self-deploy / deploy test and probe (push) Blocked by required conditions
sf self-deploy / promote prod (push) Blocked by required conditions
sf self-deploy / build, test, and publish server image (push) Has been cancelled
2026-05-18 02:33:28 +02:00
Mikael Hugo
06b1fefd35 fix(circular): break coding-agent core mega-cycle + skip function-body imports
Some checks are pending
sf self-deploy / build, test, and publish server image (push) Waiting to run
sf self-deploy / deploy test and probe (push) Blocked by required conditions
sf self-deploy / promote prod (push) Blocked by required conditions
Cycle 2 (the 13-node coding-agent mega) closed via two changes:

1. scripts/check-circular-deps.mjs — track function-body depth and
   skip require()/import() calls inside function bodies. They run on
   call, not at module evaluation, and therefore cannot cause
   module-graph cycles — same reasoning as the existing dynamic
   `await import()` skip. Generic improvement; benefits any pattern
   that uses lazy CommonJS require() to break a static cycle.

2. packages/coding-agent/src/core/extensions/loader.ts — removed the
   static `import * as _bundledCodingAgent from "../../index.js"`
   self-reference, which was the cycle-closer. It only populated
   STATIC_BUNDLED_MODULES for the Bun virtualModules path
   (`isBunBinary` branch in getJitiOptions), and SF is Node-26-only
   per operator policy (no Bun) — so the Bun branch is dead at
   runtime and dropping the static self-reference is safe. The two
   map entries that referenced it (@singularity-forge/coding-agent
   and the @mariozechner alias) are commented out at the same site
   with a pointer to the top-of-file note.

Net effect across the full session:
  start of session:      9 cycles
  walker false-positive
    cleanups landed:     dropped 6 type-only + dynamic-import false
                         positives
  tui ↔ overlay-layout:  CURSOR_MARKER moved to overlay-types.ts
  SF autonomous-rollback
    chain (3 targeted
    cuts):               experimental → preferences-serializer,
                         classifier → lazy rollback import,
                         preferences-models → runaway-defaults.js
  this commit:           coding-agent loader self-reference dropped

Final state:  zero circular dependencies in 1193 scanned files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 00:42:09 +02:00
Mikael Hugo
66309b235f fix(circular): skip type-only imports + break tui ↔ overlay-layout cycle
Some checks are pending
sf self-deploy / build, test, and publish server image (push) Waiting to run
sf self-deploy / upgrade vega source server (push) Blocked by required conditions
sf self-deploy / deploy test and probe (push) Blocked by required conditions
sf self-deploy / promote prod (push) Blocked by required conditions
Two changes (one walker, one real code):

1. scripts/check-circular-deps.mjs — skip type-only imports.
   `import type { X } from "..."` and `export type { X } from "..."`
   are erased by tsc at compile time and cannot cause runtime cycles.
   Walker now drops them, matching the precedent set by skipping
   dynamic `await import(...)`. Net effect on full-repo scan:
     before: 9 cycles
     after:  3 cycles (the 6 that disappeared were all `import type`
       false-positives — none were real runtime cycles).

2. packages/tui — break the last 2-file cycle.
   tui.ts and overlay-layout.ts had a real RUNTIME cycle:
     - tui.ts → overlay-layout.ts:  applyLineResets, compositeOverlays,
       extractCursorPosition, isOverlayVisible (4 fns)
     - overlay-layout.ts → tui.ts:  CURSOR_MARKER (1 const)
   Both files already imported `./overlay-types.ts` (no cycle there).
   Moved CURSOR_MARKER from tui.ts into overlay-types.ts and re-exported
   from tui.ts so existing `from "./tui.js"` call sites keep working.
   No behavior change.

Remaining cycles after both fixes (3 real-runtime ones, separate slices):
  - safety/autonomous-rollback chain (9 files, SF extension)
  - packages/coding-agent core mega-cycle (12 files)
  - (one more, see `npm run check:circular`)

These are foundational refactors worth their own commits, not bundled
into this one.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 00:28:53 +02:00
Mikael Hugo
f8e53840da fix(rpc, web): integrate drain into forceShutdown + healthz-503 on shutdown
Three fixes addressing codex's adversarial review of the earlier orphan-
recovery / graceful-shutdown landing:

(1) Codex point B — single shutdown path. Removed the parallel
    installGracefulShutdown() handler in rpc-mode.ts that was adding
    a second SIGTERM listener and racing forceShutdown()'s teardown.
    The drain is now the FIRST step inside forceShutdown() (before
    killTrackedDetachedChildren / extension session_shutdown / etc.)
    so DB writes complete cleanly while child processes are still
    alive to flush. Race-free against the existing shutdown ordering.

(2) Codex point D — recovery-before-each-drain. Cloud-volume mtime
    visibility lags between containers can mean an orphan `.draining`
    file from a previous container isn't visible during the startup
    scan but appears moments later. drainQueuedSfFeedbackCommands()
    now runs recoverOrphanedFeedbackDrains() as its first step, so
    each dispatch's drain sees the latest filesystem state.

(3) Codex point E — healthz returns 503 during shutdown. New module
    src/web/shutdown-state.ts holds a per-process flag, auto-registers
    SIGTERM/SIGINT/SIGHUP handlers on first read, and exposes a
    snapshot (signal, startedAt, elapsedMs) for diagnostics. The
    healthz route imports isShuttingDown() and returns 503 when set,
    so k8s readinessProbe / Forgejo blue-green probes drain traffic
    BEFORE we actually stop responding.

Tests:
  - rpc-mode-orphan-recovery.test.ts: 8/8 still green
  - web-shutdown-state.test.ts: 5/5 new — default false, mark sets
    flag, idempotent, signal exposed via snapshot, null signal for
    manual mark

Deferred to a follow-up commit (codex didn't flag, but noted for
completeness): a SIGTERM-drain child-process integration test that
spawns rpc-mode + sends a real signal. The 5 unit tests cover the
flag logic; the integration test would cover the full process tree
and is bulkier than the current commit warrants.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 22:35:50 +02:00
Mikael Hugo
68178a9260 fix(rpc-test): use .js extension for recovery module import
tsgo rejects `.ts` extensions in imports without allowImportingTsExtensions.
Updated the test to import from "./feedback-queue-recovery.js" which is
both ESM-compatible and matches the rest of the package convention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 22:30:10 +02:00
Mikael Hugo
d54f18c95f feat(rpc): orphan-recovery + 10-min graceful shutdown for safe container swap
Two related changes to make blue/green upgrades (per scripts/upgrade-vega-
source-server.mjs) safe for in-flight self-feedback writes.

1. Startup orphan recovery (feedback-queue-recovery.ts, extracted module).
   Scans .sf/runtime/ for sf-feedback-queue.jsonl.<pid>(.<sid>)?.draining
   files left by previous processes. For each:
     - if our own session id: leave alone (live drain)
     - if PID is alive: leave alone (foreign drainer)
     - else: rename back to queue (only if no active queue file exists)
   Crash safety: when both an orphan AND an active queue exist, we DEFER
   recovery rather than merge — appending then unlinking would risk
   duplicate replay on crash. The next restart's recovery picks it up
   once the queue is naturally drained. Supports legacy filenames
   (.<pid>.draining, pre-session-id) for backward compat.

   Added SF_DRAIN_SESSION_ID (per-process 6-byte hex) stamped into the
   .draining filename. PID reuse across container restarts is normally
   safe because /proc clears, but the session id is a stronger guarantee
   that we don't trample a foreign drainer that happens to land on the
   same PID.

2. SIGTERM/SIGINT drain-then-exit handler (installGracefulShutdown).
   Drains the queue once on signal, then exits. Bounded by
   SF_RPC_SHUTDOWN_GRACE_MS (default 600_000 = 10 min). Rationale: if
   a drain is in flight, it MUST finish — losing self-feedback writes
   across a server upgrade is worse than a long wait. Normal drains
   complete in <1s; the 10-min ceiling is for pathological lock
   contention. Operator overrides via env var, or docker kill /
   kubectl delete --force for hard bypass.

   Upgrader script bumped to docker stop --timeout 610 (10s safety
   margin past the grace). k8s deployments must set
   terminationGracePeriodSeconds≥610 for the rolling-update path.

Tests: rpc-mode-orphan-recovery.test.ts — 7 cases covering empty,
no-orphans, dead-PID single recovery, both-files-deferred (codex's
crash-safety fix), live-PID untouched, multiple-dead-PIDs, malformed-
filename ignored.

Refs sf-mpa5kdpu (drainer orphans never recovered), sf-mpa4g46x
(original RPC hang). Codex adversarial-reviewed; the PID-reuse hardening
and crash-safety deferral landed per its feedback. Open follow-ups:
shutdown-aware /api/healthz returning 503 (codex point E), integrate
with existing forceShutdown ordering (codex point C).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 22:29:24 +02:00
Mikael Hugo
55a498603f fix(rpc): don't unref the sf-feedback drain timer
The drainer was scheduled via setTimeout(0) with timer.unref(). The unref
made the timer release-eligible — fine in a long-running rpc-mode child
where the process has plenty of other event-loop handles, but fatal in
the packaged-standalone path where the rpc subprocess has nothing else
to keep it alive. The process exited before the timer fired, so the
queue file was renamed to .<pid>.draining and then stranded forever.

Removed timer.unref(). The setTimeout(0) still lets the RPC response go
back to the caller first (no synchronous blocking on the drain), but the
timer now keeps the process alive until the drain handler runs, and the
drain's own async I/O keeps it alive until done.

Refs sf-mpa6wuhm-wwddd1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 21:55:23 +02:00
Mikael Hugo
acd907fec2 fix: harden sf server control loop
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
2026-05-17 21:13:12 +02:00
Mikael Hugo
cf2d1a768e feat(sf): route server control through rpc 2026-05-17 20:07:36 +02:00
Mikael Hugo
3adcb833ed refactor(sf): separate daemon from server identity 2026-05-17 19:18:33 +02:00
Mikael Hugo
cc32ab79d9 fix(docs): remove stale hashline-{read,edit}.ts rows post-fold
Hashline read/edit tool wrappers were folded into Edit({match}) and
Read({format}) modes in commit ffdec0fee. The two rows in FILE-SYSTEM-MAP.md
pointed to files that no longer exist. Updated the surviving hashline.ts row
to note its new consumer relationship with Edit/Read.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 18:48:34 +02:00
Mikael Hugo
623af869b1 remove: SF voice IVR / ElevenLabs paging — migrated to centralcloud
Per operator-direction 2026-05-17 (R089 — Migrate Voice IVR / ElevenLabs
On-Call Paging Infrastructure out of SF). Migration target landed in
centralcloud monorepo:
  - centralcloud_core/lib/centralcloud_core/voice.ex (TwiML + ElevenLabs)
  - centralcloud_staff/lib/.../controllers/voice_controller.ex (Phoenix)
  - centralcloud_staff/lib/.../controllers/voice_prompt_controller.ex
  - centralcloud_staff/lib/.../router.ex (/twilio scope)

SF removal:
  - web/app/api/voice/route.ts
  - web/app/api/voice/prompt/route.ts
  - web/app/api/voice/ directory
  - src/tests/integration/web-voice-ivr-contract.test.ts

Operator-paging infra was historical drift in SF (per-project compiler);
belongs in centralcloud (org-level ops). R088 (Pre-Removal Test-Import
Safety Gate) not yet built — operator manually verified safety scan:
TWILIO_/ELEVENLABS_ env vars only referenced in the deleted files; no
internal SF callers; centralcloud version verified present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 17:42:16 +02:00
Mikael Hugo
ffdec0feee fold: hashline_edit + hashline_read → Edit({match}) + Read({format}) modes
Per operator R-entry sf-mp9wo7e3-sdxqss + no-compat directive.

- Edit gains `match: "substring"|"anchor"` arg; anchor mode routes to the
  existing applyHashlineEdits logic. Substring stays default.
- Read gains `format: "plain"|"tagged"` arg; tagged mode emits LINE#HASH
  prefixes via formatHashLines.
- Delete hashline-edit.ts, hashline-read.ts. KEEP hashline.ts (helpers
  are now Edit/Read internals).
- tools/index.ts: drop the two tools + the createHashlineCodingTools
  preset.
- agent-session.ts: setEditMode no longer swaps tool instances (single
  tool surface; mode preserved for system-prompt context only).
- sdk.ts + index.ts: remove hashline tool re-exports.
- headless-ui.ts + test: remove hashline_edit case.

Net agent-visible tool surface: -2 tools. Capability preserved as modes.
No backward-compat alias for the removed tool names.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-17 17:39:59 +02:00
Mikael Hugo
d03758d803 feat: replace launchd with systemd user-unit install path
Operator-direction 2026-05-17 "we will never use mac" — no compat
preservation. Single-cutover replacement.

- new packages/daemon/src/systemd.ts: install/uninstall/status using
  systemctl --user + ~/.config/systemd/user/sf-server.service
- new packages/daemon/src/systemd.test.ts: ports launchd tests, same
  shape, mocked systemctl via RunCommandFn injection + SF_SYSTEMD_USER_DIR
  env override for real filesystem tests
- cli-main.ts: switch import + update help text + status messages
- index.ts: re-export systemd module (installSystemdUnit, uninstallSystemdUnit,
  systemdUnitStatus, generateUnit, getServicePath, SystemdStatus, SystemdUnitOptions)
- DELETED: launchd.ts (253 LOC), launchd.test.ts (379 LOC)
- docs/dev/drafts/M053-per-repo-supervisor.md: remove "launchd" mention
- CHANGELOG.md: document systemd-only install path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-17 17:33:34 +02:00
Mikael Hugo
57fef5979d feat: make sf server the operator entrypoint 2026-05-17 17:23:46 +02:00
Mikael Hugo
6e3b3d3c54 feat: add Serena-style AST tools (ReplaceSymbol, InsertAroundSymbol, AstGrep)
Wraps the native AST primitives from @singularity-forge/native/{edit,ast} as
LLM tools so agents can do tree-sitter-anchored code edits instead of
substring-based Edit or line-anchor hashline.

- replace-symbol.ts (+117): wraps replaceSymbol(file, symbolPath, newBody);
  matches function/class/method declarations via tree-sitter, returns
  matched=false sentinel when the symbol isn't located.
- insert-around-symbol.ts (+122): wraps insertAroundSymbol with position
  enum BeforeDecl/AfterDecl/AtBodyStart/AtBodyEnd.
- ast-grep.ts (+152): wraps astGrep for pattern matching across files with
  $VAR/$$$ARGS meta-variables; returns ranked matches with byte/line/column
  + captured meta-variable bindings.

Each tool:
  - typebox schema matching the existing AgentTool pattern (edit.ts)
  - notifyFileChanged() into the LSP layer on write ops
  - resolveToCwd() for path normalization
  - catches native errors + returns isError result with the
    NativeUnavailableError message pointing operators to
    `nix develop` + `node rust-engine/scripts/build.js --dev`

Wire-in:
- tools/index.ts: re-exports + imports + entries in `allTools` map and
  createAllTools() factory.
- extension-manifest.json: ReplaceSymbol / InsertAroundSymbol / AstGrep
  appended to provides.tools so SF extension agents see them.

Higher value than substring/line-anchor for code in tree-sitter-supported
languages (TS/JS/TSX/Python/Rust). Edit + hashline remain for non-code
files. PascalCase names per the Claude-Code-aligned convention from
sf-mp9w20y1-nld9hc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 17:14:12 +02:00
Mikael Hugo
19b10eb67c feat: make sf-server own swarm registry sync 2026-05-17 17:05:16 +02:00
Mikael Hugo
0f5a606923 fix: native loader — loud banner on fallback + structured load log + helpers
- Stderr banner on fallback now multi-line with concrete fix steps
  (nix develop → node rust-engine/scripts/build.js --dev) so an operator
  scanning a 280MB cycle log can't miss it. The old single-line warning
  was easy to overlook (today's "WHY HAS NOBODY SEEN IF LOUD" check).
- Structured load record per process at .sf/runtime/native-engine-load.jsonl:
  {ts, pid, platformTag, source, binaryPath, sha256, loaded, errors?}.
  Lets operators audit which binary each SF process loaded — and detect
  ABI mismatches across daemon↔worker boundaries when different sha256
  values appear for the same platformTag (the "rare but real" concern
  flagged earlier today).
- Proxy error message now points to the build/install commands instead
  of just saying "not available". NativeUnavailableError is named for
  consumer try/catch chains.
- Fixed _loadedSuccessfully ordering — was set true BEFORE the require,
  leaving stale-true after a failed first attempt.
- New helpers isNativeLoaded(), nativeBinaryPath(), nativeBinarySha256()
  for diagnostic surfaces (sf headless query, doctor checks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 16:34:02 +02:00
Mikael Hugo
9a84d82cdb chore(release): 2.75.3 → 2.75.4 + workspace dependency refresh
Bumps version across the workspace (root + 10 @singularity-forge/*
packages) and lands the pending dependency refresh that had been
sitting uncommitted:

  @anthropic-ai/sdk         0.95.1 → 0.96.0
  @anthropic-ai/vertex-sdk  0.14.4 → 0.16.0
  @google/genai             2.0    → 2.3
  @logtape/{file,logtape,pretty,redaction}  2.0.7 → 2.0.9
  @smithy/node-http-handler 4.7.0  → 4.7.3
  @clack/prompts            1.3    → 1.4
  @types/mime-types         2.1    → 3.0

Inter-package refs in packages/{daemon,ai}/package.json bumped to
^2.75.4 so the workspace stays self-consistent. package-lock.json
regenerated via `npm install --package-lock-only --legacy-peer-deps`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 23:59:14 +02:00
Mikael Hugo
f55d490e1d fix(subagent-runner): drop spurious 10s STUCK warning on session.prompt
The phaseWatchdog at 10s fired "STUCK phase=session.prompt" on every
healthy LLM call longer than 10 seconds. Verified via strace on the
running dogfood sf: bytes were actively flowing on the TLS socket
(fd 29) to the LLM provider while STUCK was being logged — the
session.prompt was never actually stuck, the watchdog was just
diagnostic-only and oblivious to stream activity.

The noOutputTimeoutMs watchdog (set to 60s for triage in commit
d80060fec) is the actual kill mechanism. It is already event-aware:
every meaningful subagent event resets the timer via armNoOutputTimer
+ isMeaningfulSubagentOutputEvent. The 10s STUCK warning was added
in commit 67e5ac9db as investigation infrastructure for the
sf-mp8e02m1-zpk903 family of bugs, but now it is just noise that
makes legitimate 30-200s LLM responses look broken.

Keeps the 10s STUCK watchdog for the three setup phases
(resourceLoader.reload, createAgentSession, bindExtensions) where
10s of silence is a real hang signal — those phases normally run in
sub-second.

Also includes:
- biome.json: bump $schema URL from 2.4.14 to 2.4.15 to match the
  current biome CLI (clears the deserialize warning)
- scripts/check-test-imports.{,test.}mjs: format + drop a useless
  regex escape that biome flagged in landed code

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 23:49:43 +02:00
Mikael Hugo
365c6bbc3b chore: formatter / linter touch-up (230 files)
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
Pure formatting / lint-fix pass that ran during `npm run build:core`
in the session that landed the agent-runner / quota / coverage /
phase-2 routing work. No logic changes — indentation, trailing
commas, import sort, etc. Captured separately so the actual feature
commits stay scoped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 21:19:53 +02:00
Mikael Hugo
67e5ac9db1 diag(subagent-runner): per-phase timing + stuck-watchdog for sf-mp8e02m1-zpk903
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
Adds visible diagnostics to runSubagent so the next time the
"session initialized but no LLM call" bug fires, the log identifies
which setup phase hangs.

Phases instrumented:
  - resourceLoader.reload()
  - createAgentSession()
  - bindExtensions(runLifecycle=...)
  - session.prompt() entry → return

Output format (stderr, prefixed with [subagent:<name>]):
  phase=resourceLoader.reload 23ms
  phase=createAgentSession 142ms
  phase=bindExtensions 89ms runLifecycle=true
  phase=session.prompt-entered taskLen=8421 timeoutMs=480000 noOutputMs=180000
  phase=session.prompt-returned 16234ms          ← normal completion
  STUCK phase=<X> 10000ms (no completion signal ...)   ← when watchdog fires

Each phase has a soft 10s watchdog that emits a STUCK line if the
await doesn't complete in time. The watchdog never aborts — just
surfaces visibility. Existing timeoutMs / noOutputTimeoutMs handle
actual termination.

This is investigation infrastructure for the third prompt-never-sent
seam (coding-agent/subagent-runner). The agent-runner.js seam
(sf-mp8g4rcd-w01tkh) was fixed in commit 8ee4d8358 with bounded
retries. This commit doesn't fix the underlying bug — it makes the
bug self-reporting next time it fires so operator and autonomous
loop both get actionable signal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 20:40:17 +02:00
Mikael Hugo
b5764af27b sf snapshot: uncommitted changes after 33m inactivity 2026-05-16 17:00:13 +02:00
Mikael Hugo
da0c41d375 sf snapshot: uncommitted changes after 56m inactivity 2026-05-16 14:59:40 +02:00
Mikael Hugo
e2e096c5c7 feat(rpc): configurable RPC init timeout via SF_RPC_INIT_TIMEOUT_MS
Add resolveRpcInitTimeoutMs() helper and wire it into RpcClient.init().
Default init timeout increased from 30s to 120s. Override via env var.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 20:00:26 +02:00
Mikael Hugo
3a14fe86a7 test(list-models): isolate from developer's discovery-cache
Tests were picking up the developer's real
~/.sf/agent/discovery-cache.json and seeing unexpected models in
output. Pin tests to a guaranteed-missing path via the new
_discoveryCacheFilePath option so the env they observe is solely
what the test constructs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 16:37:11 +02:00
Mikael Hugo
7ba469cff1 feat(memory): add debug logging to memory extraction pipeline
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
The memory extraction system has infrastructure (DB tables, LLM prompts,
unit closeout wiring, embedding backfill) but zero processed units and
only self-feedback-resolution memories. This suggests extraction is
failing silently.

Add debugLog() calls throughout extractMemoriesFromUnit() so we can
observe:
- Skip reasons (mutex busy, rate limited, already processed, file too small)
- Start/done lifecycle per unit
- LLM call and parse outcomes
- Error messages on failure and retry

This makes the extraction pipeline observable via --debug or the
journal/debug log without changing behavior.

Tests: 185 files / 1993 tests pass.
Type check: clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 16:09:36 +02:00
Mikael Hugo
d57cd84d9a fix(auto): make halt watchdog observable 2026-05-15 08:09:02 +02:00
Mikael Hugo
f9c147a08b fix(swarm): ignore heartbeats for silent worker timeout 2026-05-15 08:00:35 +02:00
Mikael Hugo
e464a1bd6e fix(swarm): bound silent worker responses 2026-05-15 07:35:31 +02:00
Copilot
cf9203aee0 feat(swarm): forward parent permission profile to in-process worker sessions
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
In-process swarm workers get a fresh headless AgentSession whose permission
extension defaults to read-only minimal. This blocks normal autonomous edits
(e.g., write_file, edit) even when the parent session runs at normal or
trusted level.

- run-unit.js: add legacyPermissionLevelForProfile mapping and include
  executorPermissionLevel in the dispatch envelope.
- swarm-dispatch.js: forward executorPermissionLevel from envelope to
  runAgentTurn as permissionLevel.
- agent-runner.js: accept permissionLevel option and pass it to
  runSubagent config.
- subagent-runner.ts: add permissionLevel to SubagentConfig; when set,
  temporarily set SF_PERMISSION_LEVEL env and run extension lifecycle so
  the permission extension reads the level before tool hooks execute.
- Tests for envelope field, dispatch forwarding, and run-unit integration.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 06:38:42 +02:00
Mikael Hugo
dbfaca61cf fix(swarm): surface worker tool call count to bypass parent-ledger guard
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
Round 7 dogfood failed with "0 tool calls — context exhaustion" even
though the swarm worker's session DID call tools. Root cause: the
phases-unit.js zero-tool-call guard reads from the PARENT session's
message ledger via snapshotUnitMetrics. The swarm worker runs in an
ISOLATED subagent session — its tool calls never appear in the
parent's messages, so the guard always sees 0 and fires a false-
positive context-exhaustion retry.

Fix:
- runUnitViaSwarm now returns swarmToolCallCount on the UnitResult,
  surfacing the real worker tool call count from the onEvent stream
  (collectedToolCalls.length, accurate end-to-end).
- phases-unit.js zero-tool-call guard checks
  unitResult._via === "swarm" && swarmToolCallCount > 0 and bypasses
  the false-positive retry, logging "zero-tool-calls-swarm-bypass".

Also adds a debug stderr line in subagent-runner.ts printing the tool
count after bindExtensions, confirming the worker session HAS the
full tool set (checkpoint + built-ins) — Hypotheses 1 and 2 from the
Round 8 brief ruled out by direct observation.

Tests: 3 new (swarmToolCallCount = 0 / N / 1-on-checkpoint-only);
2518 tests pass total, 0 regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 05:46:17 +02:00
Mikael Hugo
46d9d45279 fix(bash): block wrong project python runtime 2026-05-15 05:33:28 +02:00
Mikael Hugo
54ac56d9bd feat(swarm): honor worker checkpoint outcomes 2026-05-15 04:59:15 +02:00
Mikael Hugo
1115437cec feat(swarm): event streaming + outcome derivation for runUnitViaSwarm
- Forward onEvent through swarm-dispatch → agent-runner → runSubagent
- Collect toolcall_end events in runUnitViaSwarm to build real tool-use blocks
- Detect checkpoint tool outcome for accurate unit completion signal
- Add headless.ts graceful shutdown (async signal handler, 2.5s timeout)
- RPC client stop() now awaits flush and propagates stop to child sessions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 04:54:58 +02:00
Mikael Hugo
903cdd4d9d feat(subagent): event streaming for in-process runSubagent
Add RunSubagentOptions.onEvent callback so callers (TUI live update panel
for /delegate, /rubber-duck, etc.) get every session event without polling.
Errors from the callback are caught so a buggy caller cannot crash the agent.

Chain caller-supplied AbortSignal through a local AbortController in
runSingleAgent and register it in a new liveSubagentControllers set so
stopLiveSubagents aborts in-process subagents alongside the legacy spawn-based
processes (cmux split, sift codebase_search).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 04:04:52 +02:00
Mikael Hugo
62f886430c fix: run subagents in process by default 2026-05-15 03:59:34 +02:00
Mikael Hugo
8b0f0bbd65 fix: harden headless dogfood self-healing 2026-05-15 03:53:15 +02:00
Mikael Hugo
f0c3eaf999 refactor(extensions): merge ttsr into guardrails
TTSR (Time Traveling Stream Rules) monitored streaming output against regex
patterns. Guardrails blocked dangerous actions and redacted secrets. Both are
safety/guardrail concerns — merging them into one extension reduces surface
area and simplifies the safety model.

Changes:
- Copied ttsr-rule-loader.js, ttsr-manager.js, ttsr-interrupt.md into guardrails/
- Updated guardrails extension-manifest.json with ttsr hooks (turn_start,
  message_update, turn_end, agent_end)
- Integrated TTSR session_start/turn_start/message_update/turn_end/agent_end
  handlers into guardrails/index.js
- Deleted ttsr/ extension directory

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 02:28:40 +02:00
Mikael Hugo
2d5a05a48b fix(security): resolve 7 findings from full-repo code review
- Create web/middleware.ts to authenticate all API routes via bearer token
  and origin checks (previously unauthenticated due to missing middleware file)

- Fix path traversal in browse-directories: replace startsWith with
  realpathSync + relative + isAbsolute containment checks

- Fix XSS in session HTML export: escape raw HTML blocks via marked renderer

- Fix PTY process leak: destroy session on SSE stream cancellation

- Fix unhandled exception in terminal sessions POST: wrap getOrCreateSession
  in try/catch with structured JSON error response

- Fix silent child-process failure in headless dispatch: add exit handler
  to write failed claim when sf headless triage exits non-zero

- Fix TypeError on malformed claim JSON: add Array.isArray guard before
  accessing claim.ids.length

All changes type-check cleanly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-15 02:18:43 +02:00
Mikael Hugo
def1edefa9 sf snapshot: uncommitted changes after 268m inactivity 2026-05-15 02:08:06 +02:00
Mikael Hugo
2e4bdd292c fix: keep hidden sf commands callable in print mode 2026-05-14 21:25:18 +02:00
Mikael Hugo
f88b48b0aa fix: show print mode liveness 2026-05-14 20:59:19 +02:00
Mikael Hugo
487237a32c fix: bound sf print mode and chat routing 2026-05-14 20:55:00 +02:00
Mikael Hugo
47867c1236 feat: route clear sf chat commands 2026-05-14 20:21:37 +02:00
Mikael Hugo
ab1a1edcf9 refactor: tier sf slash commands 2026-05-14 20:14:09 +02:00
Mikael Hugo
7ea41b89ae feat(ai,coding-agent): wireModelId — provider deployment alias
Adds an optional wireModelId field to the Model interface and a
resolveWireModelId helper. Forge's canonical model.id stays stable for
selection, capability scoring, policy, and history; providers now send
model.wireModelId on the wire when set, model.id otherwise.

Use cases: Azure deployment names, vendor model slugs that differ
from Forge's canonical identity, A/B routing where the operator wants
canonical history but a specific deployment.

Wired through every provider in @singularity-forge/ai (anthropic,
amazon-bedrock, azure-openai-responses, google, google-vertex,
google-gemini-cli, mistral, openai-codex-responses, openai-completions,
openai-responses) plus @singularity-forge/coding-agent's
ModelRegistry (model definitions + per-model overrides).

Tests: openai-completions wireModelId payload coverage +
model-registry-auth-mode coverage for the override + definition fields.
Full pi-ai + coding-agent suite: 956/956 ✓ (7 unrelated skipped).

This realizes the model-registry contract drafted in 1d753af6b.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 09:25:21 +02:00
Mikael Hugo
a342868068 feat(packages): extract @singularity-forge/openai-codex-provider
Mirrors the @singularity-forge/google-gemini-cli-provider package layout
for the codex CLI integration boundary. The new package owns:

- CodexAppServerClient (the JSON-RPC subprocess client; previously
  packages/ai/src/providers/codex-app-server-client.ts, no pi-ai
  internal coupling)
- snapshotCodexCliAccount / discoverCodexCliModels (reads
  ~/.codex/models_cache.json with visibility=list ∧ supported_in_api
  filter; previously inline in src/resources/extensions/sf/openai-codex-catalog.js)

openai-codex-responses.ts (the stream-shaping provider) intentionally
stays in @singularity-forge/ai because it depends on pi-ai stream-event
internals and is not reusable outside the provider — same scope as
google-gemini-cli.ts vs google-gemini-cli-provider.

The SF extension's openai-codex-catalog.js is now a thin SF-side cache
writer that delegates to discoverCodexCliModels, mirroring how
gemini-catalog.js delegates to discoverGeminiCliModels. readCodexAvailableModels
became async to match the dynamic-import path; tests updated.

Closes sf-mp4u5fcz-wh6ac9 (with documented AC2 narrowing — see
resolution).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 06:48:19 +02:00
Mikael Hugo
f68ab20953 fix(ai): backfill MiniMax M2/M2.1 cacheRead pricing 2026-05-14 04:55:46 +02:00
Mikael Hugo
383e495085 feat(headless,gemini-cli): add sf headless usage + unify gemini quota path
Adds a machine-readable headless surface for live LLM-provider usage and
unifies the gemini-cli quota fetch through one helper, removing the
duplication that existed between usage-bar.js and the new package.

1. snapshotGeminiCliAccount in @singularity-forge/google-gemini-cli-provider

   - Single source of truth for { projectId, userTierId, userTierName,
     paidTier, models[] } via setupUser + retrieveUserQuota.
   - Dedups buckets per modelId, keeping the worst (lowest remainingFraction)
     so consumers always see the most-restrictive window. Code Assist
     sometimes returns multiple buckets per model; the pessimistic choice
     is what every consumer needs.
   - discoverGeminiCliModels(cwd?) wraps it for catalog-cache callers that
     only need the IDs.

2. sf headless usage subcommand

   - New src/headless-usage.ts handler. text (default) and --json output.
     Uses the package's snapshot directly — no RPC child, no jiti
     gymnastics — matching the shape of headless-uok-status / headless-doctor.
   - Wired into src/headless.ts after the doctor block.
   - Help text adds the command line.

3. usage-bar.js refactored to delegate

   - fetchGeminiUsage no longer imports gemini-cli-core directly. It calls
     snapshotGeminiCliAccount and reshapes the result into the existing
     { provider, displayName, windows[] } UI contract.
   - Eliminates the duplicate setupUser + retrieveUserQuota code path.
   - The fast existsSync(~/.gemini/oauth_creds.json) pre-flight stays
     so unauth'd users get a friendly message without paying for OAuth
     bootstrap.

4. Model registry refactor (separate track committed alongside)

   - src/resources/extensions/sf/model-registry.ts (new) consolidates
     canonical model identity, capability tier, and generation tags into
     one source of truth that auto-model-selection, benchmark-selector,
     and model-router now consume instead of maintaining parallel maps.

All 1487 tests pass (151 files); typecheck clean for both the package
and the SF extensions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 03:42:53 +02:00