In-process swarm workers get a fresh headless AgentSession whose permission
extension defaults to read-only minimal. This blocks normal autonomous edits
(e.g., write_file, edit) even when the parent session runs at normal or
trusted level.
- run-unit.js: add legacyPermissionLevelForProfile mapping and include
executorPermissionLevel in the dispatch envelope.
- swarm-dispatch.js: forward executorPermissionLevel from envelope to
runAgentTurn as permissionLevel.
- agent-runner.js: accept permissionLevel option and pass it to
runSubagent config.
- subagent-runner.ts: add permissionLevel to SubagentConfig; when set,
temporarily set SF_PERMISSION_LEVEL env and run extension lifecycle so
the permission extension reads the level before tool hooks execute.
- Tests for envelope field, dispatch forwarding, and run-unit integration.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Round 7 dogfood failed with "0 tool calls — context exhaustion" even
though the swarm worker's session DID call tools. Root cause: the
phases-unit.js zero-tool-call guard reads from the PARENT session's
message ledger via snapshotUnitMetrics. The swarm worker runs in an
ISOLATED subagent session — its tool calls never appear in the
parent's messages, so the guard always sees 0 and fires a false-
positive context-exhaustion retry.
Fix:
- runUnitViaSwarm now returns swarmToolCallCount on the UnitResult,
surfacing the real worker tool call count from the onEvent stream
(collectedToolCalls.length, accurate end-to-end).
- phases-unit.js zero-tool-call guard checks
unitResult._via === "swarm" && swarmToolCallCount > 0 and bypasses
the false-positive retry, logging "zero-tool-calls-swarm-bypass".
Also adds a debug stderr line in subagent-runner.ts printing the tool
count after bindExtensions, confirming the worker session HAS the
full tool set (checkpoint + built-ins) — Hypotheses 1 and 2 from the
Round 8 brief ruled out by direct observation.
Tests: 3 new (swarmToolCallCount = 0 / N / 1-on-checkpoint-only);
2518 tests pass total, 0 regressions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Forward onEvent through swarm-dispatch → agent-runner → runSubagent
- Collect toolcall_end events in runUnitViaSwarm to build real tool-use blocks
- Detect checkpoint tool outcome for accurate unit completion signal
- Add headless.ts graceful shutdown (async signal handler, 2.5s timeout)
- RPC client stop() now awaits flush and propagates stop to child sessions
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add RunSubagentOptions.onEvent callback so callers (TUI live update panel
for /delegate, /rubber-duck, etc.) get every session event without polling.
Errors from the callback are caught so a buggy caller cannot crash the agent.
Chain caller-supplied AbortSignal through a local AbortController in
runSingleAgent and register it in a new liveSubagentControllers set so
stopLiveSubagents aborts in-process subagents alongside the legacy spawn-based
processes (cmux split, sift codebase_search).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Create web/middleware.ts to authenticate all API routes via bearer token
and origin checks (previously unauthenticated due to missing middleware file)
- Fix path traversal in browse-directories: replace startsWith with
realpathSync + relative + isAbsolute containment checks
- Fix XSS in session HTML export: escape raw HTML blocks via marked renderer
- Fix PTY process leak: destroy session on SSE stream cancellation
- Fix unhandled exception in terminal sessions POST: wrap getOrCreateSession
in try/catch with structured JSON error response
- Fix silent child-process failure in headless dispatch: add exit handler
to write failed claim when sf headless triage exits non-zero
- Fix TypeError on malformed claim JSON: add Array.isArray guard before
accessing claim.ids.length
All changes type-check cleanly.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds an optional wireModelId field to the Model interface and a
resolveWireModelId helper. Forge's canonical model.id stays stable for
selection, capability scoring, policy, and history; providers now send
model.wireModelId on the wire when set, model.id otherwise.
Use cases: Azure deployment names, vendor model slugs that differ
from Forge's canonical identity, A/B routing where the operator wants
canonical history but a specific deployment.
Wired through every provider in @singularity-forge/ai (anthropic,
amazon-bedrock, azure-openai-responses, google, google-vertex,
google-gemini-cli, mistral, openai-codex-responses, openai-completions,
openai-responses) plus @singularity-forge/coding-agent's
ModelRegistry (model definitions + per-model overrides).
Tests: openai-completions wireModelId payload coverage +
model-registry-auth-mode coverage for the override + definition fields.
Full pi-ai + coding-agent suite: 956/956 ✓ (7 unrelated skipped).
This realizes the model-registry contract drafted in 1d753af6b.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the @singularity-forge/google-gemini-cli-provider package layout
for the codex CLI integration boundary. The new package owns:
- CodexAppServerClient (the JSON-RPC subprocess client; previously
packages/ai/src/providers/codex-app-server-client.ts, no pi-ai
internal coupling)
- snapshotCodexCliAccount / discoverCodexCliModels (reads
~/.codex/models_cache.json with visibility=list ∧ supported_in_api
filter; previously inline in src/resources/extensions/sf/openai-codex-catalog.js)
openai-codex-responses.ts (the stream-shaping provider) intentionally
stays in @singularity-forge/ai because it depends on pi-ai stream-event
internals and is not reusable outside the provider — same scope as
google-gemini-cli.ts vs google-gemini-cli-provider.
The SF extension's openai-codex-catalog.js is now a thin SF-side cache
writer that delegates to discoverCodexCliModels, mirroring how
gemini-catalog.js delegates to discoverGeminiCliModels. readCodexAvailableModels
became async to match the dynamic-import path; tests updated.
Closes sf-mp4u5fcz-wh6ac9 (with documented AC2 narrowing — see
resolution).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a machine-readable headless surface for live LLM-provider usage and
unifies the gemini-cli quota fetch through one helper, removing the
duplication that existed between usage-bar.js and the new package.
1. snapshotGeminiCliAccount in @singularity-forge/google-gemini-cli-provider
- Single source of truth for { projectId, userTierId, userTierName,
paidTier, models[] } via setupUser + retrieveUserQuota.
- Dedups buckets per modelId, keeping the worst (lowest remainingFraction)
so consumers always see the most-restrictive window. Code Assist
sometimes returns multiple buckets per model; the pessimistic choice
is what every consumer needs.
- discoverGeminiCliModels(cwd?) wraps it for catalog-cache callers that
only need the IDs.
2. sf headless usage subcommand
- New src/headless-usage.ts handler. text (default) and --json output.
Uses the package's snapshot directly — no RPC child, no jiti
gymnastics — matching the shape of headless-uok-status / headless-doctor.
- Wired into src/headless.ts after the doctor block.
- Help text adds the command line.
3. usage-bar.js refactored to delegate
- fetchGeminiUsage no longer imports gemini-cli-core directly. It calls
snapshotGeminiCliAccount and reshapes the result into the existing
{ provider, displayName, windows[] } UI contract.
- Eliminates the duplicate setupUser + retrieveUserQuota code path.
- The fast existsSync(~/.gemini/oauth_creds.json) pre-flight stays
so unauth'd users get a friendly message without paying for OAuth
bootstrap.
4. Model registry refactor (separate track committed alongside)
- src/resources/extensions/sf/model-registry.ts (new) consolidates
canonical model identity, capability tier, and generation tags into
one source of truth that auto-model-selection, benchmark-selector,
and model-router now consume instead of maintaining parallel maps.
All 1487 tests pass (151 files); typecheck clean for both the package
and the SF extensions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related fixes for the google-gemini-cli provider, both motivated by today's
dogfood diagnosis: SF was pinned to a single model (gemini-3-flash-preview)
even though the AI Ultra account has access to seven (verified via the live
gemini-cli-core probe), and a transient "No capacity available for model X
on the server" was classified as `unknown` so SF gave up instead of retrying.
1. Account snapshot + model discovery in @singularity-forge/google-gemini-cli-provider
- Add `snapshotGeminiCliAccount(cwd?)` returning { projectId, userTierId,
userTierName, paidTier, models } where `models[]` carries each modelId
with usedFraction, remainingFraction, and resetTime. Built on the same
setupUser + CodeAssistServer.retrieveUserQuota path usage-bar.js
already uses, but extracted to the dedicated package so any consumer
(model picker, capacity diagnostics, catalog cache) can call one helper.
- Add `discoverGeminiCliModels(cwd?)` as a thin "just the IDs" wrapper.
- Both are best-effort: any failure (OAuth expired, no project, network)
returns null silently — never throws.
2. SF-side cache writer at src/resources/extensions/sf/gemini-catalog.js
- Delegates discovery to the package; only handles cache file path,
6-hour TTL, and the session_start lifecycle hook.
- Cache lands at .sf/runtime/model-catalog/google-gemini-cli.json with
the same shape as the generic model-catalog-cache, so getKnownModelIds
and the model picker pick it up transparently.
- Wired into bootstrap/register-hooks.js session_start in parallel with
the existing scheduleModelCatalogRefresh (the generic REST + API-key
path can't reach gemini-cli's OAuth-only Code Assist endpoint).
3. Capacity error classification fix
- error-classifier.js SERVER_RE now matches "no capacity (available|left)",
"capacity (unavailable|exhausted)", and "no capacity ... on the server".
Previously these fell through to kind=unknown, which is not transient,
so agent-end-recovery never retried — even though the same handler
already caps gemini-cli rate-limit backoff at 30s for exactly this
class of transient. With the pattern matched as `server`, the existing
retry-with-backoff path covers it.
The full extension test suite (1386 tests) passes. Typecheck clean for both
the package and the SF extensions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split reorderForCaching into a structured reorderAndSplitForCaching that
returns {before, after} at the semi-static→dynamic section boundary.
- prompt-ordering.js: export reorderAndSplitForCaching — returns null if no
dynamic sections, otherwise {before: static+semi-static, after: dynamic}
- auto.js: import and wire reorderAndSplitForCaching into deps
- phases-unit.js: use split function; pass promptParts to runUnit when split
succeeds; fall back to flat reorderForCaching when null
- run-unit.js: when promptParts is present, send a two-block content array
[{type:text, text:before, cache_control:{type:ephemeral}}, {type:text, text:after}]
so Anthropic-compatible providers cache the stable prefix
- openai-completions.ts: preserve cache_control on text parts in convertMessages;
skip maybeAddOpenRouterAnthropicCacheControl if any part already has cache_control
Tests: 5 new contract tests for reorderAndSplitForCaching; all 4502 unit tests pass.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add BUILT_IN_DEFAULT_TIMEOUT_SECS = 120 constant to bash tool
- Compute effectiveTimeout = timeout ?? resolvedDefaultTimeout so LLM
calls without a timeout get the 120s guard automatically
- Add defaultTimeoutSeconds? to BashToolOptions for override at creation
- Dynamic bashSchemaWithDefault describes the actual default in the LLM
tool description, improving model awareness
- Add BashSettings interface + getBashDefaultTimeoutSeconds() to
SettingsManager so users can override or disable via settings.json
- Wire defaultTimeoutSeconds into agent-session.ts _buildRuntime()
Root cause: npx sf --help triggered npm package download, hanging for
4+ minutes without timeout, consuming entire autonomous run budget.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rf-01: add ECONNREFUSED to isTransientNetworkError in anthropic-shared.ts,
aligning with the NETWORK_RE pattern in error-classifier.js
rf-02: add scripts/validate-model-cost-table.mjs to report coverage gaps
and price divergence between model-cost-table.js and models.generated.ts;
add 'validate-cost-table' script to package.json
rf-11: extract 10 pure resource-display utility functions from
interactive-mode.ts into packages/coding-agent/src/modes/interactive/
resource-display.ts, reducing interactive-mode.ts by ~282 lines
All 4375 tests pass.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Delete ghost package packages/pi-agent-core (no dist, no consumers,
TS build errors; JS source sf-db.js had 3 commits not mirrored in TS)
- Remove build:pi-agent-core from root package.json build:pi pipeline
- Merge all models from MODEL_COST_PER_1K_INPUT into BUNDLED_COST_TABLE
(model-cost-table.js is now the single canonical cost source)
- Remove duplicate MODEL_COST_PER_1K_INPUT object and getModelCost()
from model-router.js; use lookupModelCost() from model-cost-table.js
- Replace hand-rolled isTransientNetworkError in preferences-models.js
with delegation to classifyError() in error-classifier.js
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- FallbackResolver.setUnitContext() stores {unitType,unitId} from autonomous dispatch
- run-unit.js calls pi.setFallbackUnitContext() before/after each unit
- _findAnyAvailableFallback uses real unitType/unitId from context, not sentinel
- Schema v59: failure_mode column in llm_task_outcomes
- insertLlmTaskOutcome accepts failure_mode (rate_limit, quota_exhausted, auth_error)
- register-hooks.js passes event.classification.reason as failure_mode
- register-hooks.js uses real event.unitId when available
- ExtensionRuntimeActions.setFallbackUnitContext added to pi API surface
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a model fails and FallbackResolver picks a replacement, it now:
1. Fires the before_model_select hook with reason='fallback' and the
failing model's ID — the learning system records the failure outcome
and returns the best Bayesian-blended replacement from llm_task_outcomes
2. Falls back to the existing heuristic sort (reasoning + context window)
if the hook is unavailable or returns no override
Changes:
- BeforeModelSelectEvent: add optional currentModelId and reason fields
- FallbackResolver: accept emitBeforeModelSelect in constructor; make
_findAnyAvailableFallback async; fire hook before heuristic fallback
- agent-session.ts: inject lazy emitBeforeModelSelect closure into resolver
- register-hooks.js: record failure outcome when reason='fallback' before
returning selectLearnedModel result
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add packages/coding-agent/src/utils/format.ts as the canonical source
for formatDuration, formatTokenCount, truncateWithEllipsis, sparkline,
formatDateShort, fileLink, stripAnsi, normalizeStringArray — all already
exported from @singularity-forge/coding-agent via index.ts.
- Convert shared/format-utils.js to a compatibility shim that re-exports
the 8 functions from @singularity-forge/coding-agent. All 13 importers
continue to work with no import changes required.
- Convert shared/path-display.js to a compatibility shim that re-exports
toPosixPath from @singularity-forge/coding-agent. Implementation in
packages/coding-agent/src/utils/path-display.ts was already canonical.
- shared/frontmatter.js is intentionally NOT shimmed: splitFrontmatter/
parseFrontmatterMap have a different API from the package's parseFrontmatter/
stripFrontmatter (flat-map vs {frontmatter, body} object).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Create packages/coding-agent/src/core/providers/web-search-middleware.ts with
WebSearchMiddleware class: injects web_search tool, enforces session budget (#1309),
strips thinking blocks from history, and respects PREFERENCES.md search_provider.
- Wire webSearchMiddleware.applyToPayload into sdk.ts onPayload callback (before
extension hook dispatch) so injection runs as compiled TypeScript with zero
jiti-dispatch overhead.
- Export WebSearchMiddleware, webSearchMiddleware singleton, setPreferBraveResolver,
CUSTOM_SEARCH_TOOL_NAMES, MAX_NATIVE_SEARCHES_PER_SESSION, and stripThinkingFromHistory
from @singularity-forge/coding-agent so the extension can delegate to the same instance.
- Refactor search-the-web/native-search.js: remove self-contained injection logic;
import and delegate before_provider_request to webSearchMiddleware singleton.
Use tri-state isAnthropicProvider (null/false/true) to synthesize a provider hint
when event.model is absent but model_select has already fired — prevents the
model-name heuristic from wrongly injecting into Copilot claude-* requests.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Create config.ts with McpServerConfig types and readMcpConfigs/getServerConfig
- Create auth.ts with buildHttpTransportOpts and createCliOAuthProvider
- Create connection-manager.ts with McpConnectionManager class
- Create index.ts re-exporting the public API
- Export McpConnectionManager and helpers from @singularity-forge/coding-agent
- Rewrite mcp-client extension as thin wrapper using McpConnectionManager
- Rewrite auth.js as re-export shim from @singularity-forge/coding-agent
- Update test to import buildHttpTransportOpts from @singularity-forge/coding-agent
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Ink bridge added today was a misguided gradual-migration wrapper:
- Components still rendered via the old string-line protocol (no Ink layout)
- Key decodes were re-encoded to escape sequences → keys.ts decoded again (double round-trip bug)
- The _useInk / _inkHandle path blocked TTY start unconditionally via process.stdout.isTTY check
Removed: ink-bridge.tsx, ink-bridge.test.ts, useInk() method, _useInk/_inkHandle fields,
startInkRenderer import/export, Ink branch in start()/stop()/requestRender().
Removed ink and react from packages/tui dependencies and peerDependencies.
Reverted tsconfig.extensions.json jsx settings (only needed for the .tsx bridge file).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Delete memory-backfill.js — not imported anywhere, dead code
- Rename memory-sleeper.js → tool-watchdog.js — misnamed; it is a
tool-output watchdog with no relation to the memory store
- Collapse memory-embeddings-llm-gateway.js into memory-embeddings.js —
removes the lazy-import split; loadGatewayConfigFromEnv,
createGatewayEmbedFn, and rerankCandidates are now direct exports
- Remove buildEmbeddingFn() dead stub (always returned null)
- Enable packages/coding-agent memory extraction extension by default
(memory.enabled ?? true) so session-level extraction is active
- Update all import sites and tests
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- doRender() now catches render errors and emits a fallback line
- autonomousStatus ANSI formatting extracted to renderAutonomousStatus()
with named color constants instead of raw escape strings
- parseCellSizeResponse extracted to pure function with proper validation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- TUI.useInk() opts into Ink-backed rendering (call before start())
- In start(): if _useInk || process.stdout.isTTY, mount Ink renderer via
startInkRenderer() and skip the legacy differential render path entirely
- In stop(): unmount Ink handle and return early; legacy terminal cleanup
(cursor repositioning, showCursor, terminal.stop) is skipped since Ink
handles terminal restoration itself
- Passes this.render()/invalidate() via a plain Component wrapper to avoid
the private handleInput TypeScript conflict
- Two new contract tests: useInk() flag and stop() Ink handle teardown
- 80/80 tests pass; legacy path unchanged for non-TTY (CI/tests)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Install ink@7.0.2 + react@19.2.6. Add JSX/react-jsx support to
packages/tui tsconfig. Create ink-bridge.tsx: LegacyComponentView wraps
existing Component objects as React nodes, startInkRenderer drives the
Ink render loop around any legacy Component tree.
Exports startInkRenderer from @singularity-forge/tui public API.
All 78 existing tui tests pass; 3 new ink-bridge tests added.
This is the infrastructure step for migrating components one-by-one from
the custom differential renderer to native Ink React components, without
breaking interactive mode.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove vestigial experimentalDecorators/emitDecoratorMetadata from all
package tsconfigs (no actual decorators in source — flags were from
pi-mono vendor copy)
- Add @typescript/native-preview for 8-10x faster type checking (measured
4.6x on this repo: tsc 6.5s vs tsgo 1.4s)
- Fix tsconfig.extensions.json: remove baseUrl (removed in tsgo/TS7) and
use relative paths in paths mappings — compatible with both tsc and tsgo
- Add typecheck/typecheck:extensions scripts using tsgo
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>