singularity/singularity-forge

Author	SHA1	Message	Date
Mikael Hugo	3a14fe86a7	test(list-models): isolate from developer's discovery-cache Tests were picking up the developer's real ~/.sf/agent/discovery-cache.json and seeing unexpected models in output. Pin tests to a guaranteed-missing path via the new _discoveryCacheFilePath option so the env they observe is solely what the test constructs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 16:37:11 +02:00
Mikael Hugo	7ba469cff1	feat(memory): add debug logging to memory extraction pipeline Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details The memory extraction system has infrastructure (DB tables, LLM prompts, unit closeout wiring, embedding backfill) but zero processed units and only self-feedback-resolution memories. This suggests extraction is failing silently. Add debugLog() calls throughout extractMemoriesFromUnit() so we can observe: - Skip reasons (mutex busy, rate limited, already processed, file too small) - Start/done lifecycle per unit - LLM call and parse outcomes - Error messages on failure and retry This makes the extraction pipeline observable via --debug or the journal/debug log without changing behavior. Tests: 185 files / 1993 tests pass. Type check: clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 16:09:36 +02:00
Mikael Hugo	d57cd84d9a	fix(auto): make halt watchdog observable	2026-05-15 08:09:02 +02:00
Mikael Hugo	f9c147a08b	fix(swarm): ignore heartbeats for silent worker timeout	2026-05-15 08:00:35 +02:00
Mikael Hugo	e464a1bd6e	fix(swarm): bound silent worker responses	2026-05-15 07:35:31 +02:00
Copilot	cf9203aee0	feat(swarm): forward parent permission profile to in-process worker sessions Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details In-process swarm workers get a fresh headless AgentSession whose permission extension defaults to read-only minimal. This blocks normal autonomous edits (e.g., write_file, edit) even when the parent session runs at normal or trusted level. - run-unit.js: add legacyPermissionLevelForProfile mapping and include executorPermissionLevel in the dispatch envelope. - swarm-dispatch.js: forward executorPermissionLevel from envelope to runAgentTurn as permissionLevel. - agent-runner.js: accept permissionLevel option and pass it to runSubagent config. - subagent-runner.ts: add permissionLevel to SubagentConfig; when set, temporarily set SF_PERMISSION_LEVEL env and run extension lifecycle so the permission extension reads the level before tool hooks execute. - Tests for envelope field, dispatch forwarding, and run-unit integration. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 06:38:42 +02:00
Mikael Hugo	dbfaca61cf	fix(swarm): surface worker tool call count to bypass parent-ledger guard Some checks are pending CI / detect-changes (push) Waiting to run Details CI / docs-check (push) Blocked by required conditions Details CI / lint (push) Blocked by required conditions Details CI / build (push) Blocked by required conditions Details CI / integration-tests (push) Blocked by required conditions Details CI / windows-portability (push) Blocked by required conditions Details CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions Details CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions Details CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions Details Round 7 dogfood failed with "0 tool calls — context exhaustion" even though the swarm worker's session DID call tools. Root cause: the phases-unit.js zero-tool-call guard reads from the PARENT session's message ledger via snapshotUnitMetrics. The swarm worker runs in an ISOLATED subagent session — its tool calls never appear in the parent's messages, so the guard always sees 0 and fires a false- positive context-exhaustion retry. Fix: - runUnitViaSwarm now returns swarmToolCallCount on the UnitResult, surfacing the real worker tool call count from the onEvent stream (collectedToolCalls.length, accurate end-to-end). - phases-unit.js zero-tool-call guard checks unitResult._via === "swarm" && swarmToolCallCount > 0 and bypasses the false-positive retry, logging "zero-tool-calls-swarm-bypass". Also adds a debug stderr line in subagent-runner.ts printing the tool count after bindExtensions, confirming the worker session HAS the full tool set (checkpoint + built-ins) — Hypotheses 1 and 2 from the Round 8 brief ruled out by direct observation. Tests: 3 new (swarmToolCallCount = 0 / N / 1-on-checkpoint-only); 2518 tests pass total, 0 regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 05:46:17 +02:00
Mikael Hugo	46d9d45279	fix(bash): block wrong project python runtime	2026-05-15 05:33:28 +02:00
Mikael Hugo	1115437cec	feat(swarm): event streaming + outcome derivation for runUnitViaSwarm - Forward onEvent through swarm-dispatch → agent-runner → runSubagent - Collect toolcall_end events in runUnitViaSwarm to build real tool-use blocks - Detect checkpoint tool outcome for accurate unit completion signal - Add headless.ts graceful shutdown (async signal handler, 2.5s timeout) - RPC client stop() now awaits flush and propagates stop to child sessions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 04:54:58 +02:00
Mikael Hugo	903cdd4d9d	feat(subagent): event streaming for in-process runSubagent Add RunSubagentOptions.onEvent callback so callers (TUI live update panel for /delegate, /rubber-duck, etc.) get every session event without polling. Errors from the callback are caught so a buggy caller cannot crash the agent. Chain caller-supplied AbortSignal through a local AbortController in runSingleAgent and register it in a new liveSubagentControllers set so stopLiveSubagents aborts in-process subagents alongside the legacy spawn-based processes (cmux split, sift codebase_search). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 04:04:52 +02:00
Mikael Hugo	62f886430c	fix: run subagents in process by default	2026-05-15 03:59:34 +02:00
Mikael Hugo	8b0f0bbd65	fix: harden headless dogfood self-healing	2026-05-15 03:53:15 +02:00
Mikael Hugo	2d5a05a48b	fix(security): resolve 7 findings from full-repo code review - Create web/middleware.ts to authenticate all API routes via bearer token and origin checks (previously unauthenticated due to missing middleware file) - Fix path traversal in browse-directories: replace startsWith with realpathSync + relative + isAbsolute containment checks - Fix XSS in session HTML export: escape raw HTML blocks via marked renderer - Fix PTY process leak: destroy session on SSE stream cancellation - Fix unhandled exception in terminal sessions POST: wrap getOrCreateSession in try/catch with structured JSON error response - Fix silent child-process failure in headless dispatch: add exit handler to write failed claim when sf headless triage exits non-zero - Fix TypeError on malformed claim JSON: add Array.isArray guard before accessing claim.ids.length All changes type-check cleanly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-15 02:18:43 +02:00
Mikael Hugo	def1edefa9	sf snapshot: uncommitted changes after 268m inactivity	2026-05-15 02:08:06 +02:00
Mikael Hugo	2e4bdd292c	fix: keep hidden sf commands callable in print mode	2026-05-14 21:25:18 +02:00
Mikael Hugo	f88b48b0aa	fix: show print mode liveness	2026-05-14 20:59:19 +02:00
Mikael Hugo	487237a32c	fix: bound sf print mode and chat routing	2026-05-14 20:55:00 +02:00
Mikael Hugo	47867c1236	feat: route clear sf chat commands	2026-05-14 20:21:37 +02:00
Mikael Hugo	ab1a1edcf9	refactor: tier sf slash commands	2026-05-14 20:14:09 +02:00
Mikael Hugo	7ea41b89ae	feat(ai,coding-agent): wireModelId — provider deployment alias Adds an optional wireModelId field to the Model interface and a resolveWireModelId helper. Forge's canonical model.id stays stable for selection, capability scoring, policy, and history; providers now send model.wireModelId on the wire when set, model.id otherwise. Use cases: Azure deployment names, vendor model slugs that differ from Forge's canonical identity, A/B routing where the operator wants canonical history but a specific deployment. Wired through every provider in @singularity-forge/ai (anthropic, amazon-bedrock, azure-openai-responses, google, google-vertex, google-gemini-cli, mistral, openai-codex-responses, openai-completions, openai-responses) plus @singularity-forge/coding-agent's ModelRegistry (model definitions + per-model overrides). Tests: openai-completions wireModelId payload coverage + model-registry-auth-mode coverage for the override + definition fields. Full pi-ai + coding-agent suite: 956/956 ✓ (7 unrelated skipped). This realizes the model-registry contract drafted in `1d753af6b`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 09:25:21 +02:00
Mikael Hugo	ca7368e5f1	fix(bash): add 120s default timeout to prevent autonomous mode hangs - Add BUILT_IN_DEFAULT_TIMEOUT_SECS = 120 constant to bash tool - Compute effectiveTimeout = timeout ?? resolvedDefaultTimeout so LLM calls without a timeout get the 120s guard automatically - Add defaultTimeoutSeconds? to BashToolOptions for override at creation - Dynamic bashSchemaWithDefault describes the actual default in the LLM tool description, improving model awareness - Add BashSettings interface + getBashDefaultTimeoutSeconds() to SettingsManager so users can override or disable via settings.json - Wire defaultTimeoutSeconds into agent-session.ts _buildRuntime() Root cause: npx sf --help triggered npm package download, hanging for 4+ minutes without timeout, consuming entire autonomous run budget. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 19:12:33 +02:00
Mikael Hugo	338c75fc6f	refactor: complete rf-01/rf-02/rf-11 blocked todos rf-01: add ECONNREFUSED to isTransientNetworkError in anthropic-shared.ts, aligning with the NETWORK_RE pattern in error-classifier.js rf-02: add scripts/validate-model-cost-table.mjs to report coverage gaps and price divergence between model-cost-table.js and models.generated.ts; add 'validate-cost-table' script to package.json rf-11: extract 10 pure resource-display utility functions from interactive-mode.ts into packages/coding-agent/src/modes/interactive/ resource-display.ts, reducing interactive-mode.ts by ~282 lines All 4375 tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 16:45:39 +02:00
Mikael Hugo	1adc7f119c	refactor(rf-06): split auto/phases.js into per-phase modules 3538-line monolith → 6 focused modules + thin barrel: - phases-helpers.js (223 lines): shared helpers (generateMilestoneReport, closeoutAndStop, emitCancelledUnitEnd, maybeFireProductAudit, _resolveReportBasePath, recordLearningOutcomeForUnit) - phases-dispatch.js (486 lines): runDispatch + assessUokDiagnosticsDispatchGate - phases-guards.js (497 lines): runGuards + guard helpers - phases-pre-dispatch.js (760 lines): runPreDispatch - phases-unit.js (1477 lines): runUnitPhase + session timeout state - phases-finalize.js (542 lines): runFinalize - phases.js (13 lines): barrel re-export preserving original import surface Removed dead runPhaseReview export (zero callers confirmed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 15:14:49 +02:00
Mikael Hugo	0b5fa75c0d	fix(lint): fix all pre-existing lint failures - check-sf-extension-inventory.mjs: expand parseDirectRegisteredCommands() scan to include 7 more files (guards/inturn.js, notifications/notify.js, permissions/index.js, ui/usage-bar.js, commands/legacy/audit.js, commands/legacy/create-extension.js, commands/legacy/create-slash-command.js) and filter results by BASE_RUNTIME_COMMAND_NAMES to exclude doc-string false positives ("name" in create-slash-command.js template text) - extension-manifest.json: remove 'clear' (subcommand of logs/notifications, never a top-level pi.registerCommand) - packages/pi-agent-core/src/db/sf-db.ts: fix 23 noVoidTypeReturn errors - openDatabase: void → boolean (caller uses return value at line 5625) - claimEscalationOverride: void → boolean (caller checks at escalation.js:243) - resolveSelfFeedbackEntry: void → boolean (caller checks at self-feedback.js:387) - copyWorktreeDb: void → boolean (caller checks at reconcileWorktreeDb) - compactUokMessages: void → {before,after} (caller returns value at message-bus.js:238) - insertSessionTurn: void → bigint\|null (caller uses id at session-recorder.js:104) - expireStaleMemories: void → number (caller uses count at auto-start.js:1047) - deleteMemorySourceRow: void → boolean (caller returns value at memory-source-store.js:107) - deleteMemoryEmbedding: void → boolean (caller returns value at memory-embeddings.js:328) - updateBacklogItemStatus: remove dead return expression (callers discard value) - removeBacklogItem: remove dead return expression (callers discard value) - updateGateCircuitBreaker: remove dead return {total,avgMs,...} (wrong-type code accidentally merged from getGateLatencyStats, never reachable) - markUokMessageRead: remove dead return true/false (callers discard value) - Auto-fix formatting and organizeImports in ~30 source files (biome --write) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 04:02:31 +02:00
Mikael Hugo	e50321b62b	feat(selection): thread unitType + failure_mode into fallback outcome records - FallbackResolver.setUnitContext() stores {unitType,unitId} from autonomous dispatch - run-unit.js calls pi.setFallbackUnitContext() before/after each unit - _findAnyAvailableFallback uses real unitType/unitId from context, not sentinel - Schema v59: failure_mode column in llm_task_outcomes - insertLlmTaskOutcome accepts failure_mode (rate_limit, quota_exhausted, auth_error) - register-hooks.js passes event.classification.reason as failure_mode - register-hooks.js uses real event.unitId when available - ExtensionRuntimeActions.setFallbackUnitContext added to pi API surface Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-10 23:14:22 +02:00
Mikael Hugo	009651e86f	feat(selection): wire before_model_select into FallbackResolver for outcome-aware fallback When a model fails and FallbackResolver picks a replacement, it now: 1. Fires the before_model_select hook with reason='fallback' and the failing model's ID — the learning system records the failure outcome and returns the best Bayesian-blended replacement from llm_task_outcomes 2. Falls back to the existing heuristic sort (reasoning + context window) if the hook is unavailable or returns no override Changes: - BeforeModelSelectEvent: add optional currentModelId and reason fields - FallbackResolver: accept emitBeforeModelSelect in constructor; make _findAnyAvailableFallback async; fire hook before heuristic fallback - agent-session.ts: inject lazy emitBeforeModelSelect closure into resolver - register-hooks.js: record failure outcome when reason='fallback' before returning selectLearnedModel result Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-10 23:05:33 +02:00
Mikael Hugo	fb1bd3e5fa	refactor(shared): deduplicate shared/ utilities against coding-agent package exports - Add packages/coding-agent/src/utils/format.ts as the canonical source for formatDuration, formatTokenCount, truncateWithEllipsis, sparkline, formatDateShort, fileLink, stripAnsi, normalizeStringArray — all already exported from @singularity-forge/coding-agent via index.ts. - Convert shared/format-utils.js to a compatibility shim that re-exports the 8 functions from @singularity-forge/coding-agent. All 13 importers continue to work with no import changes required. - Convert shared/path-display.js to a compatibility shim that re-exports toPosixPath from @singularity-forge/coding-agent. Implementation in packages/coding-agent/src/utils/path-display.ts was already canonical. - shared/frontmatter.js is intentionally NOT shimmed: splitFrontmatter/ parseFrontmatterMap have a different API from the package's parseFrontmatter/ stripFrontmatter (flat-map vs {frontmatter, body} object). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-10 22:41:03 +02:00
Mikael Hugo	7227912a29	perf(search): move web-search provider injection from extension hook to native middleware - Create packages/coding-agent/src/core/providers/web-search-middleware.ts with WebSearchMiddleware class: injects web_search tool, enforces session budget (#1309), strips thinking blocks from history, and respects PREFERENCES.md search_provider. - Wire webSearchMiddleware.applyToPayload into sdk.ts onPayload callback (before extension hook dispatch) so injection runs as compiled TypeScript with zero jiti-dispatch overhead. - Export WebSearchMiddleware, webSearchMiddleware singleton, setPreferBraveResolver, CUSTOM_SEARCH_TOOL_NAMES, MAX_NATIVE_SEARCHES_PER_SESSION, and stripThinkingFromHistory from @singularity-forge/coding-agent so the extension can delegate to the same instance. - Refactor search-the-web/native-search.js: remove self-contained injection logic; import and delegate before_provider_request to webSearchMiddleware singleton. Use tri-state isAnthropicProvider (null/false/true) to synthesize a provider hint when event.model is absent but model_select has already fired — prevents the model-name heuristic from wrongly injecting into Copilot claude-* requests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-10 22:37:42 +02:00
Mikael Hugo	3fba4bcb03	refactor(mcp): move MCP connection manager to packages/coding-agent/src/core/mcp/ - Create config.ts with McpServerConfig types and readMcpConfigs/getServerConfig - Create auth.ts with buildHttpTransportOpts and createCliOAuthProvider - Create connection-manager.ts with McpConnectionManager class - Create index.ts re-exporting the public API - Export McpConnectionManager and helpers from @singularity-forge/coding-agent - Rewrite mcp-client extension as thin wrapper using McpConnectionManager - Rewrite auth.js as re-export shim from @singularity-forge/coding-agent - Update test to import buildHttpTransportOpts from @singularity-forge/coding-agent Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-10 22:19:46 +02:00
Mikael Hugo	a77e1551d2	refactor(memory): consolidate memory system, remove dead code - Delete memory-backfill.js — not imported anywhere, dead code - Rename memory-sleeper.js → tool-watchdog.js — misnamed; it is a tool-output watchdog with no relation to the memory store - Collapse memory-embeddings-llm-gateway.js into memory-embeddings.js — removes the lazy-import split; loadGatewayConfigFromEnv, createGatewayEmbedFn, and rerankCandidates are now direct exports - Remove buildEmbeddingFn() dead stub (always returned null) - Enable packages/coding-agent memory extraction extension by default (memory.enabled ?? true) so session-level extraction is active - Update all import sites and tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-10 18:17:49 +02:00
Mikael Hugo	3ffd882c8c	sf snapshot: uncommitted changes after 56m inactivity	2026-05-10 17:16:30 +02:00
Mikael Hugo	924383b6f7	sf snapshot: uncommitted changes after 197m inactivity	2026-05-10 15:59:33 +02:00
Mikael Hugo	d447095bd7	build: switch full build pipeline to TypeScript 7 native (tsgo) Replace tsc with tsgo in all build scripts — 5.6x faster emit. tsgo has full emit parity for this codebase (NodeNext, ES2022, strict). - build:core: tsc → tsgo (root tsconfig.json) - copy-resources.cjs: typescript/bin/tsc → @typescript/native-preview/bin/tsgo.js - All workspace packages (agent-core, ai, coding-agent, daemon, google-gemini-cli-provider, native, rpc-client, tui): tsc → tsgo Benchmarks (root project): tsc --project tsconfig.json: 7.7s tsgo --project tsconfig.json: 1.4s (5.6x faster) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-10 11:58:58 +02:00
Mikael Hugo	e09eb8f899	build: add TypeScript 7 (native preview) for fast type checking - Remove vestigial experimentalDecorators/emitDecoratorMetadata from all package tsconfigs (no actual decorators in source — flags were from pi-mono vendor copy) - Add @typescript/native-preview for 8-10x faster type checking (measured 4.6x on this repo: tsc 6.5s vs tsgo 1.4s) - Fix tsconfig.extensions.json: remove baseUrl (removed in tsgo/TS7) and use relative paths in paths mappings — compatible with both tsc and tsgo - Add typecheck/typecheck:extensions scripts using tsgo Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-10 11:53:22 +02:00
Mikael Hugo	cab8b5decc	refactor: strip internal pi branding (Phase 2A) - CURSOR_MARKER: \x1b_pi:c\x07 → \x1b_sf:c\x07 - process.title: "pi" → "sf" - PiManifest → SFManifest (with pi field backwards compat) - readPiManifest → readSFManifest (loader.ts and package-manager.ts) - readPiManifestFile → readSFManifestFile (package-manager.ts) - .pi/skills → .sf/skills (keeps .pi/skills for backwards compat) - User-facing path strings updated to .sf/ where appropriate - ARCHITECTURE.md: "Pi coding-agent extension" → "coding-agent extension" - Temp editor file: pi-editor-.pi.md → sf-editor-.sf.md - Test fixtures: appName "pi" → "sf", pi manifest field → sf Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-10 11:50:55 +02:00
Mikael Hugo	6725a55591	feat(web): add error boundaries, expand test coverage, add README - Add class-based ErrorBoundary component wrapping all 7 main views inside WorkspaceChrome; fallback shows view name, error, reload button - Add 30 new unit tests (boot null-project path × 9, onboarding pure-function logic × 21); all 43 web/lib tests pass - Add web/README.md: architecture, auth flow, 7 views, dev setup, API route pattern, test instructions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-10 11:24:40 +02:00

36 commits