Commit graph

4341 commits

Author SHA1 Message Date
Mikael Hugo
ca7368e5f1 fix(bash): add 120s default timeout to prevent autonomous mode hangs
- Add BUILT_IN_DEFAULT_TIMEOUT_SECS = 120 constant to bash tool
- Compute effectiveTimeout = timeout ?? resolvedDefaultTimeout so LLM
  calls without a timeout get the 120s guard automatically
- Add defaultTimeoutSeconds? to BashToolOptions for override at creation
- Dynamic bashSchemaWithDefault describes the actual default in the LLM
  tool description, improving model awareness
- Add BashSettings interface + getBashDefaultTimeoutSeconds() to
  SettingsManager so users can override or disable via settings.json
- Wire defaultTimeoutSeconds into agent-session.ts _buildRuntime()

Root cause: npx sf --help triggered npm package download, hanging for
4+ minutes without timeout, consuming entire autonomous run budget.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 19:12:33 +02:00
Mikael Hugo
7ef58422b1 TODO: feature requests for batch backlog ingestion + probe-based resolution
Some checks are pending
CI / detect-changes (push) Waiting to run
CI / docs-check (push) Blocked by required conditions
CI / lint (push) Blocked by required conditions
CI / build (push) Blocked by required conditions
CI / integration-tests (push) Blocked by required conditions
CI / windows-portability (push) Blocked by required conditions
CI / rtk-portability (linux, blacksmith-4vcpu-ubuntu-2404) (push) Blocked by required conditions
CI / rtk-portability (macos, macos-15) (push) Blocked by required conditions
CI / rtk-portability (windows, blacksmith-4vcpu-windows-2025) (push) Blocked by required conditions
Real dogfood for the auto-triage feature: this is the unstructured dump
that the autonomous cycle should pick up and process into proper backlog
items the next time it runs. Until auto-triage is wired up, the contents
serve as a written spec for what's needed.

Two flagship features:

- Auto-triage TODO.md on each autonomous cycle. `commands-todo.js`
  already implements `/todo triage` (manual). Wire it to the autonomous
  orchestrator and skip when TODO.md == _EMPTY_TODO.

- When the LLM would ask a clarifying question, replace with parallel
  combatant + partner probes (adversarial-challenge + collaborative-
  research) and only fall back to asking a human if probes diverge AND
  interactive mode is available. This unblocks unattended
  `headless new-milestone` (the gap that blocked batch backlog
  ingestion today).

Plus five smaller items (headless milestone stall fix, bulk
import-roadmap, TTY-free plan list, hand-authorable milestone scaffold,
discoverable --answers schema) carried over from the
centralcloud-ops SF-IMPROVEMENTS.md observations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 19:09:26 +02:00
Mikael Hugo
4e5fc12e81 feat(sf): fix gate health — import, DB fallback, and enrich status uok
Three follow-up fixes from S03/T04:

1. gate-runner.js: add missing getDistinctGateIds import from sf-db.js.
   UokGateRunner.getHealthSummary() called it when registry was empty but
   it was never imported — runtime ReferenceError in headless contexts.

2. sf-db-gates.js: getDistinctGateIds + getGateRunStats fall back to the
   quality_gates DB table when no trace events are found (e.g. after trace
   file rotation). Ensures gate health survives trace cleanup.

3. headless-uok-status.ts: replace generic Type column with real Scope
   (task/slice/milestone) from quality_gates DB, and show actual Last
   Evaluated timestamp from DB even when outside the 24h stats window.
   Tests updated to match (21 pass).

Closes backlog items: bl-gate-runner-import-bug, bl-gate-stats-trace-vs-db,
bl-uok-status-enrich.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 18:47:42 +02:00
Mikael Hugo
797db16ae8 feat(sf): S03/T04 — add UOK gate health to sf headless status uok
Adds a new `sf headless status uok` subcommand that queries
gate-run stats and circuit-breaker state from sf.db and formats
them as a markdown table or JSON (--json flag).

- src/headless-uok-status.ts: handler that loads sf-db-gates
  directly (avoids the unimported getDistinctGateIds in gate-runner)
- src/headless.ts: bypass RPC, route 'status uok' to handler
- src/help-text.ts: document the new subcommand
- tests/headless-uok-status.test.mjs: 19 node:test coverage

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 18:31:03 +02:00
Mikael Hugo
4132ecc1db feat(sf): S03/T03 — wire OutcomeLearningGate into adaptive verification policy
Adds adaptive-verification-policy.js which reads OutcomeLearningGate
trace events from the last 24h and adjusts verification_max_retries /
verification_auto_fix in project preferences:
- >60% verification/artifact/execution failures → reduce retries to 1, disable auto-fix
- 0% failures across ≥5 samples → bump retries (capped at 3)
- all other cases → no change (returns null)

Wires into auto-verification.js after OutcomeLearningGate runs when
outcomeLearning flag is enabled. Includes 12 node:test tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 17:40:22 +02:00
Mikael Hugo
7b225696cc feat(sf): add cross-slice and milestone integrity checks to post-execution checks
- Add checkCrossSliceConsistency() to detect key_file conflicts across slices
- Add checkMilestoneIntegrity() to verify completed slices have summaries
  and no active requirements are orphaned
- Extend runPostExecutionChecks() signature with optional milestoneId
  and allSliceTasks parameters
- Wire cross-slice task gathering into auto-verification.js call site
- Add comprehensive node:test suite for both new checks
2026-05-11 17:22:11 +02:00
Mikael Hugo
338c75fc6f refactor: complete rf-01/rf-02/rf-11 blocked todos
rf-01: add ECONNREFUSED to isTransientNetworkError in anthropic-shared.ts,
  aligning with the NETWORK_RE pattern in error-classifier.js

rf-02: add scripts/validate-model-cost-table.mjs to report coverage gaps
  and price divergence between model-cost-table.js and models.generated.ts;
  add 'validate-cost-table' script to package.json

rf-11: extract 10 pure resource-display utility functions from
  interactive-mode.ts into packages/coding-agent/src/modes/interactive/
  resource-display.ts, reducing interactive-mode.ts by ~282 lines

All 4375 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 16:45:39 +02:00
Mikael Hugo
0aaf8f2c0e refactor: split state.js into state-shared/db/legacy modules
state.js was a 2012-line monolith combining shared helpers, DB-backed
derivation, and legacy filesystem derivation. Split into four files:

- state-shared.js (114 lines): helpers used by both DB and legacy paths
  isGhostMilestone, isSliceComplete, isMilestoneComplete, isValidationTerminal,
  readMilestoneValidationVerdict, loadTerminalSummary, stripMilestonePrefix,
  canonicalMilestonePrefix, extractContextTitle

- state-db.js (841 lines): deriveStateFromDb() and its exclusive helpers
  reconcileDiskToDb, buildRegistryAndFindActive, handleNoActiveMilestone,
  handleAllSlicesDone, resolveSliceDependencies, reconcileSliceTasks,
  detectBlockers, checkReplanTrigger, checkInterruptedWork

- state-legacy.js (895 lines): _deriveStateImpl() — filesystem-only path

- state.js (228 lines): thin barrel — invalidateStateCache, getActiveMilestoneId,
  deriveState, re-exports from sub-modules

All 1195 tests pass. No behavior change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 16:25:20 +02:00
Mikael Hugo
1adc7f119c refactor(rf-06): split auto/phases.js into per-phase modules
3538-line monolith → 6 focused modules + thin barrel:
- phases-helpers.js (223 lines): shared helpers (generateMilestoneReport,
  closeoutAndStop, emitCancelledUnitEnd, maybeFireProductAudit,
  _resolveReportBasePath, recordLearningOutcomeForUnit)
- phases-dispatch.js (486 lines): runDispatch + assessUokDiagnosticsDispatchGate
- phases-guards.js (497 lines): runGuards + guard helpers
- phases-pre-dispatch.js (760 lines): runPreDispatch
- phases-unit.js (1477 lines): runUnitPhase + session timeout state
- phases-finalize.js (542 lines): runFinalize
- phases.js (13 lines): barrel re-export preserving original import surface

Removed dead runPhaseReview export (zero callers confirmed).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 15:14:49 +02:00
Mikael Hugo
aa6ecce384 refactor: fix all remaining inline error ternaries across 20 files
Used perl regex to replace all patterns of the form
  X instanceof Error ? X.message : String(X)
with getErrorMessage(X) for any variable name.

Added getErrorMessage imports to 6 files that lacked it.
Leaves only 2 intentional .stack || .message variants unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 14:50:01 +02:00
Mikael Hugo
dac14043cd refactor: consolidate remaining error ternaries (error variable)
Replace all remaining inline error ternaries using the 'error' variable name
with getErrorMessage(error). Added imports to 3 files that lacked it.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 14:48:28 +02:00
Mikael Hugo
04322f110a refactor: replace all inline error message ternaries with getErrorMessage()
Eliminates ~120 repetitions of `err instanceof Error ? err.message : String(err)`
across the entire extension source tree. All callers now import and use
`getErrorMessage` from the canonical `./error-utils.js`.

Files updated (56 files):
- auto.js, auto-worktree.js, auto-recovery.js, auto-dashboard.js, auto-timers.js
- auto-prompts.js, auto-start.js, auto-post-unit.js, auto-model-selection.js
- auto/phases.js, auto/loop.js, auto/infra-errors.js
- autonomous-solver-eval.js, bootstrap/agent-end-recovery.js, bootstrap/db-tools.js
- bootstrap/exec-tools.js, bootstrap/journal-tools.js, bootstrap/register-extension.js
- bootstrap/register-hooks.js, canonical-milestone-plan.js, changelog.js
- clean-root-preflight.js, code-intelligence.js, commands-add-tests.js
- commands-debug.js, commands-eval-review.js, commands-handlers.js
- commands-maintenance.js, commands-pr-branch.js, commands-scan.js, commands-ship.js
- commands-todo.js, commands-worktree.js, definition-io.js, doctor.js
- doctor-config-checks.js, doctor-engine-checks.js, ecosystem/loader.js
- eval-review-schema.js, exec-sandbox.js, execution-instruction-guard.js
- graph-context.js, hook-emitter.js, index.js, learning/runtime.js
- lifecycle-hooks.js, onboarding-state.js, orphan-worktree-sweep.js
- planning-depth.js, quick.js, scaffold-keeper.js, sf-db/sf-db-core.js
- slice-cadence.js, sm-client.js, spec-projections.js, subagent/background-jobs.js
- subagent/isolation.js, sync-scheduler.js, tools/exec-tool.js
- tools/sift-search-tool.js, tools/workflow-tool-executors.js, ui/index.js
- uok/a2a-agent-server.js, uok/auto-dispatch.js, uok/auto-unit-closeout.js
- uok/auto-verification.js, uok/chaos-monkey.js, uok/gate-runner.js
- vault-resolver.js, workflow-install.js, workflow-plugins.js, worktree-manager.js
- worktree-resolver.js

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 14:46:30 +02:00
Mikael Hugo
8a7f6de782 refactor: centralize skills directory constants in skill-discovery.js
Export SKILLS_DIR, CLAUDE_SKILLS_DIR, PI_SKILLS_DIR from skill-discovery.js
instead of repeating join(homedir(), ...) inline across 5 files.

Consumers updated:
- preferences-skills.js: replace 2 inline join(homedir()...) with SKILLS_DIR/CLAUDE_SKILLS_DIR
- skill-health.js: replace 2 inline join(homedir()...) with constants; remove homedir import
- skill-catalog.js: replace 2 inline join(homedir()...) with constants; remove homedir import
- skill-telemetry.js: replace 4 inline join(homedir()...) with constants; remove homedir import

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 14:39:10 +02:00
Mikael Hugo
ec224f96ac refactor: replace all process.env.HOME/.sf patterns with sfHome()
- guided-flow.js: SF-WORKFLOW.md path now uses sfHome()
- commands-config.js: both auth.json path sites use sfHome()

Eliminates the last 3 inline ~/.sf path patterns; all .sf paths
now route through sfHome() which respects SF_HOME env override
and uses the platform-safe homedir() fallback.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 14:34:08 +02:00
Mikael Hugo
d3d7342370 refactor: use sfHome() for SF-WORKFLOW.md paths and skills dir; deduplicate errorMessage
- commands-handlers.js: replace process.env.HOME/.sf/agent/SF-WORKFLOW.md with sfHome() at both call sites (lines 62 and 412)
- skills/directory.js: replace process.env.HOME/.sf/skills with sfHome()
- tools/tool-helpers.js: remove duplicate errorMessage implementation; re-export getErrorMessage from error-utils.js under the errorMessage alias

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 14:32:08 +02:00
Mikael Hugo
181a19ac65 refactor: wire worktree-session-state.js and auto-runtime-state.js
Instead of deleting these planned-extraction modules, implement them
properly:

worktree-session-state.js:
- Upgraded to canonical module with JSDoc, node:path imports
- Fixed getActiveWorktreeName() to use normalize/join/basename (was
  using fragile string.replaceAll + split('/') approach)
- Fixed ensureWorktreeOriginalCwdFromPath() to use sep instead of regex
- worktree-command.js now imports/re-exports all state functions from
  this module and removes its local 'let originalCwd = null'
- registerWorktreeCommand() recovery logic replaced with
  ensureWorktreeOriginalCwdFromPath() call

auto-runtime-state.js:
- Fixed to use getAutoSession() singleton instead of 'new AutoSession()'
  (was creating an isolated instance disconnected from auto.js state)
- auto.js now re-exports isAutoActive, isAutoPaused, markToolStart,
  markToolEnd from this module, removing duplicate implementations
- All state reads in auto-runtime-state.js delegate to the same
  singleton that auto.js manages

Test: updated worktree-fixes.test.mjs guard to match clearWorktreeOriginalCwd()

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 14:24:50 +02:00
Mikael Hugo
5be5d6d438 refactor: remove two dead files never wired to any consumer
- worktree-session-state.js: planned extraction for worktree originalCwd
  state; worktree-command.js kept its own module-level var and never
  imported this file. Dead since creation in 47c806d73.

- auto-runtime-state.js: planned extraction of isAutoActive/isAutoPaused
  and AutoSession wrapper; auto.js already exports all the same functions.
  No file in the codebase imported auto-runtime-state.js.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 14:16:09 +02:00
Mikael Hugo
e18a0001bb refactor(sf-ext): remove local sfHome() clone in preferences.js
preferences.js had its own copy of sfHome() (without resolve() canonicalization).
Replace with import from sf-home.js — single source of truth.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 14:12:11 +02:00
Mikael Hugo
90dc3c6798 refactor(sf-ext): split sf-db.js (9073 lines) into 18 domain modules
sf-db.js is now a pure barrel re-export. All logic lives in sf-db/:

- sf-db-core.js       — adapter, schema, transactions, shared helpers
- sf-db-mode-state.js — Ask/Build/YOLO mode state
- sf-db-decisions.js  — ADR / decision records
- sf-db-artifacts.js  — file artifacts and attachments
- sf-db-milestones.js — milestone CRUD
- sf-db-slices.js     — slice CRUD
- sf-db-tasks.js      — task CRUD
- sf-db-worktree.js   — worktree state
- sf-db-evidence.js   — retrieval evidence
- sf-db-spec.js       — spec/contract records
- sf-db-gates.js      — UOK gate records
- sf-db-uok.js        — unit-of-knowledge state
- sf-db-session-store — session store / FTS
- sf-db-backlog.js    — backlog items
- sf-db-learning.js   — model learning / performance
- sf-db-memory.js     — memory / embeddings
- sf-db-profile.js    — user profile
- sf-db-self-feedback — self-feedback triage

sf-db/index.js re-exports sf-db.js for backward compat.
All 4375 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 13:51:44 +02:00
Mikael Hugo
756355abf1 refactor(sf-ext): replace inline sfHome patterns with canonical sfHome()
Fix bug in auto.js where SF_HOME env var caused double '.sf' path segment.
Convert 11 files from inline homedir()/.sf or SF_HOME constructs to sfHome().

Files updated:
- auto.js: bug fix (join(SF_HOME, '.sf', 'agent') → join(sfHome(), 'agent'))
- key-manager.js: process.env.SF_HOME || join(HOME, '.sf') → sfHome()
- ui/color-band.js: os.homedir()/.sf → sfHome(); remove os import
- ui/prompt-history.js: homedir()/.sf → sfHome(); remove homedir import
- ui/usage-bar.js: homedir()/.sf/agent/auth.json → sfHome()
- ui/marketplace.js: 2 occurrences — extensions dir → sfHome()
- skill-telemetry.js: 2 occurrences — legacy skills dir → sfHome()
- preferences-skills.js: legacy skills dir → sfHome()
- preferences-models.js: models.json path → sfHome()
- memory-embeddings.js: auth.json path → sfHome(); remove homedir import
- commands/handlers/core.js: dynamic import homedir → static sfHome()

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 10:45:35 +02:00
Mikael Hugo
0ece0e5413 refactor(sf-ext): consolidate sfHome, counters, tool helpers, settings path, post-mutation hook
- rf2-01: replace 23 inline `process.env.SF_HOME || join(homedir(), '.sf')` patterns
  across 19 files with canonical `sfHome()` from sf-home.js; removes 5 private
  sfHome/getSfHome function definitions and unused os/homedir imports
- rf2-05: extract `ensureWritableParent` and `errorMessage` from complete-task.js
  and complete-slice.js into new tools/tool-helpers.js
- rf2-06: add `runPostMutationHook` to tool-helpers.js; replace 8 identical
  try/catch blocks (plan-task, plan-slice, plan-milestone, replan-slice,
  reassess-roadmap, reopen-slice, reopen-task, reopen-milestone) with single call
- rf2-09: add `makeDiskCounter` factory in auto-dispatch.js; consolidate 4 counter
  functions (rewrite/uat get/set/increment) from duplicated if/else DB-vs-disk
  logic into thin factory wrappers (~35 lines removed)
- rf2-10: export `getSfAgentSettingsPath()` from preferences.js; update
  notifications/notify.js and permissions/permission-core.js to use it

All 4375 unit tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 10:17:58 +02:00
Mikael Hugo
9dc244eb68 refactor: rf-10/rf-03 ask-gate wiring and skills frontmatter consolidation
- rf-10: Wire gateAskUserQuestions (ask-gate.js) into ask-user-questions execute() via dynamic import; blocks autonomous ask_user_questions calls at tool layer
- rf-03: Replace FRONTMATTER_RE + manual body extraction in skills/frontmatter.js with shared splitFrontmatter(); keep custom parseYaml() for skill-specific YAML handling

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 09:09:24 +02:00
Mikael Hugo
9756edfe0b refactor: rf-09/rf-08/rf-12/rf-05 cleanup and deduplication
- rf-09: Remove isTransientNetworkError from preferences-models.js/preferences.js/preferences-models.d.ts (canonical is error-classifier.js)
- rf-08: Extract Gemini token counting to google-gemini-token-counter.js; update register-hooks.js import
- rf-12: Remove 3 dead _allRequirements/_allDecisions fetch blocks from db-writer.js
- rf-05: Extract resolveSfBin() and monitorNdjsonStdout() to spawn-worker.js; both orchestrators now import from there

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 08:59:51 +02:00
Mikael Hugo
96d751555f fix(lint): fix all pre-existing lint warnings (unused vars/imports/params)
- Prefix unused params/vars with _ in db-writer.js, system-context.js,
  record-promoter.js, a2a-transport.js
- Remove unused imports: createServer (a2a-agent-server.js),
  dirname/join/resolve (a2a-transport.js), KNOWN_PREFERENCE_KEYS (preferences.js)
- Remove unused private field _lastInputAt from pty-chat-parser.ts
- Prefix unused test variable currentProject in uok-metrics-exposition.test.mjs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 08:32:30 +02:00
Mikael Hugo
64ddbd950f refactor(extensions): consolidate duplicate code into canonical modules
- Delete ghost package packages/pi-agent-core (no dist, no consumers,
  TS build errors; JS source sf-db.js had 3 commits not mirrored in TS)
- Remove build:pi-agent-core from root package.json build:pi pipeline
- Merge all models from MODEL_COST_PER_1K_INPUT into BUNDLED_COST_TABLE
  (model-cost-table.js is now the single canonical cost source)
- Remove duplicate MODEL_COST_PER_1K_INPUT object and getModelCost()
  from model-router.js; use lookupModelCost() from model-cost-table.js
- Replace hand-rolled isTransientNetworkError in preferences-models.js
  with delegation to classifyError() in error-classifier.js

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 08:28:49 +02:00
Mikael Hugo
5ea96143ca chore(todo): remove Cloudflare Workers AI provider task
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 04:16:01 +02:00
Mikael Hugo
0b5fa75c0d fix(lint): fix all pre-existing lint failures
- check-sf-extension-inventory.mjs: expand parseDirectRegisteredCommands()
  scan to include 7 more files (guards/inturn.js, notifications/notify.js,
  permissions/index.js, ui/usage-bar.js, commands/legacy/audit.js,
  commands/legacy/create-extension.js, commands/legacy/create-slash-command.js)
  and filter results by BASE_RUNTIME_COMMAND_NAMES to exclude doc-string false
  positives ("name" in create-slash-command.js template text)

- extension-manifest.json: remove 'clear' (subcommand of logs/notifications,
  never a top-level pi.registerCommand)

- packages/pi-agent-core/src/db/sf-db.ts: fix 23 noVoidTypeReturn errors
  - openDatabase: void → boolean (caller uses return value at line 5625)
  - claimEscalationOverride: void → boolean (caller checks at escalation.js:243)
  - resolveSelfFeedbackEntry: void → boolean (caller checks at self-feedback.js:387)
  - copyWorktreeDb: void → boolean (caller checks at reconcileWorktreeDb)
  - compactUokMessages: void → {before,after} (caller returns value at message-bus.js:238)
  - insertSessionTurn: void → bigint|null (caller uses id at session-recorder.js:104)
  - expireStaleMemories: void → number (caller uses count at auto-start.js:1047)
  - deleteMemorySourceRow: void → boolean (caller returns value at memory-source-store.js:107)
  - deleteMemoryEmbedding: void → boolean (caller returns value at memory-embeddings.js:328)
  - updateBacklogItemStatus: remove dead return expression (callers discard value)
  - removeBacklogItem: remove dead return expression (callers discard value)
  - updateGateCircuitBreaker: remove dead return {total,avgMs,...} (wrong-type
    code accidentally merged from getGateLatencyStats, never reachable)
  - markUokMessageRead: remove dead return true/false (callers discard value)

- Auto-fix formatting and organizeImports in ~30 source files (biome --write)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 04:02:31 +02:00
Mikael Hugo
65da855c5e refactor(state): extract loadTerminalSummary helper, dedup 5 fail-closed SUMMARY checks
The 'read SUMMARY → check if readable AND terminal' pattern appeared five
times in state.js after the Cluster F polarity fix. Extract it to a
private loadTerminalSummary(summaryFile, loadFn) helper so the fail-closed
semantics live in one place and can't drift between call sites.

- loadTerminalSummary returns the content if readable AND terminal, null otherwise
- All 5 call sites replaced: 2 in getActiveMilestoneId(), 3 in _deriveStateImpl()
- Phase 2 'no roadmap' case reuses returned content for parseSummary().title
- isTerminalMilestoneSummaryContent now only referenced inside the helper

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 03:46:36 +02:00
Mikael Hugo
3f0a02fe13 chore(todo): mark Cluster F, Always Allow port, and Mermaid diagram as done
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 03:39:17 +02:00
Mikael Hugo
159c8b0c4d refactor(git-service): rename GitServiceImpl → GitService
No interface exists for the class, so the Impl suffix is vestigial
Java-style naming. Rename throughout: git-service.js, auto-start.js,
auto.js, worktree.js, worktree-detect.js, worktree-resolver.js,
quick.js, and the two test files that imported it directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 03:38:53 +02:00
Mikael Hugo
c1df4249b8 fix(state): Cluster F — fail-closed SUMMARY checks in state.js and dispatch-guard.js
Three fail-open bugs allowed unreadable (null) SUMMARY files to be treated as
terminal, incorrectly marking milestones as complete when the content could not
be read.

Gap 1 — dispatch-guard.js line 50:
  Any SUMMARY file existence = milestone complete (fail-open).
  Fix: DB-first check via getMilestone()+isClosedStatus(); filesystem fallback
  reads SUMMARY content and calls classifyMilestoneSummaryContent() so only
  non-failure summaries skip the milestone.

Gap 2 — state.js getActiveMilestoneId():
  'if (summaryFile) continue' skipped any milestone with ANY SUMMARY.
  'if (!summaryFile) return mid' fell through incorrectly for failure SUMMARYs.
  Fix: read content; only skip/continue if sc != null && isTerminal(sc).

Gap 3 — state.js _deriveStateImpl() Phase 1 + Phase 2:
  '!sc || isTerminalMilestoneSummaryContent(sc)' — null content = fail-open.
  Fix: 'sc && isTerminalMilestoneSummaryContent(sc)' — null content = fail-closed.
  Applied to all 6 occurrences (lines 1233, 1247, 1257, 1284, 1356, 1391).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 03:34:48 +02:00
Mikael Hugo
70afabedb7 refactor(uok): move auto-dispatch, auto-verification, auto-runaway-guard, auto-unit-closeout into sf/uok/
Per checkpoint-008/009 next-steps: these 4 autonomous-loop modules belong in
the UOK subsystem alongside the other orchestration primitives.

- auto-dispatch.js → uok/auto-dispatch.js
  - Dispatch table + resolveDispatch() is a core UOK orchestration primitive
  - Updated 3 static importers + 1 dynamic await import + 3 test files
- auto-verification.js → uok/auto-verification.js
  - Post-unit verification gate delegates to UOK gates (ChaosMonkey, Security,
    CostGuard, OutcomeLearning, etc.)
  - Updated 1 importer (auto.js)
- auto-runaway-guard.js → uok/auto-runaway-guard.js
  - Diagnostic budget guard; no local relative imports
  - Updated 4 importers (auto-timers.js, preferences-models.js, auto/phases.js,
    auto/run-unit.js)
- auto-unit-closeout.js → uok/auto-unit-closeout.js
  - Unit metrics snapshot + activity log + memory extraction helper
  - Updated 3 importers (auto-timers.js, auto-post-unit.js, auto.js)

Each original file is now a 1-line re-export shim preserving public API.
All 4 are added to uok/index.js as the UOK barrel.

26 dispatch tests pass; full unit suite 4374 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 03:02:52 +02:00
Mikael Hugo
adb449d642 fix: consolidate extensions into sf, migrate kernel.ts, fix test suite
- Fold sf-usage-bar, sf-notify, sf-inturn-guard, sf-permissions,
  slash-commands into sf extension (ui/, notifications/, guards/,
  permissions/, commands/legacy/)
- Delete vectordrive extension
- Migrate uok/kernel.js to TypeScript (kernel.ts) with full interfaces
- Add allowJs/checkJs:false to tsconfig.resources.json for incremental TS migration
- Add symlink dedup to extension-discovery.ts (seenRealPaths Set)
- Add before_provider_request delegate back to native-search.js so
  session budget tests exercise the middleware end-to-end
- Fix parseSfNativeTools() to return all SF manifest tools (drop sf_ filter)
- Fix test assertions: plan_milestone/complete_task/validate_milestone
- Remove subagent from app-smoke.test.ts (folded into sf/subagent/)
- Remove sf-permissions/sf-inturn-guard/subagent from features-inventory test
- Fix resolveSearchProvider autonomous mode test to pass 'auto' explicitly
- Remove legacy /clear slash command (conflicts with built-in clear_terminal)
- Update web-command-parity-contract.test.ts for clear removal

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-11 02:40:52 +02:00
Mikael Hugo
24592507c3 sf snapshot: uncommitted changes after 53m inactivity 2026-05-11 01:54:55 +02:00
Mikael Hugo
852bf8c5aa sf snapshot: uncommitted changes after 78m inactivity 2026-05-11 01:01:03 +02:00
Mikael Hugo
605cd712be refactor: capability-tier isHeavyModelId, search provider registry, frontmatter_version field, schema docs
- preferences-models.js: replace 6-regex isHeavyModelId() with MODEL_CAPABILITY_TIER
  lookup + regex fallback for unknown models; new models in model-router.js
  are automatically reflected without touching preferences-models.js
- search-the-web/provider.js: replace ~200-line per-provider waterfall with
  PROVIDER_REGISTRY array + firstAvailable()/resolveWithFallback() helpers;
  preserves Tavily→Brave→Serper→Exa→Ollama→MiniMax auto-fallback order
- sf-db.js: bump SCHEMA_VERSION 58→60 (v59 now reachable); add
  frontmatter_version column to tasks table via v60 migration and CREATE
  TABLE definition; wire frontmatter_version into upsertTaskPlanning() SQL
  and .run() params
- task-frontmatter.js: add frontmatterVersion:1 to DEFAULT_TASK_FRONTMATTER,
  add validation block in validateTaskFrontmatter(), add frontmatterVersion
  mapping in taskFrontmatterFromRecord()
- sf-db-migration.test.mjs: update hardcoded version assertion 58→60
- docs/specs/sf-operating-model.md: add Planning Schema section documenting
  the 3-table model (milestones/slices/tasks, their PKs, spec tables, and
  ID naming conventions)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 23:42:29 +02:00
Mikael Hugo
b228bc9f5c feat(learning): weight failure_mode in Bayesian blender — rate_limit=0.7, quota=0.2, auth=0.0
- AGGREGATE_ONE/GROUPED_SQL: compute effective_success_rate with CASE WHEN failure_mode
- AggregatedStats: add effective_success_rate, hard_failure_count fields
- computeObservedScore: uses effective_success_rate when available; 0.5x penalty if >50% hard failures
- Tests: verify rate_limit ranked above quota_exhausted; hard failure penalty verified

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 23:20:33 +02:00
Mikael Hugo
2dea73398d fix(learning): add save_knowledge to manifest, failure_mode to aggregator SELECT + index
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 23:18:02 +02:00
Mikael Hugo
e50321b62b feat(selection): thread unitType + failure_mode into fallback outcome records
- FallbackResolver.setUnitContext() stores {unitType,unitId} from autonomous dispatch
- run-unit.js calls pi.setFallbackUnitContext() before/after each unit
- _findAnyAvailableFallback uses real unitType/unitId from context, not sentinel
- Schema v59: failure_mode column in llm_task_outcomes
- insertLlmTaskOutcome accepts failure_mode (rate_limit, quota_exhausted, auth_error)
- register-hooks.js passes event.classification.reason as failure_mode
- register-hooks.js uses real event.unitId when available
- ExtensionRuntimeActions.setFallbackUnitContext added to pi API surface

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 23:14:22 +02:00
Mikael Hugo
009651e86f feat(selection): wire before_model_select into FallbackResolver for outcome-aware fallback
When a model fails and FallbackResolver picks a replacement, it now:
1. Fires the before_model_select hook with reason='fallback' and the
   failing model's ID — the learning system records the failure outcome
   and returns the best Bayesian-blended replacement from llm_task_outcomes
2. Falls back to the existing heuristic sort (reasoning + context window)
   if the hook is unavailable or returns no override

Changes:
- BeforeModelSelectEvent: add optional currentModelId and reason fields
- FallbackResolver: accept emitBeforeModelSelect in constructor; make
  _findAnyAvailableFallback async; fire hook before heuristic fallback
- agent-session.ts: inject lazy emitBeforeModelSelect closure into resolver
- register-hooks.js: record failure outcome when reason='fallback' before
  returning selectLearnedModel result

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 23:05:33 +02:00
Mikael Hugo
fb1bd3e5fa refactor(shared): deduplicate shared/ utilities against coding-agent package exports
- Add packages/coding-agent/src/utils/format.ts as the canonical source
  for formatDuration, formatTokenCount, truncateWithEllipsis, sparkline,
  formatDateShort, fileLink, stripAnsi, normalizeStringArray — all already
  exported from @singularity-forge/coding-agent via index.ts.

- Convert shared/format-utils.js to a compatibility shim that re-exports
  the 8 functions from @singularity-forge/coding-agent. All 13 importers
  continue to work with no import changes required.

- Convert shared/path-display.js to a compatibility shim that re-exports
  toPosixPath from @singularity-forge/coding-agent. Implementation in
  packages/coding-agent/src/utils/path-display.ts was already canonical.

- shared/frontmatter.js is intentionally NOT shimmed: splitFrontmatter/
  parseFrontmatterMap have a different API from the package's parseFrontmatter/
  stripFrontmatter (flat-map vs {frontmatter, body} object).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 22:41:03 +02:00
Mikael Hugo
7227912a29 perf(search): move web-search provider injection from extension hook to native middleware
- Create packages/coding-agent/src/core/providers/web-search-middleware.ts with
  WebSearchMiddleware class: injects web_search tool, enforces session budget (#1309),
  strips thinking blocks from history, and respects PREFERENCES.md search_provider.

- Wire webSearchMiddleware.applyToPayload into sdk.ts onPayload callback (before
  extension hook dispatch) so injection runs as compiled TypeScript with zero
  jiti-dispatch overhead.

- Export WebSearchMiddleware, webSearchMiddleware singleton, setPreferBraveResolver,
  CUSTOM_SEARCH_TOOL_NAMES, MAX_NATIVE_SEARCHES_PER_SESSION, and stripThinkingFromHistory
  from @singularity-forge/coding-agent so the extension can delegate to the same instance.

- Refactor search-the-web/native-search.js: remove self-contained injection logic;
  import and delegate before_provider_request to webSearchMiddleware singleton.
  Use tri-state isAnthropicProvider (null/false/true) to synthesize a provider hint
  when event.model is absent but model_select has already fired — prevents the
  model-name heuristic from wrongly injecting into Copilot claude-* requests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 22:37:42 +02:00
Mikael Hugo
a798aa1f6e feat(swarm): wire @a2a-js/sdk as real A2A transport for SF_A2A_ENABLED dispatch path
- Install @a2a-js/sdk v0.3.13 as a dependency
- Add a2a-transport.js: A2ATransport class with spawnAgent, dispatch,
  getOrSpawnAgent, and buildAgentCard; spawns pi subprocesses with
  SF_A2A_AGENT_* env vars and dispatches envelopes via A2A JSON-RPC
- Add a2a-agent-server.js: A2A HTTP server entrypoint for spawned agent
  processes; starts express + A2AExpressApp with DefaultRequestHandler,
  handles incoming DispatchEnvelopes via SwarmAgentExecutor, writes
  envelope to SQLite MessageBus, and signals readiness via stdout JSON
- Update swarm-dispatch.js: split dispatch() into _busDispatch()
  (existing SQLite path) and _a2aDispatch() (new A2A path); lazy-load
  A2ATransport singleton only when SF_A2A_ENABLED is set; default
  path unchanged for all existing callers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 22:33:01 +02:00
Mikael Hugo
3fba4bcb03 refactor(mcp): move MCP connection manager to packages/coding-agent/src/core/mcp/
- Create config.ts with McpServerConfig types and readMcpConfigs/getServerConfig
- Create auth.ts with buildHttpTransportOpts and createCliOAuthProvider
- Create connection-manager.ts with McpConnectionManager class
- Create index.ts re-exporting the public API
- Export McpConnectionManager and helpers from @singularity-forge/coding-agent
- Rewrite mcp-client extension as thin wrapper using McpConnectionManager
- Rewrite auth.js as re-export shim from @singularity-forge/coding-agent
- Update test to import buildHttpTransportOpts from @singularity-forge/coding-agent

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 22:19:46 +02:00
Mikael Hugo
9e484e67b7 refactor(sf): fold sf-tui extension into sf/ui/ — remove separate extension layer
sf-tui was a 'bundled' extension with zero features independent of the sf/
extension. Every hook, shortcut, tool, header and footer render depended
on sf/ internals (getAutoSession, isAutoActive, projectRoot,
getExperimentalFlag). The separation was artificial.

Changes:
- Moved all sf-tui/*.js into sf/ui/ (header, footer, git, color-band, emoji,
  prompt-history, marketplace, powerline, shared)
- Fixed imports: ../sf/ → ../ (one level up from ui/)
- Registered sf/ui/index.js from sf/index.js in a try/catch so a UI failure
  can't take out the core SF commands
- Merged sf-tui manifest entries (9 commands, 3 shortcuts, agent_start hook)
  into sf/extension-manifest.json
- Deleted src/resources/extensions/sf-tui/ entirely
- Fixed prompt-history.test.mjs import path

Result: one fewer extension to discover, load and validate at startup.
sf is now the single extension that owns both planning state and UI chrome.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 22:04:00 +02:00
Mikael Hugo
9e55528c95 revert(tui): remove Ink bridge, restore pure custom differential renderer
The Ink bridge added today was a misguided gradual-migration wrapper:
- Components still rendered via the old string-line protocol (no Ink layout)
- Key decodes were re-encoded to escape sequences → keys.ts decoded again (double round-trip bug)
- The _useInk / _inkHandle path blocked TTY start unconditionally via process.stdout.isTTY check

Removed: ink-bridge.tsx, ink-bridge.test.ts, useInk() method, _useInk/_inkHandle fields,
startInkRenderer import/export, Ink branch in start()/stop()/requestRender().

Removed ink and react from packages/tui dependencies and peerDependencies.
Reverted tsconfig.extensions.json jsx settings (only needed for the .tsx bridge file).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 21:38:54 +02:00
Mikael Hugo
8c764f6c98 fix(tsconfig): add jsx/jsxImportSource to tsconfig.extensions.json for tsgo compat
tsgo (TS7 native port) requires explicit jsx setting when .tsx files are
in scope. tsc 6 was lenient; tsgo errors without it.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 21:31:53 +02:00
Mikael Hugo
702ec3fc0e refactor(sf): rename guidance files TASTE.md→STYLE.md, ANTI-GOALS.md→NON-GOALS.md
More self-explanatory names. No behavioral change — same files, same purpose.

- .sf/TASTE.md → .sf/STYLE.md (# Taste → # Style)
- .sf/ANTI-GOALS.md → .sf/NON-GOALS.md (# Anti-goals → # Non-goals)
- All code references updated: auto-bootstrap-context, system-context,
  gitignore, milestone-framing-check, scaffold-constants, spec-projections
- Section headings injected into agent context updated to match

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 21:28:31 +02:00
Mikael Hugo
48a01dd764 refactor(prefs): remove all legacy PREFERENCES.md / preferences.md support
preferences.yaml is now the only preferences file. No fallback chains,
no .md parsing paths, no legacy path getters.

- preferences.js: remove globalPreferencesPath, globalPreferencesPathUppercase,
  legacyGlobalPreferencesPath, projectPreferencesPath, projectPreferencesPathUppercase,
  getLegacyGlobalSFPreferencesPath; simplify load functions to yaml-only;
  parsePreferencesMarkdown kept as thin deprecated shim over parsePreferencesYaml
- commands-prefs-wizard.js: remove parseFrontmatterMap/splitFrontmatter usage,
  .md branch in savePreferencesFile/ensurePreferencesFile, legacyGlobal display
- auto-dashboard.js: parsePreferencesMarkdown → parsePreferencesYaml
- guided-flow.js / worktree-root.js: remove PREFERENCES.md existence checks
- detection.js: remove .md fallbacks from all 3 detection functions
- auto-bootstrap-context.js: remove .sf/PREFERENCES.md from priority list
- auto-worktree.js: remove LEGACY_PREFERENCES_FILES array and all copy fallbacks
- deep-project-setup-policy.js: only check preferences.yaml
- gitignore.js: ensurePreferences checks yaml only
- planning-depth.js: returns plain string path (not {path,isYaml}); yaml-only
- preferences-template-upgrade.js: remove .md branch; always write raw YAML
- tests: update fixtures to preferences.yaml with plain YAML content
- docs/learning: update all remaining PREFERENCES.md references

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 21:14:43 +02:00
Mikael Hugo
48dbb175c0 feat(prefs): migrate canonical preferences file from PREFERENCES.md to preferences.yaml
New installations create .sf/preferences.yaml (pure YAML, no frontmatter
markers) and ~/.sf/preferences.yaml. Existing .md files are read as fallbacks
with no migration required for current users.

Changes:
- preferences.js: add yaml path getters, load chain tries .yaml first, add
  parsePreferencesYaml() for direct YAML parse without frontmatter extraction
- templates/preferences.yaml: new canonical template (pure YAML with comment
  header pointing to preferences-reference.md)
- gitignore.js: ensurePreferences() creates preferences.yaml; simplified by
  removing scaffold-versioning dependency
- init-wizard.js: buildPreferencesFile() produces pure YAML, writes preferences.yaml
- commands-prefs-wizard.js: savePreferencesFile() helper handles .yaml vs .md;
  ensurePreferencesFile uses yaml template for yaml paths
- preferences-template-upgrade.js: yaml files get raw YAML on upgrade
- planning-depth.js: returns {path, isYaml}, handles both formats
- deep-project-setup-policy.js: isWorkflowPrefsCaptured() tries all 3 paths
- detection.js: preferences.yaml added to all detection checks
- auto-worktree.js: canonical=yaml, LEGACY_PREFERENCES_FILES=["PREFERENCES.md","preferences.md"]
- auto-bootstrap-context.js: preferences.yaml before PREFERENCES.md in list
- guided-flow.js / worktree-root.js: existence checks include preferences.yaml
- User-visible strings / comments updated throughout

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-10 21:05:10 +02:00