singularity/singularity-forge

Author	SHA1	Message	Date
Mikael Hugo	6e95c3542c	fix(bootstrap): always dispatch self-feedback triage on session_start The session_start hook only invoked dispatchSelfFeedbackInlineFixIfNeeded when triage.stillBlocked contained at least one high/critical entry. After the previous commit rewired the worker as a triage queue that returns every open forge-local entry (not just high/critical), this gate stranded medium/low backlog forever at startup — the unit was never given a chance to triage them. The dispatcher's own selectInlineFixCandidates is now the source of truth for eligibility; the call site should call unconditionally. Keep the high/critical-specific notify (still useful operator signal when the loud ones are present) but stop using it to gate the dispatch. The turn_end hook at the bottom of register-hooks.js already calls the dispatcher unconditionally, so this change aligns the two paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 22:59:13 +02:00
Mikael Hugo	ce58d32231	fix(self-feedback,state): close two state-drift gaps 1. Self-feedback JSONL is now a real append-only audit log. Previously markResolved updated the DB row in place but never echoed the resolution to JSONL, so a DB rebuild via importLegacyJsonlToDb would re-import all entries with their original pre-resolution state and silently lose every resolution that had ever landed. The JSONL was a half event log — creations yes, resolutions no. - Introduce a `recordType: "resolution"` JSONL record shape. Append one of these to the project JSONL whenever markResolved succeeds against the DB. Best-effort: failure to append never blocks the resolution itself. - Extend importLegacyJsonlToDb to handle both record types. Entry creations go through insertSelfFeedbackEntry (ON CONFLICT DO NOTHING — idempotent). Resolution events go through resolveSelfFeedbackEntry, which is already a no-op on missing or already-resolved rows, so replay is idempotent. - Tests cover: the appended record shape; a DB rebuild correctly reconstructing resolved_at/resolved_evidence_json from a JSONL audit trail; orphan resolution events (entry never existed) are a silent no-op. Closes self-feedback entry sf-mp4ikbta-2zcbhh. 2. The reconcile path at state-db.js:reconcileSliceTasks warns when an on-disk SUMMARY.md exists for a task whose DB row is still pending and refuses to silently import — a safety check so autonomous runs can't promote themselves to complete by writing a SUMMARY without a real DB transition. But operators had no remediation path when the drift was real (lost DB write, hand edit). They had to mutate the DB by hand. - New `state-reconcile.js` with `reconcileTaskFromSummary` exposes the remediation explicitly. Parses the SUMMARY via the existing parseSummary helper, validates via isValidTaskSummary, and writes status / completed_at / verification_result / blocker / key_files / full_summary_md into the DB row through a new `setTaskSummaryFields` helper in sf-db-tasks. - Returns structured { ok, reason, applied } outcomes — never throws — so operator tooling can branch on `db-unavailable`, `summary-missing`, `summary-invalid`, `task-not-in-db`, `already-done`. - The reconcile warning text now points at the helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 22:55:30 +02:00
Mikael Hugo	5f245b721d	fix(self-feedback): rewire inline-fix worker as triage queue The inline-fix worker was a partial repair queue — it picked only high/critical+blocking entries plus my recent gap/architecture-defect override and left everything else (medium inconsistencies, janitor gaps, architectural-risks, low-severity gaps) sitting open forever. The requirement-promoter clusters by exact `kind` string and never fires on diverse forge-local entries (every open entry currently has a unique kind), so there is no other sweep that ever touches these. They just accumulate. The point of the worker is triage, not just repair: every open entry should get an eyes-on per session and reach one of three outcomes — fix, promote to requirement, or close as not-of-value with reason. Closing deliberately is a valid, expected outcome. Changes: - `selectInlineFixCandidates` now returns every open forge-local entry, modulo the existing credibility check that re-includes suspect resolutions. Severity and blocking filters are gone; the kind-based override is no longer needed because everything qualifies. - The dispatch prompt is rewritten as a three-way triage protocol (Fix / Promote / Close) with explicit guidance per outcome and explicit prohibition on the `auto-version-bump` evidence kind (which would re-open under the credibility check). - Tests collapse the three filter-coverage tests into a single "selects every open forge-local entry" assertion that exercises the full severity × blocking × kind matrix. Upstream feedback is still excluded — those entries describe behavior in other repos that the inline-fix unit cannot directly repair. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 22:46:24 +02:00
Mikael Hugo	89b52b6011	fix(self-feedback): widen inline-fix candidate selection + drop upstream The inline-fix dispatcher had three blind spots that left forge-local architectural debt rotting in the ledger: 1. Filter required `severity ∈ {high, critical} AND blocking`. Medium `gap:` and `architecture-defect:` entries — describing the exact class of debt the inline-fix unit was built to repair — were dropped on the floor. The forge-local queue currently has 0 high+blocking open entries and 3 architectural gaps, so the old filter would dispatch on nothing local and fall back to upstream. 2. Resolutions were trusted unconditionally. `auto-version-bump` fires on any sf-version bump without verifying the bump contained a fix, silently burying defects. 3. Upstream feedback was merged into the candidate set. Upstream entries describe behavior observed in OTHER repos (e.g. `flow-audit:repeated- milestone-failure` from /srv/infra/apps/centralcloud_ops) — the inline-fix unit edits forge source and cannot repair issues in those other repos. Including them dispatches work the unit cannot perform. Changes to `selectInlineFixCandidates`: - Add kind-based override: entries with `kind` starting with `gap:` or `architecture-defect:` qualify regardless of severity/blocking. - Add resolution credibility check: re-include entries resolved with evidence kind `auto-version-bump`, or with no evidence kind AND no `resolvedReason` narrative at all. Legacy resolutions with a meaningful operator narrative (the historical format) are still trusted. - Drop `readUpstreamSelfFeedback()` from the candidate merge. Upstream stays readable for SELF-FEEDBACK.md rollups and operator review, just not auto-dispatched to inline-fix. Also relax the schedule-e2e readEntries timing assertion from a 100ms threshold to 500ms — the test is a catastrophic-regression guard, not a microbenchmark, and parallel-suite jitter on dev machines routinely adds >100ms even when the underlying read is fast (≤ a few hundred ms). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 22:23:57 +02:00
Mikael Hugo	5a2618c05d	fix(auto): re-dispatch on executor refusal instead of pausing The autonomous solver was designed precisely to handle executor refusals (per its own docstring: "the solver role MUST stay on a stable, agentic, refusal-resistant model independent of any per-unit routing choices"), but the refusal handler short-circuited past it and emitted a `blocked` checkpoint, which assessAutonomousSolverTurn unconditionally turns into a `pause` — defeating autonomous mode every time the router selects a capability-mismatched executor. The 1h model-block added in `3f2babb5d` was the right primitive but had no consumer: nothing actually re-dispatched the unit after the model was blocked, so the block only mattered if the operator manually unpaused and retried. This change wires the missing consumer: - Add per-unit `executorRefusalEscalations` counter to solver state plus a `recordExecutorRefusalEscalation` helper. Counter persists across iterations of the same unit and resets on unit change. - On `executor-refused`: block the refusing model and slice-routing entry (unchanged), file self-feedback (unchanged), then synthesize a `continue` checkpoint and return `{ action: "continue" }` directly so the auto loop re-dispatches the unit. selectAndApplyModel will skip the now-blocked model and pick a higher-tier fallback. - Bounded by `MAX_EXECUTOR_REFUSAL_ESCALATIONS=3`. When the budget is exhausted (an entire fallback chain refused on the same unit), fall back to the legacy blocked-and-pause path so the operator can review. - Bypass `assessAutonomousSolverTurn` on the refusal-continue path because its no-op detector would (correctly) reject a continue over a refusal transcript — but here the "no-op" is the whole point: we are explicitly swapping the routed model. Tests cover the new state field's init/persistence/reset semantics and the constant's invariants. Full SF extension suite (1369 tests) passes. Refs: sf-mp3bm6u0-2fskt8 (now fully addressed, not just AC1) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 21:49:51 +02:00
Mikael Hugo	32cfb6224b	test: migrate node:test imports to vitest and stabilize timing thresholds - Three .test.mjs files now import describe/it from vitest, matching the harness CLAUDE.md mandates for the SF extension suite. - schedule-e2e local readEntries threshold raised 50ms → 100ms with a comment noting full-suite parallelism adds scheduler/filesystem jitter on dev machines (CI threshold unchanged at 200ms). - e2e-smoke "headless new-milestone without --context" timeout raised 10s → 30s so the exit-1 assertion isn't flaky under load. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 21:30:21 +02:00
Mikael Hugo	3f2babb5d1	fix(auto): block refusing executor model temporarily to force escalation on retry When classifyExecutorRefusal detects an executor refusal, the model is now temporarily blocked (1-hour TTL) via the existing blocked-models mechanism. This ensures that on retry — whether automatic or manual — the router skips the refusing model and the tier-escalation path in selectAndApplyModel picks a higher-tier alternative. This satisfies AC1 of self-feedback entry sf-mp3bm6u0-2fskt8. AC2 (refusal pattern detection) was already satisfied by the existing apology-no-tools pattern in classifyExecutorRefusal. Refs: sf-mp3bm6u0-2fskt8	2026-05-13 02:40:41 +02:00
Mikael Hugo	2cad6d54f4	fix(doctor): enrich flow-audit repeated-failure rollup with full diagnostic context The flow-audit repeated-milestone-failure rollup now includes: - Active milestone/unit and session pointer (AC1) - Stale dispatched units (AC2) - Runaway history (AC3) - Over-budget child processes (AC3) This satisfies the acceptance criteria of self-feedback entry sf-mp3ati7u-qqxcyi so operators can use the rollup evidence to repair stale dispatch, missing summary, runaway, or child-process handling without needing to re-run the flow audit manually. Refs: sf-mp3ati7u-qqxcyi	2026-05-13 02:25:29 +02:00
Mikael Hugo	65e195a9fd	feat: Created draft mapping of SF patterns to ACE reference draft SF-Task: S05/T01	2026-05-13 02:01:41 +02:00
Mikael Hugo	1ed505669b	fix(sf-db,autonomous-solver): resolve schema-drift and checkpoint runaway loop - sf-db-schema.js: per-migration transaction boundaries (runMigrationStep) so a late migration failure does not roll back earlier successful ones. Post-migration assertion recreates routing_history if missing. - routing-history.js: catch missing routing_history table at init and latch _dbTableAvailable=false so auto-start does not crash. - autonomous-solver.js: sticky identity guard in appendAutonomousSolverCheckpoint pins to orchestrator's unitType/unitId instead of trusting agent's claim. Emit journal event on identity mismatch. Record mismatchedIdentity diagnostic. Hard cap MAX_CHECKPOINTS_PER_ITERATION=5 in assessAutonomousSolverTurn. - Tests: add v52 DB smoke test with auto-start path; add sticky identity tests (4 cases); add excessive-checkpoint pause test. Fixes: sf-mp36kfqm-rjrzju, sf-mp37kjmo-1mfuru	2026-05-13 01:47:19 +02:00
Mikael Hugo	a49ea1da87	feat(sf/prompts): Phase 4 — cache_control breakpoints at static/dynamic boundary Split reorderForCaching into a structured reorderAndSplitForCaching that returns {before, after} at the semi-static→dynamic section boundary. - prompt-ordering.js: export reorderAndSplitForCaching — returns null if no dynamic sections, otherwise {before: static+semi-static, after: dynamic} - auto.js: import and wire reorderAndSplitForCaching into deps - phases-unit.js: use split function; pass promptParts to runUnit when split succeeds; fall back to flat reorderForCaching when null - run-unit.js: when promptParts is present, send a two-block content array [{type:text, text:before, cache_control:{type:ephemeral}}, {type:text, text:after}] so Anthropic-compatible providers cache the stable prefix - openai-completions.ts: preserve cache_control on text parts in convertMessages; skip maybeAddOpenRouterAnthropicCacheControl if any part already has cache_control Tests: 5 new contract tests for reorderAndSplitForCaching; all 4502 unit tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-13 01:36:22 +02:00
Mikael Hugo	3b83d09692	feat(sf/prompts): Phase 3 v2 — migrate milestone+slice builders to composeUnitContext Migrate buildPlanMilestonePrompt, buildValidateMilestonePrompt, buildCompleteMilestonePrompt, buildReplanSlicePrompt, buildResearchSlicePrompt, and renderSlicePrompt (plan-slice + refine-slice) from imperative inlined[] push loops to the v2 composeUnitContext API (manifest-driven, prepend/computed support). Changes: - unit-context-manifest.js: add 7 new ARTIFACT_KEYS (slice-summaries, blocker-summaries, queue, verification-classes, outstanding-items, previous-validation, prior-milestone-summary); update 7 manifests with correct prepend/inline/computed declarations - auto-prompts.js: import composeUnitContext; migrate all 6 builders; remove orphaned old buildValidateMilestonePrompt tail left by partial prior edit - tests: add auto-prompts-phase3.test.mjs with 7 contract tests covering plan-milestone, replan-slice, validate-milestone, and research-slice prompt generation Pre-computation pattern: complex async logic (blocker scan, slice aggregation, verification classes, prior validation) is computed imperatively before composeUnitContext, then returned from resolveArtifact. This preserves parallel execution of other artifacts. buildPlanMilestonePrompt keeps framingBlock imperative: the framing check wraps the composed inlinedContext rather than going inside the composer boundary. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-13 01:02:48 +02:00
Mikael Hugo	ca5d869e34	feat(prompts): fragment infrastructure + RFC #4782 stub manifests Phase 1 — Fragment infrastructure: - Add {{include:fragment-name}} support to prompt-loader.js - fragmentsDir registered alongside promptsDir/templatesDir - warmCache() now reads prompts/fragments/*.md with 'frg:' prefix - Pre-resolution pass in loadPrompt() resolves {{include:}} before the {{var}} validator (colon is outside validator regex [a-zA-Z0-9_], so unresolved includes are caught as parse errors) - Lazy-load fallback for fragments mirrors existing prompt lazy-load - Create prompts/fragments/working-directory.md (Variant A: full contract including 'Do NOT cd to any other directory') - Create prompts/fragments/working-directory-ops.md (Variant B: ops prompts, no cd restriction) - Replace duplicated 3-line Working Directory boilerplate in 17 prompts with {{include:working-directory}} (12 files) or {{include:working-directory-ops}} (5 ops files) - One fix to Working Directory wording now propagates to all 17 prompts Phase 2 — RFC #4782 stub manifests: - Add deploy, smoke-production, release, rollback, challenge to KNOWN_UNIT_TYPES and UNIT_MANIFESTS in unit-context-manifest.js - All 5 builders already called composeInlinedContext() but returned "" because resolveManifest() found no entry; now they return live content - All 26 unit types now have manifests (resolveManifest returns non-null for every type in KNOWN_UNIT_TYPES) Tests: - 5 new tests in prompt-loader-fragments.test.mjs (include resolution, lazy-load fallback, unknown fragment error, nested var inheritance, variant-B fragment) - Full unit suite: 427 files passed, 4476 tests passed, 0 regressions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-13 00:30:19 +02:00
Mikael Hugo	55229f6604	fix(auto): split autonomous solver from executor per ADR-0079 - Lock solver model to kimi-k2.6 independent of unit-type router - Executor prompt no longer requires checkpoint tool call - Add dedicated solver pass that reads executor transcript and emits canonical checkpoint - Classify executor refusals as blocker outcomes (already partially implemented) - Classify no-op iterations (continue with zero work) as missing-checkpoint-retry - Add tests for executor prompt block, solver pass prompt, no-op detection, and no-op assessment Fixes sf-mp34nxb6-27zdx7	2026-05-12 23:55:02 +02:00
Mikael Hugo	93d547c65e	fix(headless): skip Ask→Build mode gate in SF_HEADLESS mode In headless mode the showConfirm dialog blocks forever since there is no TUI to answer it. The user already consented by calling /next or /autonomous explicitly — the gate adds no value and hangs the run. Add process.env.SF_HEADLESS !== '1' to the gate condition so headless runs bypass it and proceed directly to autonomous execution. Verified: `sf headless --command next` now completes slice S03 (719 526 tokens, 10 tool calls, $0.027) without hanging. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-12 17:28:09 +02:00
Mikael Hugo	d22df007a7	fix(headless): correct log message to show actual command format The log message said '/sf ${command}' but the actual command sent is '/${command}' (without the sf namespace). Fix to match actual dispatch. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-12 17:04:11 +02:00
Mikael Hugo	16db710468	sf snapshot: uncommitted changes after 49m inactivity	2026-05-12 16:45:04 +02:00
Mikael Hugo	0426aafad2	fix(headless): drop /sf prefix so typed commands route through extension dispatch headless.ts was sending `/sf {subcommand} {args}` to the RPC session, but commands are registered without the sf namespace (e.g. 'todo', 'autonomous'). _tryExecuteExtensionCommand parsed commandName='sf', found no match, and the LLM handled the request instead of the typed backend. Fix: send `/{subcommand} {args}` directly — matches what registerSFCommands registers and what the TUI already uses. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-12 15:55:46 +02:00
Mikael Hugo	2bb9cdbeef	feat(scaffold): ADR-022 scaffold profiles (all phases) Add profile-aware scaffold system so SF does not lay down irrelevant templates in infra/ops/docs repos. ## What ships Phase 1 — data model - scaffold-versioning.js: add 'disabled' to VALID_STATES; readScaffoldManifest returns profile field; recordScaffoldApply preserves manifest.profile (fixes roundtrip bug where profile was stripped on every write). - scaffold-constants.js: PROFILES (app/library/infra/docs/minimal as Set<string>) and PROFILE_NAMES exports. Phase 2 — profile-aware drift detection - scaffold-drift.js: disabled bucket in emptyCounts, resolveActiveProfileSet integration, profile param on detectScaffoldDrift/migrateLegacyScaffold. - doc-checker.js: filter to active profile, skip disabled-state files. Phase 3 — auto-detection on first run - scaffold-profiles.js: detectRepoProfile() heuristics (nix→infra, terraform→infra, react→app, node-no-ui→library, docs-only→docs, else→app). - agentic-docs-scaffold.js: reads profile from manifest, auto-detects on first run, persists to manifest, filters SCAFFOLD_FILES to active profile. Phase 4 — migrate command - commands-scaffold-migrate.js: sf scaffold migrate --profile <name> Re-enables pending files entering the new profile; stamps state=disabled (or prunes with --prune) files leaving it; warns on editing/completed files. - commands/handlers/ops.js, commands/catalog.js: registered and tab-completed. Phase 5 — custom profiles + PREFERENCES.md frontmatter - scaffold-profiles.js: readPreferencesProfile(), loadCustomProfileSet() (~/.sf/profiles/<name>.yaml with extends/add/remove), resolveActiveProfileSet() implementing full ADR-022 §6 precedence. - All callers updated to use resolveActiveProfileSet as the single source of truth. Tests: 28 new tests in adr-022-scaffold-profiles.test.mjs — all passing. Pre-existing node:test stubs (3 files) unaffected. ADR: docs/dev/ADR-022-scaffold-profiles.md Misc: triage TODO.md dump into BACKLOG.md (phases-helpers export error T1, /todo triage typed-handler gap T1, structured triage tiers T2, sha-track markdown files T2, cross-repo triage T3). Reset TODO.md to empty template. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-12 15:28:03 +02:00
Mikael Hugo	385cc8a18b	revert(skills): restore lowercase defaults in sf-wiki SKILL.md sf-wiki is a built-in read-only skill — its page name defaults must stay generic (lowercase). The uppercase convention is this repo's project-level choice, documented in system.md and the wiki itself. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 19:52:15 +02:00
Mikael Hugo	0d187e53d7	chore(wiki): rename wiki pages to UPPERCASE to match .sf/ convention All .sf/ operational files use UPPERCASE (DECISIONS.md, KNOWLEDGE.md, etc.). Wiki pages now follow the same convention: INDEX.md, ARCHITECTURE.md, WORKFLOWS.md, SUBSYSTEMS.md, GLOSSARY.md. Also updates sf-wiki SKILL.md and system.md prompt references. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 19:50:06 +02:00
Mikael Hugo	e679478d1b	feat(wiki): wire .sf/wiki/ as tracked context source - auto-bootstrap-context.js: scan .sf/wiki/*.md in collectAutoBootstrapFiles so wiki pages load as priority context in headless autonomous bootstrap - headless-context.ts: same fix for the TS bootstrap path - system-context.js: loadWikiBlock already existed and was wired into fullSystem; add .sf/wiki/ to Tier 1 escalation policy lookup sources - system.md: add wiki/ to .sf/ directory structure; add Conventions entry explaining wiki is tracked in git (hand edits persist) and injected automatically when present - git-runtime-patterns.js: do NOT gitignore .sf/wiki/ — wiki pages are tracked like DECISIONS.md so hand edits survive commits and clones - .sf/wiki/: seed index.md, architecture.md, workflows.md for this repo Wiki filenames follow sf-wiki SKILL.md convention: lowercase (index.md, architecture.md, workflows.md, subsystems.md, glossary.md). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 19:24:23 +02:00
Mikael Hugo	4e5fc12e81	feat(sf): fix gate health — import, DB fallback, and enrich status uok Three follow-up fixes from S03/T04: 1. gate-runner.js: add missing getDistinctGateIds import from sf-db.js. UokGateRunner.getHealthSummary() called it when registry was empty but it was never imported — runtime ReferenceError in headless contexts. 2. sf-db-gates.js: getDistinctGateIds + getGateRunStats fall back to the quality_gates DB table when no trace events are found (e.g. after trace file rotation). Ensures gate health survives trace cleanup. 3. headless-uok-status.ts: replace generic Type column with real Scope (task/slice/milestone) from quality_gates DB, and show actual Last Evaluated timestamp from DB even when outside the 24h stats window. Tests updated to match (21 pass). Closes backlog items: bl-gate-runner-import-bug, bl-gate-stats-trace-vs-db, bl-uok-status-enrich. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 18:47:42 +02:00
Mikael Hugo	797db16ae8	feat(sf): S03/T04 — add UOK gate health to sf headless status uok Adds a new `sf headless status uok` subcommand that queries gate-run stats and circuit-breaker state from sf.db and formats them as a markdown table or JSON (--json flag). - src/headless-uok-status.ts: handler that loads sf-db-gates directly (avoids the unimported getDistinctGateIds in gate-runner) - src/headless.ts: bypass RPC, route 'status uok' to handler - src/help-text.ts: document the new subcommand - tests/headless-uok-status.test.mjs: 19 node:test coverage Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 18:31:03 +02:00
Mikael Hugo	4132ecc1db	feat(sf): S03/T03 — wire OutcomeLearningGate into adaptive verification policy Adds adaptive-verification-policy.js which reads OutcomeLearningGate trace events from the last 24h and adjusts verification_max_retries / verification_auto_fix in project preferences: - >60% verification/artifact/execution failures → reduce retries to 1, disable auto-fix - 0% failures across ≥5 samples → bump retries (capped at 3) - all other cases → no change (returns null) Wires into auto-verification.js after OutcomeLearningGate runs when outcomeLearning flag is enabled. Includes 12 node:test tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 17:40:22 +02:00
Mikael Hugo	7b225696cc	feat(sf): add cross-slice and milestone integrity checks to post-execution checks - Add checkCrossSliceConsistency() to detect key_file conflicts across slices - Add checkMilestoneIntegrity() to verify completed slices have summaries and no active requirements are orphaned - Extend runPostExecutionChecks() signature with optional milestoneId and allSliceTasks parameters - Wire cross-slice task gathering into auto-verification.js call site - Add comprehensive node:test suite for both new checks	2026-05-11 17:22:11 +02:00
Mikael Hugo	0aaf8f2c0e	refactor: split state.js into state-shared/db/legacy modules state.js was a 2012-line monolith combining shared helpers, DB-backed derivation, and legacy filesystem derivation. Split into four files: - state-shared.js (114 lines): helpers used by both DB and legacy paths isGhostMilestone, isSliceComplete, isMilestoneComplete, isValidationTerminal, readMilestoneValidationVerdict, loadTerminalSummary, stripMilestonePrefix, canonicalMilestonePrefix, extractContextTitle - state-db.js (841 lines): deriveStateFromDb() and its exclusive helpers reconcileDiskToDb, buildRegistryAndFindActive, handleNoActiveMilestone, handleAllSlicesDone, resolveSliceDependencies, reconcileSliceTasks, detectBlockers, checkReplanTrigger, checkInterruptedWork - state-legacy.js (895 lines): _deriveStateImpl() — filesystem-only path - state.js (228 lines): thin barrel — invalidateStateCache, getActiveMilestoneId, deriveState, re-exports from sub-modules All 1195 tests pass. No behavior change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 16:25:20 +02:00
Mikael Hugo	1adc7f119c	refactor(rf-06): split auto/phases.js into per-phase modules 3538-line monolith → 6 focused modules + thin barrel: - phases-helpers.js (223 lines): shared helpers (generateMilestoneReport, closeoutAndStop, emitCancelledUnitEnd, maybeFireProductAudit, _resolveReportBasePath, recordLearningOutcomeForUnit) - phases-dispatch.js (486 lines): runDispatch + assessUokDiagnosticsDispatchGate - phases-guards.js (497 lines): runGuards + guard helpers - phases-pre-dispatch.js (760 lines): runPreDispatch - phases-unit.js (1477 lines): runUnitPhase + session timeout state - phases-finalize.js (542 lines): runFinalize - phases.js (13 lines): barrel re-export preserving original import surface Removed dead runPhaseReview export (zero callers confirmed). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 15:14:49 +02:00
Mikael Hugo	aa6ecce384	refactor: fix all remaining inline error ternaries across 20 files Used perl regex to replace all patterns of the form X instanceof Error ? X.message : String(X) with getErrorMessage(X) for any variable name. Added getErrorMessage imports to 6 files that lacked it. Leaves only 2 intentional .stack \|\| .message variants unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 14:50:01 +02:00
Mikael Hugo	dac14043cd	refactor: consolidate remaining error ternaries (error variable) Replace all remaining inline error ternaries using the 'error' variable name with getErrorMessage(error). Added imports to 3 files that lacked it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 14:48:28 +02:00
Mikael Hugo	04322f110a	refactor: replace all inline error message ternaries with getErrorMessage() Eliminates ~120 repetitions of `err instanceof Error ? err.message : String(err)` across the entire extension source tree. All callers now import and use `getErrorMessage` from the canonical `./error-utils.js`. Files updated (56 files): - auto.js, auto-worktree.js, auto-recovery.js, auto-dashboard.js, auto-timers.js - auto-prompts.js, auto-start.js, auto-post-unit.js, auto-model-selection.js - auto/phases.js, auto/loop.js, auto/infra-errors.js - autonomous-solver-eval.js, bootstrap/agent-end-recovery.js, bootstrap/db-tools.js - bootstrap/exec-tools.js, bootstrap/journal-tools.js, bootstrap/register-extension.js - bootstrap/register-hooks.js, canonical-milestone-plan.js, changelog.js - clean-root-preflight.js, code-intelligence.js, commands-add-tests.js - commands-debug.js, commands-eval-review.js, commands-handlers.js - commands-maintenance.js, commands-pr-branch.js, commands-scan.js, commands-ship.js - commands-todo.js, commands-worktree.js, definition-io.js, doctor.js - doctor-config-checks.js, doctor-engine-checks.js, ecosystem/loader.js - eval-review-schema.js, exec-sandbox.js, execution-instruction-guard.js - graph-context.js, hook-emitter.js, index.js, learning/runtime.js - lifecycle-hooks.js, onboarding-state.js, orphan-worktree-sweep.js - planning-depth.js, quick.js, scaffold-keeper.js, sf-db/sf-db-core.js - slice-cadence.js, sm-client.js, spec-projections.js, subagent/background-jobs.js - subagent/isolation.js, sync-scheduler.js, tools/exec-tool.js - tools/sift-search-tool.js, tools/workflow-tool-executors.js, ui/index.js - uok/a2a-agent-server.js, uok/auto-dispatch.js, uok/auto-unit-closeout.js - uok/auto-verification.js, uok/chaos-monkey.js, uok/gate-runner.js - vault-resolver.js, workflow-install.js, workflow-plugins.js, worktree-manager.js - worktree-resolver.js Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 14:46:30 +02:00
Mikael Hugo	8a7f6de782	refactor: centralize skills directory constants in skill-discovery.js Export SKILLS_DIR, CLAUDE_SKILLS_DIR, PI_SKILLS_DIR from skill-discovery.js instead of repeating join(homedir(), ...) inline across 5 files. Consumers updated: - preferences-skills.js: replace 2 inline join(homedir()...) with SKILLS_DIR/CLAUDE_SKILLS_DIR - skill-health.js: replace 2 inline join(homedir()...) with constants; remove homedir import - skill-catalog.js: replace 2 inline join(homedir()...) with constants; remove homedir import - skill-telemetry.js: replace 4 inline join(homedir()...) with constants; remove homedir import Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 14:39:10 +02:00
Mikael Hugo	ec224f96ac	refactor: replace all process.env.HOME/.sf patterns with sfHome() - guided-flow.js: SF-WORKFLOW.md path now uses sfHome() - commands-config.js: both auth.json path sites use sfHome() Eliminates the last 3 inline ~/.sf path patterns; all .sf paths now route through sfHome() which respects SF_HOME env override and uses the platform-safe homedir() fallback. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 14:34:08 +02:00
Mikael Hugo	d3d7342370	refactor: use sfHome() for SF-WORKFLOW.md paths and skills dir; deduplicate errorMessage - commands-handlers.js: replace process.env.HOME/.sf/agent/SF-WORKFLOW.md with sfHome() at both call sites (lines 62 and 412) - skills/directory.js: replace process.env.HOME/.sf/skills with sfHome() - tools/tool-helpers.js: remove duplicate errorMessage implementation; re-export getErrorMessage from error-utils.js under the errorMessage alias Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 14:32:08 +02:00
Mikael Hugo	181a19ac65	refactor: wire worktree-session-state.js and auto-runtime-state.js Instead of deleting these planned-extraction modules, implement them properly: worktree-session-state.js: - Upgraded to canonical module with JSDoc, node:path imports - Fixed getActiveWorktreeName() to use normalize/join/basename (was using fragile string.replaceAll + split('/') approach) - Fixed ensureWorktreeOriginalCwdFromPath() to use sep instead of regex - worktree-command.js now imports/re-exports all state functions from this module and removes its local 'let originalCwd = null' - registerWorktreeCommand() recovery logic replaced with ensureWorktreeOriginalCwdFromPath() call auto-runtime-state.js: - Fixed to use getAutoSession() singleton instead of 'new AutoSession()' (was creating an isolated instance disconnected from auto.js state) - auto.js now re-exports isAutoActive, isAutoPaused, markToolStart, markToolEnd from this module, removing duplicate implementations - All state reads in auto-runtime-state.js delegate to the same singleton that auto.js manages Test: updated worktree-fixes.test.mjs guard to match clearWorktreeOriginalCwd() Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 14:24:50 +02:00
Mikael Hugo	5be5d6d438	refactor: remove two dead files never wired to any consumer - worktree-session-state.js: planned extraction for worktree originalCwd state; worktree-command.js kept its own module-level var and never imported this file. Dead since creation in `47c806d73`. - auto-runtime-state.js: planned extraction of isAutoActive/isAutoPaused and AutoSession wrapper; auto.js already exports all the same functions. No file in the codebase imported auto-runtime-state.js. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 14:16:09 +02:00
Mikael Hugo	e18a0001bb	refactor(sf-ext): remove local sfHome() clone in preferences.js preferences.js had its own copy of sfHome() (without resolve() canonicalization). Replace with import from sf-home.js — single source of truth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 14:12:11 +02:00
Mikael Hugo	90dc3c6798	refactor(sf-ext): split sf-db.js (9073 lines) into 18 domain modules sf-db.js is now a pure barrel re-export. All logic lives in sf-db/: - sf-db-core.js — adapter, schema, transactions, shared helpers - sf-db-mode-state.js — Ask/Build/YOLO mode state - sf-db-decisions.js — ADR / decision records - sf-db-artifacts.js — file artifacts and attachments - sf-db-milestones.js — milestone CRUD - sf-db-slices.js — slice CRUD - sf-db-tasks.js — task CRUD - sf-db-worktree.js — worktree state - sf-db-evidence.js — retrieval evidence - sf-db-spec.js — spec/contract records - sf-db-gates.js — UOK gate records - sf-db-uok.js — unit-of-knowledge state - sf-db-session-store — session store / FTS - sf-db-backlog.js — backlog items - sf-db-learning.js — model learning / performance - sf-db-memory.js — memory / embeddings - sf-db-profile.js — user profile - sf-db-self-feedback — self-feedback triage sf-db/index.js re-exports sf-db.js for backward compat. All 4375 tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 13:51:44 +02:00
Mikael Hugo	756355abf1	refactor(sf-ext): replace inline sfHome patterns with canonical sfHome() Fix bug in auto.js where SF_HOME env var caused double '.sf' path segment. Convert 11 files from inline homedir()/.sf or SF_HOME constructs to sfHome(). Files updated: - auto.js: bug fix (join(SF_HOME, '.sf', 'agent') → join(sfHome(), 'agent')) - key-manager.js: process.env.SF_HOME \|\| join(HOME, '.sf') → sfHome() - ui/color-band.js: os.homedir()/.sf → sfHome(); remove os import - ui/prompt-history.js: homedir()/.sf → sfHome(); remove homedir import - ui/usage-bar.js: homedir()/.sf/agent/auth.json → sfHome() - ui/marketplace.js: 2 occurrences — extensions dir → sfHome() - skill-telemetry.js: 2 occurrences — legacy skills dir → sfHome() - preferences-skills.js: legacy skills dir → sfHome() - preferences-models.js: models.json path → sfHome() - memory-embeddings.js: auth.json path → sfHome(); remove homedir import - commands/handlers/core.js: dynamic import homedir → static sfHome() Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 10:45:35 +02:00
Mikael Hugo	0ece0e5413	refactor(sf-ext): consolidate sfHome, counters, tool helpers, settings path, post-mutation hook - rf2-01: replace 23 inline `process.env.SF_HOME \|\| join(homedir(), '.sf')` patterns across 19 files with canonical `sfHome()` from sf-home.js; removes 5 private sfHome/getSfHome function definitions and unused os/homedir imports - rf2-05: extract `ensureWritableParent` and `errorMessage` from complete-task.js and complete-slice.js into new tools/tool-helpers.js - rf2-06: add `runPostMutationHook` to tool-helpers.js; replace 8 identical try/catch blocks (plan-task, plan-slice, plan-milestone, replan-slice, reassess-roadmap, reopen-slice, reopen-task, reopen-milestone) with single call - rf2-09: add `makeDiskCounter` factory in auto-dispatch.js; consolidate 4 counter functions (rewrite/uat get/set/increment) from duplicated if/else DB-vs-disk logic into thin factory wrappers (~35 lines removed) - rf2-10: export `getSfAgentSettingsPath()` from preferences.js; update notifications/notify.js and permissions/permission-core.js to use it All 4375 unit tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 10:17:58 +02:00
Mikael Hugo	9dc244eb68	refactor: rf-10/rf-03 ask-gate wiring and skills frontmatter consolidation - rf-10: Wire gateAskUserQuestions (ask-gate.js) into ask-user-questions execute() via dynamic import; blocks autonomous ask_user_questions calls at tool layer - rf-03: Replace FRONTMATTER_RE + manual body extraction in skills/frontmatter.js with shared splitFrontmatter(); keep custom parseYaml() for skill-specific YAML handling Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 09:09:24 +02:00
Mikael Hugo	9756edfe0b	refactor: rf-09/rf-08/rf-12/rf-05 cleanup and deduplication - rf-09: Remove isTransientNetworkError from preferences-models.js/preferences.js/preferences-models.d.ts (canonical is error-classifier.js) - rf-08: Extract Gemini token counting to google-gemini-token-counter.js; update register-hooks.js import - rf-12: Remove 3 dead _allRequirements/_allDecisions fetch blocks from db-writer.js - rf-05: Extract resolveSfBin() and monitorNdjsonStdout() to spawn-worker.js; both orchestrators now import from there Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 08:59:51 +02:00
Mikael Hugo	96d751555f	fix(lint): fix all pre-existing lint warnings (unused vars/imports/params) - Prefix unused params/vars with _ in db-writer.js, system-context.js, record-promoter.js, a2a-transport.js - Remove unused imports: createServer (a2a-agent-server.js), dirname/join/resolve (a2a-transport.js), KNOWN_PREFERENCE_KEYS (preferences.js) - Remove unused private field _lastInputAt from pty-chat-parser.ts - Prefix unused test variable currentProject in uok-metrics-exposition.test.mjs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 08:32:30 +02:00
Mikael Hugo	64ddbd950f	refactor(extensions): consolidate duplicate code into canonical modules - Delete ghost package packages/pi-agent-core (no dist, no consumers, TS build errors; JS source sf-db.js had 3 commits not mirrored in TS) - Remove build:pi-agent-core from root package.json build:pi pipeline - Merge all models from MODEL_COST_PER_1K_INPUT into BUNDLED_COST_TABLE (model-cost-table.js is now the single canonical cost source) - Remove duplicate MODEL_COST_PER_1K_INPUT object and getModelCost() from model-router.js; use lookupModelCost() from model-cost-table.js - Replace hand-rolled isTransientNetworkError in preferences-models.js with delegation to classifyError() in error-classifier.js Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 08:28:49 +02:00
Mikael Hugo	0b5fa75c0d	fix(lint): fix all pre-existing lint failures - check-sf-extension-inventory.mjs: expand parseDirectRegisteredCommands() scan to include 7 more files (guards/inturn.js, notifications/notify.js, permissions/index.js, ui/usage-bar.js, commands/legacy/audit.js, commands/legacy/create-extension.js, commands/legacy/create-slash-command.js) and filter results by BASE_RUNTIME_COMMAND_NAMES to exclude doc-string false positives ("name" in create-slash-command.js template text) - extension-manifest.json: remove 'clear' (subcommand of logs/notifications, never a top-level pi.registerCommand) - packages/pi-agent-core/src/db/sf-db.ts: fix 23 noVoidTypeReturn errors - openDatabase: void → boolean (caller uses return value at line 5625) - claimEscalationOverride: void → boolean (caller checks at escalation.js:243) - resolveSelfFeedbackEntry: void → boolean (caller checks at self-feedback.js:387) - copyWorktreeDb: void → boolean (caller checks at reconcileWorktreeDb) - compactUokMessages: void → {before,after} (caller returns value at message-bus.js:238) - insertSessionTurn: void → bigint\|null (caller uses id at session-recorder.js:104) - expireStaleMemories: void → number (caller uses count at auto-start.js:1047) - deleteMemorySourceRow: void → boolean (caller returns value at memory-source-store.js:107) - deleteMemoryEmbedding: void → boolean (caller returns value at memory-embeddings.js:328) - updateBacklogItemStatus: remove dead return expression (callers discard value) - removeBacklogItem: remove dead return expression (callers discard value) - updateGateCircuitBreaker: remove dead return {total,avgMs,...} (wrong-type code accidentally merged from getGateLatencyStats, never reachable) - markUokMessageRead: remove dead return true/false (callers discard value) - Auto-fix formatting and organizeImports in ~30 source files (biome --write) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 04:02:31 +02:00
Mikael Hugo	65da855c5e	refactor(state): extract loadTerminalSummary helper, dedup 5 fail-closed SUMMARY checks The 'read SUMMARY → check if readable AND terminal' pattern appeared five times in state.js after the Cluster F polarity fix. Extract it to a private loadTerminalSummary(summaryFile, loadFn) helper so the fail-closed semantics live in one place and can't drift between call sites. - loadTerminalSummary returns the content if readable AND terminal, null otherwise - All 5 call sites replaced: 2 in getActiveMilestoneId(), 3 in _deriveStateImpl() - Phase 2 'no roadmap' case reuses returned content for parseSummary().title - isTerminalMilestoneSummaryContent now only referenced inside the helper Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 03:46:36 +02:00
Mikael Hugo	159c8b0c4d	refactor(git-service): rename GitServiceImpl → GitService No interface exists for the class, so the Impl suffix is vestigial Java-style naming. Rename throughout: git-service.js, auto-start.js, auto.js, worktree.js, worktree-detect.js, worktree-resolver.js, quick.js, and the two test files that imported it directly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 03:38:53 +02:00
Mikael Hugo	c1df4249b8	fix(state): Cluster F — fail-closed SUMMARY checks in state.js and dispatch-guard.js Three fail-open bugs allowed unreadable (null) SUMMARY files to be treated as terminal, incorrectly marking milestones as complete when the content could not be read. Gap 1 — dispatch-guard.js line 50: Any SUMMARY file existence = milestone complete (fail-open). Fix: DB-first check via getMilestone()+isClosedStatus(); filesystem fallback reads SUMMARY content and calls classifyMilestoneSummaryContent() so only non-failure summaries skip the milestone. Gap 2 — state.js getActiveMilestoneId(): 'if (summaryFile) continue' skipped any milestone with ANY SUMMARY. 'if (!summaryFile) return mid' fell through incorrectly for failure SUMMARYs. Fix: read content; only skip/continue if sc != null && isTerminal(sc). Gap 3 — state.js _deriveStateImpl() Phase 1 + Phase 2: '!sc \|\| isTerminalMilestoneSummaryContent(sc)' — null content = fail-open. Fix: 'sc && isTerminalMilestoneSummaryContent(sc)' — null content = fail-closed. Applied to all 6 occurrences (lines 1233, 1247, 1257, 1284, 1356, 1391). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 03:34:48 +02:00
Mikael Hugo	70afabedb7	refactor(uok): move auto-dispatch, auto-verification, auto-runaway-guard, auto-unit-closeout into sf/uok/ Per checkpoint-008/009 next-steps: these 4 autonomous-loop modules belong in the UOK subsystem alongside the other orchestration primitives. - auto-dispatch.js → uok/auto-dispatch.js - Dispatch table + resolveDispatch() is a core UOK orchestration primitive - Updated 3 static importers + 1 dynamic await import + 3 test files - auto-verification.js → uok/auto-verification.js - Post-unit verification gate delegates to UOK gates (ChaosMonkey, Security, CostGuard, OutcomeLearning, etc.) - Updated 1 importer (auto.js) - auto-runaway-guard.js → uok/auto-runaway-guard.js - Diagnostic budget guard; no local relative imports - Updated 4 importers (auto-timers.js, preferences-models.js, auto/phases.js, auto/run-unit.js) - auto-unit-closeout.js → uok/auto-unit-closeout.js - Unit metrics snapshot + activity log + memory extraction helper - Updated 3 importers (auto-timers.js, auto-post-unit.js, auto.js) Each original file is now a 1-line re-export shim preserving public API. All 4 are added to uok/index.js as the UOK barrel. 26 dispatch tests pass; full unit suite 4374 tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 03:02:52 +02:00
Mikael Hugo	adb449d642	fix: consolidate extensions into sf, migrate kernel.ts, fix test suite - Fold sf-usage-bar, sf-notify, sf-inturn-guard, sf-permissions, slash-commands into sf extension (ui/, notifications/, guards/, permissions/, commands/legacy/) - Delete vectordrive extension - Migrate uok/kernel.js to TypeScript (kernel.ts) with full interfaces - Add allowJs/checkJs:false to tsconfig.resources.json for incremental TS migration - Add symlink dedup to extension-discovery.ts (seenRealPaths Set) - Add before_provider_request delegate back to native-search.js so session budget tests exercise the middleware end-to-end - Fix parseSfNativeTools() to return all SF manifest tools (drop sf_ filter) - Fix test assertions: plan_milestone/complete_task/validate_milestone - Remove subagent from app-smoke.test.ts (folded into sf/subagent/) - Remove sf-permissions/sf-inturn-guard/subagent from features-inventory test - Fix resolveSearchProvider autonomous mode test to pass 'auto' explicitly - Remove legacy /clear slash command (conflicts with built-in clear_terminal) - Update web-command-parity-contract.test.ts for clear removal Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-11 02:40:52 +02:00

1 2 3 4 5 ...

3130 commits