Commit graph

4163 commits

Author SHA1 Message Date
Mikael Hugo
89677b7e9b sf snapshot: uncommitted changes after 110m inactivity 2026-05-08 00:17:47 +02:00
Mikael Hugo
d05e7164a9 feat: journal execution policy decisions 2026-05-07 22:27:29 +02:00
Mikael Hugo
e9df932234 feat: add execution policy profiles 2026-05-07 18:21:47 +02:00
Mikael Hugo
b0fce94f9e feat: record retrieval evidence across context tools 2026-05-07 18:17:41 +02:00
Mikael Hugo
05f185256c docs: record local cli survey cross-check 2026-05-07 17:22:03 +02:00
Mikael Hugo
b1a7749763 fix: harden widget and provider auth handling 2026-05-07 17:20:52 +02:00
Mikael Hugo
3c84bd2fed fix: stabilize headless bootstrap and prompt history 2026-05-07 16:46:44 +02:00
Mikael Hugo
deeb4dbd4e sf snapshot: uncommitted changes after 61m inactivity 2026-05-07 16:39:39 +02:00
Mikael Hugo
8088489e38 sf snapshot: uncommitted changes after 258m inactivity 2026-05-07 15:37:55 +02:00
Mikael Hugo
e154dad930 fix: clean workflow helper extraction lint 2026-05-07 11:19:26 +02:00
Mikael Hugo
426fea7334 fix: reload sf source runtime on extension changes 2026-05-07 10:31:34 +02:00
Mikael Hugo
343ee5c89e sf snapshot: uncommitted changes after 158m inactivity 2026-05-07 10:01:56 +02:00
Mikael Hugo
6e0273573c refactor: Extract workflow-helpers module from auto-prompts (D3)
- Extract buildResumeSection and buildCarryForwardSection for continue/carry-forward logic
- Extract checkNeedsReassessment and checkNeedsRunUat for adaptive replanning
- Consolidates workflow state checking and section building
- No behavior change; backward compatible via re-export pattern
- Reduces auto-prompts.js by ~260 LOC

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 07:23:43 +02:00
Mikael Hugo
e99d50fbc1 refactor: Extract summary-helpers module from auto-prompts (D2)
- Extract buildSliceSummaryExcerpt to format slice summaries as excerpts
- Extract getPriorTaskSummaryPaths and getDependencyTaskSummaryPaths
- Extract isSummaryCleanForSkip for replan decision logic
- Consolidates summary extraction logic for reuse and testability
- No behavior change; backward compatible via re-export pattern
- Reduces auto-prompts.js by ~120 LOC

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 07:16:56 +02:00
Mikael Hugo
d75ed12d89 refactor: Extract io-helpers module from auto-prompts (D1)
- Extract inlineFile, inlineFileOptional, inlineFileSmart to io-helpers.js
- Enables testable file I/O utilities reusable across prompt builders
- No behavior change; backward compatible via re-export pattern
- Reduces auto-prompts.js cognitive load by ~50 LOC

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 07:09:46 +02:00
Mikael Hugo
de3990093e style: Organize imports in memory-store.js per Biome
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 06:57:36 +02:00
Mikael Hugo
5e518dd7d4 feat: Add SM cross-project recall to memory ranking (Phase 3)
- Import querySmMemories from sm-client.js
- Merge cross-project memories into getRelevantMemoriesRanked
- Cap cross-project confidence at 0.8 with 0.9 reduction (conservative)
- Gracefully degrade: fail-open if SM unavailable
- Preserve cosine ranking with relation boost for merged pool
- Tests: 3821 passing, no regressions

Implements Tier 1.2 Phase 3: Cross-project memory recall via Singularity Memory.
Enables dispatch to leverage patterns from other projects while maintaining
local autonomy via fail-open semantics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 06:56:15 +02:00
Mikael Hugo
bfb892eca3 fix: bind todo backlog triage to project db 2026-05-07 06:40:28 +02:00
Mikael Hugo
1b73500fcf fix: bind inspect command to project db 2026-05-07 06:38:43 +02:00
Mikael Hugo
2aed04608c fix: bind escalate command to project db 2026-05-07 06:37:21 +02:00
Mikael Hugo
87362f27fc docs: remove mcp server roadmap residue 2026-05-07 06:25:59 +02:00
Mikael Hugo
9bd913d4a1 fix: bind uok status to project db 2026-05-07 06:25:03 +02:00
Mikael Hugo
cc08afc3b1 fix: bind memory command to project db 2026-05-07 06:23:42 +02:00
Mikael Hugo
a2184a0a0e feat: store judgment log in db 2026-05-07 06:22:07 +02:00
Mikael Hugo
2178aa8803 fix: isolate uok message bus db per project 2026-05-07 06:09:32 +02:00
Mikael Hugo
95cb13c08d fix: isolate backlog db per project 2026-05-07 05:54:18 +02:00
Mikael Hugo
6beb6fd412 docs: align replan and state source of truth 2026-05-07 05:52:25 +02:00
Mikael Hugo
03ebc02277 fix: stamp replan triggers in db 2026-05-07 05:41:08 +02:00
Mikael Hugo
95b00d8963 test: cover memory tags schema 2026-05-07 05:38:38 +02:00
Mikael Hugo
5c32d91124 feat: promote schedule and self-feedback state to db 2026-05-07 05:34:42 +02:00
Mikael Hugo
cd5926a17a fix: auto-compact uok message bus 2026-05-07 05:23:08 +02:00
Mikael Hugo
5bc3895586 feat: expose uok message bus metrics 2026-05-07 05:19:41 +02:00
Mikael Hugo
268e7ac678 feat: publish uok diagnostics to observer inbox 2026-05-07 05:08:44 +02:00
Mikael Hugo
c0973ac287 fix: complete gate cost micro-usd migration 2026-05-07 05:07:57 +02:00
Mikael Hugo
7c39165c81 Tier 2.7: Migrate cost_usd to cost_micro_usd for accurate accounting
- Schema version bumped to 36
- Add migrateCostUsdToMicroUsd() helper for safe migration
- Convert cost_usd REAL to cost_micro_usd INTEGER in gate_runs
- Migration: multiply USD values by 1,000,000 to avoid float drift
- Update insertGateRun() to support cost_micro_usd field
- Old cost_usd column retained for backward compatibility

Benefits:
- Eliminates floating-point drift on accumulated costs
- Easier reasoning about cost totals
- Integer arithmetic is faster and more predictable
- Idempotent migration (safe to re-run)

Migration runs automatically on first database open for schema < 36.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 05:04:35 +02:00
Mikael Hugo
fce0c4c781 Tier 1.1: Implement vault credential resolver for provider keys
- Add vault-credential-resolver.js: Async credential resolution with vault:// URI support
- Integration with vault-resolver.js (low-level Vault client)
- Update doctor-providers.js to detect and report vault URIs
- Synchronous doctor checks (no network I/O) with lazy async resolution
- Fail-open semantics: vault unavailable -> fall back to plaintext
- 28 tests for credential resolver (all passing)
- ADR-0078: Architecture and auth chain documentation

Features:
- vault://secret/path/to/secret#fieldname URI format
- Auth chain: VAULT_TOKEN -> ~/.vault-token -> AppRole (reserved)
- Helper functions: couldBeVaultUri, hasProviderCredentialEnvVar, resolveProviderCredential, getCredentialValue, formatCredentialInfo
- Full backward compatibility with plaintext keys and auth.json

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 04:59:07 +02:00
Mikael Hugo
9ceb0bf229 fix: store backlog items in db 2026-05-07 04:50:13 +02:00
Mikael Hugo
59cfc4f7c3 test: guard against sf mcp server regression 2026-05-07 04:46:09 +02:00
Mikael Hugo
ffde54e05a fix: persist live planning specs in db 2026-05-07 04:44:09 +02:00
Mikael Hugo
8f5f33611a test: cover adaptive uok circuit breaker 2026-05-07 04:42:12 +02:00
Mikael Hugo
856ce4d530 test: cover uok metrics cache refresh 2026-05-07 04:36:08 +02:00
Mikael Hugo
79896b4377 Tier 1.3 Phase 4: Add evidence recording to plan and complete tools
- Updated plan-milestone, plan-slice, plan-task to record planning evidence
- Updated complete-milestone, complete-slice, complete-task to record completion evidence
- All evidence includes relevant spec fields (goals, narratives, decisions, etc.)
- Evidence recorded atomically within transactions
- Enables audit trail queries to reconstruct planning and completion decisions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 04:35:03 +02:00
Mikael Hugo
076e8c4894 Tier 1.3 Phase 3: Add evidence management API
Implements data layer functions for managing and querying spec/evidence data.

New export functions:
- insertMilestoneEvidence(): Append evidence for milestone
- insertSliceEvidence(): Append evidence for slice
- insertTaskEvidence(): Append evidence for task
- getMilestoneAuditTrail(): Query full audit trail (spec + evidence + runtime)
- getSliceAuditTrail(): Query slice audit trail with joined spec/evidence
- getTaskAuditTrail(): Query task audit trail with joined spec/evidence
- getMilestoneSpec(): Get spec only (immutable intent)
- getSliceSpec(): Get slice spec only
- getTaskSpec(): Get task spec only

Key properties:
- Evidence functions use timestamp for recording time (set at insertion)
- Audit trail queries JOIN runtime, spec, and evidence tables
- All queries support data archaeology (reconstruct decision history)
- Spec-only queries useful for validation and re-planning
- All functions include JSDoc with purpose and consumer

This completes Phase 3 of Tier 1.3 implementation. Phase 4 (tool updates) and
Phase 5 (integration tests) follow in next PRs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 04:24:31 +02:00
Mikael Hugo
f3761d7f46 Tier 1.3 Phase 2: Migrate existing data to spec tables
Implements automatic population of new spec tables from existing milestone/slice/task columns.

Migration function: populateSpecTablesFromExisting()
- Runs during schema v32 migration (first database open)
- Populates milestone_specs from existing milestone table spec columns
- Populates slice_specs from existing slice table spec columns
- Populates task_specs from existing task table spec columns
- Uses INSERT OR IGNORE to safely handle existing data
- Sets spec_version to 1 for all migrated specs
- Uses current timestamp for created_at if missing

Key properties:
- Non-destructive: existing runtime rows preserved
- Idempotent: safe to re-run (INSERT OR IGNORE)
- Evidence tables left empty: populated as tools create new evidence
- Evidence populated retroactively in future phase

This completes Phase 2 of Tier 1.3. Phases 3-5 (data layer updates, tool updates, tests) follow.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 04:22:41 +02:00
Mikael Hugo
87aa04cf05 Tier 1.3: Add spec/runtime/evidence schema separation (v32)
Implements the 3-table normalization model for milestone, slice, and task entities:

- 9 new tables: {milestone,slice,task}_{specs,evidence} + runtime tables
- milestone_specs: immutable record of intent (vision, goals, risks, proof strategy)
- slice_specs: immutable slice-level intent
- task_specs: immutable task verification criteria
- {entity}_evidence: append-only audit trail with timestamps and phase metadata
- Indices on evidence tables for efficient chronological queries

Key improvements:
- Spec immutability: Write-once specs preserve original intent
- Audit trail: Evidence chain enables data archaeology and decision history
- Query efficiency: Each table contains only relevant columns
- Re-planning clarity: Multiple spec versions can exist for same entity ID
- Forensic capability: Timestamp + phase metadata on evidence rows

Migration:
- Schema version bumped to 32
- Migration runs on first open of existing databases
- No data loss; existing milestone/slice/task rows preserved
- Creates spec and evidence tables from existing columns (future work)

This is Phase 1 of Tier 1.3 implementation (schema definition + basic setup).
Phases 2-5 (migration, data layer updates, tool updates, tests) follow in next PRs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 04:20:32 +02:00
Mikael Hugo
e2b51b62fc fix: correct turn-status integration test assertions
Fixed two assertion issues in turn-status-integration.test.ts:
1. Line 52: Changed .toContain('blocked') to .toContain('blocker')
   - Reason field returns 'Agent discovered blocker—...' not 'Agent discovered blocked—...'
2. Line 225: Changed .toBe(100000 + 1) to .toBe(100000)
   - extractTurnStatus() applies trimEnd() to cleanOutput, removing trailing newline

Result: All 65 turn-status tests passing (31 parser + 34 integration)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 04:06:32 +02:00
Mikael Hugo
ca431e7e78 Tier 2.5 Phase 5-6: Documentation and integration tests
Added comprehensive documentation and end-to-end test suite for turn_status:

Phase 5 Documentation:
- Added 'turn_status Marker System' section to preferences-reference.md
- Explains three states (complete/blocked/giving_up)
- Covers why, how, and best practices
- Includes doctor check integration docs

Phase 6 Integration Tests:
- Created turn-status-integration.test.ts (34 tests)
- Tests end-to-end signal pipeline (extraction→resolution→action)
- Tests marker placement, format, case-insensitivity
- Tests multi-block agent output (code, JSON, tool output)
- Tests error handling and edge cases
- Tests signal resolution semantics
- Tests validation and introspection functions
- Tests doctor check integration
- Tests real-world scenarios (research, execute, complete slices)
- Tests cross-cutting concerns (idempotency, side effects)

Test Coverage:
- End-to-end signal pipeline: 6 tests
- Marker placement and format: 5 tests
- Multi-block agent output: 3 tests
- Error handling and edge cases: 5 tests
- Signal resolution semantics: 6 tests
- Validation and introspection: 5 tests
- Doctor check integration: 2 tests
- Real-world scenarios: 3 tests
- Cross-cutting concerns: 3 tests

Results:
- 31 turn-status-parser tests passing (existing)
- 34 turn-status-integration tests passing (new)
- Total: 65/65 passing
- Core build: ✓ passing
- No regressions

Tier 2.5 Complete:
- Phase 1: Markers in prompts ✓
- Phase 2: Parser + extraction ✓
- Phase 4: Doctor check ✓
- Phase 5: Documentation ✓
- Phase 6: Integration tests ✓
- Phase 3: Signal transitions (blocked—pending harness context)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-07 04:04:45 +02:00
Mikael Hugo
88cf545821 fix: exclude generated sf milestones from staging 2026-05-07 04:02:34 +02:00
Mikael Hugo
4f39c3f4c8 docs: tighten sf runtime state boundary 2026-05-07 04:00:58 +02:00
Mikael Hugo
4f217cc88c docs: promote sf state guidance 2026-05-07 03:59:38 +02:00