singularity/singularity-forge

Author	SHA1	Message	Date
Mikael Hugo	b0fce94f9e	feat: record retrieval evidence across context tools	2026-05-07 18:17:41 +02:00
Mikael Hugo	05f185256c	docs: record local cli survey cross-check	2026-05-07 17:22:03 +02:00
Mikael Hugo	b1a7749763	fix: harden widget and provider auth handling	2026-05-07 17:20:52 +02:00
Mikael Hugo	8088489e38	sf snapshot: uncommitted changes after 258m inactivity	2026-05-07 15:37:55 +02:00
Mikael Hugo	87362f27fc	docs: remove mcp server roadmap residue	2026-05-07 06:25:59 +02:00
Mikael Hugo	5c32d91124	feat: promote schedule and self-feedback state to db	2026-05-07 05:34:42 +02:00
Mikael Hugo	fce0c4c781	Tier 1.1: Implement vault credential resolver for provider keys - Add vault-credential-resolver.js: Async credential resolution with vault:// URI support - Integration with vault-resolver.js (low-level Vault client) - Update doctor-providers.js to detect and report vault URIs - Synchronous doctor checks (no network I/O) with lazy async resolution - Fail-open semantics: vault unavailable -> fall back to plaintext - 28 tests for credential resolver (all passing) - ADR-0078: Architecture and auth chain documentation Features: - vault://secret/path/to/secret#fieldname URI format - Auth chain: VAULT_TOKEN -> ~/.vault-token -> AppRole (reserved) - Helper functions: couldBeVaultUri, hasProviderCredentialEnvVar, resolveProviderCredential, getCredentialValue, formatCredentialInfo - Full backward compatibility with plaintext keys and auth.json Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-07 04:59:07 +02:00
Mikael Hugo	87aa04cf05	Tier 1.3: Add spec/runtime/evidence schema separation (v32) Implements the 3-table normalization model for milestone, slice, and task entities: - 9 new tables: {milestone,slice,task}_{specs,evidence} + runtime tables - milestone_specs: immutable record of intent (vision, goals, risks, proof strategy) - slice_specs: immutable slice-level intent - task_specs: immutable task verification criteria - {entity}_evidence: append-only audit trail with timestamps and phase metadata - Indices on evidence tables for efficient chronological queries Key improvements: - Spec immutability: Write-once specs preserve original intent - Audit trail: Evidence chain enables data archaeology and decision history - Query efficiency: Each table contains only relevant columns - Re-planning clarity: Multiple spec versions can exist for same entity ID - Forensic capability: Timestamp + phase metadata on evidence rows Migration: - Schema version bumped to 32 - Migration runs on first open of existing databases - No data loss; existing milestone/slice/task rows preserved - Creates spec and evidence tables from existing columns (future work) This is Phase 1 of Tier 1.3 implementation (schema definition + basic setup). Phases 2-5 (migration, data layer updates, tool updates, tests) follow in next PRs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-07 04:20:32 +02:00
Mikael Hugo	4f217cc88c	docs: promote sf state guidance	2026-05-07 03:59:38 +02:00
Mikael Hugo	932f17b93a	refactor: rename workflow tool boundary	2026-05-07 03:45:41 +02:00
Mikael Hugo	e35cc3c6b8	docs: align schedule and package state wording	2026-05-07 03:36:56 +02:00
Mikael Hugo	3e6827e7dc	docs: remove stale direct db and mcp guidance	2026-05-07 03:33:14 +02:00
Mikael Hugo	9ab0b9fe63	docs: tighten legacy state fallback wording	2026-05-07 03:25:20 +02:00
Mikael Hugo	39382f7e54	docs: clarify db-backed state guidance	2026-05-07 03:20:20 +02:00
Mikael Hugo	2fae96d539	docs: align runtime state and mcp boundaries	2026-05-07 03:09:55 +02:00
Mikael Hugo	f192dbfca0	docs: add ADR-076 for UOK memory integration decisions Document the three-phase integration of SF memory system with UOK: Phase 1: Unit outcome recording (recordUnitOutcomeInMemory) - Records success/failure patterns with 0.9/0.5 confidence - Fire-and-forget async, never blocks execution Phase 2: Dispatch ranking enhancement (enhanceUnitRankingWithMemory) - Queries memory for similar patterns - Boosts matching candidates by up to 15% (conservative limit) - Deterministic embeddings ensure reproducible ranking Phase 3: Gate context enrichment (enrichGateResultWithMemory) - Diagnostic only; never changes gate pass/fail logic - Helps operators understand recurring issues All memory operations gracefully degrade if DB unavailable. 56 test cases validate integration across all phases. Relates to ADR-0075 (UOK gates), ADR-008 (SF tools). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-07 02:05:01 +02:00
Mikael Hugo	a8634d4a3b	docs: add memory system integration guide for developers Practical quick-start guide for using SF's autonomous memory system: - Record unit outcomes (success/failure patterns) - Enhance dispatch ranking with learned patterns - Add context to gate failures - Core memory operations (create, query, relations) - Common integration patterns - Graceful degradation strategy - Performance notes and best practices - Testing with mocked memory - Debugging helpers Guide covers: - Fire-and-forget async pattern - Never blocks dispatch/execution - Testing strategies for memory-enhanced code - Performance characteristics - Architecture decision: memory is SF-internal Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-07 02:03:34 +02:00
Mikael Hugo	b384c8e0df	docs: clarify memory system is SF-internal, not MCP-exposed Add architecture decision: Memory is not exposed as MCP server. - SF is an MCP client only (consumes external MCP tools) - Memory is internal SF infrastructure (uses SQLite, fire-and-forget async) - Memory exposed as SF tools only (capture, query, graph) - No external MCP exposure needed (memory is autonomous learning, not a service) This keeps SF's learning system private and prevents: - External memory pollution - Uncontrolled confidence scoring - Inconsistent learning patterns - Loss of autonomy (memory decisions stay internal) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-07 01:41:33 +02:00
Mikael Hugo	b6ea800e2e	docs: comprehensive SF memory system architecture reference Add MEMORY-SYSTEM-ARCHITECTURE.md documenting: - All 10 memory modules (store, embeddings, relations, etc.) - Core functions and APIs for each module - Storage schema (SQLite tables) - Integration points (UOK, dispatch, gates) - Usage examples and architecture diagram - Performance characteristics - Graceful degradation strategy - Data retention and growth management This serves as: 1. Reference guide for developers using memory system 2. Architecture overview of autonomous learning 3. Integration point documentation for extensions 4. Future enhancement roadmap Discovered during UOK memory integration work: - Memory system already complete (no duplication needed) - Used for pattern learning, dispatch ranking, and diagnostics - Node 24 native SQLite backend (no external deps) - Fire-and-forget async operations (never blocks) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-07 01:36:08 +02:00
Mikael Hugo	3f099e240c	Update test coverage plan: Phase 3 complete - Phase 1: 48 tests (metrics + triage) ✓ - Phase 2: 31 tests (crash recovery) ✓ - Phase 3: 17 tests (property-based FSM) ✓ - Total: 96 critical path tests + 25 env schema tests = 104 new tests - All passing, coverage targets met Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-07 01:01:47 +02:00
Mikael Hugo	2d465b11fd	test: add comprehensive Phase 1 coverage for dispatch loop (48 tests) - Add metrics.test.ts: 21 tests for unit outcome recording, model performance tracking, fire-and-forget safety, persistence, error handling - Add triage-self-feedback.test.ts: 27 tests for report classification, confidence thresholds, auto-fix, deduplication, severity categorization, async safety Purpose: Increase coverage of critical autonomous dispatch paths from 40% to 60%+. Covers fire-and-forget patterns (metrics recording and auto-fix application must not block dispatch), concurrent recording safety, graceful degradation on error. Tests validate: ✓ Unit outcome recording without blocking ✓ Per-task-type model performance tracking ✓ Fire-and-forget error handling (metrics/fixes don't break dispatch) ✓ Concurrent metric recording race conditions ✓ Persistence atomicity ✓ Report classification by type/severity ✓ Confidence thresholds (0.85-0.95 per type) ✓ Auto-fix deduplication and prioritization ✓ Async triage without blocking dispatch Phase 1 complete: 48 tests, all passing. Phase 2: Recovery path hardening (recovery/forensics) Phase 3: Property-based FSM testing (fast-check) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-07 00:38:19 +02:00
Mikael Hugo	6be23806fe	feat: comprehensive environment schema with type-safe validation - Expand env.ts with completeSfEnvSchema covering all 80+ SF_* variables - Organize variables into logical categories (core, directories, performance, debug, extensions, recovery, settings, misc) - Add typed API: getCompleteSfEnv(), parseCompleteSfEnv(), getEnvValidationSummary() - Support graceful degradation (missing config returns partial data, never throws) - Add 25 comprehensive test cases covering schema, parsing, defaults, round-trips - Document in docs/ENV.md with quick start, API reference, migration guide Purpose: Prevent silent misconfiguration by centralizing environment validation, enabling IDE auto-completion, and providing clear defaults. Callers get type-safe access to all config instead of scattered process.env reads. Consumers: loader.ts for startup validation, all modules reading configuration. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-07 00:31:59 +02:00
Mikael Hugo	f2db20b4d6	docs: add SQLite migration guide for Node 24 upgrade Comprehensive guide for migrating from JSON to node:sqlite when Node 24 is available: - Schema design (model_outcomes + model_stats tables) - Phase-by-phase refactoring approach - Data migration from JSON with backward compatibility - Testing strategy with new SQLite-specific tests - Future opportunities: dashboards, trend analysis, A/B testing, federated learning This doc serves as a roadmap for ~2 days of work when Node 24 becomes standard. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 23:03:50 +02:00
Mikael Hugo	553ba23b89	integrate: hook quick wins into UOK dispatch loop Integration of 3 quick wins into existing UOK infrastructure: 1. Model Learning (Quick Win #2) → metrics.js - Record outcomes to model-learner for per-task-type performance tracking - Hook: recordUnitOutcome() now calls ModelLearner.recordOutcome() - Fire-and-forget: never blocks outcome recording on learning failure - Enables adaptive model routing decisions in downstream gates 2. Self-Report Fixing (Quick Win #1) → triage-self-feedback.js - Auto-fix high-confidence reports (>0.85) in applyTriageReport() - Hook: After triage and requirement promotion, apply auto-fixes - Fire-and-forget: never blocks report application on fix failure - Returns reportsAutoFixed count for triage metrics 3. Knowledge Injection (Quick Win #3) → already integrated in auto-prompts.js - Already active in execute-task prompt template - Semantic matching with graceful degradation All integration points: - Fire-and-forget: learning/fixing failures never block dispatch - UOK-native: use existing outcome recording, db, gates - Backward compatible: applyTriageReport now async, but callers handle it - No new dependencies: all modules already in codebase Testing: 2934 tests pass (no regressions from integration) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 22:34:41 +02:00
Mikael Hugo	0e2edfdebf	feat: implement 3 quick wins for SF self-evolution Quick Win 1: Close Self-Report Feedback Loop [9/10 impact] - Added self-report-fixer.js module with automatic fix classification - Pattern-based detection for high-confidence fixes (e.g., prompt rubrics) - Deduplication and severity-based categorization of reports - Designed for extension into triage-self-feedback pipeline Quick Win 2: Activate Continuous Model Learning [8/10 impact] - Added model-learner.js with ModelPerformanceTracker class - Per-task-type tracking: success rate, latency, cost, token efficiency - Auto-demotion for models failing >50% on specific task types - A/B testing infrastructure for hypothesis testing on low-risk tasks - Failure analysis with pattern detection (e.g., timeouts, quality issues) - Storage: .sf/model-performance.json, .sf/model-failure-log.jsonl Quick Win 3: Automate Knowledge Injection [7/10 impact] - Added knowledge-injector.js with semantic similarity scoring - Integrated into auto-prompts.js for execute-task prompts - queryKnowledge already exists in context-store.js (60% done) - Enhanced with: semantic matching, confidence filtering, contradiction detection - Tracks knowledge usage for feedback loop Integration: - Modified auto-prompts.js to inject knowledge via knowledgeInjection variable - Added getKnowledgeInjection helper for graceful degradation - All new modules pass build check and are in dist/ Status: Core infrastructure in place; ready for integration into dispatch loop. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 22:01:37 +02:00
Mikael Hugo	8fd59e156d	sf snapshot: uncommitted changes after 321m inactivity	2026-05-06 21:53:05 +02:00
Mikael Hugo	6471e10245	sf snapshot: uncommitted changes after 64m inactivity	2026-05-06 16:28:31 +02:00
Mikael Hugo	f655188814	sf snapshot: uncommitted changes after 93m inactivity	2026-05-06 11:37:27 +02:00
Mikael Hugo	a73ea845e7	sf snapshot: uncommitted changes after 61m inactivity	2026-05-06 10:04:20 +02:00
Mikael Hugo	500a9d1c1d	fix: move unit runtime under uok ownership	2026-05-06 07:02:28 +02:00
Mikael Hugo	76b218762b	fix: harden sf autonomous runtime	2026-05-06 06:02:46 +02:00
Mikael Hugo	adf28d69b4	feat: run solver eval from autonomous lifecycle	2026-05-06 04:02:40 +02:00
Mikael Hugo	a1fd6cfc05	fix: separate headless transport from autonomous mode	2026-05-06 02:24:15 +02:00
Mikael Hugo	3960e42b26	docs: align sf purpose doctrine and docs	2026-05-06 00:38:36 +02:00
Mikael Hugo	d75ebfe7c3	sf snapshot: uncommitted changes after 43m inactivity	2026-05-05 21:39:56 +02:00
Mikael Hugo	22fa995500	fix: avoid lockfile churn during doctor install	2026-05-05 20:24:30 +02:00
Mikael Hugo	ab6cad4c84	fix: clean provider surfaces and core build	2026-05-05 16:31:53 +02:00
Mikael Hugo	4c98cb8c33	fix: make autonomous mode canonical	2026-05-05 15:42:10 +02:00
Mikael Hugo	55e7dd0e02	fix: clean generated harness residue	2026-05-05 15:04:34 +02:00
Mikael Hugo	00a118ea71	chore: commit current workspace state	2026-05-05 14:46:18 +02:00
Mikael Hugo	47c806d733	fix: version sf extension runtime sources	2026-05-04 23:27:20 +02:00
Mikael Hugo	ed4a4bc93a	chore: commit current worktree state	2026-05-04 19:28:39 +02:00
Mikael Hugo	a37737c4af	docs: memory-relations.ts is now ranker-live Updates `23c5de38b` (which flagged the table as storage-only) to reflect that `55b14c3f7` wired the ranker consumer (graph-boost in getRelevantMemoriesRanked) and `b9bff3762` wired the writer (co-extraction linkage in applyMemoryActions). The graph-aware pipeline is now end-to-end live, with named relation types, auto-linking confidence (0.5), intra-pool boost, and damping (0.4). Honest description for contributors reading top-down. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 00:13:56 +02:00
Mikael Hugo	23c5de38bf	docs: clarify memory-relations.ts is storage-only today The architecture.md entry implied memory-relations.ts contributes to ranking ("knowledge-graph edges between memories"). The read consumer doesn't exist yet — getRelevantMemoriesRanked uses cosine + static score, not graph traversal. Relations are written via /sf memory import / createMemoryRelation but never read for ranking. Updated the description so a contributor reading this file knows the graph-traversal pipeline is the next logical extension, not something that currently runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 23:52:38 +02:00
Mikael Hugo	daa192a572	docs: list memory-* modules in architecture.md The repo's architecture file listed only `memory-extractor.ts` and `memory-store.ts` — the rest of the memory subsystem (`memory-embeddings.ts`, `memory-embeddings-llm-gateway.ts`, `memory-relations.ts`, `memory-source-store.ts`) had no entry, so a new contributor reading the file would miss them entirely. Added one-line descriptions for each, including the gateway adapter's opt-in env-var contract (`SF_LLM_GATEWAY_KEY`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 23:29:03 +02:00
Mikael Hugo	a3ef4bdf3f	fix(sf): remove workflow tool aliases	2026-05-02 18:32:50 +02:00
Mikael Hugo	ba4bab1034	fix(sf): correct stale .sf milestone paths in prompts + ADR-impl absolute links prompts/parallel-research-slices.md step 3 told the dispatcher to verify research at `.sf/{{mid}}/`, but slice research files actually live at `.sf/milestones/{{mid}}/slices/<sliceId>/<sliceId>-RESEARCH.md`. Step 3 verification could only ever fail. prompts/validate-milestone.md sent the three milestone-validation reviewer agents to wrong paths: - parentTrace pointed at `.sf/{{milestoneId}}/S0X-SUMMARY.md` (slice summaries actually live at `.sf/milestones/{{milestoneId}}/slices/S0X/`) - Reviewer A read `.sf/{{milestoneId}}/REQUIREMENTS.md` (the file is at project-level `.sf/REQUIREMENTS.md`) - Reviewer A scanned `.sf/{{milestoneId}}/` for slice SUMMARYs (wrong dir) - Reviewer C read `.sf/{{milestoneId}}/CONTEXT.md` (actual file is `.sf/milestones/{{milestoneId}}/{{milestoneId}}-CONTEXT.md`) Reviewers would either return false MISSING / FAIL verdicts or have to re-discover the layout. docs/dev/ADR-{008,009}-IMPLEMENTATION-PLAN.md "Related ADR" links pointed to absolute paths inside a contributor's old Mac (`/Users/jeremymcspadden/ Github/sf-2/...`). Replaced with sibling-file relative paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 18:06:16 +02:00
Mikael Hugo	21113e18a9	fix: update remaining stale repo and scope refs to singularity-forge After fixing forensics.md and error-classifier.ts last fire, swept the rest of the tree for the same class of stale reference: - scripts/validate-pack.js: criticalPackages list used \`@sf\` and \`@sf-build\` scopes — neither exists in node_modules; this is in CI (.github/workflows/ci.yml) + prepublishOnly, so the validation step was failing to find anything. Now \`@singularity-forge/pi-coding-agent\` and \`@singularity-forge/rpc-client\` (the actual scope). - src/resources/skills/github-workflows/references/gh/SKILL.md: same GraphQL bug as forensics.md — owner:"sf-build" name:"sf-2" — and three \`gh project\` commands using owner sf-build. The gh issue create command above already used singularity-forge/sf-run, so the follow-up calls always failed. Also retitled "sf-2 Backlog" to "sf-run Backlog". - src/resources/extensions/sf/bootstrap/system-context.ts: deprecation warning linked to https://github.com/sf-build/SF/issues/1492. - packages/mcp-server/README.md, packages/rpc-client/README.md: 9 refs to \`@sf-build/...\` for installable package names — would mislead anyone copy-pasting into npm install. - docs/user-docs/troubleshooting.md (+ zh-CN): GitHub Issues link pointed at github.com/sf-build/SF/issues. - docs/user-docs/getting-started.md (+ zh-CN): clone URL was correct but the next \`cd\` was \`cd sf-2/docker\` — won't exist after a fresh clone of sf-run. - docs/dev/ci-cd-pipeline.md: GHCR org was \`sf-build\`. Code comments containing "sf-2" / "sf-build" in non-active places (parsers.ts banner, error message URLs in tests, dev-doc absolute paths from a contributor's Mac) left alone — they're informational and not addressed by users or runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 18:01:55 +02:00
Mikael Hugo	61485c5bef	fix(sf): remove legacy completion tool aliases	2026-05-02 17:51:38 +02:00
Mikael Hugo	85a0188fe1	fix(sf): stabilize auto notices and package checks	2026-05-02 12:39:27 +02:00

1 2 3 4

177 commits