singularity/singularity-forge

Author	SHA1	Message	Date
Jeremy	0d3ef6b545	feat(gsd): add LLM safety harness for auto-mode damage control Unified safety layer that monitors, validates, and constrains LLM behavior during auto-mode execution. All components use warn-and-continue policy by default (log violations, notify user, keep going). Components: - Evidence collector: real-time bash/write/edit tool call tracking - Destructive command guard: classifies 10 dangerous patterns (rm -rf, force push, etc.) - File change validator: compares git diff against task plan's expected output - Evidence cross-reference: detects tasks marked complete with zero bash calls - Git checkpoint: pre-unit refs/gsd/checkpoints/ for optional rollback - Content validator: minimum quality checks on plans and summaries - Timeout scale cap: limits timeout multiplier to 6x (was unlimited) New preference: safety_harness with per-component toggles. Enabled by default, auto_rollback off by default. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 15:00:06 -05:00
Tibsfox	857c45bd6a	fix(gsd): replace remaining empty catch with logWarning	2026-04-05 12:21:36 -07:00
Tibsfox	f4ecfd1a56	fix(gsd): use logWarning instead of raw stderr in catch blocks	2026-04-05 12:14:23 -07:00
Tibsfox	5a51631941	fix(gsd): log error instead of empty catch in STATE.md rebuild	2026-04-05 12:08:19 -07:00
Tibsfox	114bde1788	fix(gsd): log error instead of empty catch in skip_slice	2026-04-05 12:06:11 -07:00
Tibsfox	db90607378	fix(gsd): cast milestone classification to string for type safety	2026-04-05 12:05:02 -07:00
Tibsfox	4b68e8c9d9	test(headless): add extension path alignment test	2026-04-05 11:57:40 -07:00
Tibsfox	469acf53af	test(cli): add update diagnostics regression test	2026-04-05 11:57:25 -07:00
Tibsfox	a3f2de828e	test(cli): add node_modules symlink regression test	2026-04-05 11:56:58 -07:00
Tibsfox	3213cd8c80	test(headless): add multi-turn command classification test	2026-04-05 11:56:44 -07:00
Tibsfox	d4b6eb714c	test(pi-coding-agent): add custom provider registration test	2026-04-05 11:56:29 -07:00
Tibsfox	dbe6f9d292	test(ollama): add authMode regression test	2026-04-05 11:56:15 -07:00
Tibsfox	5b6d7784c2	test(gsd): add zero-slice roadmap guided flow test	2026-04-05 11:55:49 -07:00
Tibsfox	824e8e12a8	test(gsd): add skip-slice STATE.md rebuild regression test	2026-04-05 11:55:35 -07:00
Tibsfox	107efc5bff	test(gsd): add worktree main_branch preference test	2026-04-05 11:55:21 -07:00
Tibsfox	5cb04f54ca	test(gsd): add defer capture stamp regression test	2026-04-05 11:55:07 -07:00
Tibsfox	48dc32eeb5	test(tui): add Image dimension caching regression test	2026-04-05 11:54:47 -07:00
Jeremy McSpadden	3e09184493	Merge pull request #3566 from jeremymcs/fix/complete-slice-string-coercion fix(gsd): coerce string arrays to objects in complete-slice/task tools	2026-04-05 13:44:40 -05:00
Tibsfox	523fcd89a8	fix(headless): sync resources and use agent dir for query	2026-04-05 11:35:11 -07:00
Jeremy	0b7764349c	chore(gsd): remove copyright line from test file	2026-04-05 13:33:13 -05:00
Tibsfox	3bcd55ccfd	fix(cli): show latest version and bypass npm cache in update check	2026-04-05 11:33:03 -07:00
Jeremy	e210b7efdf	fix(gsd): follow CONTRIBUTING standards for #3565 - Move new coercion tests to standalone file using node:test + node:assert/strict (per CONTRIBUTING testing standards) - Remove tests from legacy complete-slice.test.ts to avoid mixing test frameworks in the same file	2026-04-05 13:32:56 -05:00
Jeremy	6046a31c6f	fix(gsd): address Codex adversarial review findings for #3565 - verificationEvidence coercion now uses sentinel values (exitCode: -1, verdict: "unknown") instead of fabricating passing results - String coercion for requirements fields now parses "ID — detail" delimiter format to preserve semantic payload - Added regression tests for sentinel values and delimiter parsing Closes #3565	2026-04-05 13:30:09 -05:00
Jeremy	0742cf3493	fix(gsd): coerce string arrays to objects in complete-slice/task tools (#3565 ) LLMs sometimes pass plain strings instead of the expected object shape for array fields like filesModified and requires, causing TypeBox validation to reject the input before the execute function runs. This adds Type.Union schemas to accept both formats and normalizes strings to objects with sensible defaults in the execute functions. Closes #3565	2026-04-05 13:23:30 -05:00
Tibsfox	5d75705650	fix(cli): resolve hoisted node_modules for global installs	2026-04-05 11:16:10 -07:00
Tibsfox	21a14e32fc	fix(headless): treat discuss and plan as multi-turn commands	2026-04-05 11:14:24 -07:00
Jeremy	3a1e9e3416	fix(gsd): harden flat-rate routing guard against alias/resolution gaps The flat-rate provider guard from #3552 can fail open in two scenarios: 1. Provider alias mismatch — isFlatRateProvider only matched the exact string "github-copilot", but "copilot" appears as a provider alias in the codebase. Case variations could also bypass the check. Fix: add "copilot" alias and lowercase input before set membership. 2. Unresolved primary model — when resolveModelId returns undefined (stale model ID, registry mismatch), the guard was skipped entirely, allowing dynamic routing to downgrade models on a flat-rate backend. Fix: fall back to autoModeStartModel.provider and ctx.model.provider when primary resolution fails, disabling routing if either indicates a flat-rate provider. Ref: #3453	2026-04-05 13:09:44 -05:00
Tibsfox	935cc9a464	fix(pi-coding-agent): register models.json providers and await Ollama probe in headless mode	2026-04-05 11:09:08 -07:00
Tibsfox	352dd17e76	fix(ollama): use apiKey auth mode to avoid streamSimple crash	2026-04-05 11:06:38 -07:00
Tibsfox	cd87c9937d	fix(gsd): treat zero-slice roadmap as pre-planning in guided flow	2026-04-05 11:00:09 -07:00
Tibsfox	8ccab86aac	fix(gsd): rebuild STATE.md after skip-slice and strengthen rethink prompt	2026-04-05 10:55:30 -07:00
Tibsfox	f93dee3733	fix(gsd): use main_branch preference in worktree creation	2026-04-05 10:43:31 -07:00
Tibsfox	b46b113360	fix(gsd): stamp defer and milestone captures as executed after triage	2026-04-05 10:38:29 -07:00
Tibsfox	31e20e0fe3	fix(tui): treat absolute file paths as plain text, not commands	2026-04-05 10:34:34 -07:00
Tibsfox	11239140db	fix(tui): break infinite re-render loop for images in cmux	2026-04-05 10:30:13 -07:00
Tibsfox	9ab675a843	fix(gsd): disable dynamic model routing for flat-rate providers	2026-04-05 10:24:52 -07:00
Tibsfox	a2f7274a82	fix(gsd): rebuild STATE.md before guided-flow dispatch	2026-04-05 10:13:25 -07:00
Tibsfox	2363c94da6	fix(gsd): defer queued shells in active milestone selection	2026-04-05 10:06:05 -07:00
Tibsfox	fde2aafa64	fix(retry): prevent 429 quota cascade and 30-min lockout	2026-04-05 09:37:28 -07:00
deseltrus	58b58d7290	ci: retrigger — previous failures in unrelated tests (DB reconciliation, state-machine)	2026-04-05 18:15:21 +02:00
deseltrus	44aecc9a3f	fix(auto): resilient transient error recovery — defer to Core RetryHandler and fix cmdCtx race Three related bugs cause auto-mode to permanently stop on transient provider errors (overloaded_error, rate limits, 503s) that should be recoverable: 1. Layer 1/Layer 2 race condition: The extension's handleAgentEnd processes agent_end events BEFORE the Core RetryHandler in _processAgentEvent. For transient errors, both layers reacted simultaneously — the extension called pauseAuto (tearing down the session) while the Core called agent.continue() (in-context retry). This ripped the agent out of its context window mid- recovery. Fix: handleAgentEnd now returns immediately for transient errors, letting the Core retry in-context with full conversation preservation. 2. retryState accumulation across resume cycles: consecutiveTransientCount in agent-end-recovery.ts accumulated across pause/resume cycles without resetting, permanently locking out auto-resume after MAX_TRANSIENT_AUTO_RESUMES total (not per-cycle) errors. Fix: resetTransientRetryState() is called before startAuto() in the resume path. MAX_TRANSIENT_AUTO_RESUMES raised from 3 to 8 to cover ~30 minutes of sustained provider overload. 3. ExtensionContext lacks newSession(): The provider-error resume callback receives an ExtensionContext (from the agent_end hook), not an ExtensionCommandContext. startAuto() overwrote s.cmdCtx with this incomplete context, causing 'newSession is not a function' on every subsequent runUnit() call. Fix: startAuto() now checks for newSession before overwriting — on provider-error resume, it preserves the original ExtensionCommandContext. Bonus: Session creation timeout (category=timeout) now calls pauseAuto instead of stopAuto, matching the provider-pause path. Structural errors (TypeError) still hard-stop to prevent infinite retry loops. Fixes #2492	2026-04-05 18:15:21 +02:00
Jeremy McSpadden	a6b7febc5e	Merge pull request #3545 from jeremymcs/feat/ollama-native-chat-provider feat(ollama): native /api/chat provider with full option exposure	2026-04-05 11:05:47 -05:00
Jeremy McSpadden	092d1c0a9e	Merge pull request #3546 from jeremymcs/worktree-issue-3541-ollama-native fix(gsd): prevent LLM from querying gsd.db directly via bash	2026-04-05 10:51:01 -05:00
Jeremy	563fdae8e2	ci: add scanignore for doctor-heal.md false positive The prompt injection scan flags "You are now responsible" in doctor-heal.md as role injection (matches "you are now [a-z]"). This is a pre-existing legitimate prompt instruction, not injection.	2026-04-05 10:22:03 -05:00
Jeremy	bc20104a44	perf(gsd): trim promptGuidelines to 1 line to reduce per-turn token cost promptGuidelines from every registered tool are injected into the system prompt on every API call. The return shape details were redundant (the JSON response is self-describing). Keep only the sqlite3 prohibition.	2026-04-05 10:11:15 -05:00
Jeremy	7d74081434	fix(gsd): address Codex adversarial review findings 1. Replace ensureDbOpen() with isDbAvailable() in gsd_milestone_status so the read-only tool cannot create/migrate the DB as a side effect 2. Wrap all reads in a BEGIN/COMMIT transaction for snapshot consistency under concurrent WAL writes 3. Broaden negative regex in guardrail tests to catch sqlite3 with flags, relative paths, absolute paths, and quoted paths	2026-04-05 09:56:19 -05:00
Jeremy	4d9eb9ead0	fix(gsd): prevent LLM from querying gsd.db directly via bash (#3541 ) Add 4-layer defense-in-depth to enforce single-writer WAL discipline: 1. Global anti-pattern in system.md protecting all 35+ auto-mode units 2. DB access safety blocks in 5 high-risk prompts (validate-milestone, complete-milestone, doctor-heal, forensics, reassess-roadmap) 3. New gsd_milestone_status read-only query tool giving the LLM a sanctioned path to inspect milestone/slice/task state 4. 14 regression tests (8 prompt guardrails + 6 tool coverage) Closes #3541	2026-04-05 09:43:56 -05:00
Jeremy	4ba2d5a219	feat(ollama): native /api/chat provider with full option exposure Replace the OpenAI-compat shim with a native Ollama /api/chat streaming provider that exposes all commonly-used Ollama options and surfaces inference performance metrics. Key changes: - Native NDJSON streaming from /api/chat (no more OpenAI shim) - Known models send num_ctx from capability table; unknown models defer to Ollama's default to avoid OOM on constrained hosts - Exposes: temperature, top_p, top_k, repeat_penalty, seed, num_gpu, keep_alive, num_predict via per-model providerOptions - Extracts <think>...</think> blocks for reasoning models (deepseek-r1, qwq) - Surfaces InferenceMetrics (tokens/sec, durations) on AssistantMessage - Adds remove and show actions to ollama_manage LLM tool - Adds "ollama-chat" to KnownApi, providerOptions to Model<TApi> - NDJSON parser uses strict mode for chat (fails on malformed frames) - Mixed content+tool_call chunks handled independently Closes #3544	2026-04-05 09:01:40 -05:00
Jeremy McSpadden	dcf41154b8	Merge pull request #3540 from Tibsfox/fix/seed-requirements-from-markdown fix(gsd): seed requirements table from REQUIREMENTS.md on first update	2026-04-05 08:11:59 -05:00
Jeremy McSpadden	5c7e5efcf4	Merge pull request #3539 from Tibsfox/fix/inject-slice-context-into-prompts fix(gsd): inject S##-CONTEXT.md from slice discussion into all prompt builders	2026-04-05 08:07:19 -05:00

... 22 23 24 25 26 ...

3806 commits