LLMs sometimes pass simple string-array fields (provides, keyFiles, etc.)
as a plain string instead of a single-element array, causing TypeBox schema
validation to reject the call before the execute function's coercion logic
can run. Fix by accepting Union([Array, String]) in the schema and adding
wrapArray() coercion for all 8 simple array fields in the execute function.
1. Post-execution retry bypass (auto-verification.ts)
- When postExecBlockingFailure is true, skip retry and pause immediately
- Post-exec failures are cross-task consistency issues that retrying won't fix
- Added test in post-exec-retry-bypass.test.ts
2. File path normalization (pre-execution-checks.ts)
- Added normalizeFilePath() to handle ./path vs path equivalence
- Normalizes backslashes, removes duplicate slashes, strips leading ./
- Applied to checkFilePathConsistency() and checkTaskOrdering()
- Added tests for path normalization in pre-execution-checks.test.ts
3. Pre-exec fail-closed (auto-post-unit.ts)
- Added try/catch around runPreExecutionChecks() inside runSafely block
- If runPreExecutionChecks throws, set preExecPauseNeeded = true
- Used logError from workflow-logger (not raw stderr)
- Added test in pre-execution-fail-closed.test.ts
autoStartTime was never saved to paused-session.json, so cross-session
resume always started with autoStartTime=0 and the widget showed no
elapsed timer. Now saved on pause, restored on resume with Date.now()
fallback for old files.
Also fixes widget layout: elapsed/ETA stays on the header line above
the milestone/branch info line.
The enhanced_verification_* preferences were validated and typed but not
included in mergePreferences(), causing project-level overrides to be
silently ignored. This fix ensures project preferences properly merge
with user-level defaults.
Integrates pre/post-execution checks into auto-mode:
- auto-verification.ts: runEnhancedPreChecks/runEnhancedPostChecks integration
- auto-post-unit.ts: pause control flow when blocking checks fail
- Respects enhanced_verification_strict preference for blocking vs warning
Control flow: blocking failures trigger auto-mode pause for user review.
Adds 3 post-execution checks that run after task completion:
- Import resolution: verifies relative imports resolve to existing files
- Export verification: confirms exported symbols are defined
- Type consistency: validates function return types match declarations
All checks follow the permissive-by-default pattern (R012) - warnings don't block.
Adds 4 pre-execution checks that run before each task:
- File ops review: surfaces create/edit/delete intent for manual review
- Read-before-create guard: fails when plan reads a file before creating it
- Package existence: verifies npm packages exist before install attempts
- Interface contract: warns on mismatched function signatures
Includes preference types and validation for enhanced_verification settings.
The welcome screen lines stopped short on wide terminals because
termWidth was capped at 200 columns. Remove the cap so separator
lines extend to the full terminal width.
- Use `git reset --hard <sha>` for rollback instead of `git branch -f`
which fails on checked-out branches and worktrees
- Clear pendingProviderRegistrations after preflush to prevent duplicate
registration when bindCore() runs
- Process Ollama stream content on terminal `done:true` chunks to avoid
truncating trailing assistant text
The system prompt hardcoded ~/.gsd/agent/skills/ paths for bundled skills,
causing ENOENT loops when skills weren't installed at those locations. The
auto-mode loop treated ENOENT as transient and retried indefinitely.
- Replace hardcoded skill paths in system.md with {{bundledSkillsTable}} template
variable, resolved dynamically via resolveSkillReference() at runtime
- Replace hardcoded templates dir path with {{templatesDir}} variable
- Add buildBundledSkillsTable() to system-context.ts — only includes skills
that actually exist on disk
- Export getTemplatesDir() from prompt-loader.ts
- Add Rule 4 to detect-stuck.ts: same ENOENT path seen twice in the sliding
window triggers immediate stuck detection (missing files don't self-heal)
- Add 4 tests for Rule 4 coverage
Closes#3575
- Move new coercion tests to standalone file using node:test +
node:assert/strict (per CONTRIBUTING testing standards)
- Remove tests from legacy complete-slice.test.ts to avoid mixing
test frameworks in the same file
LLMs sometimes pass plain strings instead of the expected object shape
for array fields like filesModified and requires, causing TypeBox
validation to reject the input before the execute function runs. This
adds Type.Union schemas to accept both formats and normalizes strings
to objects with sensible defaults in the execute functions.
Closes#3565
The flat-rate provider guard from #3552 can fail open in two scenarios:
1. Provider alias mismatch — isFlatRateProvider only matched the exact
string "github-copilot", but "copilot" appears as a provider alias
in the codebase. Case variations could also bypass the check.
Fix: add "copilot" alias and lowercase input before set membership.
2. Unresolved primary model — when resolveModelId returns undefined
(stale model ID, registry mismatch), the guard was skipped entirely,
allowing dynamic routing to downgrade models on a flat-rate backend.
Fix: fall back to autoModeStartModel.provider and ctx.model.provider
when primary resolution fails, disabling routing if either indicates
a flat-rate provider.
Ref: #3453
The prompt injection scan flags "You are now responsible" in
doctor-heal.md as role injection (matches "you are now [a-z]").
This is a pre-existing legitimate prompt instruction, not injection.
promptGuidelines from every registered tool are injected into the system
prompt on every API call. The return shape details were redundant (the
JSON response is self-describing). Keep only the sqlite3 prohibition.
1. Replace ensureDbOpen() with isDbAvailable() in gsd_milestone_status
so the read-only tool cannot create/migrate the DB as a side effect
2. Wrap all reads in a BEGIN/COMMIT transaction for snapshot consistency
under concurrent WAL writes
3. Broaden negative regex in guardrail tests to catch sqlite3 with
flags, relative paths, absolute paths, and quoted paths
Add 4-layer defense-in-depth to enforce single-writer WAL discipline:
1. Global anti-pattern in system.md protecting all 35+ auto-mode units
2. DB access safety blocks in 5 high-risk prompts (validate-milestone,
complete-milestone, doctor-heal, forensics, reassess-roadmap)
3. New gsd_milestone_status read-only query tool giving the LLM a
sanctioned path to inspect milestone/slice/task state
4. 14 regression tests (8 prompt guardrails + 6 tool coverage)
Closes#3541
Replace the OpenAI-compat shim with a native Ollama /api/chat streaming
provider that exposes all commonly-used Ollama options and surfaces
inference performance metrics.
Key changes:
- Native NDJSON streaming from /api/chat (no more OpenAI shim)
- Known models send num_ctx from capability table; unknown models defer
to Ollama's default to avoid OOM on constrained hosts
- Exposes: temperature, top_p, top_k, repeat_penalty, seed, num_gpu,
keep_alive, num_predict via per-model providerOptions
- Extracts <think>...</think> blocks for reasoning models (deepseek-r1, qwq)
- Surfaces InferenceMetrics (tokens/sec, durations) on AssistantMessage
- Adds remove and show actions to ollama_manage LLM tool
- Adds "ollama-chat" to KnownApi, providerOptions to Model<TApi>
- NDJSON parser uses strict mode for chat (fails on malformed frames)
- Mixed content+tool_call chunks handled independently
Closes#3544