Notifications from ctx.ui.notify() and workflow-logger now persist to
.gsd/notifications.jsonl instead of evaporating as transient toasts.
- notification-store: JSONL persistence with 500-entry rotation, atomic
temp+rename rewrites, ref-counted suppress API, disk-synced counters
- notify-interceptor: WeakSet-guarded monkey-patch on ctx.ui.notify
installed at session_start and session_switch
- notification-widget: always-on belowEditor strip showing unread count
- notification-overlay: scrollable Ctrl+Alt+N panel with severity filter
- /gsd notifications command: clear, tail, filter subcommands
- workflow-logger: warnings now also persist to notification store
- web API: GET/DELETE /api/notifications with ?countOnly support
- 16 unit tests covering store, suppress, project isolation, resync
The welcome screen lines stopped short on wide terminals because
termWidth was capped at 200 columns. Remove the cap so separator
lines extend to the full terminal width.
- Use `git reset --hard <sha>` for rollback instead of `git branch -f`
which fails on checked-out branches and worktrees
- Clear pendingProviderRegistrations after preflush to prevent duplicate
registration when bindCore() runs
- Process Ollama stream content on terminal `done:true` chunks to avoid
truncating trailing assistant text
The system prompt hardcoded ~/.gsd/agent/skills/ paths for bundled skills,
causing ENOENT loops when skills weren't installed at those locations. The
auto-mode loop treated ENOENT as transient and retried indefinitely.
- Replace hardcoded skill paths in system.md with {{bundledSkillsTable}} template
variable, resolved dynamically via resolveSkillReference() at runtime
- Replace hardcoded templates dir path with {{templatesDir}} variable
- Add buildBundledSkillsTable() to system-context.ts — only includes skills
that actually exist on disk
- Export getTemplatesDir() from prompt-loader.ts
- Add Rule 4 to detect-stuck.ts: same ENOENT path seen twice in the sliding
window triggers immediate stuck detection (missing files don't self-heal)
- Add 4 tests for Rule 4 coverage
Closes#3575
- Move new coercion tests to standalone file using node:test +
node:assert/strict (per CONTRIBUTING testing standards)
- Remove tests from legacy complete-slice.test.ts to avoid mixing
test frameworks in the same file
LLMs sometimes pass plain strings instead of the expected object shape
for array fields like filesModified and requires, causing TypeBox
validation to reject the input before the execute function runs. This
adds Type.Union schemas to accept both formats and normalizes strings
to objects with sensible defaults in the execute functions.
Closes#3565
The flat-rate provider guard from #3552 can fail open in two scenarios:
1. Provider alias mismatch — isFlatRateProvider only matched the exact
string "github-copilot", but "copilot" appears as a provider alias
in the codebase. Case variations could also bypass the check.
Fix: add "copilot" alias and lowercase input before set membership.
2. Unresolved primary model — when resolveModelId returns undefined
(stale model ID, registry mismatch), the guard was skipped entirely,
allowing dynamic routing to downgrade models on a flat-rate backend.
Fix: fall back to autoModeStartModel.provider and ctx.model.provider
when primary resolution fails, disabling routing if either indicates
a flat-rate provider.
Ref: #3453
The prompt injection scan flags "You are now responsible" in
doctor-heal.md as role injection (matches "you are now [a-z]").
This is a pre-existing legitimate prompt instruction, not injection.
promptGuidelines from every registered tool are injected into the system
prompt on every API call. The return shape details were redundant (the
JSON response is self-describing). Keep only the sqlite3 prohibition.
1. Replace ensureDbOpen() with isDbAvailable() in gsd_milestone_status
so the read-only tool cannot create/migrate the DB as a side effect
2. Wrap all reads in a BEGIN/COMMIT transaction for snapshot consistency
under concurrent WAL writes
3. Broaden negative regex in guardrail tests to catch sqlite3 with
flags, relative paths, absolute paths, and quoted paths
Add 4-layer defense-in-depth to enforce single-writer WAL discipline:
1. Global anti-pattern in system.md protecting all 35+ auto-mode units
2. DB access safety blocks in 5 high-risk prompts (validate-milestone,
complete-milestone, doctor-heal, forensics, reassess-roadmap)
3. New gsd_milestone_status read-only query tool giving the LLM a
sanctioned path to inspect milestone/slice/task state
4. 14 regression tests (8 prompt guardrails + 6 tool coverage)
Closes#3541
Replace the OpenAI-compat shim with a native Ollama /api/chat streaming
provider that exposes all commonly-used Ollama options and surfaces
inference performance metrics.
Key changes:
- Native NDJSON streaming from /api/chat (no more OpenAI shim)
- Known models send num_ctx from capability table; unknown models defer
to Ollama's default to avoid OOM on constrained hosts
- Exposes: temperature, top_p, top_k, repeat_penalty, seed, num_gpu,
keep_alive, num_predict via per-model providerOptions
- Extracts <think>...</think> blocks for reasoning models (deepseek-r1, qwq)
- Surfaces InferenceMetrics (tokens/sec, durations) on AssistantMessage
- Adds remove and show actions to ollama_manage LLM tool
- Adds "ollama-chat" to KnownApi, providerOptions to Model<TApi>
- NDJSON parser uses strict mode for chat (fails on malformed frames)
- Mixed content+tool_call chunks handled independently
Closes#3544
When requirements are authored in REQUIREMENTS.md during the discussion
phase (the standard workflow), the DB requirements table stays empty.
gsd_requirement_update then fails with not_found for every requirement
at milestone completion, burning tokens on retries.
When updateRequirementInDb encounters a requirement ID not in the DB,
it now parses REQUIREMENTS.md via parseRequirementsSections() and seeds
all requirements into the DB before retrying the lookup. This preserves
the original content (class, description, why, source, validation)
instead of creating an empty skeleton.
The seeding is:
- Lazy: only runs on first miss, not on every update
- Collision-safe: skips IDs already in the DB
- Non-blocking: falls through to skeleton if REQUIREMENTS.md is
missing or unparseable
Adds 1 regression test verifying that updating R005 when the DB is
empty seeds all 3 requirements from REQUIREMENTS.md with their
original content preserved.
Closes#3346
S##-CONTEXT.md files produced by /gsd discuss (require_slice_discussion)
are never injected into downstream prompt builders. Discussed
requirements, acceptance criteria, and design decisions are silently
dropped — the researcher, planner, completer, replanner, and
reassessor never see them.
Add resolveSliceFile(base, mid, sid, "CONTEXT") + inlineFileOptional()
to all 5 affected builders:
1. buildResearchSlicePrompt
2. buildPlanSlicePrompt
3. buildCompleteSlicePrompt
4. buildReplanSlicePrompt
5. buildReassessRoadmapPrompt
The slice CONTEXT is placed immediately after the roadmap and before
other context (research, decisions, requirements) so the discussed
scope is visible before detailed planning artifacts.
Uses the existing inlineFileOptional() pattern — if no S##-CONTEXT.md
exists, nothing is injected (zero cost for projects not using slice
discussion).
Adds 5 regression tests verifying each builder resolves and inlines
the slice CONTEXT file.
Closes#3452
Update What's New section from v2.52 to v2.63, expand native engine
docs to cover all 20+ modules, add missing extensions and ADRs to
indexes, update version references and Node.js requirements.
Address Codex adversarial review findings:
1. Only re-apply the validated model when createAgentSession() signals
a fallback (modelFallbackMessage is truthy). This prevents silently
overriding the persisted model of resumed conversations.
2. Use modelRegistry.getAvailable() instead of find() to ensure the
model's provider is request-ready before calling setModel().
3. Await session.setModel() and wrap in try/catch so provider auth
failures don't surface as unhandled promise rejections at startup.
Applies to both print-mode and interactive-mode startup paths.
Extension-provided models (e.g. claude-code/*) were unavailable during
findInitialModel() because pendingProviderRegistrations had not been
flushed yet, causing the fallback chain to select Google Gemini even
when the user explicitly configured claude-code as their default.
Three compounding issues fixed:
(A) Flush pendingProviderRegistrations in createAgentSession() before
findInitialModel() runs, so extension models are in the registry
when initial model selection happens.
(B) Re-apply the validated model to the session after
validateConfiguredModel() in both print and interactive CLI paths.
Previously, validation updated settingsManager but never called
session.setModel(), leaving the session on the wrong model.
(C) Update defaultModelPerProvider.anthropic from "claude-opus-4-6[1m]"
to "claude-opus-4-6" — the [1m] variant was removed from the model
registry when the base model was upgraded to 1M context, causing the
Anthropic fallback to silently fail and skip to Google.
Closes#3534
* fix: detect Xcode bundles by suffix scan in worktree health check (#1882)
Xcode project directories have project-specific names (e.g. Sudokuxyz.xcodeproj)
that cannot be matched by the exact-filename PROJECT_FILES list. Add a
readdirSync suffix scan for *.xcodeproj and *.xcworkspace so iOS/macOS projects
are not incorrectly treated as greenfield when the health check runs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: replace empty catch with debugLog in Xcode bundle scan
The silent-catch-diagnostics test (#3348) bans empty catch blocks in
migrated auto-mode files. Replace the bare `catch { /* best-effort */ }`
with a debugLog call to satisfy the workflow-logger requirement.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(perf): share jiti module cache across extension loads (#2108)
Each extension was creating a new jiti instance with moduleCache: false,
causing shared dependencies to be recompiled for every extension. Use a
shared singleton with moduleCache: true so shared modules are compiled once.
Export resetExtensionLoaderCache() for test teardown and explicit reload.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: correct loader path in extension-load-perf test (4 → 3 levels up)
The test file is at src/tests/ (2 levels deep from repo root), so
fileURLToPath(import.meta.url) + 3x'..' reaches the repo root.
Using 4 levels exits the repo into the GitHub Actions workspace parent,
causing ERR_MODULE_NOT_FOUND for loader.js in dist/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: use process.cwd() for loader path in perf test (source/compiled portability)
import.meta.url resolves to different depths in source (src/tests/) vs compiled
(dist-test/src/tests/), so relative '../' navigation produces the wrong path in
the build phase. process.cwd() is always the repo root in CI regardless of
where the test file is compiled to.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>