The provider manager let users navigate with arrow keys but pressing
Enter did nothing. Users had no way to set up authentication from within
the /provider command.
Adds selectConfirm (Enter) handler that routes to showLoginDialog for
the selected provider, with a hint in the status bar.
Closes#3579Closes#3567
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backdrop was painting empty lines with dark gray background (48;5;233),
making the entire screen go black. Now uses dim + gray foreground only.
Message truncation now measures actual prefix width with visibleWidth()
instead of hardcoded 20-char estimate, and uses truncateToWidth() for
proper Unicode handling.
Use dark gray background + dim foreground for visible backdrop effect
instead of barely-perceptible SGR dim. Size overlay box to content
instead of padding to fill the entire viewport.
- Overlay layout: verify backdrop dims base lines, no dim without flag,
overlay composites on top of dimmed background
- Notification store: verify markAllRead and clearNotifications do not
delete a foreign process's lock file
The notification overlay was rendering too small with few entries, allowing
underlying content to bleed through. Added viewport padding to fill the
overlay box and a new `backdrop` option to OverlayOptions that dims the
background behind modal overlays.
newSession() only rebuilt the tool registry when cwd changed. When cwd
stayed the same (e.g., discuss → plan-slice in the same worktree), any
tool narrowing from setActiveTools() persisted — stripping gsd_plan_slice
and other DB tools from auto-mode subagent sessions.
Add an else-branch that calls _refreshToolRegistry with
includeAllExtensionTools:true on every session switch, regardless of cwd.
Also call resetExtensionLoaderCache() in DefaultResourceLoader.reload()
so hot-updated extension code on disk is re-compiled instead of served
from the stale jiti module cache.
Closes#3616
The schema overload detector counted ALL isError tool results toward the
consecutive-failure cap, including bash commands that returned non-zero exit
codes (e.g. rg/grep exit 1 = 'no matches'). Three consecutive exploratory
searches with no matches would trigger the cap and abort the session.
Root cause: the allToolsFailed check used toolResults.every(r => r.isError)
which conflates preparation-phase errors (schema validation, tool-not-found,
tool-blocked) with execution-phase errors (the tool ran successfully but
returned a non-zero exit code).
Fix: track preparationErrorCount alongside tool results. Only preparation
errors (schema/validation failures) increment the consecutive failure
counter. Tool execution errors — like bash exit code 1 — are valid usage
and do not count toward the cap.
Also fixes pre-existing StopReason type mismatches in agent-loop tests
(end_turn → stop, tool_use → toolUse).
Chat component cap: After 100 rendered components, oldest are removed
from the container (session transcript persists on disk via
SessionManager). Prevents unbounded memory growth in long sessions
where thousands of tool calls accumulate DOM-like component trees.
Orphan process prevention: On shutdown, listDescendants(process.pid)
finds ALL child processes (including those spawned by the Bash tool
that bg-shell doesn't track) and kills them with SIGTERM + 500ms
grace + SIGKILL. Prevents orphaned dev servers, build processes, etc.
from persisting after session exit.
Container.render() now returns a stable array reference when output is
unchanged — TUI.doRender() skips ALL post-processing (isImageLine scans,
applyLineResets, differential diffs) when the reference matches.
Loader decouples spinner frame rotation from Text content updates.
Previously every 80ms tick called setText() which invalidated Text's
wrapTextWithAnsi/visibleWidth caches. Now the frame is prepended in
render() while Text caches the message separately.
Text.setText() returns early when text is unchanged, avoiding cache
invalidation on redundant updates.
ToolExecutionComponent.dispose() clears heavy references (image maps,
diff previews, result data) so GC can reclaim memory when components
are removed from the chat history.
- Use `git reset --hard <sha>` for rollback instead of `git branch -f`
which fails on checked-out branches and worktrees
- Clear pendingProviderRegistrations after preflush to prevent duplicate
registration when bindCore() runs
- Process Ollama stream content on terminal `done:true` chunks to avoid
truncating trailing assistant text
Replace the OpenAI-compat shim with a native Ollama /api/chat streaming
provider that exposes all commonly-used Ollama options and surfaces
inference performance metrics.
Key changes:
- Native NDJSON streaming from /api/chat (no more OpenAI shim)
- Known models send num_ctx from capability table; unknown models defer
to Ollama's default to avoid OOM on constrained hosts
- Exposes: temperature, top_p, top_k, repeat_penalty, seed, num_gpu,
keep_alive, num_predict via per-model providerOptions
- Extracts <think>...</think> blocks for reasoning models (deepseek-r1, qwq)
- Surfaces InferenceMetrics (tokens/sec, durations) on AssistantMessage
- Adds remove and show actions to ollama_manage LLM tool
- Adds "ollama-chat" to KnownApi, providerOptions to Model<TApi>
- NDJSON parser uses strict mode for chat (fails on malformed frames)
- Mixed content+tool_call chunks handled independently
Closes#3544
Extension-provided models (e.g. claude-code/*) were unavailable during
findInitialModel() because pendingProviderRegistrations had not been
flushed yet, causing the fallback chain to select Google Gemini even
when the user explicitly configured claude-code as their default.
Three compounding issues fixed:
(A) Flush pendingProviderRegistrations in createAgentSession() before
findInitialModel() runs, so extension models are in the registry
when initial model selection happens.
(B) Re-apply the validated model to the session after
validateConfiguredModel() in both print and interactive CLI paths.
Previously, validation updated settingsManager but never called
session.setModel(), leaving the session on the wrong model.
(C) Update defaultModelPerProvider.anthropic from "claude-opus-4-6[1m]"
to "claude-opus-4-6" — the [1m] variant was removed from the model
registry when the base model was upgraded to 1M context, causing the
Anthropic fallback to silently fail and skip to Google.
Closes#3534
* fix(perf): share jiti module cache across extension loads (#2108)
Each extension was creating a new jiti instance with moduleCache: false,
causing shared dependencies to be recompiled for every extension. Use a
shared singleton with moduleCache: true so shared modules are compiled once.
Export resetExtensionLoaderCache() for test teardown and explicit reload.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: correct loader path in extension-load-perf test (4 → 3 levels up)
The test file is at src/tests/ (2 levels deep from repo root), so
fileURLToPath(import.meta.url) + 3x'..' reaches the repo root.
Using 4 levels exits the repo into the GitHub Actions workspace parent,
causing ERR_MODULE_NOT_FOUND for loader.js in dist/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: use process.cwd() for loader path in perf test (source/compiled portability)
import.meta.url resolves to different depths in source (src/tests/) vs compiled
(dist-test/src/tests/), so relative '../' navigation produces the wrong path in
the build phase. process.cwd() is always the repo root in CI regardless of
where the test file is compiled to.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: detect and block Gemini CLI OAuth tokens used as API keys
Users who install Google's standalone Gemini CLI may inadvertently set
GEMINI_API_KEY to an OAuth access token (ya29.*) instead of an AI Studio
API key (AIza*). These tokens fail at the Google API with a confusing
error. This adds early detection at three entry points:
- AuthStorage.set(): throws when storing ya29.* as api_key for "google"
- AuthStorage.getApiKey(): blocks ya29.* from runtime overrides (--api-key)
- AuthStorage.getApiKey(): blocks ya29.* from environment variables
Each path provides a clear error message explaining the issue and
directing users to either get an API key from aistudio.google.com or
use /login google-gemini-cli for OAuth-based access.
Fixes#2157
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: retrigger CI
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
* fix: cap consecutive tool validation failures to prevent stuck-loop (#2783)
When the LLM repeatedly emits tool calls with arguments that fail schema
validation, the agent loop retries indefinitely — each failed validation
returns an error tool result, the LLM retries with the same broken args,
and the cycle burns budget with no progress.
Add a consecutive-failure counter in runLoop that tracks turns where ALL
tool calls fail. After MAX_CONSECUTIVE_VALIDATION_FAILURES (3) consecutive
all-error turns, the loop emits a diagnostic stop message and terminates
cleanly. The counter resets whenever any tool call in a turn succeeds, so
intermittent failures do not trigger early termination.
Closes#2783
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: retrigger CI
* fix(test): repair agent-loop.test.ts — close unclosed blocks, merge imports
Two test suites were concatenated without closing the first suite's
it+describe blocks, placing the second suite's imports inside a function
body and triggering 'Unexpected "{" ' from esbuild. Merged into a single
well-structured file with consolidated imports and proper closings.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Fixes#2075
The isBuiltIn check in detectExtensionConflicts used a path-based
heuristic that excluded paths containing /.gsd/agent/extensions/.
Since initResources() syncs all bundled extensions into that same
directory, the heuristic could never return true for bundled extensions,
preventing the "supersedes" hint from being appended. This caused
downstream cli.ts to display "Extension load error" instead of the
intended "Extension conflict" for benign precedence collisions.
Replace the path heuristic with an explicit bundledExtensionKeys set
(already computed by buildResourceLoader) threaded through to
detectExtensionConflicts. The set is matched against the extension
directory name extracted from the owner path.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add gsd_progress, gsd_roadmap, gsd_history, gsd_doctor, gsd_captures,
and gsd_knowledge tools that parse .gsd/ on disk — no session needed.
Inline lightweight readers in src/readers/ keep the package standalone
(zero new dependencies). 33 new tests, 64 total passing.
Users with existing lsp.json overrides referencing the old
"kotlin-language-server" key would silently lose their Kotlin
LSP config after the rename to "kotlin-lsp". LEGACY_ALIASES
map remaps old keys during mergeServers() so overrides still
merge correctly.
- Add BeforeModelSelectEvent interface and BeforeModelSelectResult type to types.ts
- Add on('before_model_select') subscription overload to ExtensionAPI interface
- Add emitBeforeModelSelect() method to ExtensionAPI interface and ExtensionRuntimeState
- Implement emitBeforeModelSelect() on ExtensionRunner using invokeHandlers (first-override-wins)
- Bind runner's emitBeforeModelSelect into shared runtime at construction time
- Wire emitBeforeModelSelect delegation through createExtensionAPI in loader.ts
The existing repairToolJson only handles YAML bullet lists (#2660).
Two additional malformation patterns from smaller models now cause
tool call failures and stuck retry loops:
1. XML parameter tags mixed into JSON values (#3403):
LLMs (especially Haiku-class) sometimes emit hybrid XML/JSON
syntax like <parameter name="X">value</parameter> inside
JSON string values. Add stripXmlParameterTags() to remove
the tags while preserving content.
2. Truncated numeric values (#3464):
Smaller models emit incomplete numbers like "exitCode": -,
or "durationMs": , when values are cut off mid-generation.
Add repairTruncatedNumbers() to replace these with 0.
Both repairs run before the existing YAML bullet repair phase.
The AJV validation layer (coerceTypes: true) then handles any
remaining string-to-number coercion.
Adds 13 new tests covering detection and repair for both patterns.
Closes#3464, closes#3403, addresses #3369
Explicit model changes aborted the active backoff sleep but left already-queued retry callbacks alive. That let stale provider retries continue after a user or runtime model switch, which could keep surfacing the old provider's backoff errors in the new session state.
Cancel the full retry generation on explicit model switches by clearing queued continue callbacks in RetryHandler, then cover the behavior with regression tests for both the session switch ordering and the queued-retry cancellation path.
The TUI slash dispatcher started treating any unrecognized /command as handled before session.prompt() could resolve extension commands, prompt templates, or /skill:* inputs. That blocked valid non-builtin slash commands and also let /export swallow unrelated /export* prefixes.
Move unknown-command detection to the interactive entry points, allow only known builtins or session-resolved slash commands through, gate /skill:* on the skill-command setting, and tighten /export matching to exact command tokens.
Move regression tests and override tests from standalone files into
the existing test files introduced by PR #666:
- resolve-config-value.test.ts: add REGRESSION #666 describe block
and setAllowedCommandPrefixes override tests
- url-utils.test.ts: add REGRESSION #666 describe block and
setFetchAllowedUrls override tests
- Delete: regression-666.test.ts, resolve-config-value-override.test.ts,
url-utils-override.test.ts
Same 59 tests, fewer files, tests live next to the code they test.
PR #666 introduced hardcoded SAFE_COMMAND_PREFIXES and SSRF URL
blocklists with no override mechanism. Users with non-standard
credential tools (sops, doppler, age, infisical) or needing to fetch
from internal URLs (self-hosted docs, VPN services) were silently
blocked with no recourse.
Add two global-only settings (ignored in project-level settings.json
to preserve the security property against malicious repos):
- allowedCommandPrefixes: replaces the built-in command allowlist
- fetchAllowedUrls: exempts hostnames from SSRF blocking
Both also support env var overrides (GSD_ALLOWED_COMMAND_PREFIXES,
GSD_FETCH_ALLOWED_URLS) for CI/container environments. Env vars
take precedence over settings.json.
Security model: global-only keys are stripped from project settings
at load time via stripGlobalOnlyKeys(), applied at all three
assignment points for this.projectSettings. The merge function
stays untouched — no future caller can accidentally skip stripping.
15 new tests covering override behavior, cache invalidation,
allowlist exemptions, and global-only enforcement.