newSession() only rebuilt the tool registry when cwd changed. When cwd
stayed the same (e.g., discuss → plan-slice in the same worktree), any
tool narrowing from setActiveTools() persisted — stripping gsd_plan_slice
and other DB tools from auto-mode subagent sessions.
Add an else-branch that calls _refreshToolRegistry with
includeAllExtensionTools:true on every session switch, regardless of cwd.
Also call resetExtensionLoaderCache() in DefaultResourceLoader.reload()
so hot-updated extension code on disk is re-compiled instead of served
from the stale jiti module cache.
Closes#3616
- Use `git reset --hard <sha>` for rollback instead of `git branch -f`
which fails on checked-out branches and worktrees
- Clear pendingProviderRegistrations after preflush to prevent duplicate
registration when bindCore() runs
- Process Ollama stream content on terminal `done:true` chunks to avoid
truncating trailing assistant text
Replace the OpenAI-compat shim with a native Ollama /api/chat streaming
provider that exposes all commonly-used Ollama options and surfaces
inference performance metrics.
Key changes:
- Native NDJSON streaming from /api/chat (no more OpenAI shim)
- Known models send num_ctx from capability table; unknown models defer
to Ollama's default to avoid OOM on constrained hosts
- Exposes: temperature, top_p, top_k, repeat_penalty, seed, num_gpu,
keep_alive, num_predict via per-model providerOptions
- Extracts <think>...</think> blocks for reasoning models (deepseek-r1, qwq)
- Surfaces InferenceMetrics (tokens/sec, durations) on AssistantMessage
- Adds remove and show actions to ollama_manage LLM tool
- Adds "ollama-chat" to KnownApi, providerOptions to Model<TApi>
- NDJSON parser uses strict mode for chat (fails on malformed frames)
- Mixed content+tool_call chunks handled independently
Closes#3544
Extension-provided models (e.g. claude-code/*) were unavailable during
findInitialModel() because pendingProviderRegistrations had not been
flushed yet, causing the fallback chain to select Google Gemini even
when the user explicitly configured claude-code as their default.
Three compounding issues fixed:
(A) Flush pendingProviderRegistrations in createAgentSession() before
findInitialModel() runs, so extension models are in the registry
when initial model selection happens.
(B) Re-apply the validated model to the session after
validateConfiguredModel() in both print and interactive CLI paths.
Previously, validation updated settingsManager but never called
session.setModel(), leaving the session on the wrong model.
(C) Update defaultModelPerProvider.anthropic from "claude-opus-4-6[1m]"
to "claude-opus-4-6" — the [1m] variant was removed from the model
registry when the base model was upgraded to 1M context, causing the
Anthropic fallback to silently fail and skip to Google.
Closes#3534
* fix(perf): share jiti module cache across extension loads (#2108)
Each extension was creating a new jiti instance with moduleCache: false,
causing shared dependencies to be recompiled for every extension. Use a
shared singleton with moduleCache: true so shared modules are compiled once.
Export resetExtensionLoaderCache() for test teardown and explicit reload.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: correct loader path in extension-load-perf test (4 → 3 levels up)
The test file is at src/tests/ (2 levels deep from repo root), so
fileURLToPath(import.meta.url) + 3x'..' reaches the repo root.
Using 4 levels exits the repo into the GitHub Actions workspace parent,
causing ERR_MODULE_NOT_FOUND for loader.js in dist/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: use process.cwd() for loader path in perf test (source/compiled portability)
import.meta.url resolves to different depths in source (src/tests/) vs compiled
(dist-test/src/tests/), so relative '../' navigation produces the wrong path in
the build phase. process.cwd() is always the repo root in CI regardless of
where the test file is compiled to.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: detect and block Gemini CLI OAuth tokens used as API keys
Users who install Google's standalone Gemini CLI may inadvertently set
GEMINI_API_KEY to an OAuth access token (ya29.*) instead of an AI Studio
API key (AIza*). These tokens fail at the Google API with a confusing
error. This adds early detection at three entry points:
- AuthStorage.set(): throws when storing ya29.* as api_key for "google"
- AuthStorage.getApiKey(): blocks ya29.* from runtime overrides (--api-key)
- AuthStorage.getApiKey(): blocks ya29.* from environment variables
Each path provides a clear error message explaining the issue and
directing users to either get an API key from aistudio.google.com or
use /login google-gemini-cli for OAuth-based access.
Fixes#2157
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: retrigger CI
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Fixes#2075
The isBuiltIn check in detectExtensionConflicts used a path-based
heuristic that excluded paths containing /.gsd/agent/extensions/.
Since initResources() syncs all bundled extensions into that same
directory, the heuristic could never return true for bundled extensions,
preventing the "supersedes" hint from being appended. This caused
downstream cli.ts to display "Extension load error" instead of the
intended "Extension conflict" for benign precedence collisions.
Replace the path heuristic with an explicit bundledExtensionKeys set
(already computed by buildResourceLoader) threaded through to
detectExtensionConflicts. The set is matched against the extension
directory name extracted from the owner path.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Users with existing lsp.json overrides referencing the old
"kotlin-language-server" key would silently lose their Kotlin
LSP config after the rename to "kotlin-lsp". LEGACY_ALIASES
map remaps old keys during mergeServers() so overrides still
merge correctly.
- Add BeforeModelSelectEvent interface and BeforeModelSelectResult type to types.ts
- Add on('before_model_select') subscription overload to ExtensionAPI interface
- Add emitBeforeModelSelect() method to ExtensionAPI interface and ExtensionRuntimeState
- Implement emitBeforeModelSelect() on ExtensionRunner using invokeHandlers (first-override-wins)
- Bind runner's emitBeforeModelSelect into shared runtime at construction time
- Wire emitBeforeModelSelect delegation through createExtensionAPI in loader.ts
Explicit model changes aborted the active backoff sleep but left already-queued retry callbacks alive. That let stale provider retries continue after a user or runtime model switch, which could keep surfacing the old provider's backoff errors in the new session state.
Cancel the full retry generation on explicit model switches by clearing queued continue callbacks in RetryHandler, then cover the behavior with regression tests for both the session switch ordering and the queued-retry cancellation path.
Move regression tests and override tests from standalone files into
the existing test files introduced by PR #666:
- resolve-config-value.test.ts: add REGRESSION #666 describe block
and setAllowedCommandPrefixes override tests
- url-utils.test.ts: add REGRESSION #666 describe block and
setFetchAllowedUrls override tests
- Delete: regression-666.test.ts, resolve-config-value-override.test.ts,
url-utils-override.test.ts
Same 59 tests, fewer files, tests live next to the code they test.
PR #666 introduced hardcoded SAFE_COMMAND_PREFIXES and SSRF URL
blocklists with no override mechanism. Users with non-standard
credential tools (sops, doppler, age, infisical) or needing to fetch
from internal URLs (self-hosted docs, VPN services) were silently
blocked with no recourse.
Add two global-only settings (ignored in project-level settings.json
to preserve the security property against malicious repos):
- allowedCommandPrefixes: replaces the built-in command allowlist
- fetchAllowedUrls: exempts hostnames from SSRF blocking
Both also support env var overrides (GSD_ALLOWED_COMMAND_PREFIXES,
GSD_FETCH_ALLOWED_URLS) for CI/container environments. Env vars
take precedence over settings.json.
Security model: global-only keys are stripped from project settings
at load time via stripGlobalOnlyKeys(), applied at all three
assignment points for this.projectSettings. The merge function
stays untouched — no future caller can accidentally skip stripping.
15 new tests covering override behavior, cache invalidation,
allowlist exemptions, and global-only enforcement.
Self-contained extension at src/resources/extensions/ollama/ that
auto-detects a running Ollama instance, discovers locally pulled models,
and registers them as a first-class provider with zero configuration.
Features:
- Auto-discovery of local models via /api/tags on session_start
- Capability detection (vision, reasoning, context window) for 40+ model families
- /ollama slash command with status, list, pull, remove, ps subcommands
- ollama_manage LLM-callable tool for agent-driven model operations
- Onboarding flow with auto-detect (no API key required)
- Non-blocking async probe — doesn't delay TUI paint
- Respects OLLAMA_HOST env var for non-default endpoints
Core changes (minimal):
- Add "ollama" to KnownProvider in pi-ai types
- Add "ollama" key resolution in env-api-keys.ts
- Add "ollama" default model in model-resolver.ts
- Add "Ollama (Local)" to onboarding wizard with probe flow
- Add extension-manifest.ts and extension-sort.ts to pi-coding-agent
with manifest reading and Kahn's BFS topological sort algorithm
- Add extensionPathsTransform hook to DefaultResourceLoader that runs
between path merging and loadExtensions() — enables pre-load
filtering and reordering without modifying pi internals
- Wire GSD's buildResourceLoader() to provide a transform that:
1. Filters ALL extensions (including community) through the GSD registry
2. Sorts in topological dependency order via sortExtensionPaths()
- Mark discoverAndLoadExtensions() as @deprecated (dead code path)
- Add 16 tests covering manifest reading, dependency sorting, cycles,
missing deps, and non-array deps
Previously, dependencies.extensions in manifests was decorative (sort
existed but was never called), and gsd extensions disable only worked
for bundled extensions. Community extensions in ~/.gsd/agent/extensions/
bypassed the registry entirely.
When an agent requests read(file, offset: 30) on a 13-line file, the
read tool threw "Offset 30 is beyond end of file" which propagated as
invalid JSON downstream during milestone completion. Now clamps the
offset to the last line and prepends a notice, allowing the agent to
continue with valid content.
Fixes both read.ts and hashline-read.ts variants.
Closes#3007
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
When a session grows beyond the context window of available models,
generateSummary() now detects the overflow and falls back to chunked
summarization: split messages into context-fitting chunks, summarize
the first chunk, then iteratively merge subsequent chunks using the
existing UPDATE_SUMMARIZATION_PROMPT path.
Closes#2932
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Three spawn call sites were missing `shell: process.platform === "win32"`,
causing ENOENT/EINVAL errors on Windows where npm-installed tools are .cmd
batch scripts that require shell resolution:
- exec.ts: hardcoded `shell: false` -> platform-guarded
- lsp/index.ts: missing shell option on project-type command spawn
- lsp/lspmux.ts: missing shell option on lspmux binary spawn
Adds a structural regression test that scans all spawn sites invoking
user-facing binaries and asserts the Windows shell guard is present.
Closes#2854
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: wrap custom messages with system notification prefix in LLM context
Background job completion notifications (delivered as custom messages via
sendMessage with deliverAs: "followUp") were converted to plain role: "user"
messages in convertToLlm(), making the LLM indistinguishable from actual
human input. This caused the agent to confuse background task output with
user messages, responding to job completions as if the user had typed them.
Wrap all custom messages with a clear system notification prefix that
includes the customType and an explicit instruction that the content is
an automated system event, not user input. This follows the same pattern
used by branchSummary and compactionSummary messages which already use
structured prefixes/suffixes.
Closes#3026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve TS import extension and type errors in messages test
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
When a session accumulates many images (screenshots, file reads), the
Anthropic API enforces a 2000px dimension limit for "many-image requests"
and returns a 400 error. Previously this error was not classified as
retryable, causing the session to get permanently stuck in an error loop
with no recovery path.
This adds automatic recovery: detect the specific "image dimensions exceed
max allowed size for many-image requests" error, strip older images from
the conversation history (keeping the 5 most recent), and auto-retry.
Also handles manual retry (continue/retry) by downsizing before retrying.
Closes#2874
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The "Extra usage is required for long context requests" error from
Anthropic is a billing gate, not a transient rate limit. Classify it as
quota_exhausted so the handler enters the fallback path instead of an
infinite backoff loop. When no cross-provider fallback exists, attempt a
[1m] to base model downgrade before stopping cleanly.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* refactor(pi-ai): replace model-ID pattern matching with capability metadata
Add ModelCapabilities to Model<TApi> and a CAPABILITY_PATCHES mechanism
so call sites read model.capabilities fields instead of parsing model IDs
or hardcoding provider names.
- types.ts: add ModelCapabilities interface (supportsXhigh, requiresToolCallId,
supportsServiceTier, charsPerToken) and capabilities?: ModelCapabilities to
Model<TApi>
- models.ts: add CAPABILITY_PATCHES table applied at registry init; patches
declare GPT-5.x and Opus 4.6 capabilities once instead of repeating ID
checks at every call site; supportsXhigh() now reads capabilities only
- service-tier.ts: extract SERVICE_TIER_MODEL_PREFIXES constant so the gating
list has a single named home; add path comment pointing to issue #2546 for
the full capability-driven follow-up
No behaviour change. New models and providers can declare capabilities in
their model definitions without touching function logic.
Closes#2546
* fix(pi-ai): apply capability patches to custom/discovered/extension models
Models constructed outside the static pi-ai registry (custom models
from models.json, extension-registered models, discovered models)
bypassed CAPABILITY_PATCHES — causing supportsXhigh() to silently
return false for GPT-5.x or Opus 4.6 variants registered through
those paths.
Export applyCapabilityPatches() from pi-ai and call it in ModelRegistry
after model assembly in all three construction paths: loadModels(),
applyProviderConfig(), and discoverModels().
Add regression tests covering patching, precedence, idempotency,
and synthetic models that mimic the custom/extension path.
Closes#2546
On Windows, `spawn()` with `detached: true` sets the
CREATE_NEW_PROCESS_GROUP flag in CreateProcess. In certain terminal
contexts — notably VSCode's integrated terminal (ConPTY), Windows
Terminal, and some MSYS2/Git Bash configurations — this flag conflicts
with the parent process group hierarchy and causes a synchronous EINVAL
from libuv, making *every* bash/async_bash/bg_shell command fail
immediately with `spawn EINVAL`.
The bg-shell extension already guards against this with
`detached: process.platform !== "win32"` (process-manager.ts:109),
but three other spawn sites were missed:
- `packages/pi-coding-agent/src/core/tools/bash.ts` (bash tool)
- `packages/pi-coding-agent/src/core/bash-executor.ts` (RPC executor)
- `src/resources/extensions/async-jobs/async-bash-tool.ts` (async_bash)
This commit aligns all spawn sites with the bg-shell pattern.
Additionally fixes two related issues:
1. `killProcessTree()` in shell.ts used `detached: true` on its own
`taskkill` spawn call — unnecessary and potentially problematic
in the same terminal contexts. Removed.
2. `killTree()` in async-bash-tool.ts used Unix-only
`process.kill(-pid)` with no Windows fallback. On Windows, negative
PIDs (process group kill) are not supported, so orphaned child
processes could survive timeout kills. Now uses `taskkill /F /T`
on Windows, matching the bg-shell and shell.ts implementations.
Includes a regression test that statically verifies no spawn site
uses unconditional `detached: true`, plus a smoke test confirming
the platform-guarded pattern works on all platforms.
Reproduction: Run GSD v2.42-v2.51 inside VSCode on Windows 11 with
Git Bash as the shell. Any bash tool call fails with `spawn EINVAL`.
The error is 100% reproducible and affects all shell operations
(bash, async_bash, bg_shell start).
Co-authored-by: Matt Haynes <matt@auroraventures.io>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Runs commands in the user's login shell ($SHELL -l -c) so PATH additions
and env vars from shell profiles (.zprofile/.profile) are available.
Shell aliases are intentionally not loaded (requires -i which causes
startup noise and job control side effects).
Implementation spawns $SHELL directly via a loginShell flag threaded
through the bash executor — no double-shell wrapping.
- Registered as builtin slash command with autocomplete
- Reuses existing bash execution pipeline (streaming, session recording)
- Output included in LLM context for agent reference
- Added loginShell option to executeBash and handleBashCommand
- Browser mode rejects /terminal (terminal-only command)
- Updated web-command-parity-contract tests
AI-assisted: This change was authored with Claude (AI pair programming).
* feat: integrate managed RTK across shell workflows
* fix(rtk): unify managed fallback and live savings wiring
* fix(rtk): improve TUI status visibility
* fix(tests): make portability tests independent of pi-coding-agent dist build
The CI portability test runs don't guarantee that
packages/pi-coding-agent has been compiled. Any test that
imported files pulling in @gsd/pi-coding-agent (resource-loader,
preferences-skills, async-bash-tool, etc.) crashed with
ERR_MODULE_NOT_FOUND pointing at dist/index.js.
Two changes to dist-redirect.mjs (the Node ESM loader hook used by
all unit tests):
- Redirect the bare @gsd/pi-coding-agent specifier to the workspace
source entrypoint (src/index.ts) so no dist/ artifact is needed.
- Extend the load() hook to transpile *.ts files under
packages/pi-coding-agent/src/ through TypeScript's transpileModule.
Node's --experimental-strip-types can't handle parameter properties
and similar syntax present in that package's source; full transpilation
avoids the ERR_UNSUPPORTED_TYPESCRIPT_SYNTAX crash.
Also fix the dashboard.tsx responsive grid:
- xl:grid-cols-5 → xl:grid-cols-4 2xl:grid-cols-5
(5 metric cards no longer fit at xl without overflow; test contract
expected xl:grid-cols-4)
- Keep loading-skeletons.tsx in sync with the same breakpoints.
Add src/tests/resolve-ts-loader.test.ts to guard the loader behaviour:
- bare @gsd/pi-coding-agent redirect points to workspace source
- direct source-entry rewrite (.js → .ts)
- transpilation removes TS parameter property syntax that strip-only
mode cannot parse
* fix(tests): redirect all workspace package imports to source in portability tests
The previous fix only redirected @gsd/pi-coding-agent to its
source entrypoint. In CI, pi-coding-agent/src itself imports
@gsd/pi-ai (and other workspace packages) which were still pointing
at dist/. Since no workspace dist is built during the portability
test run, any transitive resolution hit the same ERR_MODULE_NOT_FOUND.
Changes to dist-redirect.mjs:
- Redirect @gsd/pi-ai, @gsd/pi-ai/oauth, @gsd/pi-agent-core, and
@gsd/pi-tui bare imports to their workspace src/ entrypoints.
- Broaden the load() transpilation condition from
'/packages/pi-coding-agent/src/' to '/packages/*/src/' so that
all workspace source files are run through TypeScript's
transpileModule, handling parameter properties and other syntax
that Node's strip-only mode rejects.
Verified by hiding all four workspace dist/ directories locally and
running the failing test set — 96/96 pass.
* fix(tests): redirect @gsd/native sub-paths; fix Windows .cmd spawnSync
Two more portability failures after the previous fix:
1. @gsd/native sub-path imports (@gsd/native/fd, @gsd/native/text, etc.)
were not redirected — the loader only handled the bare specifier.
Added a prefix-match redirect for @gsd/native/* → packages/native/src/<sub>/index.ts.
2. Windows RTK tests failed because createFakeRtk produces a .cmd wrapper
on Windows, and spawnSync(binaryPath, [...]) without shell:true silently
returns non-zero when the binary is a .cmd file.
Added shell: /\.(cmd|bat)$/i.test(binaryPath) to the spawnSync calls in:
- src/resources/extensions/shared/rtk.ts (rewriteCommandWithRtk)
- src/resources/extensions/shared/rtk-session-stats.ts (readCurrentRtkGainSummary)
- packages/pi-coding-agent/src/utils/rtk.ts (rewriteCommandForGsd)
Production use of rtk.exe is unaffected; the shell flag is only true for
.cmd/.bat paths.
Verified: all 93 portability tests pass with all workspace dist/ directories
removed (simulating CI portability environment).
* fix(tests): Windows portability fixes — HOME env, managed RTK path, perf threshold
Four Windows-specific failures fixed:
1. app-smoke.test.ts: process.env.HOME is undefined on Windows (uses
USERPROFILE instead). Changed to homedir() from node:os which works
cross-platform.
2. Managed RTK path tests on Windows: tests placed a fake RTK as rtk.exe
(by copying a .cmd script into a .exe filename), which Windows cannot
execute. Two-part fix:
- resolveRtkBinaryPath() in both rtk.ts files now falls back to rtk.cmd
in the managed dir on Windows when rtk.exe is absent.
- withManagedFakeRtk and equivalent patterns in rtk.test.ts,
rtk-session-stats.test.ts, rtk-execution-seams.test.ts changed to
place the fake at rtk.cmd instead of rtk.exe on Windows.
3. bg_shell RTK test on Windows: requires bash (for shell sessions), which
is not available on the blacksmith-4vcpu-windows-2025 runner without
Git Bash installed. Test now skips on win32.
4. derive-state-db perf assertion: 10ms threshold was too tight for Windows
CI runners (measured 12ms under load). Raised to 25ms — still catches
real regressions (baseline is 3ms locally and ~12ms on stressed runners).
* fix(tests): fix managed RTK path fallback on Windows in src/rtk.ts + fix copyable fake
Two remaining Windows failures:
1. src/rtk.ts was never patched with the rtk.cmd managed-dir fallback
(only the shared/rtk.ts and pi-coding-agent/src/utils/rtk.ts were updated).
Added the same rtk.cmd fallback and shell:.cmd detection to src/rtk.ts,
which is what rtk.test.ts imports from.
2. createFakeRtk on Windows wrote '%~dp0\fake-rtk.js' in the .cmd content —
this resolves relative to the .cmd file's own directory. When the test
copies rtk.cmd to a different managed dir, %~dp0 resolves to the copy
destination where fake-rtk.js does not exist. Fixed by embedding the
absolute path to fake-rtk.js directly in the .cmd content so the fake
works correctly regardless of where the .cmd is copied.
* feat(experimental): add RTK opt-in preference with web UI toggle
- Add `experimental` category to GSDPreferences with `rtk: boolean` (default: false)
- RTK is now opt-in: disabled by default for all projects unless explicitly enabled
- Validate experimental.* keys; unknown experimental keys produce warnings
Web UI:
- Add ExperimentalPanel component with animated toggle switch per flag
- Add /api/experimental route (GET/PATCH) to read/write flags in preferences.md
- Add 'Experimental' tab to settings dialog sidebar nav (FlaskConical icon)
- Include ExperimentalPanel at bottom of gsd-prefs mega-scroll
- Fix toggle disabled state: trigger loadSettingsData for 'experimental' section
and self-fetch on mount when data is absent
Dashboard:
- Gate RTK Saved metric card on rtkEnabled from live auto state (web)
- Gate TUI dashboard RTK savings row on rtkEnabled
- Gate TUI footer RTK status updates on experimental.rtk preference
- Propagate rtkEnabled through AutoDashboardData → bridge-service → store
Build:
- Add scripts/build-if-stale.cjs: incremental build driver that skips each
step (packages, root tsc, copy-resources, web) when output is newer than
source; replaces full rebuild chain in gsd:web
- Add scripts/web-stop.cjs: robust stop with registry + legacy PID + orphan
sweep via pgrep; handles crash/restart orphaned next-server processes
- gsd:web now uses build-if-stale.cjs (fast cold starts, instant when unchanged)
- gsd:web:stop / gsd:web:stop:all use web-stop.cjs directly
Fix: correct import path in rtk-status.ts (./preferences.js not ../preferences.js)
* fix: restore em-dash encoding in package.json to match upstream
* refactor(rtk): move command rewrite out of pi-coding-agent into GSD extension
Per review feedback from igouss: pi-coding-agent should not be modified to add
GSD-specific logic. Instead, add a proper extension point and wire RTK through it.
Changes to packages/pi-coding-agent (extension API only — no RTK logic):
- Add BashTransformEvent + BashTransformEventResult types to extension API
- Add on('bash_transform') overload to ExtensionAPI interface
- Add emitBashTransform() to ExtensionRunner (chains all handlers in order)
- Call emitBashTransform() in wrapToolWithExtensions before bash tool execution
- Export new types from extensions/index.ts and package index.ts
- Revert all RTK-specific changes from bash-executor.ts, tools/bash.ts
- Remove packages/pi-coding-agent/src/utils/rtk.ts entirely
Changes to GSD extension:
- Register bash_transform handler in register-hooks.ts that calls
rewriteCommandWithRtk() from the existing shared/rtk.ts module
- Handler is a no-op when RTK is disabled or not installed
* fix: correct import path for shared/rtk.js in register-hooks
* fix(tests): remove deleted pi-coding-agent/utils/rtk imports from execution seams test
The RTK rewrite logic was moved out of pi-coding-agent into the GSD
extension (bash_transform hook). Tests that directly imported the
deleted utils/rtk.ts are removed; remaining tests verify the shared
RTK module and GSD-layer surfaces that still call rewriteCommandWithRtk.
Server errors (500/502/503/504) are server-side failures — rotating
credentials doesn't help. Only rate_limit and quota_exhausted are
meaningfully credential-scoped. This prevents the cascading backoff
where a single 500 backs off the sole API key for 20s, causing all
subsequent retries to fail with "All credentials temporarily backed off".
Closes#2588
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a custom provider (e.g. claude-code-cli) registers a streamSimple
handler with the same api type as a built-in (e.g. 'anthropic-messages'),
the global API provider registry was overwritten, routing ALL models of
that api type through the custom handler.
This caused anthropic/claude-opus-4-6 requests to be dispatched through
the Claude Code SDK subprocess instead of the Anthropic API, resulting
in 'Tool not found' errors for Glob, Read, Edit, Bash (SDK tool names
not present in pi's tool registry).
Fix: wrap the registered handler with a model.provider guard so it only
fires for models from the registering provider, delegating to the
previous handler for all other providers.
Closes#2536
Adds `externalToolExecution` flag to AgentLoopConfig. When true, the
agent loop emits tool_execution_start/end events for TUI rendering but
skips local tool dispatch. Used by providers that handle tool execution
internally (e.g., Claude Code CLI via Agent SDK).
The flag is dynamically evaluated per-loop via a callback on
AgentOptions, so model switches mid-session are handled correctly.
Providers with authMode "externalCli" automatically use this mode.
Also updates the Claude Code CLI stream adapter to preserve tool call
blocks in the final message instead of stripping them.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 13 test files in packages/pi-coding-agent/src/core/ were never executed
in CI or by `npm test`. The test:unit glob only covers src/resources/extensions/gsd/tests/
and src/tests/, leaving lifecycle-hooks, model-registry-auth-mode, auth-storage,
and 10 other suites with zero enforcement.
- Add `test:packages` script that runs compiled dist tests after build
- Wire into both the linux build job and windows-portability job in CI
- Fix two env-isolation bugs in auth-storage.test.ts: the "returns undefined"
and "falls through to fallback resolver" tests were not clearing
OPENROUTER_API_KEY before calling getApiKey, causing failures when the
env var is set in the caller's environment
Shows absolute timestamps (date + time) on user prompts (right-aligned
above the message) and assistant replies (below the response). Format
is configurable via /settings → Timestamp format:
- date-time-iso: 2026-03-24 10:34 (default)
- date-time-us: 03-24-2026 10:34 AM
Setting persists in settings.json as timestampFormat.
- Added formatTimestamp utility with ISO and US format support
- Updated UserMessageComponent and AssistantMessageComponent
- Added timestampFormat to SettingsManager with getter/setter
- Added to /settings UI for runtime switching
- Unit tests for all format variants including AM/PM edge cases
AI-assisted: This change was authored with Claude (AI pair programming).
* feat: complete offline mode support for local-only model setups
- Add isLocalModel() to detect localhost/127.0.0.1/0.0.0.0/::1/unix sockets
- Add isAllLocalChain() to verify all registry models are local
- Validate --offline flag rejects remote models with clear error
- Auto-enable PI_OFFLINE when all configured models are local
- Return dummy API key for local models to skip auth validation
- Filter web search results in offline mode (chat-controller + tool-execution)
- Add ECONNREFUSED/ENOTFOUND/ENETUNREACH to INFRA_ERROR_CODES for immediate
failure (no retry) when network is intentionally unavailable
- Add comprehensive test suite (17 tests)
Fixes#2341
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): update infra-error test for new offline-mode error codes
The offline mode feature added ECONNREFUSED, ENOTFOUND, and ENETUNREACH
to INFRA_ERROR_CODES but the test still asserted size === 6. Update the
count to 9 and add detection tests for the three new codes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(core): add generic native post-install hooks for package install
* feat(core): add before/after install/remove lifecycle hooks
* refactor(core): remove postInstall alias from lifecycle hook fallback
* feat(core): complete authMode support for keyless providers
The initial authMode implementation fixed model-registry, sdk, and
fallback-resolver but missed agent-session.ts (6 callsites) and
compaction-orchestrator.ts (2 callsites) that block externalCli
providers at runtime.
Architecture: separate readiness gating from credential retrieval.
- isProviderRequestReady(): authMode-aware readiness check
- getApiKey()/getApiKeyForProvider(): return undefined for
externalCli/none providers instead of triggering auth errors
- All 8 callsites in agent-session and compaction-orchestrator
now gate on readiness, not key presence
- Downstream signatures (compaction, branch-summarization) accept
apiKey: string | undefined
- Replaced hardcoded ollama exception in discoverModels with
isProviderRequestReady
Zero behavioral change for classic apiKey/oauth providers.
* feat(core): add isReady callback for provider readiness verification
Extensions can now provide an isReady() callback when registering any
provider. isProviderRequestReady() calls it before default auth checks,
allowing providers to verify actual reachability (CLI authenticated,
API key valid, service online) rather than relying solely on credential
presence.
* test(core): expand authMode test coverage
Cover all four auth modes (apiKey, oauth, externalCli, none),
isReady callback behavior, getProviderAuthMode defaults,
isProviderRequestReady for each mode, getAvailable filtering,
and getApiKey early-return for keyless providers.
* chore: remove provider-api-bridge files from this branch
These files implement GSD core → provider-api wiring (deps + tool
registry) and belong in a separate PR. Reverts register-extension.ts
to upstream state.
When a user creates a .js extension file but writes TypeScript syntax in it,
the loader now detects common TS patterns (type annotations, interfaces, enums,
generics) and provides a clear error message suggesting to rename the file to
.ts, instead of the previous cryptic "Extension does not export a valid factory
function" or opaque jiti parse errors.
Fixes#2381
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(memory): fix memory and resource leaks across TUI, LSP, DB, and automation
Addresses all findings from a systematic memory leak audit across five
dimensions: event listeners, timers, file system handles, subscriptions/
closures, and GSD automation lifecycle.
Critical fixes:
rpc-client.ts: stderr .on("data") handler attached in start() was never
removed in stop(). Now stored as _stderrHandler and removed via
removeListener() on stop.
lsp/client.ts: Three process.on() handlers (beforeExit, SIGINT, SIGTERM)
registered at module load time with anonymous functions — impossible to
remove. Now stored as named references; new removeProcessHandlers() export
allows graceful teardown. stdout/stderr stream listeners in
startMessageReader/startStderrReader also stored per-client in
clientStreamHandlers map and removed in shutdownClient() and shutdownAll().
parallel-orchestrator.ts: spawnWorker() attached 5 listeners to child
process streams on every spawn with no removal on worker stop/respawn,
accumulating listeners indefinitely. Added cleanup() field to WorkerInfo;
called via removeAllListeners() on exit, graceful stop, stale detection,
and dead PID cleanup paths. Also: module-level state.workers Map was never
cleared between orchestration runs; startParallel() and resetOrchestrator()
now iterate and clean up all WorkerInfo entries before reassigning state.
scripts/watch-resources.js: fs.watch() return value was discarded (OS
watcher never closed) and the fallback setInterval handle was also
discarded (timer ran forever). Both now stored; process.on("exit") handler
closes/clears them.
gsd-db.ts: closeDatabase() did not checkpoint the WAL before closing —
.db-shm/.db-wal files accumulated on disk across crash-recovery cycles.
Now runs PRAGMA wal_checkpoint(TRUNCATE) before close. Also added a
one-time process.on("exit") handler in openDatabase() so the handle is
always closed even on unclean exits.
Medium fixes:
bg-shell/overlay.ts: 1-second refresh setInterval only cleared in
keyboard exit handler; abnormal teardown leaked the timer. Added dispose()
method that unconditionally clears it.
file-watcher.ts: pending debounce Map was scoped inside startFileWatcher()
making it inaccessible to stopFileWatcher(). Moved to module scope;
stopFileWatcher() now clears all pending timers and empties the map before
closing the watcher.
auto-supervisor.ts: registerSigtermHandler() could accumulate multiple
SIGTERM handlers if called without passing back the previous reference.
Added module-level _currentSigtermHandler; old handler is always removed
before registering the new one regardless of whether caller passes it.
Low-severity fixes:
print-mode.ts: session.subscribe() return value was discarded. Now stored
and called in a finally block to guarantee cleanup on both normal
completion and errors.
rpc-mode.ts: same — subscribe() unsubscribe now called in the shutdown
path before process.exit().
theme.ts: onThemeChangeCallback singleton silently overwrote any previous
subscriber. Converted to Set<() => void>; onThemeChange() now returns a
cleanup function. All four internal call sites updated to forEach().
Backward-compatible — existing callers that discard the return are unaffected.
* fix: ensure unsubscribe is called on error/abort in print-mode
The PR #2314 added unsubscribe storage but still called process.exit(1)
directly, bypassing the unsubscribe. Wrapped in try/finally to guarantee
cleanup runs before exit.
Adds opt-in per-prompt cost display to the interactive footer. Users
enable it by setting `show_token_cost: true` in their preferences.md.
Disabled by default — the footer behavior is unchanged unless opted in.
Fixes#1515
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes#2083
When an OpenRouter API key is stored in auth.json as type:"oauth" (instead
of type:"api_key"), getApiKey() calls getOAuthProvider("openrouter") which
returns undefined — OpenRouter is not a registered OAuth provider. Previously,
resolveCredentialApiKey returned undefined and getApiKey returned that directly,
never reaching the env-var or fallback-resolver paths.
Now, when resolveCredentialApiKey returns undefined, getApiKey falls through
to OPENROUTER_API_KEY env var and the fallback resolver instead of silently
failing with "Authentication failed."
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Skip jiti JIT compilation for bundled extensions that have pre-compiled .js
siblings, enable V8 bytecode caching on Node 22+, and batch directory
discovery to reduce syscalls during resource loading.
Fixes#2108
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>