The two new sub-turn shrink regression tests created a pinned
DynamicBorder (via message_update with pinnable text + tool) but never
emitted message_end, so the spinner's setInterval kept the test process
alive until CI timed out after 15 minutes. Append a message_end to
each test so the module-level pinnedBorder is torn down.
Commit c8c416802 (#4144) introduced module-level renderedSegments state
to track interleaved text/tool components per assistant turn, but never
reset it when an adapter shrinks streamingMessage.content[] back to 0/1
at a provider sub-turn boundary within one assistant lifecycle (the
claude-code adapter does this). Consequence chain: the segment walker
finds the stale text-run entry at startIndex=0, calls updateContent on
it with the new (shrunk) message, and the in-place edit destroys the
prior sub-turn's visible text. New tool blocks at contentIndex=1 then
collide with stale registrations, causing visual ordering corruption.
hasToolsInTurn stays sticky-true and lastPinnedText never clears, so
the pinned "Working - Latest Output" mirror freezes on the pre-shrink
snapshot.
Track lastContentLength explicitly. On shrink, clear renderedSegments,
reset lastPinnedText, and reset lastProcessedContentIndex so the
walker treats the new sub-turn as fresh segments that append after
prior sub-turn children. Prior history stays rendered as frozen
components; pendingTools and the spinner border are untouched.
Adds two regression tests in chat-controller-ordering.test.ts: one
verifies prior sub-turn components are not overwritten and new tools
append in content[] order after a shrink, the other verifies the
pinned markdown updates from the first sub-turn's text to the second
sub-turn's text across a shrink boundary.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously the chat-controller created one AssistantMessageComponent per
assistant message and removed/re-appended it to the chat container's tail
on every tool block, forcing all narration after every tool execution
regardless of stream order. Users had to scroll up to read text that was
written before each tool call.
Replace the reorder hack with a stream-order segment walker that walks
content[] left-to-right, collapses contiguous text/thinking blocks into
text-run segments, emits one segment per tool block, and append-only adds
new segments to chatContainer. AssistantMessageComponent gains a
ContentRange API so a single message can spawn multiple text-run
components, plus a separate showMetadata flag so timestamp/error footers
render only on the trailing segment without duplicating earlier text.
Adds a regression test that streams [text, tool, text, tool, text] and
asserts both interleaved order and per-segment rendered text content.
Closes#4144
Restore the isProviderRequestReady() guard lost during the main merge.
Tests in model-resolver.test.ts and model-resolver-initial-model-auth.test.ts
require findInitialModel() to skip an unauth'd saved default and fall
through to the first available model.
Remove hard-coded Anthropic/Claude defaults and silent provider swaps so
the app honors whatever model/provider the user has configured.
- src/cli.ts: drop the anthropic->claude-code auto-migration blocks that
were rewriting the user's saved defaultProvider on every startup.
- packages/pi-coding-agent/src/core/model-resolver.ts: delete the
defaultModelPerProvider table, drop the "recommended variant" swap
that silently upgraded e.g. claude-opus-4-6 to -extended, and replace
the provider-iteration first-available fallback with provider-sticky
(user's saved provider first, then first registry entry).
- src/startup-model-validation.ts: replace the openai/anthropic-first
fallback chain with Pi-default -> same-provider -> first-available.
- src/help-text.ts: use a generic provider/model-id example for --model
instead of claude-opus-4-6.
- src/tests/startup-model-validation.test.ts: update the fallback test
to assert provider stickiness rather than a specific Claude model id.
https://claude.ai/code/session_01CvuUuzuVjRcQN25263nG6V
Custom OpenAI-compatible providers running on localhost (e.g. a local proxy)
with an explicit apiKey in models.json received 'local-no-key-needed' during
compaction instead of their configured key, causing 401 errors.
The localhost shortcut in AuthStorage.getApiKey() was unconditional. Normal
dispatch calls getApiKeyForProvider() which skips the baseUrl check entirely,
so the fallback resolver was reached and the real key was used. Compaction
calls getApiKey(model) which passes baseUrl, hitting the shortcut first.
Closes#4106
* feat(pi-ai): add Alibaba DashScope as standalone provider
Adds `alibaba-dashscope` for users with a regular DashScope API key,
separate from the existing `alibaba-coding-plan` free-tier provider.
- types.ts: register `alibaba-dashscope` as KnownProvider
- env-api-keys.ts: map to DASHSCOPE_API_KEY
- models.custom.ts: add qwen3-max, qwen3.5-plus, qwen3.5-flash,
qwen3-coder-plus with international endpoint and real pricing
- model-resolver.ts: default model qwen3.5-plus
- key-manager.ts: add alibaba-coding-plan and alibaba-dashscope
to PROVIDER_REGISTRY so /gsd keys add works for both
Co-Authored-By: Claude Code <noreply@anthropic.com>
* feat(pi-ai): add qwen3.6-plus to alibaba-dashscope provider
qwen3.6-plus is available on DashScope international endpoint.
Pricing: $0.5/M input, $3/M output (base tier, 0-256K tokens).
Supports thinking mode (reasoning: true).
Source: https://www.alibabacloud.com/help/en/model-studio/model-pricing
Co-Authored-By: Claude Code <noreply@anthropic.com>
* test(pi-ai): add tests for alibaba-dashscope provider and key-manager regression
- packages/pi-ai/src/models.test.ts: add describe block covering all 5
alibaba-dashscope models (presence, base URL, API, provider field,
context window, paid pricing, per-model reasoning/cost assertions,
independence from alibaba-coding-plan, failure path for unknown model)
- src/resources/extensions/gsd/tests/key-manager.test.ts: add regression
tests for #3891 — alibaba-coding-plan was missing from PROVIDER_REGISTRY,
causing /gsd keys add alibaba-coding-plan to fail silently; also covers
alibaba-dashscope registration, env var separation, and getAllKeyStatuses
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Code <noreply@anthropic.com>
Extension-based providers like pi-claude-cli register their models
during extension loading, but registrations were queued and not flushed
until after model resolution ran. This caused findInitialModel() and
the startup model validation to see extension models as nonexistent,
permanently overwriting the user's saved model selection on every launch.
- Flush pendingProviderRegistrations in createAgentSession() before
findInitialModel() so extension models are visible in the registry
- Move model validation to after createAgentSession() in both print
and interactive code paths
- Load extensions before --list-models so extension models appear
Filter models whose provider has no working API key or OAuth out of
every user-facing selection path. Previously, stale defaults and scoped
sets could leak unconfigured models into /model, /gsd model, and auto
run — the user could "pick" a model that immediately threw on use.
- model-selector: filter scopedModels via isProviderRequestReady;
default to "all" scope when no scoped model is ready.
- model-controller: same filter for getModelCandidates, so exact-match
resolution from /model <term> can't return an unauth'd scoped model.
- model-resolver: gate findInitialModel step 3 on provider readiness so
a stale saved default falls through to the available-models path.
- startup-model-validation: check configuredExists against getAvailable
instead of getAll, so a configured-but-unauth default triggers the
fallback picker and thinking-level reset.
- auto-start: validate resolveDefaultSessionModel against the live
registry + auth before snapshotting, and warn when PREFERENCES.md
names an unconfigured model.
https://claude.ai/code/session_015q6b23ap9Pyqdogzz2FXGh
- auth-storage.test.ts: 8 tests for getEarliestBackoffExpiry()
- sdk.test.ts: 12 tests for CredentialCooldownError class
- infra-errors-cooldown.test.ts: 35 tests for isTransientCooldownError(),
getCooldownRetryAfterMs(), and exported constants
Required by CI lint (require-tests.sh) per CONTRIBUTING.md.
Closes#4052
getApiKey() retry loop (3 attempts, ~12s) couldn't outlast the 30s
rate-limit backoff window, causing cooldown errors to cascade through
the auto-loop and trigger a hard stop after 3 consecutive failures.
- Add AuthStorage.getEarliestBackoffExpiry() to expose when the next
credential becomes available
- Update getApiKey() to sleep until backoff expiry (up to 60s) instead
of fixed 2s/4s/6s delays
- Add isTransientCooldownError() detector in infra-errors.ts
- Auto-loop now waits 35s on cooldown errors without incrementing the
consecutive error counter
Closes#4052
The pinned "Latest Output" zone was only cleared at agent_end, but during
flows with form elicitation (e.g. discuss-phase), there is a gap between
message_end and agent_end where the agent waits for user input. During this
gap, the same content was visible in both the chat history and the pinned
zone. Clear the pinned zone at message_end when the assistant message is
finalized in the chat container.
Restores the pinned assistant output zone that shows the latest narration
during tool execution. Adds markdown rendering, animated spinner, height
capping to prevent render flashing, and session rebuild support.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs prevented subscription users from routing through Claude Code CLI:
1. Retry handler regex only matched "third-party" errors but actual error is
"You're out of extra usage" — fallback never triggered
2. auto-model-selection actively rerouted bare model IDs back to anthropic
even after startup migration set claude-code as the session provider
Verify claude-code fallback only fires for anthropic provider and
does not reroute non-anthropic providers on similar error text.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevent _tryClaudeCodeFallback from firing for non-Anthropic providers
that may produce similar error text, avoiding unintended provider drift.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Anthropic now blocks third-party apps from using Pro/Max subscription
quotas via direct API calls. This change makes the claude-code provider
(which delegates to the local claude CLI binary) the default path for
Anthropic subscription users — TOS-compliant because requests flow
through Anthropic's own infrastructure.
Changes:
- Enhanced readiness check to verify CLI auth status (not just binary)
- Startup migration: auto-switch anthropic → claude-code when CLI ready
- Error recovery: auto-switch on third-party 400 block error
- Onboarding: removed Anthropic from OAuth, added Claude CLI option
- Added claude-code to flat-rate providers (no dynamic routing benefit)
Closes#3772
newSession() only rebuilt the tool registry when cwd changed. When cwd
stayed the same (e.g., discuss → plan-slice in the same worktree), any
tool narrowing from setActiveTools() persisted — stripping gsd_plan_slice
and other DB tools from auto-mode subagent sessions.
Add an else-branch that calls _refreshToolRegistry with
includeAllExtensionTools:true on every session switch, regardless of cwd.
Also call resetExtensionLoaderCache() in DefaultResourceLoader.reload()
so hot-updated extension code on disk is re-compiled instead of served
from the stale jiti module cache.
Closes#3616
- Use `git reset --hard <sha>` for rollback instead of `git branch -f`
which fails on checked-out branches and worktrees
- Clear pendingProviderRegistrations after preflush to prevent duplicate
registration when bindCore() runs
- Process Ollama stream content on terminal `done:true` chunks to avoid
truncating trailing assistant text
Replace the OpenAI-compat shim with a native Ollama /api/chat streaming
provider that exposes all commonly-used Ollama options and surfaces
inference performance metrics.
Key changes:
- Native NDJSON streaming from /api/chat (no more OpenAI shim)
- Known models send num_ctx from capability table; unknown models defer
to Ollama's default to avoid OOM on constrained hosts
- Exposes: temperature, top_p, top_k, repeat_penalty, seed, num_gpu,
keep_alive, num_predict via per-model providerOptions
- Extracts <think>...</think> blocks for reasoning models (deepseek-r1, qwq)
- Surfaces InferenceMetrics (tokens/sec, durations) on AssistantMessage
- Adds remove and show actions to ollama_manage LLM tool
- Adds "ollama-chat" to KnownApi, providerOptions to Model<TApi>
- NDJSON parser uses strict mode for chat (fails on malformed frames)
- Mixed content+tool_call chunks handled independently
Closes#3544
Extension-provided models (e.g. claude-code/*) were unavailable during
findInitialModel() because pendingProviderRegistrations had not been
flushed yet, causing the fallback chain to select Google Gemini even
when the user explicitly configured claude-code as their default.
Three compounding issues fixed:
(A) Flush pendingProviderRegistrations in createAgentSession() before
findInitialModel() runs, so extension models are in the registry
when initial model selection happens.
(B) Re-apply the validated model to the session after
validateConfiguredModel() in both print and interactive CLI paths.
Previously, validation updated settingsManager but never called
session.setModel(), leaving the session on the wrong model.
(C) Update defaultModelPerProvider.anthropic from "claude-opus-4-6[1m]"
to "claude-opus-4-6" — the [1m] variant was removed from the model
registry when the base model was upgraded to 1M context, causing the
Anthropic fallback to silently fail and skip to Google.
Closes#3534
* fix(perf): share jiti module cache across extension loads (#2108)
Each extension was creating a new jiti instance with moduleCache: false,
causing shared dependencies to be recompiled for every extension. Use a
shared singleton with moduleCache: true so shared modules are compiled once.
Export resetExtensionLoaderCache() for test teardown and explicit reload.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: correct loader path in extension-load-perf test (4 → 3 levels up)
The test file is at src/tests/ (2 levels deep from repo root), so
fileURLToPath(import.meta.url) + 3x'..' reaches the repo root.
Using 4 levels exits the repo into the GitHub Actions workspace parent,
causing ERR_MODULE_NOT_FOUND for loader.js in dist/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: use process.cwd() for loader path in perf test (source/compiled portability)
import.meta.url resolves to different depths in source (src/tests/) vs compiled
(dist-test/src/tests/), so relative '../' navigation produces the wrong path in
the build phase. process.cwd() is always the repo root in CI regardless of
where the test file is compiled to.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: detect and block Gemini CLI OAuth tokens used as API keys
Users who install Google's standalone Gemini CLI may inadvertently set
GEMINI_API_KEY to an OAuth access token (ya29.*) instead of an AI Studio
API key (AIza*). These tokens fail at the Google API with a confusing
error. This adds early detection at three entry points:
- AuthStorage.set(): throws when storing ya29.* as api_key for "google"
- AuthStorage.getApiKey(): blocks ya29.* from runtime overrides (--api-key)
- AuthStorage.getApiKey(): blocks ya29.* from environment variables
Each path provides a clear error message explaining the issue and
directing users to either get an API key from aistudio.google.com or
use /login google-gemini-cli for OAuth-based access.
Fixes#2157
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: retrigger CI
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Fixes#2075
The isBuiltIn check in detectExtensionConflicts used a path-based
heuristic that excluded paths containing /.gsd/agent/extensions/.
Since initResources() syncs all bundled extensions into that same
directory, the heuristic could never return true for bundled extensions,
preventing the "supersedes" hint from being appended. This caused
downstream cli.ts to display "Extension load error" instead of the
intended "Extension conflict" for benign precedence collisions.
Replace the path heuristic with an explicit bundledExtensionKeys set
(already computed by buildResourceLoader) threaded through to
detectExtensionConflicts. The set is matched against the extension
directory name extracted from the owner path.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>