Four related improvements that landed in the working tree after the
auto-hardening merge but hadn't been committed:
1. auth_error as a distinct error type (auth-storage + retry-handler).
Previously invalid/expired API keys would retry the same failing
credential until the retry budget exhausted. Now:
- classifyErrorType() recognizes 401s, "invalid api key",
"authentication error", "unauthorized" etc as "auth_error"
- RetryHandler triggers cross-provider fallback on auth_error just
like it does for rate_limit and quota_exhausted — switch
providers rather than burning retries on a broken key
Outcome: a stale OPENCODE_API_KEY in sops now fails over to kimi or
minimax immediately instead of stalling the unit.
2. Multi-provider search-key detection (native-search.ts).
The "Web search: Set BRAVE_API_KEY" warning fired whenever a
non-Anthropic model lacked BRAVE_API_KEY, even when the user had
TAVILY_API_KEY or OLLAMA_API_KEY available. Now: the warning
suppresses if any of BRAVE/TAVILY/OLLAMA keys is present, and the
warning text lists all three options. Matches the preferences-
validation allow-list for search_provider.
3. MiniMax-M2.7-highspeed benchmark entry (model-benchmarks.json).
Routes the fast-tier variant of M2.7 through the Bayesian blender
with inherited RULER scores. Lets dynamic routing consider the
highspeed model when speed matters more than peak quality.
No regressions: the 41 pre-existing test failures in pi-coding-agent
(FallbackResolver chain-membership + LSP integration) are unchanged
relative to the prior commit.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- compaction: fix repeated compaction dropping kept messages (#2608)
Re-summarize from previous compaction's firstKeptEntryId instead of
prevCompactionIndex+1; use buildSessionContext for accurate tokensBefore
- edit: add multi-edit support via edits[] array
Single call can update multiple disjoint regions in one file;
applyEditsToNormalizedContent matches all edits against original content
and applies in reverse order for stable offsets
- bash: persist full output when line-count truncation occurs (#2852)
ensureTempFile now called on any truncation, not only byte overflow;
prevents data loss when output exceeds line limit before byte threshold
- bash-executor: same fix for remote/operations-based execution
ensureTempFile includes SF cleanup registration (registerTempCleanup,
bashTempFiles tracking)
- grep: include lineText from rg JSON events to avoid per-match file reads
Eliminates stall when context=0 on broad searches (#3148)
- agent-session: forward isError override from afterToolCall extension hook
Allows extensions to change error status of tool results (#3051)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Anthropic OAuth was removed in v2.74.0 for TOS compliance (#3952). Users
who upgraded through that version still have type:"oauth" entries under
`anthropic` in auth.json which cannot resolve to a valid API key.
stale entry, so hasAuth("anthropic") kept reporting true and masked the
claude-code fallback path. Users had to hand-edit auth.json to recover.
Self-heal instead:
- AuthStorage.removeLegacyOAuthCredential(provider) strips only
type:"oauth" entries and preserves any api_key credentials.
- sdk.ts getApiKey() calls it when the legacy-OAuth branch triggers,
logs a one-line warning, and throws a message pointing the user at
the "claude-code" provider when the `claude` binary is in PATH, or
at ANTHROPIC_API_KEY otherwise.
Closes#4399
(cherry picked from commit b8ef6604617fda239a037cf5d5e6020b168d2e62)
ChatGPT-backed Codex sign-in no longer exposes the removed 5.1/5.2 Codex variants. Filter those models from openai-codex OAuth so GSD stops surfacing selections that immediately fail while leaving API-key-backed OpenAI models available.
(cherry picked from commit 1aedba583916826fc5c6129037f61e9802010e46)
- 15 tests for ModelRegistry.getModelsForProxy covering family priority
ordering, auth-ready promotion, overrides, and edge cases
- Fix StreamOptions cast in proxy-server.ts (lost during rebase conflict)
- Fix .ts import extension in args.test.ts (pre-existing)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- model-registry: export PROXY_FAMILY_PRIORITY and GLOBAL_PROVIDER_FALLBACK
constants; add getModelsForProxy() returning candidates ordered by family
priority then global fallback (opencode → opencode-go → openrouter →
ollama-cloud); add getModel() convenience wrapper
- proxy-server: add priorityOverrides option; handleChat iterates all
candidates in priority order and falls through to the next on 429
- settings-manager: add ProxySettings type with providerPriority override
map; add getProxyProviderPriority() / setProxyFamilyProvider() accessors
- settings-selector: add ProxyPrioritySubmenu — a two-level TUI submenu
(family → provider) that dynamically generates entries from
PROXY_FAMILY_PRIORITY; wired in interactive-mode with full callback
Family defaults: MiniMax→minimax, GLM→zai, Kimi→kimi-coding,
MiMo→global-fallback, Gemini/Gemma→google-gemini-cli, Claude→anthropic,
GPT/o-series→openai
https://claude.ai/code/session_013BwmqG3NuwwZY3vsUb4Y9Y
Co-authored-by: Claude <noreply@anthropic.com>
Updates workflow tool names, documentation references, and internal naming
conventions across MCP server, CLI, tests, and web components to complete
the singularity-forge rebrand from gsd to sf.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Updates channel prefixes, log messages, comments, and configuration values
across daemon, mcp-server, and related packages to complete the rebrand from
gsd to sf-run naming.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Minimax and other Anthropic-protocol providers return HTTP 529 with
`overloaded_error` bodies under heavy load. The retryable regex (line 119)
matched `overloaded` so the error was retried, but the rate-limit
classifier (line 423) only matched `429`, so the error never triggered
credential rotation or cross-provider fallback — the handler looped on
the same provider forever.
Adds `529|overloaded` to the rate-limit classifier so 529 responses
route through the same backoff + fallback path as real rate limits.
Followup to 828c5edf6. Swarm review flagged default=true as a latent
footgun: any SDK consumer of createAgentSession() that forgets to pass
persistModelChanges would silently mutate ~/.gsd/agent/settings.json.
Flip the default to false so persistence is opt-in. Interactive CLI
entry points now explicitly pass persistModelChanges: true:
- src/cli.ts interactive createAgentSession call
- packages/pi-coding-agent/src/main.ts: persistModelChanges = isInteractive
Print/rpc/mcp stay at the safe default. Tests updated (9/9 green).
`gsd -p --model X "msg"` was silently overwriting defaultProvider/
defaultModel in settings.json. One-shot verification runs must use the
model for that invocation only.
Adds an AgentSessionConfig.persistModelChanges flag (default true so
interactive behavior is unchanged), forwards it through createAgentSession,
and sets it false in main.ts when !isInteractive and in src/cli.ts print
mode. The gsd wrapper also skips validateConfiguredModel when --model is
explicitly passed, so a CLI-provided model can't trigger a fallback repair
that writes the wrong default back.
Three settings.json write sinks audited: agent-session._applyModelChange
(gated on flag), model-selector.ts (interactive only, unreachable in
print), startup-model-validation (gated by !cliFlags.model in print).
Regression: 8 source-assertion tests in
agent-session-print-mode-persist.test.ts.
When message_end does not fire before agent_end (e.g. abort path), the
agent_end case was calling chatContainer.removeChild(streamingComponent),
which silently erased the last assistant message from the TUI chat history.
Fix: follow the message_end finalization pattern — call setShowMetadata(true)
and updateContent() before clearing the reference. Never call removeChild on
a component that was added to the persistent chat history.
Closes#4197
- DynamicBorder: verify lastExternalRender tracking suppresses redundant
renders during streaming, and standalone renders fire when idle
- TUI clearOnShrink: verify debounce flag lifecycle — deferred shrink
preserves maxLinesRendered, flag resets when content grows back
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
rebuildChatFromMessages() called populatePinnedFromMessages() which
re-populated the pinned zone with text already present in the chat
history, causing visible duplication during session state changes.
Additionally, the spinner interval at 80ms generated ~12.5 renders/s
for a purely cosmetic animation, and clearOnShrink triggered
unnecessary full redraws during pinned-zone transitions.
- Remove populatePinnedFromMessages() from rebuildChatFromMessages()
and add pinnedMessageContainer.clear() instead — the streaming
lifecycle in chat-controller manages pinned content during active work
- Reduce spinner interval 80ms→200ms with render-batching that skips
redundant renders when streaming already triggers requestRender()
- Debounce clearOnShrink: defer full redraw by one render tick so
pinned-clear→new-streaming transitions avoid a wasted full redraw
- Increase notification widget safety-net timer 5s→30s since the
store subscription already handles push-based updates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The two new sub-turn shrink regression tests created a pinned
DynamicBorder (via message_update with pinnable text + tool) but never
emitted message_end, so the spinner's setInterval kept the test process
alive until CI timed out after 15 minutes. Append a message_end to
each test so the module-level pinnedBorder is torn down.
Commit c8c416802 (#4144) introduced module-level renderedSegments state
to track interleaved text/tool components per assistant turn, but never
reset it when an adapter shrinks streamingMessage.content[] back to 0/1
at a provider sub-turn boundary within one assistant lifecycle (the
claude-code adapter does this). Consequence chain: the segment walker
finds the stale text-run entry at startIndex=0, calls updateContent on
it with the new (shrunk) message, and the in-place edit destroys the
prior sub-turn's visible text. New tool blocks at contentIndex=1 then
collide with stale registrations, causing visual ordering corruption.
hasToolsInTurn stays sticky-true and lastPinnedText never clears, so
the pinned "Working - Latest Output" mirror freezes on the pre-shrink
snapshot.
Track lastContentLength explicitly. On shrink, clear renderedSegments,
reset lastPinnedText, and reset lastProcessedContentIndex so the
walker treats the new sub-turn as fresh segments that append after
prior sub-turn children. Prior history stays rendered as frozen
components; pendingTools and the spinner border are untouched.
Adds two regression tests in chat-controller-ordering.test.ts: one
verifies prior sub-turn components are not overwritten and new tools
append in content[] order after a shrink, the other verifies the
pinned markdown updates from the first sub-turn's text to the second
sub-turn's text across a shrink boundary.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
InteractiveMode.renderWidgets() called Container.clear() on the
widgetContainerAbove/Below render mounts, which disposed every mounted
extension widget and then re-added the now-dead components. In AUTO mode
updateProgressWidget re-registers gsd-progress on every unit dispatch,
so gsd-notifications and gsd-health had their refresh timers and store
subscriptions killed after the first dispatch. Renders kept returning
the widgets' frozen cachedLines, making them look alive but never update
(/gsd notifications clear appeared to do nothing, belowEditor last-commit
went stale while the top-of-screen dashboard stayed correct).
Split detach from dispose: add Container.detachChildren() and use it from
the two widget-mount call sites. clear() still disposes for every other
caller (chat, editor, status, pinned-message containers). The
extensionWidgets* maps remain the single owner of widget disposal via
removeExisting() and clearExtensionWidgets().
While in AUTO, gsd-progress duplicates gsd-health on last commit, cost/
budget, and the health signal. Make gsd-progress the single source of
truth: hide gsd-health from auto-start and re-register it from every
exit point in auto.ts (lock-lost stop, cleanupAfterLoopExit !paused
guard, stopAuto, pauseAuto). gsd-notifications stays visible — it is
independent state and, with the detach fix, its subscription + 5s
refresh actually work again.
Tests: Container.detachChildren()/clear() contract guards added to
packages/pi-tui/src/__tests__/tui.test.ts. health-widget,
notification-{store,widget,overlay}, notifications-handler, notifications,
and auto-paused-ui-cleanup suites all pass.
Previously the chat-controller created one AssistantMessageComponent per
assistant message and removed/re-appended it to the chat container's tail
on every tool block, forcing all narration after every tool execution
regardless of stream order. Users had to scroll up to read text that was
written before each tool call.
Replace the reorder hack with a stream-order segment walker that walks
content[] left-to-right, collapses contiguous text/thinking blocks into
text-run segments, emits one segment per tool block, and append-only adds
new segments to chatContainer. AssistantMessageComponent gains a
ContentRange API so a single message can spawn multiple text-run
components, plus a separate showMetadata flag so timestamp/error footers
render only on the trailing segment without duplicating earlier text.
Adds a regression test that streams [text, tool, text, tool, text] and
asserts both interleaved order and per-segment rendered text content.
Closes#4144
Restore the isProviderRequestReady() guard lost during the main merge.
Tests in model-resolver.test.ts and model-resolver-initial-model-auth.test.ts
require findInitialModel() to skip an unauth'd saved default and fall
through to the first available model.
Remove hard-coded Anthropic/Claude defaults and silent provider swaps so
the app honors whatever model/provider the user has configured.
- src/cli.ts: drop the anthropic->claude-code auto-migration blocks that
were rewriting the user's saved defaultProvider on every startup.
- packages/pi-coding-agent/src/core/model-resolver.ts: delete the
defaultModelPerProvider table, drop the "recommended variant" swap
that silently upgraded e.g. claude-opus-4-6 to -extended, and replace
the provider-iteration first-available fallback with provider-sticky
(user's saved provider first, then first registry entry).
- src/startup-model-validation.ts: replace the openai/anthropic-first
fallback chain with Pi-default -> same-provider -> first-available.
- src/help-text.ts: use a generic provider/model-id example for --model
instead of claude-opus-4-6.
- src/tests/startup-model-validation.test.ts: update the fallback test
to assert provider stickiness rather than a specific Claude model id.
https://claude.ai/code/session_01CvuUuzuVjRcQN25263nG6V
Extract the post-tool text-block selection logic into a small pure
helper (`findLatestPinnableText`) so the regression scenario can be
covered without standing up the full interactive controller harness.
The new test pins the bug from #4120: when content blocks are
`[text1, tool1, text2_streaming]`, the helper must return `text1`
(not `text2`), because `text2` is still streaming live into the chat
container and mirroring it would render the same tokens twice.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>