Add vitest.config.ts with forks pool, v8 coverage, and package aliases.
Run migrate-to-vitest.mjs to replace `from "node:test"` imports with
`from 'vitest'` across 749 test files, converting mock.fn→vi.fn and
mock.timers→vi fake timers where needed.
💘 Generated with Crush
Assisted-by: GLM-5.1 via Crush <crush@charm.land>
- Move guards phase after dispatch in dev path so unitType/unitId are
available for plan-gate validation
- Relocate UOK plan-gate from runDispatch into runGuards with
getSliceTaskCounts first-task-of-slice check
- Rename runLegacyAutoLoop → autoLoop in startAuto call sites
- Add plan quality gate in _deriveStateImpl via getSlicePlanBlockingIssue
- Clear path cache in invalidateStateCache
- Deprioritise minimax in search provider fallback ordering
- Fix native-search Anthropic heuristic to exclude copilot/minimax/kimi
clones while still matching claude-* models
- Add releaseIfIdle to CodexAppServerClient for clean short-lived process
exit
- Fix nested codex error message parsing
- Update search provider tests to clear minimax env vars
- Add native parser zero-task fallback in parsePlan
💘 Generated with Crush
Assisted-by: GLM-5.1 via Crush <crush@charm.land>
- Add codex-app-server-client for Codex app server communication
- Update openai-codex-responses provider integration
- Fix auto.ts to use runLegacyAutoLoop post-UOK-refactor
- Add advisor_allowed_providers preference support
- Fix slice plan blocking issue check in auto-recovery
- run-unit.ts: do NOT clear isSessionSwitchInFlight on timeout; let the
dangling newSession .finally() clear it via generation check. This fixes
'runUnit keeps the session-switch guard across a late newSession settlement'.
- auto.ts: use `runLegacyLoop: autoLoop` (not runLegacyAutoLoop) — autoLoop
already defaults to legacy-direct dispatch contract. Fixes source-inspection
test that expects the literal text 'runLegacyLoop: autoLoop'.
- state.ts: remove over-strict plan quality check from state derivation so
minimal plans (no review sections) don't block task dispatch.
- auto-recovery.ts, auto-timers.ts: minor cleanup from agent sweep.
- packages/pi-ai: github-copilot.ts OAuth helper + index.ts export wiring.
- openai-codex.ts: drop stale PKCE residuals after simplification.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- openai-codex.ts: replace hand-rolled PKCE flow with simple read of
~/.codex/auth.json written by the real codex CLI after user authentication.
Removes ~250 lines of local callback server + browser dance code.
- openai-codex-responses.ts: minor residual cleanup
- openai-completions.ts: drop remaining `as any` stream_options cast
- anthropic-shared.ts: use `unknown` cast on thinkingNoBudget path
- pi-coding-agent/extensions/types.ts: minor type addition
- db-tools.ts: explicit AgentToolResult return type on execute handlers
- requesting-code-review/SKILL.md: prompt wording cleanup
- subagent/index.ts: capability registration wiring
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- anthropic-shared.ts: replace `as any` cast on thinkingNoBudget path with
`as unknown as Record<string, unknown>` for auditability; remove `as any`
on server_tool_use block (SDK type is now correct)
- openai-completions.ts: drop residual `as any` casts after SDK type update
- db-tools.ts: add explicit AgentToolResult return type annotation on execute
handlers to resolve implicit-any lint
- requesting-code-review/SKILL.md: update review skill prompt
- subagent/index.ts: wire subagent capability registration
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- package.json: add 'typecheck' script (build:pi + tsc --noEmit) so pi-ai
and pi-coding-agent typecheck under the same command surface SF uses.
- anthropic-shared.ts: replace 'as any' casts with proper Anthropic SDK
types (ServerToolUseBlockParam, WebSearchToolResultBlockParam,
CacheControlEphemeral). The cache_control variant is documented inline
so the cast is auditable.
- openai-completions.ts: drop the 'as any' on stream_options — the type
system can verify the assignment now.
- openai-codex-responses.ts, package-manager.ts, skills.ts: annotate the
three remaining empty catches with one-line WHY comments (best-effort
cleanup, malformed ignore files, partial directory traversal). Empty
catch with no rationale is an SF012 anti-pattern; with rationale it is
a deliberate fallback.
- oauth/github-copilot.ts, oauth/openai-codex.ts: add UPSTREAM AUDIT
blocks documenting why these hand-rolled OAuth flows stay hand-rolled
rather than delegating to @octokit/auth-oauth-device or @openai/codex.
AbortSignal coverage and provider-specific surface area are the gating
concerns; re-audit triggers are named.
Replace ~700 LOC of hand-rolled OAuth and onboarding with cli-core's own
getOauthClient + setupUser. The provider now reads ~/.gemini/oauth_creds.json
itself (via cli-core), refreshes tokens, and discovers the Code Assist
project + tier server-side — exactly like the real gemini CLI does.
- provider/google-gemini-cli.ts: drop apiKey={token,projectId} JSON
plumbing; getCodeAssistServer() uses cli-core for everything
- delete utils/oauth/google-gemini-cli.ts (457 LOC: hand-rolled login,
PKCE, callback server, discoverProject, onboardUser, tier handling)
- delete utils/oauth/google-oauth-utils.ts (201 LOC: only consumed by
the deleted gemini-cli helper)
- oauth/index.ts: remove gemini-cli from BUILT_IN_OAUTH_PROVIDERS
registry; google-gemini-cli is no longer SF-managed
- auth-storage.ts: update 3 error messages to direct users to the real
gemini CLI for authentication instead of the removed /login command
Login UX: users authenticate with the real gemini CLI; we just consume
~/.gemini/oauth_creds.json. Whole-provider disable goes through manual
settings.json edit (per-model toggle still works in interactive UI).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
84 files spanning provider capabilities, model routing, headless
runtime, sf auto subsystems, gitbook docs, and test coverage. Snapshotted
so headless auto can resume M004 (Production Readiness) S03
(Verification Gate Validation) on a clean tree.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pi-mono Tier 0 #2 — sf-driven port of PR #3650.
Some LLM providers reject API calls when `tools: []` is sent (an empty
array), but accept the call when the tools field is omitted entirely.
This guards each provider's request-body builder to omit `tools` when
the tool list is empty, instead of serialising the empty array.
Files (5 provider builders):
- packages/pi-ai/src/providers/openai-completions.ts
- packages/pi-ai/src/providers/openai-responses.ts
- packages/pi-ai/src/providers/openai-codex-responses.ts
- packages/pi-ai/src/providers/azure-openai-responses.ts
- packages/pi-ai/src/providers/anthropic-shared.ts (covers anthropic
and anthropic-vertex which both import buildParams from it)
Pattern: `if (context.tools)` → `if (context.tools && context.tools.length > 0)`.
Preserved: the `else if (hasToolHistory(context.messages))` branch in
openai-completions.ts that intentionally emits `tools: []` for
LiteLLM/Anthropic-proxy compatibility is unchanged.
Type-check passes.
Co-Authored-By: sf v2.75.1 (session 38ed0a48)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pi-mono Tier 0 #5 — first sf-driven port. sf-from-source dispatched the
task in print mode and produced this fix autonomously.
Adds getModelMatchCandidates(modelId, modelName?) helper that normalizes
both inputs to lowercase and dash-separated form
(s.replace(/[\s_.:]+/g, "-")). Inference profile ARNs don't embed the
model name; the helper lets capability checks match against either the
inference profile ARN or the underlying model name.
Updated:
- supportsAdaptiveThinking — uses the helper; consolidates the
opus-4.6/opus-4-6 dot-vs-dash variants.
- mapThinkingLevelToEffort — same pattern.
- supportsPromptCaching — same pattern (also from pi-mono PR #3527).
- streamSimpleBedrock and buildAdditionalModelRequestFields — pass
model.name through to capability checks.
Type-check passes (cd packages/pi-ai && npx tsc --noEmit).
Co-Authored-By: sf v2.75.1 (session 911dd2de)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds direct xiaomi token-plan API access alongside the existing
OpenRouter-routed xiaomi entries. ADDITIVE only — OpenRouter cleanup is
a separate follow-up.
Three new region providers:
- xiaomi-token-plan-ams (Amsterdam, default for plain `xiaomi`)
- xiaomi-token-plan-sgp (Singapore)
- xiaomi-token-plan-cn (China)
All use Anthropic Messages API. Env-var resolution: XIAOMI_API_KEY →
XIAOMI_TOKEN_PLAN_API_KEY → MIMO_API_KEY (in that fallback order).
Three xiaomi MiMo models registered under each direct provider:
- mimo-v2-flash (256k ctx, 64k output, text-only, reasoning)
- mimo-v2-omni (256k ctx, 128k output, text+image, reasoning)
- mimo-v2-pro (1M ctx, 128k output, text-only, reasoning)
Same model literals × 4 provider keys, different baseUrls per region.
Test count assertion bumped 22 → 26 providers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two codex-rescue tasks landed together:
1. Auto-coerce JSON-schema validator: when a tool field declares
{type:"array", items:{type:"string"}} and the model sends a single
string, wrap it in [string] before validation instead of hard-rejecting.
Fixes the recurring "keyDecisions: must be array" rejection on
sf_complete_task that wasted retries.
2. Provider_model_allow filter (proper implementation with helpers):
- resolveProviderModelAllowList / isProviderModelAllowed /
filterModelsByProviderModelAllow helpers in preferences-models
- Wired into model-registry and auto-model-selection
- New tests/provider-model-allow.test.ts
Tools coerced: sf_complete_task, sf_complete_milestone, sf_plan_milestone,
sf_plan_slice, sf_replan_slice, sf_reassess_roadmap (key list fields).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: OpenAI Codex <noreply@openai.com>
Real bugs from 2nd-pass scan:
1. extension-registry.ts: discoverAllManifests skipped symlinked extension
dirs because Dirent.isDirectory() returns false for symlinks. Dev-workflow
symlinks under ~/.sf/agent/extensions/ were invisible to list/enable/
disable/info. Matches the regression documented in
symlink-extension-discovery.test.ts — the test inlines the correct logic,
but this callsite still had the buggy form. Now accepts isDirectory() ||
isSymbolicLink().
2. headless.ts SIGINT handler: client.stop() failures were double-silenced
(inner .catch(()=>{}), outer try{}catch{}). Interactive mode logs stop
errors to stderr. Restored head/headless parity — still fire-and-forget
(exit code is forced via process.exit) but failures are observable.
3. openai-codex-responses.ts SSE parser: malformed data frames were silently
dropped so broken streams looked identical to clean ones. Now debug-logs
the parse error with the chunk context so broken streams are
distinguishable in logs. Stream continues on bad chunk (one bad frame
shouldn't kill the whole generation).
4. web/cleanup-service.ts generated script: bare 'catch {}' around four native
git calls (nativeBranchList, nativeDetectMainBranch, nativeBranchListMerged,
nativeForEachRef). A failed main-branch detection silently left mainBranch
undefined-shaped, then the next native call operated on garbage. Now emits
console.warn so failures surface in the subprocess log.
5. web/undo-service.ts generated script: git revert failure was silenced;
when --no-commit failed, user saw commitsReverted=0 with no reason. Now
logs the revert error before attempting --abort (abort itself remains
best-effort silent).
False positives from the same scan (investigated and dismissed):
- auto-worktree.ts #2505: code uses ':(exclude).sf/milestones' pathspec +
shelter-and-restore, which is a better fix than the 'drop --include-untracked'
approach the test comment describes. Test comment is stale; source is correct.
- Lifecycle handler unhandled rejections across 5 extensions: extensions/runner.ts
already try/catches handler invocations and routes to emitError. Wrapping the
individual handlers would be redundant.
RequestedThinkingLevel adds "auto" to the reasoning option. Each provider
handles it natively:
- Claude 4.x (anthropic/bedrock): adaptive thinking, no effort constraint
- Gemini 2.5 Pro/Flash (google/vertex/gemini-cli): THINKING_LEVEL_UNSPECIFIED
- GPT-5+ (openai-responses/azure): reasoning.effort omitted, model decides
- Kimi (kimi-coding): {"type":"enabled"} without budget_tokens via new
capabilities.thinkingNoBudget flag — model manages reasoning depth
- GLM (zai, thinkingFormat:zai): enable_thinking:true already correct
- MiniMax (anthropic API): explicit budget_tokens required, resolves to medium
ModelCapabilities.thinkingNoBudget: new flag for Anthropic-compatible providers
that accept {"type":"enabled"} without a budget (Kimi API).
models.generated.ts: add Kimi K2.6 (id: kimi-for-coding, beta API); add
thinkingNoBudget capability to all kimi-coding models.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
generate-models.ts now imports @google/gemini-cli-core's
VALID_GEMINI_MODELS set and iterates it to produce SF's google-gemini-cli
provider entries. Single source of truth: when Google ships a new Gemini
model, it lands in cli-core first, then flows into SF on
`npm update @google/gemini-cli-core` + `generate-models.ts` re-run —
no more hand-editing the generate script.
Before: 6 hardcoded entries (gemini-2.0/2.5/3 flash + pro preview, etc.)
After: 7 entries sourced dynamically, filtered to drop `-customtools`
variants which require a different tool protocol:
gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite,
gemini-3-pro-preview, gemini-3-flash-preview,
gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview
Capability tagging uses cli-core's isProModel / isPreviewModel so
reasoning=true for pro + 3.x preview variants (excluding flash-lite).
Context-window / max-output-tokens kept in an SF-local override table
since cli-core doesn't publish those per-model.
Pre-existing 4 test failures (zai glm-5.1 x3, anthropic resolveBaseUrl
#4140) unchanged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the handwritten fetch() + SSE-parsing + custom retry loop in
packages/pi-ai/src/providers/google-gemini-cli.ts with direct calls into
`CodeAssistServer.generateContentStream()` from @google/gemini-cli-core.
Requests to cloudcode-pa.googleapis.com are now byte-identical to what
the real `gemini` CLI sends — same User-Agent, same Client-Metadata,
same retry semantics — which preserves Google's subsidised free-OAuth
quota treatment and eliminates third-party-bot ban risk.
File size: 798 → 511 lines (~290 lines deleted net).
What went away:
- DEFAULT_ENDPOINT, GEMINI_CLI_HEADERS (cli-core sets these itself)
- MAX_RETRIES, BASE_DELAY_MS, MAX_EMPTY_STREAM_RETRIES, EMPTY_STREAM_BASE_DELAY_MS
- CLAUDE_THINKING_BETA_HEADER (was antigravity-only)
- extractRetryDelay(), isRetryableError(), extractErrorMessage(),
sleep() — cli-core handles 429/5xx retry with Retry-After honoured
- needsClaudeThinkingBetaHeader() — antigravity-only stub
- CloudCodeAssistRequest + CloudCodeAssistResponseChunk interfaces
(replaced by @google/genai's GenerateContentParameters +
GenerateContentResponse — already unwrapped by cli-core)
- ~200-line SSE body-reader block (response.body.getReader() + decoder
+ 'data:' line parsing) — cli-core yields parsed objects directly
- Empty-stream retry workaround — handled upstream now
What stayed (pure SF adapter code):
- convertMessages() → @google/genai Content[]
- convertTools() → functionDeclarations
- AssistantMessageEventStream — our event shape
- Part-by-part processing: text vs thinking blocks, function-call
translation to ToolCall, thoughtSignature retention, usage token
extraction
New helper:
- buildCodeAssistServer(token, projectId) constructs an OAuth2Client
(google-auth-library) seeded with the SF-cached access token and
wraps it in a CodeAssistServer instance. Ready for future promotion
to cli-core's getOauthClient() for full auto-refresh; today we
still pass the token through from SF's auth storage (Strategy A
from the plan doc).
Live verified end-to-end against gemini-2.5-flash using the user's
cached ~/.gemini/oauth_creds.json — got real streaming response,
correct stopReason, usage tokens accounted.
Models registry test updated from 23 → 22 providers (antigravity gone).
Remaining 4 pi-ai test failures are pre-existing and unrelated
(custom-zai glm-5.1, resolveAnthropicBaseUrl #4140).
Type note: cli-core bundles its own nested copy of @google/genai, so
TypeScript sees two structurally-identical Content types. Runtime is
fine; a single `as any` cast at the generateContentStream call site
handles the nominal split.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Continues the antigravity rip-out (previous commit covered SF + pi-coding-
agent UI layer). This commit removes the code from pi-ai:
- Delete packages/pi-ai/src/utils/oauth/google-antigravity.ts (313 lines)
- Update oauth/index.ts: drop antigravityOAuthProvider, refreshAntigravityToken,
loginAntigravity exports + registry entry. Add comment explaining why
(no vendor core lib + Google ban risk).
- google-gemini-cli.ts: strip ANTIGRAVITY_* constants, ANTIGRAVITY_ENDPOINT_FALLBACKS,
getAntigravityHeaders(), ANTIGRAVITY_SYSTEM_INSTRUCTION, and all
isAntigravity branching from streamGoogleGeminiCli + buildRequest.
File header rewritten. needsClaudeThinkingBetaHeader() collapses to
always-false (antigravity was the only path that needed it).
- google-shared.ts: strip stale Antigravity comments (file still shared
between google, google-gemini-cli, google-vertex).
- types.ts: drop "google-antigravity" from Api / KnownProvider union.
- models.generated.ts: remove google-antigravity provider block (~170 lines,
4 claude-* models that were only served via Antigravity).
- models.generated.test.ts: drop from expected-providers snapshot.
- scripts/generate-models.ts: remove antigravity model emission + context-
window override so future regenerations don't re-add it.
Reasoning (same as previous commit): Antigravity has no vendor-published
core library we can embed. Hand-rolled OAuth against
daily-cloudcode-pa.sandbox.googleapis.com was exactly the pattern
Google is banning for third-party tools. Removing it eliminates the
risk surface.
Breaking change: users with google-antigravity configured in their
models.* block will need to migrate to google-gemini-cli (OAuth via
the real `gemini` CLI), google (API key), or google-vertex (GCP auth).
Build passes. Next commit wires the google-gemini-cli provider to
@google/gemini-cli-core per the plan.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Installs Google's official core library that powers the `gemini` CLI
binary. This is the first step of re-platforming pi-ai's
`google-gemini-cli` provider to use cli-core's transport instead of
handwritten fetch() calls against cloudcode-pa.googleapis.com.
Why:
- cli-core requests are byte-for-byte identical to the official
gemini CLI — preserves Google's subsidised free-OAuth quota and
eliminates bot-detection drift risk from our reverse-engineered
User-Agent / Client-Metadata headers.
- Auto-inherit upstream improvements (new tool formats, grounding,
session caching, quota displays) on `npm update`.
- The `genai-proxy` extension (localhost proxy for gemini-cli-format
clients) becomes "the CLI, but programmable" — same upstream
behavior, hookable SF routing underneath.
Auth model (unchanged for users):
- User runs the real `gemini` CLI once to OAuth; credentials land
in ~/.gemini/oauth_creds.json (or keychain on newer installs).
- SF reads those credentials via cli-core's own storage helpers;
no SF-side OAuth flow, no separate login.
Scope for this commit: dependency only. The transport refactor
(replacing the fetch() calls in google-gemini-cli.ts with
CodeAssistServer.generateContentStream()) is queued as the next
task and documented in google-gemini-cli-core-plan.md with a
detailed API map, two integration strategies (transport-only vs
full cli-core auth), and a step-by-step implementation checklist.
Note: this commit adds 66 transitive deps to pi-ai (ajv, zod,
glob, mime, open, etc.). google-antigravity provider stays on
handwritten code — different sandbox endpoints, different auth
contract, not in cli-core's scope.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ChatGPT-backed Codex sign-in no longer exposes the removed 5.1/5.2 Codex variants. Filter those models from openai-codex OAuth so GSD stops surfacing selections that immediately fail while leaving API-key-backed OpenAI models available.
(cherry picked from commit 1aedba583916826fc5c6129037f61e9802010e46)
LongCat (Meituan) ships an Anthropic-compatible endpoint at
https://api.longcat.chat/anthropic that authenticates via
`Authorization: Bearer $KEY` instead of Anthropic's native `x-api-key`
header. Without this change, pi sends x-api-key and LongCat replies
with 401 invalid_api_key / missing_api_key.
Same topology as the existing alibaba-coding-plan / minimax /
minimax-cn entries (#3783).
- Add "longcat" to usesAnthropicBearerAuth() so createClient routes
the key through authToken.
- Add "longcat": "LONGCAT_API_KEY" to env-api-keys.ts envMap so
getEnvApiKey() can resolve it when options.apiKey is absent.
- Add "longcat" to KnownProvider so the === literal check type-checks.
- Extend anthropic-auth.test.ts to assert usesAnthropicBearerAuth
returns true for longcat.