Replaces the handwritten fetch() + SSE-parsing + custom retry loop in
packages/pi-ai/src/providers/google-gemini-cli.ts with direct calls into
`CodeAssistServer.generateContentStream()` from @google/gemini-cli-core.
Requests to cloudcode-pa.googleapis.com are now byte-identical to what
the real `gemini` CLI sends — same User-Agent, same Client-Metadata,
same retry semantics — which preserves Google's subsidised free-OAuth
quota treatment and eliminates third-party-bot ban risk.
File size: 798 → 511 lines (~290 lines deleted net).
What went away:
- DEFAULT_ENDPOINT, GEMINI_CLI_HEADERS (cli-core sets these itself)
- MAX_RETRIES, BASE_DELAY_MS, MAX_EMPTY_STREAM_RETRIES, EMPTY_STREAM_BASE_DELAY_MS
- CLAUDE_THINKING_BETA_HEADER (was antigravity-only)
- extractRetryDelay(), isRetryableError(), extractErrorMessage(),
sleep() — cli-core handles 429/5xx retry with Retry-After honoured
- needsClaudeThinkingBetaHeader() — antigravity-only stub
- CloudCodeAssistRequest + CloudCodeAssistResponseChunk interfaces
(replaced by @google/genai's GenerateContentParameters +
GenerateContentResponse — already unwrapped by cli-core)
- ~200-line SSE body-reader block (response.body.getReader() + decoder
+ 'data:' line parsing) — cli-core yields parsed objects directly
- Empty-stream retry workaround — handled upstream now
What stayed (pure SF adapter code):
- convertMessages() → @google/genai Content[]
- convertTools() → functionDeclarations
- AssistantMessageEventStream — our event shape
- Part-by-part processing: text vs thinking blocks, function-call
translation to ToolCall, thoughtSignature retention, usage token
extraction
New helper:
- buildCodeAssistServer(token, projectId) constructs an OAuth2Client
(google-auth-library) seeded with the SF-cached access token and
wraps it in a CodeAssistServer instance. Ready for future promotion
to cli-core's getOauthClient() for full auto-refresh; today we
still pass the token through from SF's auth storage (Strategy A
from the plan doc).
Live verified end-to-end against gemini-2.5-flash using the user's
cached ~/.gemini/oauth_creds.json — got real streaming response,
correct stopReason, usage tokens accounted.
Models registry test updated from 23 → 22 providers (antigravity gone).
Remaining 4 pi-ai test failures are pre-existing and unrelated
(custom-zai glm-5.1, resolveAnthropicBaseUrl #4140).
Type note: cli-core bundles its own nested copy of @google/genai, so
TypeScript sees two structurally-identical Content types. Runtime is
fine; a single `as any` cast at the generateContentStream call site
handles the nominal split.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
45 lines
1.1 KiB
JSON
45 lines
1.1 KiB
JSON
{
|
|
"name": "@singularity-forge/pi-ai",
|
|
"version": "2.75.0",
|
|
"description": "Unified LLM API (vendored from pi-mono)",
|
|
"type": "module",
|
|
"main": "./dist/index.js",
|
|
"types": "./dist/index.d.ts",
|
|
"exports": {
|
|
".": {
|
|
"types": "./dist/index.d.ts",
|
|
"import": "./dist/index.js"
|
|
},
|
|
"./oauth": {
|
|
"types": "./dist/oauth.d.ts",
|
|
"import": "./dist/oauth.js"
|
|
},
|
|
"./bedrock-provider": {
|
|
"types": "./bedrock-provider.d.ts",
|
|
"import": "./bedrock-provider.js"
|
|
}
|
|
},
|
|
"scripts": {
|
|
"build": "tsc -p tsconfig.json"
|
|
},
|
|
"dependencies": {
|
|
"@anthropic-ai/sdk": "^0.73.0",
|
|
"@anthropic-ai/vertex-sdk": "^0.14.4",
|
|
"@aws-sdk/client-bedrock-runtime": "^3.983.0",
|
|
"@google/gemini-cli-core": "0.38.2",
|
|
"@google/genai": "^1.40.0",
|
|
"@mistralai/mistralai": "^1.14.1",
|
|
"@sinclair/typebox": "^0.34.41",
|
|
"ajv": "^8.17.1",
|
|
"ajv-formats": "^3.0.1",
|
|
"chalk": "^5.6.2",
|
|
"gaxios": "^6",
|
|
"openai": "^6.26.0",
|
|
"proxy-agent": "^6.5.0",
|
|
"undici": "^7.24.2",
|
|
"zod-to-json-schema": "^3.24.6"
|
|
},
|
|
"devDependencies": {
|
|
"@smithy/node-http-handler": "^4.5.0"
|
|
}
|
|
}
|