pi-ai: add @google/gemini-cli-core@0.38.2 dependency + refactor plan

Installs Google's official core library that powers the `gemini` CLI
binary. This is the first step of re-platforming pi-ai's
`google-gemini-cli` provider to use cli-core's transport instead of
handwritten fetch() calls against cloudcode-pa.googleapis.com.

Why:
  - cli-core requests are byte-for-byte identical to the official
    gemini CLI — preserves Google's subsidised free-OAuth quota and
    eliminates bot-detection drift risk from our reverse-engineered
    User-Agent / Client-Metadata headers.
  - Auto-inherit upstream improvements (new tool formats, grounding,
    session caching, quota displays) on `npm update`.
  - The `genai-proxy` extension (localhost proxy for gemini-cli-format
    clients) becomes "the CLI, but programmable" — same upstream
    behavior, hookable SF routing underneath.

Auth model (unchanged for users):
  - User runs the real `gemini` CLI once to OAuth; credentials land
    in ~/.gemini/oauth_creds.json (or keychain on newer installs).
  - SF reads those credentials via cli-core's own storage helpers;
    no SF-side OAuth flow, no separate login.

Scope for this commit: dependency only. The transport refactor
(replacing the fetch() calls in google-gemini-cli.ts with
CodeAssistServer.generateContentStream()) is queued as the next
task and documented in google-gemini-cli-core-plan.md with a
detailed API map, two integration strategies (transport-only vs
full cli-core auth), and a step-by-step implementation checklist.

Note: this commit adds 66 transitive deps to pi-ai (ajv, zod,
glob, mime, open, etc.). google-antigravity provider stays on
handwritten code — different sandbox endpoints, different auth
contract, not in cli-core's scope.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Mikael Hugo 2026-04-19 10:33:22 +02:00
parent ffe86284d2
commit eed84a2624
2 changed files with 134 additions and 0 deletions

View file

@ -26,6 +26,7 @@
"@anthropic-ai/sdk": "^0.73.0",
"@anthropic-ai/vertex-sdk": "^0.14.4",
"@aws-sdk/client-bedrock-runtime": "^3.983.0",
"@google/gemini-cli-core": "0.38.2",
"@google/genai": "^1.40.0",
"@mistralai/mistralai": "^1.14.1",
"@sinclair/typebox": "^0.34.41",

View file

@ -0,0 +1,133 @@
# Re-platforming `google-gemini-cli` onto `@google/gemini-cli-core`
**Status:** Dependency installed (2026-04-19). Refactor pending next iteration.
## Goal
Replace the handwritten `fetch()` transport in `google-gemini-cli.ts` with calls
into `@google/gemini-cli-core`'s `CodeAssistServer` so requests to
`cloudcode-pa.googleapis.com` are byte-for-byte indistinguishable from the
official `gemini` CLI. Upside: free OAuth quota treatment, automatic inheritance
of upstream improvements, no reverse-engineered User-Agent / Client-Metadata
drift.
## Scope
**In-scope**
- `provider: "google-gemini-cli"` stream paths in `google-gemini-cli.ts`
(functions `streamGoogleGeminiCli` and `streamSimpleGoogleGeminiCli`).
**Out-of-scope (keep handwritten)**
- `provider: "google-antigravity"` — different sandbox endpoints
(`daily-cloudcode-pa.sandbox.googleapis.com`), different auth contract
(Antigravity IDE-scoped), different User-Agent requirements. cli-core
does not target these endpoints.
- `provider: "google"` (API key) and `provider: "google-vertex"` — unrelated
transports, stay on `@google/genai` directly.
## API mapping (cli-core 0.38.2)
| Today (handwritten) | After (cli-core) |
|------------------------------------------------------------|------------------------------------------------------------------------|
| `fetch(CLOUD_CODE_ASSIST_ENDPOINT + ":streamGenerateContent?alt=sse", …)` | `await server.generateContentStream(req, promptId, role)` returns `AsyncGenerator<GenerateContentResponse>` |
| Manual SSE body parsing (`response.body.getReader()` + `TextDecoder`) | cli-core yields already-parsed `GenerateContentResponse` chunks |
| Custom retry loop (429/5xx with backoff, endpoint cascade) | cli-core has internal retry in `requestStreamingPost` |
| Header assembly (`User-Agent`, `X-Goog-Api-Client`, `Client-Metadata`) | cli-core sets its own correct headers; just pass `httpOptions.headers` for extras |
| OAuth token carried in SF `apiKey` as `{ token, projectId }` JSON | Either keep (build `OAuth2Client` + set credentials) OR let cli-core load from `~/.gemini/oauth_creds.json` via `getOauthClient()` |
Relevant cli-core exports:
```ts
import { CodeAssistServer, CODE_ASSIST_ENDPOINT, type HttpOptions } from "@google/gemini-cli-core";
import { getOauthClient } from "@google/gemini-cli-core/dist/src/code_assist/oauth2.js";
import { AuthType } from "@google/gemini-cli-core";
import type { GenerateContentParameters, GenerateContentResponse } from "@google/genai";
```
## Two integration strategies
### Strategy A: Transport-only (incremental, lower risk)
Keep SF's existing auth storage (`apiKey` JSON blob with `{ token, projectId }`).
At each request:
```ts
import { OAuth2Client } from "google-auth-library";
import { CodeAssistServer } from "@google/gemini-cli-core";
const authClient = new OAuth2Client();
authClient.setCredentials({ access_token: token });
const server = new CodeAssistServer(authClient, projectId, {
headers: { /* extras if any */ },
});
for await (const chunk of await server.generateContentStream(req, promptId, "USER")) {
// feed chunk into existing AssistantMessageEventStream adapter
}
```
Pros: no SF auth-layer changes, minimal blast radius.
Cons: SF still does OAuth refresh manually; cli-core's auto-refresh benefit lost.
### Strategy B: Full cli-core auth (target state)
Drop the `apiKey` unpacking for `google-gemini-cli`. At provider init:
```ts
const authClient = await getOauthClient(AuthType.LOGIN_WITH_GOOGLE, config);
const server = new CodeAssistServer(authClient, projectId);
```
cli-core reads `~/.gemini/oauth_creds.json` (migrated to keychain on newer
installs), refreshes tokens, writes back. SF's `/login` flow for this provider
becomes "run the real `gemini` binary first" — exactly what the user asked for.
Pros: full integration benefit, SF drops ~80 lines of auth management.
Cons: breaks existing SF auth storage path for this provider; users must
re-authenticate via `gemini` CLI at least once.
Recommendation: **A first** (one commit, verifiable), **B second** as a
follow-up once A is stable.
## Implementation checklist (Strategy A)
1. Add factory helper `buildCodeAssistServer(token, projectId)` that constructs
`OAuth2Client` + `CodeAssistServer`. Put it alongside the existing helpers
near the top of `google-gemini-cli.ts`.
2. In `streamGoogleGeminiCli` (line 320): branch on `model.provider`. When
`"google-gemini-cli"`, use the new helper and replace the `fetch()` block
(lines ~392-450) with `server.generateContentStream()` consumption. When
`"google-antigravity"`, keep the existing codepath unchanged.
3. Convert cli-core's `GenerateContentResponse` chunks to SSE-equivalent
processing via the existing `processStreamChunk` helper (or inline the
minimal equivalent — cli-core already parses the JSON).
4. Remove `GEMINI_CLI_HEADERS` constant (cli-core sets its own).
5. Keep `ANTIGRAVITY_*` constants for the antigravity path.
6. Update `streamSimpleGoogleGeminiCli` similarly.
7. Tests:
- Replace `global.fetch` mocks targeting `cloudcode-pa.googleapis.com` with
`CodeAssistServer` prototype mocks (`generateContentStream` returns a
mocked AsyncGenerator).
- Keep antigravity tests unchanged (still fetch-based).
8. Live smoke test against a `gemini-*` model in dr-repo or a scratch project,
confirm OAuth flow works, streaming response arrives, cost is reported.
## Retry semantics
cli-core's internal retry on `requestStreamingPost` handles 429/5xx with
exponential backoff and consults `Retry-After` headers. That subsumes the
existing `MAX_RETRIES` / `BASE_DELAY_MS` loop in SF for this provider.
Keep the loop for antigravity (different endpoint, different quirks).
`extractRetryDelay` and `isRetryableError` helpers become antigravity-only.
## Why this matters (recap)
- **Free OAuth quota**: Google subsidises the official CLI's free tier. Our
requests blending in byte-for-byte preserves access.
- **Bot-detection resilience**: User-Agent / Client-Metadata drift risk goes
to zero — cli-core is the authoritative client.
- **Upstream improvements**: new tool formats, grounding, session caching,
quota displays ship via `npm update @google/gemini-cli-core`.
- **Our proxy becomes "the CLI, programmable"**: identical upstream behavior,
hookable local endpoint for any OpenAI-compatible tool.