# ADR-007: Model Catalog Split and Provider API Encapsulation **Status:** Proposed **Date:** 2026-04-03 **Deciders:** Jeremy McSpadden **Related:** ADR-004 (capability-aware model routing), [ADR-005](https://github.com/gsd-build/gsd-2/issues/2790), [ADR-006](https://github.com/gsd-build/gsd-2/issues/2995), `packages/pi-ai/src/providers/`, `packages/pi-ai/src/models.ts` ## Context The model/provider system in `pi-ai` has two structural problems worth fixing — but the system is **not fundamentally broken**. The heavy lifting (lazy SDK imports, registry-based dispatch, extension-based registration) is already well-designed. This ADR targets the two areas where the current design creates real friction without proposing unnecessary runtime changes. ### Current Architecture ``` stream.ts └─ import "./providers/register-builtins.js" ← side-effect import at load time ├─ import anthropic.ts (6.8 KB) ├─ import anthropic-vertex.ts (3.9 KB) ├─ import openai-completions.ts (26 KB) ├─ import openai-responses.ts (6.4 KB) ├─ import openai-codex-responses.ts (29 KB) ├─ import azure-openai-responses.ts (7.8 KB) ├─ import google.ts (13.6 KB) ├─ import google-vertex.ts (14.5 KB) ├─ import google-gemini-cli.ts (30 KB) ├─ import mistral.ts (18.9 KB) └─ amazon-bedrock.ts (24 KB) ← only lazy-loaded provider models.ts └─ import models.generated.ts ← 13,848 lines, ALL providers, loaded at init └─ import models.custom.ts ← 197 lines, additional providers ``` ### What Already Works Well 1. **SDK lazy loading.** Every provider file uses `async function getXxxClass()` with a cached dynamic `import()`. The heavy npm packages (`@anthropic-ai/sdk`, `openai`, `@google/genai`, `@aws-sdk/*`, `@mistralai/*`) are only loaded on first API call. This is where the real startup cost would be — and it's already handled. 2. **Registry-based dispatch.** `api-registry.ts` cleanly maps API types to stream functions. Callers use `stream(model, context)` and the registry routes to the right provider. This pattern is sound. 3. **Extension registration.** Ollama and Claude Code CLI register via `registerApiProvider()` at runtime. This extensibility point works correctly. 4. **Provider implementation code loading (~200KB total).** While all providers load eagerly, V8 parses local `.js` files in single-digit milliseconds each. The total parse cost for all provider files is ~10-30ms — not a user-visible bottleneck on a CLI that's about to make a multi-second API call anyway. ### What's Actually Worth Fixing #### Problem 1: Monolithic model catalog — developer experience, not runtime `models.generated.ts` is **13,848 lines in a single file**. This creates real friction: - **PR reviews are painful.** When the generation script runs, the diff is a wall of changes across unrelated providers. Reviewers can't tell what actually changed for a specific provider. - **Navigation is slow.** Finding a specific model requires scrolling or searching through thousands of lines of static object literals. - **Merge conflicts are frequent.** Any two PRs that touch model generation will conflict on the same monolithic file. - **Git blame is useless.** Every line was "last changed" by the generation script, obscuring the history of individual provider additions. The runtime cost of loading all model definitions is negligible — a Map of ~200 model objects is maybe 50-100KB of heap. The problem is purely about code organization and developer workflow. #### Problem 2: Barrel export leaks provider internals — API design `packages/pi-ai/src/index.ts` re-exports every provider module's internals: ```typescript export * from "./providers/anthropic.js"; export * from "./providers/google.js"; export * from "./providers/google-gemini-cli.js"; export * from "./providers/google-vertex.js"; export * from "./providers/mistral.js"; export * from "./providers/openai-completions.js"; export * from "./providers/openai-responses.js"; // ... etc ``` This is a public API problem: - **Consumers can bypass the registry.** Any code that `import { streamAnthropic } from "pi-ai"` has a direct dependency on an implementation detail that should be internal. - **Refactoring is blocked.** Renaming a function inside a provider file is a breaking change because it's re-exported from the package root. - **API surface is unnecessarily large.** The public API should be `stream()`, `streamSimple()`, `registerApiProvider()`, model utilities, and types. Provider-specific stream functions are implementation details. ### What Is NOT Worth Changing **Lazy provider loading (converting `register-builtins.ts` to async on-demand loading).** This was considered and rejected because: 1. **The SDKs are already lazy.** The heavy cost is handled. Provider implementation code (~200KB of local `.js`) parses in ~10-30ms total. 2. **Async resolution adds complexity to the hot path.** `stream.ts` currently does a synchronous `Map.get()`. Making `resolveApiProvider` async adds a microtask hop to every API call — not just the first. Small but measurable, and for no user-visible gain. 3. **High blast radius, low payoff.** Touching `stream.ts`, `api-registry.ts`, and the registration lifecycle simultaneously risks regressions in the core streaming path for an optimization that wouldn't show up in profiling. 4. **Bedrock's lazy loading is a special case, not a template.** It exists because `@aws-sdk/client-bedrock-runtime` is uniquely massive. Generalizing this pattern to providers where the SDK is already lazy-imported doesn't compound the benefit. ## Decision **Make two targeted improvements to code organization and API hygiene. Do not change runtime loading behavior.** ### Change 1: Split `models.generated.ts` into per-provider files Replace the monolithic 13,848-line generated file with per-provider files: ``` packages/pi-ai/src/models/ ├── index.ts ← re-exports combined registry, same public API ├── generated/ │ ├── anthropic.ts ← Anthropic model definitions │ ├── openai.ts ← OpenAI model definitions │ ├── google.ts ← Google model definitions │ ├── mistral.ts ← Mistral model definitions │ ├── amazon-bedrock.ts ← Bedrock model definitions │ ├── groq.ts ← Groq model definitions │ ├── xai.ts ← xAI model definitions │ ├── cerebras.ts ← Cerebras model definitions │ ├── openrouter.ts ← OpenRouter model definitions │ └── ... ← one file per provider in the catalog ├── custom.ts ← replaces models.custom.ts (unchanged content) └── capability-patches.ts ← CAPABILITY_PATCHES extracted for clarity ``` **`models/index.ts` keeps the exact same synchronous public API:** ```typescript // models/index.ts // GSD-2 — Model registry (split by provider for maintainability) import { ANTHROPIC_MODELS } from "./generated/anthropic.js"; import { OPENAI_MODELS } from "./generated/openai.js"; import { GOOGLE_MODELS } from "./generated/google.js"; // ... one import per provider import { CUSTOM_MODELS } from "./custom.js"; import { CAPABILITY_PATCHES, applyCapabilityPatches } from "./capability-patches.js"; import type { Api, KnownProvider, Model, Usage } from "../types.js"; // Combine all generated models into single registry — same as today const MODELS = { ...ANTHROPIC_MODELS, ...OPENAI_MODELS, ...GOOGLE_MODELS, // ... }; // Rest of the file is identical to current models.ts: // modelRegistry Map construction, capability patch application, // getModel(), getProviders(), getModels(), calculateCost(), // supportsXhigh(), modelsAreEqual() ``` **Key constraint: loading stays synchronous and eager.** All model files are statically imported. The Map is built at module init exactly as today. No async, no lazy loading, no runtime behavior change. This is purely a file organization change. **Update `generate-models.ts`** to emit one file per provider instead of a single `models.generated.ts`. The script already groups models by provider internally — it just needs to write separate files instead of one. #### Why this matters | Before | After | |--------|-------| | PR diffs show 13K-line file changes | PR diffs scoped to the provider that changed | | Merge conflicts on any concurrent model update | Conflicts only when same provider is touched | | `git blame` shows "regenerate models" for every line | `git blame` shows per-provider history | | Finding a model = search through 13K lines | Finding a model = open the provider file | | One reviewer must understand all providers | Reviewers only need context for affected provider | ### Change 2: Stop barrel-exporting provider internals **Update `packages/pi-ai/src/index.ts`:** ```typescript // Before (current — 17 re-exports including all providers): export * from "./providers/anthropic.js"; export * from "./providers/azure-openai-responses.js"; export * from "./providers/google.js"; export * from "./providers/google-gemini-cli.js"; export * from "./providers/google-vertex.js"; export * from "./providers/mistral.js"; export * from "./providers/openai-completions.js"; export * from "./providers/openai-responses.js"; export * from "./providers/register-builtins.js"; // ... // After (clean public API): export * from "./api-registry.js"; export * from "./env-api-keys.js"; export * from "./models/index.js"; export * from "./providers/register-builtins.js"; // resetApiProviders() is public export * from "./stream.js"; export * from "./types.js"; export * from "./utils/event-stream.js"; export * from "./utils/json-parse.js"; export type { OAuthAuthInfo, OAuthCredentials, /* ... */ } from "./utils/oauth/types.js"; export * from "./utils/overflow.js"; export * from "./utils/typebox-helpers.js"; export * from "./utils/repair-tool-json.js"; export * from "./utils/validation.js"; ``` Provider-specific exports (`streamAnthropic`, `streamGoogle`, etc.) are removed from the public API. Any external consumer that imported them directly should use the registry-based `stream()` / `streamSimple()` functions instead — which is how all internal callers already work. #### Why this matters - **Enforces the registry pattern.** The correct way to call a provider is `stream(model, context)`. Direct provider function imports create fragile coupling. - **Enables future refactoring.** Provider internal function signatures can change without breaking the package API. Today, renaming `streamAnthropic` would be a semver-breaking change. - **Reduces API surface.** Consumers see only what they need: `stream`, `streamSimple`, `registerApiProvider`, model utilities, and types. ### What Does NOT Change - **Runtime behavior** — all providers still load eagerly, same as today - **The `Model` type system** — all types, interfaces, and generics stay the same - **The `ApiProvider` interface** — providers still implement `{ api, stream, streamSimple }` - **The `api-registry.ts` registry** — synchronous `Map.get()` dispatch, unchanged - **`stream.ts`** — no changes to the streaming entry point - **`register-builtins.ts`** — still eagerly imports and registers all providers (only `resetApiProviders` remains in barrel export) - **The extension system** — `registerApiProvider()` continues to work for Ollama, Claude Code CLI, etc. - **`models.json` user config** — custom models, overrides, provider settings are unaffected - **Model discovery** — discovery adapters are already lazy and independent - **Model routing** — ADR-004's capability-aware routing is orthogonal ## Consequences ### Positive 1. **Cleaner PRs.** Model catalog changes are scoped to the provider that changed. Reviewers see a 200-line diff in `models/generated/openai.ts` instead of a 13K-line diff in `models.generated.ts`. 2. **Fewer merge conflicts.** Two PRs that update different providers no longer conflict on the same file. 3. **Better navigability.** Developers can jump directly to `models/generated/anthropic.ts` to see Anthropic's model definitions instead of searching through a monolith. 4. **Cleaner package API.** `pi-ai` exports only what consumers need. Provider internals are properly encapsulated. 5. **Future-proofs refactoring.** Provider implementation details can evolve without breaking the public API contract. 6. **Zero runtime risk.** No changes to loading, registration, streaming, or dispatch. The refactor is purely structural. ### Negative 1. **More files.** Instead of 1 generated file + 1 custom file, we'll have ~15-20 generated files. Marginal complexity increase, but each file is focused and small. 2. **Generation script update.** `generate-models.ts` needs to write per-provider files. The script already groups by provider, so this is straightforward but requires testing. 3. **Import audit for barrel export change.** Any code that directly imports `streamAnthropic` (etc.) from `pi-ai` needs to be updated. Based on research, the main consumer is `register-builtins.ts` itself, which imports providers directly (not through the barrel). External usage should be minimal. ## Alternatives Considered ### 1. Full lazy provider loading (original ADR-005 proposal) Make all providers load on-demand via async dynamic imports, generalizing the Bedrock pattern. **Rejected** because: - SDK imports are already lazy — the heavy cost is handled - Provider implementation parsing is ~10-30ms total — not a bottleneck - Adds async complexity to the synchronous stream dispatch hot path - High migration effort and regression risk for unmeasurable performance gain ### 2. Plugin architecture with separate npm packages Move each provider to its own package (`@gsd/provider-anthropic`, etc.). Maximum isolation but dramatically more complex build/release/versioning. Overkill for a monorepo where all providers ship together. ### 3. Do nothing The current architecture works. This is a valid choice. The split is justified by the developer experience friction (13K-line file, merge conflicts, unusable git blame) and the API hygiene issue (leaking provider internals), not by a runtime problem. If the team is not experiencing these friction points, deferring is reasonable. ## Implementation Plan ### Wave 1: Split Model Catalog (Low-Medium Risk) 1. Update `generate-models.ts` to emit per-provider files into `models/generated/` 2. Create `models/index.ts` that imports all per-provider files and builds the same registry 3. Extract `CAPABILITY_PATCHES` into `models/capability-patches.ts` 4. Move `models.custom.ts` to `models/custom.ts` 5. Update imports in `models.ts` (or replace it with the new `models/index.ts`) 6. Verify `npm run build` and `npm run test` pass 7. Delete `models.generated.ts` and `models.custom.ts` ### Wave 2: Clean Up Barrel Export (Low Risk) 1. Remove provider re-exports from `index.ts` 2. Grep for direct provider imports from `"pi-ai"` across the codebase 3. Migrate any found usages to use `stream()` / `streamSimple()` through the registry 4. Verify build and tests ### Wave 3: Validate 1. Run full test suite 2. Verify extension registration (Ollama, Claude Code CLI) still works 3. Verify `resetApiProviders()` test helper still works 4. Spot-check a few providers end-to-end ## References - Current model catalog: `packages/pi-ai/src/models.generated.ts` (13,848 lines) - Current barrel export: `packages/pi-ai/src/index.ts` - Model registry: `packages/pi-ai/src/models.ts` - API provider registry: `packages/pi-ai/src/api-registry.ts` - Eager registration: `packages/pi-ai/src/providers/register-builtins.ts` - Stream dispatch: `packages/pi-ai/src/stream.ts` - Generation script: `packages/pi-ai/scripts/generate-models.ts` - Extension registration: `packages/pi-coding-agent/src/core/model-registry.ts` - ADR-004: `docs/ADR-004-capability-aware-model-routing.md`