285 lines
16 KiB
Markdown
285 lines
16 KiB
Markdown
# ADR-007: Model Catalog Split and Provider API Encapsulation
|
|
|
|
**Status:** Proposed
|
|
**Date:** 2026-04-03
|
|
**Deciders:** Jeremy McSpadden
|
|
**Related:** ADR-004 (capability-aware model routing), [ADR-005](https://github.com/singularity-forge/sf-run/issues/2790), [ADR-006](https://github.com/singularity-forge/sf-run/issues/2995), `packages/pi-ai/src/providers/`, `packages/pi-ai/src/models.ts`
|
|
|
|
## Context
|
|
|
|
The model/provider system in `pi-ai` has two structural problems worth fixing — but the system is **not fundamentally broken**. The heavy lifting (lazy SDK imports, registry-based dispatch, extension-based registration) is already well-designed. This ADR targets the two areas where the current design creates real friction without proposing unnecessary runtime changes.
|
|
|
|
### Current Architecture
|
|
|
|
```
|
|
stream.ts
|
|
└─ import "./providers/register-builtins.js" ← side-effect import at load time
|
|
├─ import anthropic.ts (6.8 KB)
|
|
├─ import anthropic-vertex.ts (3.9 KB)
|
|
├─ import openai-completions.ts (26 KB)
|
|
├─ import openai-responses.ts (6.4 KB)
|
|
├─ import openai-codex-responses.ts (29 KB)
|
|
├─ import azure-openai-responses.ts (7.8 KB)
|
|
├─ import google.ts (13.6 KB)
|
|
├─ import google-vertex.ts (14.5 KB)
|
|
├─ import google-gemini-cli.ts (30 KB)
|
|
├─ import mistral.ts (18.9 KB)
|
|
└─ amazon-bedrock.ts (24 KB) ← only lazy-loaded provider
|
|
|
|
models.ts
|
|
└─ import models.generated.ts ← 13,848 lines, ALL providers, loaded at init
|
|
└─ import models.custom.ts ← 197 lines, additional providers
|
|
```
|
|
|
|
### What Already Works Well
|
|
|
|
1. **SDK lazy loading.** Every provider file uses `async function getXxxClass()` with a cached dynamic `import()`. The heavy npm packages (`@anthropic-ai/sdk`, `openai`, `@google/genai`, `@aws-sdk/*`, `@mistralai/*`) are only loaded on first API call. This is where the real startup cost would be — and it's already handled.
|
|
|
|
2. **Registry-based dispatch.** `api-registry.ts` cleanly maps API types to stream functions. Callers use `stream(model, context)` and the registry routes to the right provider. This pattern is sound.
|
|
|
|
3. **Extension registration.** Ollama and Claude Code CLI register via `registerApiProvider()` at runtime. This extensibility point works correctly.
|
|
|
|
4. **Provider implementation code loading (~200KB total).** While all providers load eagerly, V8 parses local `.js` files in single-digit milliseconds each. The total parse cost for all provider files is ~10-30ms — not a user-visible bottleneck on a CLI that's about to make a multi-second API call anyway.
|
|
|
|
### What's Actually Worth Fixing
|
|
|
|
#### Problem 1: Monolithic model catalog — developer experience, not runtime
|
|
|
|
`models.generated.ts` is **13,848 lines in a single file**. This creates real friction:
|
|
|
|
- **PR reviews are painful.** When the generation script runs, the diff is a wall of changes across unrelated providers. Reviewers can't tell what actually changed for a specific provider.
|
|
- **Navigation is slow.** Finding a specific model requires scrolling or searching through thousands of lines of static object literals.
|
|
- **Merge conflicts are frequent.** Any two PRs that touch model generation will conflict on the same monolithic file.
|
|
- **Git blame is useless.** Every line was "last changed" by the generation script, obscuring the history of individual provider additions.
|
|
|
|
The runtime cost of loading all model definitions is negligible — a Map of ~200 model objects is maybe 50-100KB of heap. The problem is purely about code organization and developer workflow.
|
|
|
|
#### Problem 2: Barrel export leaks provider internals — API design
|
|
|
|
`packages/pi-ai/src/index.ts` re-exports every provider module's internals:
|
|
|
|
```typescript
|
|
export * from "./providers/anthropic.js";
|
|
export * from "./providers/google.js";
|
|
export * from "./providers/google-gemini-cli.js";
|
|
export * from "./providers/google-vertex.js";
|
|
export * from "./providers/mistral.js";
|
|
export * from "./providers/openai-completions.js";
|
|
export * from "./providers/openai-responses.js";
|
|
// ... etc
|
|
```
|
|
|
|
This is a public API problem:
|
|
|
|
- **Consumers can bypass the registry.** Any code that `import { streamAnthropic } from "pi-ai"` has a direct dependency on an implementation detail that should be internal.
|
|
- **Refactoring is blocked.** Renaming a function inside a provider file is a breaking change because it's re-exported from the package root.
|
|
- **API surface is unnecessarily large.** The public API should be `stream()`, `streamSimple()`, `registerApiProvider()`, model utilities, and types. Provider-specific stream functions are implementation details.
|
|
|
|
### What Is NOT Worth Changing
|
|
|
|
**Lazy provider loading (converting `register-builtins.ts` to async on-demand loading).** This was considered and rejected because:
|
|
|
|
1. **The SDKs are already lazy.** The heavy cost is handled. Provider implementation code (~200KB of local `.js`) parses in ~10-30ms total.
|
|
2. **Async resolution adds complexity to the hot path.** `stream.ts` currently does a synchronous `Map.get()`. Making `resolveApiProvider` async adds a microtask hop to every API call — not just the first. Small but measurable, and for no user-visible gain.
|
|
3. **High blast radius, low payoff.** Touching `stream.ts`, `api-registry.ts`, and the registration lifecycle simultaneously risks regressions in the core streaming path for an optimization that wouldn't show up in profiling.
|
|
4. **Bedrock's lazy loading is a special case, not a template.** It exists because `@aws-sdk/client-bedrock-runtime` is uniquely massive. Generalizing this pattern to providers where the SDK is already lazy-imported doesn't compound the benefit.
|
|
|
|
## Decision
|
|
|
|
**Make two targeted improvements to code organization and API hygiene. Do not change runtime loading behavior.**
|
|
|
|
### Change 1: Split `models.generated.ts` into per-provider files
|
|
|
|
Replace the monolithic 13,848-line generated file with per-provider files:
|
|
|
|
```
|
|
packages/pi-ai/src/models/
|
|
├── index.ts ← re-exports combined registry, same public API
|
|
├── generated/
|
|
│ ├── anthropic.ts ← Anthropic model definitions
|
|
│ ├── openai.ts ← OpenAI model definitions
|
|
│ ├── google.ts ← Google model definitions
|
|
│ ├── mistral.ts ← Mistral model definitions
|
|
│ ├── amazon-bedrock.ts ← Bedrock model definitions
|
|
│ ├── groq.ts ← Groq model definitions
|
|
│ ├── xai.ts ← xAI model definitions
|
|
│ ├── cerebras.ts ← Cerebras model definitions
|
|
│ ├── openrouter.ts ← OpenRouter model definitions
|
|
│ └── ... ← one file per provider in the catalog
|
|
├── custom.ts ← replaces models.custom.ts (unchanged content)
|
|
└── capability-patches.ts ← CAPABILITY_PATCHES extracted for clarity
|
|
```
|
|
|
|
**`models/index.ts` keeps the exact same synchronous public API:**
|
|
|
|
```typescript
|
|
// models/index.ts
|
|
// GSD-2 — Model registry (split by provider for maintainability)
|
|
|
|
import { ANTHROPIC_MODELS } from "./generated/anthropic.js";
|
|
import { OPENAI_MODELS } from "./generated/openai.js";
|
|
import { GOOGLE_MODELS } from "./generated/google.js";
|
|
// ... one import per provider
|
|
|
|
import { CUSTOM_MODELS } from "./custom.js";
|
|
import { CAPABILITY_PATCHES, applyCapabilityPatches } from "./capability-patches.js";
|
|
import type { Api, KnownProvider, Model, Usage } from "../types.js";
|
|
|
|
// Combine all generated models into single registry — same as today
|
|
const MODELS = {
|
|
...ANTHROPIC_MODELS,
|
|
...OPENAI_MODELS,
|
|
...GOOGLE_MODELS,
|
|
// ...
|
|
};
|
|
|
|
// Rest of the file is identical to current models.ts:
|
|
// modelRegistry Map construction, capability patch application,
|
|
// getModel(), getProviders(), getModels(), calculateCost(),
|
|
// supportsXhigh(), modelsAreEqual()
|
|
```
|
|
|
|
**Key constraint: loading stays synchronous and eager.** All model files are statically imported. The Map is built at module init exactly as today. No async, no lazy loading, no runtime behavior change. This is purely a file organization change.
|
|
|
|
**Update `generate-models.ts`** to emit one file per provider instead of a single `models.generated.ts`. The script already groups models by provider internally — it just needs to write separate files instead of one.
|
|
|
|
#### Why this matters
|
|
|
|
| Before | After |
|
|
|--------|-------|
|
|
| PR diffs show 13K-line file changes | PR diffs scoped to the provider that changed |
|
|
| Merge conflicts on any concurrent model update | Conflicts only when same provider is touched |
|
|
| `git blame` shows "regenerate models" for every line | `git blame` shows per-provider history |
|
|
| Finding a model = search through 13K lines | Finding a model = open the provider file |
|
|
| One reviewer must understand all providers | Reviewers only need context for affected provider |
|
|
|
|
### Change 2: Stop barrel-exporting provider internals
|
|
|
|
**Update `packages/pi-ai/src/index.ts`:**
|
|
|
|
```typescript
|
|
// Before (current — 17 re-exports including all providers):
|
|
export * from "./providers/anthropic.js";
|
|
export * from "./providers/azure-openai-responses.js";
|
|
export * from "./providers/google.js";
|
|
export * from "./providers/google-gemini-cli.js";
|
|
export * from "./providers/google-vertex.js";
|
|
export * from "./providers/mistral.js";
|
|
export * from "./providers/openai-completions.js";
|
|
export * from "./providers/openai-responses.js";
|
|
export * from "./providers/register-builtins.js";
|
|
// ...
|
|
|
|
// After (clean public API):
|
|
export * from "./api-registry.js";
|
|
export * from "./env-api-keys.js";
|
|
export * from "./models/index.js";
|
|
export * from "./providers/register-builtins.js"; // resetApiProviders() is public
|
|
export * from "./stream.js";
|
|
export * from "./types.js";
|
|
export * from "./utils/event-stream.js";
|
|
export * from "./utils/json-parse.js";
|
|
export type { OAuthAuthInfo, OAuthCredentials, /* ... */ } from "./utils/oauth/types.js";
|
|
export * from "./utils/overflow.js";
|
|
export * from "./utils/typebox-helpers.js";
|
|
export * from "./utils/repair-tool-json.js";
|
|
export * from "./utils/validation.js";
|
|
```
|
|
|
|
Provider-specific exports (`streamAnthropic`, `streamGoogle`, etc.) are removed from the public API. Any external consumer that imported them directly should use the registry-based `stream()` / `streamSimple()` functions instead — which is how all internal callers already work.
|
|
|
|
#### Why this matters
|
|
|
|
- **Enforces the registry pattern.** The correct way to call a provider is `stream(model, context)`. Direct provider function imports create fragile coupling.
|
|
- **Enables future refactoring.** Provider internal function signatures can change without breaking the package API. Today, renaming `streamAnthropic` would be a semver-breaking change.
|
|
- **Reduces API surface.** Consumers see only what they need: `stream`, `streamSimple`, `registerApiProvider`, model utilities, and types.
|
|
|
|
### What Does NOT Change
|
|
|
|
- **Runtime behavior** — all providers still load eagerly, same as today
|
|
- **The `Model<TApi>` type system** — all types, interfaces, and generics stay the same
|
|
- **The `ApiProvider` interface** — providers still implement `{ api, stream, streamSimple }`
|
|
- **The `api-registry.ts` registry** — synchronous `Map.get()` dispatch, unchanged
|
|
- **`stream.ts`** — no changes to the streaming entry point
|
|
- **`register-builtins.ts`** — still eagerly imports and registers all providers (only `resetApiProviders` remains in barrel export)
|
|
- **The extension system** — `registerApiProvider()` continues to work for Ollama, Claude Code CLI, etc.
|
|
- **`models.json` user config** — custom models, overrides, provider settings are unaffected
|
|
- **Model discovery** — discovery adapters are already lazy and independent
|
|
- **Model routing** — ADR-004's capability-aware routing is orthogonal
|
|
|
|
## Consequences
|
|
|
|
### Positive
|
|
|
|
1. **Cleaner PRs.** Model catalog changes are scoped to the provider that changed. Reviewers see a 200-line diff in `models/generated/openai.ts` instead of a 13K-line diff in `models.generated.ts`.
|
|
|
|
2. **Fewer merge conflicts.** Two PRs that update different providers no longer conflict on the same file.
|
|
|
|
3. **Better navigability.** Developers can jump directly to `models/generated/anthropic.ts` to see Anthropic's model definitions instead of searching through a monolith.
|
|
|
|
4. **Cleaner package API.** `pi-ai` exports only what consumers need. Provider internals are properly encapsulated.
|
|
|
|
5. **Future-proofs refactoring.** Provider implementation details can evolve without breaking the public API contract.
|
|
|
|
6. **Zero runtime risk.** No changes to loading, registration, streaming, or dispatch. The refactor is purely structural.
|
|
|
|
### Negative
|
|
|
|
1. **More files.** Instead of 1 generated file + 1 custom file, we'll have ~15-20 generated files. Marginal complexity increase, but each file is focused and small.
|
|
|
|
2. **Generation script update.** `generate-models.ts` needs to write per-provider files. The script already groups by provider, so this is straightforward but requires testing.
|
|
|
|
3. **Import audit for barrel export change.** Any code that directly imports `streamAnthropic` (etc.) from `pi-ai` needs to be updated. Based on research, the main consumer is `register-builtins.ts` itself, which imports providers directly (not through the barrel). External usage should be minimal.
|
|
|
|
## Alternatives Considered
|
|
|
|
### 1. Full lazy provider loading (original ADR-005 proposal)
|
|
|
|
Make all providers load on-demand via async dynamic imports, generalizing the Bedrock pattern. **Rejected** because:
|
|
- SDK imports are already lazy — the heavy cost is handled
|
|
- Provider implementation parsing is ~10-30ms total — not a bottleneck
|
|
- Adds async complexity to the synchronous stream dispatch hot path
|
|
- High migration effort and regression risk for unmeasurable performance gain
|
|
|
|
### 2. Plugin architecture with separate npm packages
|
|
|
|
Move each provider to its own package (`@gsd/provider-anthropic`, etc.). Maximum isolation but dramatically more complex build/release/versioning. Overkill for a monorepo where all providers ship together.
|
|
|
|
### 3. Do nothing
|
|
|
|
The current architecture works. This is a valid choice. The split is justified by the developer experience friction (13K-line file, merge conflicts, unusable git blame) and the API hygiene issue (leaking provider internals), not by a runtime problem. If the team is not experiencing these friction points, deferring is reasonable.
|
|
|
|
## Implementation Plan
|
|
|
|
### Wave 1: Split Model Catalog (Low-Medium Risk)
|
|
1. Update `generate-models.ts` to emit per-provider files into `models/generated/`
|
|
2. Create `models/index.ts` that imports all per-provider files and builds the same registry
|
|
3. Extract `CAPABILITY_PATCHES` into `models/capability-patches.ts`
|
|
4. Move `models.custom.ts` to `models/custom.ts`
|
|
5. Update imports in `models.ts` (or replace it with the new `models/index.ts`)
|
|
6. Verify `npm run build` and `npm run test` pass
|
|
7. Delete `models.generated.ts` and `models.custom.ts`
|
|
|
|
### Wave 2: Clean Up Barrel Export (Low Risk)
|
|
1. Remove provider re-exports from `index.ts`
|
|
2. Grep for direct provider imports from `"pi-ai"` across the codebase
|
|
3. Migrate any found usages to use `stream()` / `streamSimple()` through the registry
|
|
4. Verify build and tests
|
|
|
|
### Wave 3: Validate
|
|
1. Run full test suite
|
|
2. Verify extension registration (Ollama, Claude Code CLI) still works
|
|
3. Verify `resetApiProviders()` test helper still works
|
|
4. Spot-check a few providers end-to-end
|
|
|
|
## References
|
|
|
|
- Current model catalog: `packages/pi-ai/src/models.generated.ts` (13,848 lines)
|
|
- Current barrel export: `packages/pi-ai/src/index.ts`
|
|
- Model registry: `packages/pi-ai/src/models.ts`
|
|
- API provider registry: `packages/pi-ai/src/api-registry.ts`
|
|
- Eager registration: `packages/pi-ai/src/providers/register-builtins.ts`
|
|
- Stream dispatch: `packages/pi-ai/src/stream.ts`
|
|
- Generation script: `packages/pi-ai/scripts/generate-models.ts`
|
|
- Extension registration: `packages/pi-coding-agent/src/core/model-registry.ts`
|
|
- ADR-004: `docs/ADR-004-capability-aware-model-routing.md`
|