diff --git a/README.md b/README.md
index 33e29d038..4dced8410 100644
--- a/README.md
+++ b/README.md
@@ -53,6 +53,7 @@ Full documentation is available in the [`docs/`](./docs/) directory:
 - **[Getting Started](./docs/getting-started.md)** — install, first run, basic usage
 - **[Auto Mode](./docs/auto-mode.md)** — autonomous execution deep-dive
 - **[Configuration](./docs/configuration.md)** — all preferences, models, git, and hooks
+- **[Custom Models](./docs/custom-models.md)** — add custom providers (Ollama, vLLM, LM Studio, proxies)
 - **[Token Optimization](./docs/token-optimization.md)** — profiles, context compression, complexity routing
 - **[Cost Management](./docs/cost-management.md)** — budgets, tracking, projections
 - **[Git Strategy](./docs/git-strategy.md)** — worktree isolation, branching, merge behavior
diff --git a/docs/README.md b/docs/README.md
index 080a5eaf7..290201e79 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -11,6 +11,7 @@ Welcome to the GSD documentation. This covers everything from getting started to
 | [Commands Reference](./commands.md) | All commands, keyboard shortcuts, and CLI flags |
 | [Remote Questions](./remote-questions.md) | Discord and Slack integration for headless auto-mode |
 | [Configuration](./configuration.md) | Preferences, model selection, git settings, and token profiles |
+| [Custom Models](./custom-models.md) | Add custom providers (Ollama, vLLM, LM Studio, proxies) via models.json |
 | [Token Optimization](./token-optimization.md) | Token profiles, context compression, complexity routing, and adaptive learning (v2.17) |
 | [Dynamic Model Routing](./dynamic-model-routing.md) | Complexity-based model selection, cost tables, escalation, and budget pressure (v2.19) |
 | [Captures & Triage](./captures-triage.md) | Fire-and-forget thought capture during auto-mode with automated triage (v2.19) |
diff --git a/docs/configuration.md b/docs/configuration.md
index d5c9a3a7a..d8e1111e6 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -187,13 +187,35 @@ models:
 
 ### Custom Model Definitions (`models.json`)
 
-Define custom models in `~/.gsd/agent/models.json`. This lets you add models not included in the default registry — useful for self-hosted endpoints, fine-tuned models, or new releases.
+Define custom models and providers in `~/.gsd/agent/models.json`. This lets you add models not included in the default registry — useful for self-hosted endpoints (Ollama, vLLM, LM Studio), fine-tuned models, proxies, or new provider releases.
 
 GSD resolves models.json with fallback logic:
 1. `~/.gsd/agent/models.json` — primary (GSD)
 2. `~/.pi/agent/models.json` — fallback (Pi)
 3. If neither exists, creates `~/.gsd/agent/models.json`
 
+**Quick example for local models (Ollama):**
+
+```json
+{
+  "providers": {
+    "ollama": {
+      "baseUrl": "http://localhost:11434/v1",
+      "api": "openai-completions",
+      "apiKey": "ollama",
+      "models": [
+        { "id": "llama3.1:8b" },
+        { "id": "qwen2.5-coder:7b" }
+      ]
+    }
+  }
+}
+```
+
+The file reloads each time you open `/model` — no restart needed.
+
+For full documentation including provider configuration, model overrides, OpenAI compatibility settings, and advanced examples, see the [Custom Models Guide](./custom-models.md).
+
 **With fallbacks:**
 
 ```yaml
diff --git a/docs/custom-models.md b/docs/custom-models.md
new file mode 100644
index 000000000..943d213bf
--- /dev/null
+++ b/docs/custom-models.md
@@ -0,0 +1,335 @@
+# Custom Models
+
+Add custom providers and models (Ollama, vLLM, LM Studio, proxies) via `~/.gsd/agent/models.json`.
+
+## Table of Contents
+
+- [Minimal Example](#minimal-example)
+- [Full Example](#full-example)
+- [Supported APIs](#supported-apis)
+- [Provider Configuration](#provider-configuration)
+- [Model Configuration](#model-configuration)
+- [Overriding Built-in Providers](#overriding-built-in-providers)
+- [Per-model Overrides](#per-model-overrides)
+- [OpenAI Compatibility](#openai-compatibility)
+
+## Minimal Example
+
+For local models (Ollama, LM Studio, vLLM), only `id` is required per model:
+
+```json
+{
+  "providers": {
+    "ollama": {
+      "baseUrl": "http://localhost:11434/v1",
+      "api": "openai-completions",
+      "apiKey": "ollama",
+      "models": [
+        { "id": "llama3.1:8b" },
+        { "id": "qwen2.5-coder:7b" }
+      ]
+    }
+  }
+}
+```
+
+The `apiKey` is required but Ollama ignores it, so any value works.
+
+Some OpenAI-compatible servers do not understand the `developer` role used for reasoning-capable models. For those providers, set `compat.supportsDeveloperRole` to `false` so GSD sends the system prompt as a `system` message instead. If the server also does not support `reasoning_effort`, set `compat.supportsReasoningEffort` to `false` too.
+
+You can set `compat` at the provider level to apply to all models, or at the model level to override a specific model. This commonly applies to Ollama, vLLM, SGLang, and similar OpenAI-compatible servers.
+
+```json
+{
+  "providers": {
+    "ollama": {
+      "baseUrl": "http://localhost:11434/v1",
+      "api": "openai-completions",
+      "apiKey": "ollama",
+      "compat": {
+        "supportsDeveloperRole": false,
+        "supportsReasoningEffort": false
+      },
+      "models": [
+        {
+          "id": "gpt-oss:20b",
+          "reasoning": true
+        }
+      ]
+    }
+  }
+}
+```
+
+## Full Example
+
+Override defaults when you need specific values:
+
+```json
+{
+  "providers": {
+    "ollama": {
+      "baseUrl": "http://localhost:11434/v1",
+      "api": "openai-completions",
+      "apiKey": "ollama",
+      "models": [
+        {
+          "id": "llama3.1:8b",
+          "name": "Llama 3.1 8B (Local)",
+          "reasoning": false,
+          "input": ["text"],
+          "contextWindow": 128000,
+          "maxTokens": 32000,
+          "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }
+        }
+      ]
+    }
+  }
+}
+```
+
+The file reloads each time you open `/model`. Edit during session; no restart needed.
+
+## Supported APIs
+
+| API | Description |
+|-----|-------------|
+| `openai-completions` | OpenAI Chat Completions (most compatible) |
+| `openai-responses` | OpenAI Responses API |
+| `anthropic-messages` | Anthropic Messages API |
+| `google-generative-ai` | Google Generative AI |
+
+Set `api` at provider level (default for all models) or model level (override per model).
+
+## Provider Configuration
+
+| Field | Description |
+|-------|-------------|
+| `baseUrl` | API endpoint URL |
+| `api` | API type (see above) |
+| `apiKey` | API key (see value resolution below) |
+| `headers` | Custom headers (see value resolution below) |
+| `authHeader` | Set `true` to add `Authorization: Bearer <apiKey>` automatically |
+| `models` | Array of model configurations |
+| `modelOverrides` | Per-model overrides for built-in models on this provider |
+
+### Value Resolution
+
+The `apiKey` and `headers` fields support three formats:
+
+- **Shell command:** `"!command"` executes and uses stdout
+  ```json
+  "apiKey": "!security find-generic-password -ws 'anthropic'"
+  "apiKey": "!op read 'op://vault/item/credential'"
+  ```
+- **Environment variable:** Uses the value of the named variable
+  ```json
+  "apiKey": "MY_API_KEY"
+  ```
+- **Literal value:** Used directly
+  ```json
+  "apiKey": "sk-..."
+  ```
+
+### Custom Headers
+
+```json
+{
+  "providers": {
+    "custom-proxy": {
+      "baseUrl": "https://proxy.example.com/v1",
+      "apiKey": "MY_API_KEY",
+      "api": "anthropic-messages",
+      "headers": {
+        "x-portkey-api-key": "PORTKEY_API_KEY",
+        "x-secret": "!op read 'op://vault/item/secret'"
+      },
+      "models": [...]
+    }
+  }
+}
+```
+
+## Model Configuration
+
+| Field | Required | Default | Description |
+|-------|----------|---------|-------------|
+| `id` | Yes | — | Model identifier (passed to the API) |
+| `name` | No | `id` | Human-readable model label. Used for matching (`--model` patterns) and shown in model details/status text. |
+| `api` | No | provider's `api` | Override provider's API for this model |
+| `reasoning` | No | `false` | Supports extended thinking |
+| `input` | No | `["text"]` | Input types: `["text"]` or `["text", "image"]` |
+| `contextWindow` | No | `128000` | Context window size in tokens |
+| `maxTokens` | No | `16384` | Maximum output tokens |
+| `cost` | No | all zeros | `{"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0}` (per million tokens) |
+| `compat` | No | provider `compat` | OpenAI compatibility overrides. Merged with provider-level `compat` when both are set. |
+
+Current behavior:
+- `/model` and `--list-models` list entries by model `id`.
+- The configured `name` is used for model matching and detail/status text.
+
+## Overriding Built-in Providers
+
+Route a built-in provider through a proxy without redefining models:
+
+```json
+{
+  "providers": {
+    "anthropic": {
+      "baseUrl": "https://my-proxy.example.com/v1"
+    }
+  }
+}
+```
+
+All built-in Anthropic models remain available. Existing OAuth or API key auth continues to work.
+
+To merge custom models into a built-in provider, include the `models` array:
+
+```json
+{
+  "providers": {
+    "anthropic": {
+      "baseUrl": "https://my-proxy.example.com/v1",
+      "apiKey": "ANTHROPIC_API_KEY",
+      "api": "anthropic-messages",
+      "models": [...]
+    }
+  }
+}
+```
+
+Merge semantics:
+- Built-in models are kept.
+- Custom models are upserted by `id` within the provider.
+- If a custom model `id` matches a built-in model `id`, the custom model replaces that built-in model.
+- If a custom model `id` is new, it is added alongside built-in models.
+
+## Per-model Overrides
+
+Use `modelOverrides` to customize specific built-in models without replacing the provider's full model list.
+
+```json
+{
+  "providers": {
+    "openrouter": {
+      "modelOverrides": {
+        "anthropic/claude-sonnet-4": {
+          "name": "Claude Sonnet 4 (Bedrock Route)",
+          "compat": {
+            "openRouterRouting": {
+              "only": ["amazon-bedrock"]
+            }
+          }
+        }
+      }
+    }
+  }
+}
+```
+
+`modelOverrides` supports these fields per model: `name`, `reasoning`, `input`, `cost` (partial), `contextWindow`, `maxTokens`, `headers`, `compat`.
+
+Behavior notes:
+- `modelOverrides` are applied to built-in provider models.
+- Unknown model IDs are ignored.
+- You can combine provider-level `baseUrl`/`headers` with `modelOverrides`.
+- If `models` is also defined for a provider, custom models are merged after built-in overrides. A custom model with the same `id` replaces the overridden built-in model entry.
+
+## OpenAI Compatibility
+
+For providers with partial OpenAI compatibility, use the `compat` field.
+
+- Provider-level `compat` applies defaults to all models under that provider.
+- Model-level `compat` overrides provider-level values for that model.
+
+```json
+{
+  "providers": {
+    "local-llm": {
+      "baseUrl": "http://localhost:8080/v1",
+      "api": "openai-completions",
+      "compat": {
+        "supportsUsageInStreaming": false,
+        "maxTokensField": "max_tokens"
+      },
+      "models": [...]
+    }
+  }
+}
+```
+
+| Field | Description |
+|-------|-------------|
+| `supportsStore` | Provider supports `store` field |
+| `supportsDeveloperRole` | Use `developer` vs `system` role |
+| `supportsReasoningEffort` | Support for `reasoning_effort` parameter |
+| `reasoningEffortMap` | Map GSD thinking levels to provider-specific `reasoning_effort` values |
+| `supportsUsageInStreaming` | Supports `stream_options: { include_usage: true }` (default: `true`) |
+| `maxTokensField` | Use `max_completion_tokens` or `max_tokens` |
+| `requiresToolResultName` | Include `name` on tool result messages |
+| `requiresAssistantAfterToolResult` | Insert an assistant message before a user message after tool results |
+| `requiresThinkingAsText` | Convert thinking blocks to plain text |
+| `thinkingFormat` | Use `reasoning_effort`, `zai`, `qwen`, or `qwen-chat-template` thinking parameters |
+| `supportsStrictMode` | Include the `strict` field in tool definitions |
+| `openRouterRouting` | OpenRouter routing config passed to OpenRouter for model/provider selection |
+| `vercelGatewayRouting` | Vercel AI Gateway routing config for provider selection (`only`, `order`) |
+
+`qwen` uses top-level `enable_thinking`. Use `qwen-chat-template` for local Qwen-compatible servers that require `chat_template_kwargs.enable_thinking`.
+
+Example:
+
+```json
+{
+  "providers": {
+    "openrouter": {
+      "baseUrl": "https://openrouter.ai/api/v1",
+      "apiKey": "OPENROUTER_API_KEY",
+      "api": "openai-completions",
+      "models": [
+        {
+          "id": "openrouter/anthropic/claude-3.5-sonnet",
+          "name": "OpenRouter Claude 3.5 Sonnet",
+          "compat": {
+            "openRouterRouting": {
+              "order": ["anthropic"],
+              "fallbacks": ["openai"]
+            }
+          }
+        }
+      ]
+    }
+  }
+}
+```
+
+Vercel AI Gateway example:
+
+```json
+{
+  "providers": {
+    "vercel-ai-gateway": {
+      "baseUrl": "https://ai-gateway.vercel.sh/v1",
+      "apiKey": "AI_GATEWAY_API_KEY",
+      "api": "openai-completions",
+      "models": [
+        {
+          "id": "moonshotai/kimi-k2.5",
+          "name": "Kimi K2.5 (Fireworks via Vercel)",
+          "reasoning": true,
+          "input": ["text", "image"],
+          "cost": { "input": 0.6, "output": 3, "cacheRead": 0, "cacheWrite": 0 },
+          "contextWindow": 262144,
+          "maxTokens": 262144,
+          "compat": {
+            "vercelGatewayRouting": {
+              "only": ["fireworks", "novita"],
+              "order": ["fireworks", "novita"]
+            }
+          }
+        }
+      ]
+    }
+  }
+}
+```