singularity-forge/docs/user-docs/custom-models.md

# Custom Models

Add custom providers and models (Ollama, vLLM, LM Studio, proxies) via `~/.sf/agent/models.json`.

## Table of Contents

- [Minimal Example](#minimal-example)
- [Full Example](#full-example)
- [Supported APIs](#supported-apis)
- [Provider Configuration](#provider-configuration)
- [Model Configuration](#model-configuration)
- [Overriding Built-in Providers](#overriding-built-in-providers)
- [Per-model Overrides](#per-model-overrides)
- [OpenAI Compatibility](#openai-compatibility)

## Minimal Example

For local models (Ollama, LM Studio, vLLM), only `id` is required per model:

```json
{
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "api": "openai-completions",
      "apiKey": "ollama",
      "models": [
        { "id": "llama3.1:8b" },
        { "id": "qwen2.5-coder:7b" }
      ]
    }
  }
}
```

The `apiKey` is required but Ollama ignores it, so any value works.

Some OpenAI-compatible servers do not understand the `developer` role used for reasoning-capable models. For those providers, set `compat.supportsDeveloperRole` to `false` so SF sends the system prompt as a `system` message instead. If the server also does not support `reasoning_effort`, set `compat.supportsReasoningEffort` to `false` too.

You can set `compat` at the provider level to apply to all models, or at the model level to override a specific model. This commonly applies to Ollama, vLLM, SGLang, and similar OpenAI-compatible servers.

```json
{
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "api": "openai-completions",
      "apiKey": "ollama",
      "compat": {
        "supportsDeveloperRole": false,
        "supportsReasoningEffort": false
      },
      "models": [
        {
          "id": "gpt-oss:20b",
          "reasoning": true
        }
      ]
    }
  }
}
```

## Full Example

Override defaults when you need specific values:

```json
{
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "api": "openai-completions",
      "apiKey": "ollama",
      "models": [
        {
          "id": "llama3.1:8b",
          "name": "Llama 3.1 8B (Local)",
          "reasoning": false,
          "input": ["text"],
          "contextWindow": 128000,
          "maxTokens": 32000,
          "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }
        }
      ]
    }
  }
}
```

The file reloads each time you open `/model`. Edit during session; no restart needed.

## Supported APIs

| API | Description |
|-----|-------------|
| `openai-completions` | OpenAI Chat Completions (most compatible) |
| `openai-responses` | OpenAI Responses API |
| `anthropic-messages` | Anthropic Messages API |
| `google-generative-ai` | Google Generative AI |

Set `api` at provider level (default for all models) or model level (override per model).

## Provider Configuration

| Field | Description |
|-------|-------------|
| `baseUrl` | API endpoint URL |
| `api` | API type (see above) |
| `apiKey` | API key (see value resolution below) |
| `headers` | Custom headers (see value resolution below) |
| `authHeader` | Set `true` to add `Authorization: Bearer <apiKey>` automatically |
| `models` | Array of model configurations |
| `modelOverrides` | Per-model overrides for built-in models on this provider |

### Value Resolution

The `apiKey` and `headers` fields support three formats:

- **Shell command:** `"!command"` executes and uses stdout
  ```json
  "apiKey": "!security find-generic-password -ws 'anthropic'"
  "apiKey": "!op read 'op://vault/item/credential'"
  ```
- **Environment variable:** Uses the value of the named variable
  ```json
  "apiKey": "MY_API_KEY"
  ```
- **Literal value:** Used directly
  ```json
  "apiKey": "sk-..."
  ```

#### Command Allowlist

Shell commands (`!command`) are restricted to a set of known credential tools. Only commands starting with one of these are allowed to execute:

`pass`, `op`, `aws`, `gcloud`, `vault`, `security`, `gpg`, `bw`, `gopass`, `lpass`

Commands not on this list are blocked and the value resolves to `undefined`. A warning is written to stderr.

Shell operators (`;`, `|`, `&`, `` ` ``, `$`, `>`, `<`) are also blocked in command arguments to prevent injection.

**Customizing the allowlist:**

If you use a credential tool not on the default list, override it in global settings (`~/.sf/agent/settings.json`):

```json
{
  "allowedCommandPrefixes": ["pass", "op", "sops", "doppler", "mycli"]
}
```

This replaces the default list entirely — include any defaults you still want.

Alternatively, set the `SF_ALLOWED_COMMAND_PREFIXES` environment variable (comma-separated). The env var takes precedence over settings.json:

```bash
export SF_ALLOWED_COMMAND_PREFIXES="pass,op,sops,doppler"
```

> **Note:** This setting is global-only. Project-level settings.json (`<project>/.sf/settings.json`) cannot override the command allowlist — this prevents a cloned repo from escalating command execution privileges.

### Custom Headers

```json
{
  "providers": {
    "custom-proxy": {
      "baseUrl": "https://proxy.example.com/v1",
      "apiKey": "MY_API_KEY",
      "api": "anthropic-messages",
      "headers": {
        "x-portkey-api-key": "PORTKEY_API_KEY",
        "x-secret": "!op read 'op://vault/item/secret'"
      },
      "models": [...]
    }
  }
}
```

## Model Configuration

| Field | Required | Default | Description |
|-------|----------|---------|-------------|
| `id` | Yes | — | Model identifier (passed to the API) |
| `name` | No | `id` | Human-readable model label. Used for matching (`--model` patterns) and shown in model details/status text. |
| `api` | No | provider's `api` | Override provider's API for this model |
| `reasoning` | No | `false` | Supports extended thinking |
| `input` | No | `["text"]` | Input types: `["text"]` or `["text", "image"]` |
| `contextWindow` | No | `128000` | Context window size in tokens |
| `maxTokens` | No | `16384` | Maximum output tokens |
| `cost` | No | all zeros | `{"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0}` (per million tokens) |
| `compat` | No | provider `compat` | OpenAI compatibility overrides. Merged with provider-level `compat` when both are set. |

Current behavior:
- `/model` and `--list-models` list entries by model `id`.
- The configured `name` is used for model matching and detail/status text.

## Overriding Built-in Providers

Route a built-in provider through a proxy without redefining models:

```json
{
  "providers": {
    "anthropic": {
      "baseUrl": "https://my-proxy.example.com/v1"
    }
  }
}
```

All built-in Anthropic models remain available. Existing OAuth or API key auth continues to work.

To merge custom models into a built-in provider, include the `models` array:

```json
{
  "providers": {
    "anthropic": {
      "baseUrl": "https://my-proxy.example.com/v1",
      "apiKey": "ANTHROPIC_API_KEY",
      "api": "anthropic-messages",
      "models": [...]
    }
  }
}
```

Merge semantics:
- Built-in models are kept.
- Custom models are upserted by `id` within the provider.
- If a custom model `id` matches a built-in model `id`, the custom model replaces that built-in model.
- If a custom model `id` is new, it is added alongside built-in models.

## Per-model Overrides

Use `modelOverrides` to customize specific built-in models without replacing the provider's full model list.

```json
{
  "providers": {
    "openrouter": {
      "modelOverrides": {
        "anthropic/claude-sonnet-4": {
          "name": "Claude Sonnet 4 (Bedrock Route)",
          "compat": {
            "openRouterRouting": {
              "only": ["amazon-bedrock"]
            }
          }
        }
      }
    }
  }
}
```

`modelOverrides` supports these fields per model: `name`, `reasoning`, `input`, `cost` (partial), `contextWindow`, `maxTokens`, `headers`, `compat`.

Behavior notes:
- `modelOverrides` are applied to built-in provider models.
- Unknown model IDs are ignored.
- You can combine provider-level `baseUrl`/`headers` with `modelOverrides`.
- If `models` is also defined for a provider, custom models are merged after built-in overrides. A custom model with the same `id` replaces the overridden built-in model entry.

## OpenAI Compatibility

For providers with partial OpenAI compatibility, use the `compat` field.

- Provider-level `compat` applies defaults to all models under that provider.
- Model-level `compat` overrides provider-level values for that model.

```json
{
  "providers": {
    "local-llm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "compat": {
        "supportsUsageInStreaming": false,
        "maxTokensField": "max_tokens"
      },
      "models": [...]
    }
  }
}
```

| Field | Description |
|-------|-------------|
| `supportsStore` | Provider supports `store` field |
| `supportsDeveloperRole` | Use `developer` vs `system` role |
| `supportsReasoningEffort` | Support for `reasoning_effort` parameter |
| `reasoningEffortMap` | Map SF thinking levels to provider-specific `reasoning_effort` values |
| `supportsUsageInStreaming` | Supports `stream_options: { include_usage: true }` (default: `true`) |
| `maxTokensField` | Use `max_completion_tokens` or `max_tokens` |
| `requiresToolResultName` | Include `name` on tool result messages |
| `requiresAssistantAfterToolResult` | Insert an assistant message before a user message after tool results |
| `requiresThinkingAsText` | Convert thinking blocks to plain text |
| `thinkingFormat` | Use `reasoning_effort`, `zai`, `qwen`, or `qwen-chat-template` thinking parameters |
| `supportsStrictMode` | Include the `strict` field in tool definitions |
| `openRouterRouting` | OpenRouter routing config passed to OpenRouter for model/provider selection |
| `vercelGatewayRouting` | Vercel AI Gateway routing config for provider selection (`only`, `order`) |

`qwen` uses top-level `enable_thinking`. Use `qwen-chat-template` for local Qwen-compatible servers that require `chat_template_kwargs.enable_thinking`.

Example:

```json
{
  "providers": {
    "openrouter": {
      "baseUrl": "https://openrouter.ai/api/v1",
      "apiKey": "OPENROUTER_API_KEY",
      "api": "openai-completions",
      "models": [
        {
          "id": "openrouter/anthropic/claude-3.5-sonnet",
          "name": "OpenRouter Claude 3.5 Sonnet",
          "compat": {
            "openRouterRouting": {
              "order": ["anthropic"],
              "fallbacks": ["openai"]
            }
          }
        }
      ]
    }
  }
}
```

Vercel AI Gateway example:

```json
{
  "providers": {
    "vercel-ai-gateway": {
      "baseUrl": "https://ai-gateway.vercel.sh/v1",
      "apiKey": "AI_GATEWAY_API_KEY",
      "api": "openai-completions",
      "models": [
        {
          "id": "moonshotai/kimi-k2.5",
          "name": "Kimi K2.5 (Fireworks via Vercel)",
          "reasoning": true,
          "input": ["text", "image"],
          "cost": { "input": 0.6, "output": 3, "cacheRead": 0, "cacheWrite": 0 },
          "contextWindow": 262144,
          "maxTokens": 262144,
          "compat": {
            "vercelGatewayRouting": {
              "only": ["fireworks", "novita"],
              "order": ["fireworks", "novita"]
            }
          }
        }
      ]
    }
  }
}
```
fix: restore PR files lost during merge conflict resolution Files added by PR #2008 that were not in main were dropped during the merge. Restore all src/, docs/, and scripts/ files from the pre-merge PR head. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-03-25 22:38:55 -06:00			`# Custom Models`

refactor(native): rename gsd_parser.rs to forge_parser.rs Final rebrand: rename remaining Rust source file to complete the gsd → forge transition. All parser references already use forge_parser after earlier commits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-15 14:58:21 +02:00			Add custom providers and models (Ollama, vLLM, LM Studio, proxies) via `~/.sf/agent/models.json`.
fix: restore PR files lost during merge conflict resolution Files added by PR #2008 that were not in main were dropped during the merge. Restore all src/, docs/, and scripts/ files from the pre-merge PR head. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-03-25 22:38:55 -06:00
			`## Table of Contents`

			`- [Minimal Example](#minimal-example)`
			`- [Full Example](#full-example)`
			`- [Supported APIs](#supported-apis)`
			`- [Provider Configuration](#provider-configuration)`
			`- [Model Configuration](#model-configuration)`
			`- [Overriding Built-in Providers](#overriding-built-in-providers)`
			`- [Per-model Overrides](#per-model-overrides)`
			`- [OpenAI Compatibility](#openai-compatibility)`

			`## Minimal Example`

			For local models (Ollama, LM Studio, vLLM), only `id` is required per model:

			```json
			`{`
			`"providers": {`
			`"ollama": {`
			`"baseUrl": "http://localhost:11434/v1",`
			`"api": "openai-completions",`
			`"apiKey": "ollama",`
			`"models": [`
			`{ "id": "llama3.1:8b" },`
			`{ "id": "qwen2.5-coder:7b" }`
			`]`
			`}`
			`}`
			`}`
			```

			The `apiKey` is required but Ollama ignores it, so any value works.

chore: sync workspace state after rebrand - Rebrand commits already in history (gsd → forge) - Sync pre-existing doc, docker, and CI config updates - All rebrand artifacts verified in place: * Native crates: forge-engine, forge-ast, forge-grep * Log prefixes: [forge] across 22+ files * Binary: ~/bin/sf-run * Workspace scopes: @sf-run/, @singularity-forge/ * Nix flake: Rust toolchain ready System ready for: nix develop && bun run build:native Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-15 14:54:20 +02:00			Some OpenAI-compatible servers do not understand the `developer` role used for reasoning-capable models. For those providers, set `compat.supportsDeveloperRole` to `false` so SF sends the system prompt as a `system` message instead. If the server also does not support `reasoning_effort`, set `compat.supportsReasoningEffort` to `false` too.
fix: restore PR files lost during merge conflict resolution Files added by PR #2008 that were not in main were dropped during the merge. Restore all src/, docs/, and scripts/ files from the pre-merge PR head. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-03-25 22:38:55 -06:00
			You can set `compat` at the provider level to apply to all models, or at the model level to override a specific model. This commonly applies to Ollama, vLLM, SGLang, and similar OpenAI-compatible servers.

			```json
			`{`
			`"providers": {`
			`"ollama": {`
			`"baseUrl": "http://localhost:11434/v1",`
			`"api": "openai-completions",`
			`"apiKey": "ollama",`
			`"compat": {`
			`"supportsDeveloperRole": false,`
			`"supportsReasoningEffort": false`
			`},`
			`"models": [`
			`{`
			`"id": "gpt-oss:20b",`
			`"reasoning": true`
			`}`
			`]`
			`}`
			`}`
			`}`
			```

			`## Full Example`

			`Override defaults when you need specific values:`

			```json
			`{`
			`"providers": {`
			`"ollama": {`
			`"baseUrl": "http://localhost:11434/v1",`
			`"api": "openai-completions",`
			`"apiKey": "ollama",`
			`"models": [`
			`{`
			`"id": "llama3.1:8b",`
			`"name": "Llama 3.1 8B (Local)",`
			`"reasoning": false,`
			`"input": ["text"],`
			`"contextWindow": 128000,`
			`"maxTokens": 32000,`
			`"cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }`
			`}`
			`]`
			`}`
			`}`
			`}`
			```

			The file reloads each time you open `/model`. Edit during session; no restart needed.

			`## Supported APIs`

			`\| API \| Description \|`
			`\|-----\|-------------\|`
			\| `openai-completions` \| OpenAI Chat Completions (most compatible) \|
			\| `openai-responses` \| OpenAI Responses API \|
			\| `anthropic-messages` \| Anthropic Messages API \|
			\| `google-generative-ai` \| Google Generative AI \|

			Set `api` at provider level (default for all models) or model level (override per model).

			`## Provider Configuration`

			`\| Field \| Description \|`
			`\|-------\|-------------\|`
			\| `baseUrl` \| API endpoint URL \|
			\| `api` \| API type (see above) \|
			\| `apiKey` \| API key (see value resolution below) \|
			\| `headers` \| Custom headers (see value resolution below) \|
			\| `authHeader` \| Set `true` to add `Authorization: Bearer <apiKey>` automatically \|
			\| `models` \| Array of model configurations \|
			\| `modelOverrides` \| Per-model overrides for built-in models on this provider \|

			`### Value Resolution`

			The `apiKey` and `headers` fields support three formats:

			- Shell command: `"!command"` executes and uses stdout
			```json
			`"apiKey": "!security find-generic-password -ws 'anthropic'"`
			`"apiKey": "!op read 'op://vault/item/credential'"`
			```
			`- Environment variable: Uses the value of the named variable`
			```json
			`"apiKey": "MY_API_KEY"`
			```
			`- Literal value: Used directly`
			```json
			`"apiKey": "sk-..."`
			```

docs: document command allowlist and fetch_page URL blocking - custom-models.md: add Command Allowlist section under Value Resolution explaining the restriction, default list, and how to override via allowedCommandPrefixes setting or GSD_ALLOWED_COMMAND_PREFIXES env var - configuration.md: add URL Blocking (fetch_page) section documenting what's blocked by default, why, and how to allowlist specific hosts via fetchAllowedUrls setting or GSD_FETCH_ALLOWED_URLS env var - configuration.md: add both env vars to the Environment Variables table 2026-04-02 13:55:07 +02:00			`#### Command Allowlist`

			Shell commands (`!command`) are restricted to a set of known credential tools. Only commands starting with one of these are allowed to execute:

			`pass`, `op`, `aws`, `gcloud`, `vault`, `security`, `gpg`, `bw`, `gopass`, `lpass`

			Commands not on this list are blocked and the value resolves to `undefined`. A warning is written to stderr.

			Shell operators (`;`, `\|`, `&`, `` ` ``, `$`, `>`, `<`) are also blocked in command arguments to prevent injection.

			`Customizing the allowlist:`

refactor(native): rename gsd_parser.rs to forge_parser.rs Final rebrand: rename remaining Rust source file to complete the gsd → forge transition. All parser references already use forge_parser after earlier commits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-15 14:58:21 +02:00			If you use a credential tool not on the default list, override it in global settings (`~/.sf/agent/settings.json`):
docs: document command allowlist and fetch_page URL blocking - custom-models.md: add Command Allowlist section under Value Resolution explaining the restriction, default list, and how to override via allowedCommandPrefixes setting or GSD_ALLOWED_COMMAND_PREFIXES env var - configuration.md: add URL Blocking (fetch_page) section documenting what's blocked by default, why, and how to allowlist specific hosts via fetchAllowedUrls setting or GSD_FETCH_ALLOWED_URLS env var - configuration.md: add both env vars to the Environment Variables table 2026-04-02 13:55:07 +02:00
			```json
			`{`
			`"allowedCommandPrefixes": ["pass", "op", "sops", "doppler", "mycli"]`
			`}`
			```

			`This replaces the default list entirely — include any defaults you still want.`

chore: sync workspace state after rebrand - Rebrand commits already in history (gsd → forge) - Sync pre-existing doc, docker, and CI config updates - All rebrand artifacts verified in place: * Native crates: forge-engine, forge-ast, forge-grep * Log prefixes: [forge] across 22+ files * Binary: ~/bin/sf-run * Workspace scopes: @sf-run/, @singularity-forge/ * Nix flake: Rust toolchain ready System ready for: nix develop && bun run build:native Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-15 14:54:20 +02:00			Alternatively, set the `SF_ALLOWED_COMMAND_PREFIXES` environment variable (comma-separated). The env var takes precedence over settings.json:
docs: document command allowlist and fetch_page URL blocking - custom-models.md: add Command Allowlist section under Value Resolution explaining the restriction, default list, and how to override via allowedCommandPrefixes setting or GSD_ALLOWED_COMMAND_PREFIXES env var - configuration.md: add URL Blocking (fetch_page) section documenting what's blocked by default, why, and how to allowlist specific hosts via fetchAllowedUrls setting or GSD_FETCH_ALLOWED_URLS env var - configuration.md: add both env vars to the Environment Variables table 2026-04-02 13:55:07 +02:00
			```bash
chore: sync workspace state after rebrand - Rebrand commits already in history (gsd → forge) - Sync pre-existing doc, docker, and CI config updates - All rebrand artifacts verified in place: * Native crates: forge-engine, forge-ast, forge-grep * Log prefixes: [forge] across 22+ files * Binary: ~/bin/sf-run * Workspace scopes: @sf-run/, @singularity-forge/ * Nix flake: Rust toolchain ready System ready for: nix develop && bun run build:native Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-15 14:54:20 +02:00			`export SF_ALLOWED_COMMAND_PREFIXES="pass,op,sops,doppler"`
docs: document command allowlist and fetch_page URL blocking - custom-models.md: add Command Allowlist section under Value Resolution explaining the restriction, default list, and how to override via allowedCommandPrefixes setting or GSD_ALLOWED_COMMAND_PREFIXES env var - configuration.md: add URL Blocking (fetch_page) section documenting what's blocked by default, why, and how to allowlist specific hosts via fetchAllowedUrls setting or GSD_FETCH_ALLOWED_URLS env var - configuration.md: add both env vars to the Environment Variables table 2026-04-02 13:55:07 +02:00			```

refactor(native): rename gsd_parser.rs to forge_parser.rs Final rebrand: rename remaining Rust source file to complete the gsd → forge transition. All parser references already use forge_parser after earlier commits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-15 14:58:21 +02:00			> Note: This setting is global-only. Project-level settings.json (`<project>/.sf/settings.json`) cannot override the command allowlist — this prevents a cloned repo from escalating command execution privileges.
docs: document command allowlist and fetch_page URL blocking - custom-models.md: add Command Allowlist section under Value Resolution explaining the restriction, default list, and how to override via allowedCommandPrefixes setting or GSD_ALLOWED_COMMAND_PREFIXES env var - configuration.md: add URL Blocking (fetch_page) section documenting what's blocked by default, why, and how to allowlist specific hosts via fetchAllowedUrls setting or GSD_FETCH_ALLOWED_URLS env var - configuration.md: add both env vars to the Environment Variables table 2026-04-02 13:55:07 +02:00
fix: restore PR files lost during merge conflict resolution Files added by PR #2008 that were not in main were dropped during the merge. Restore all src/, docs/, and scripts/ files from the pre-merge PR head. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-03-25 22:38:55 -06:00			`### Custom Headers`

			```json
			`{`
			`"providers": {`
			`"custom-proxy": {`
			`"baseUrl": "https://proxy.example.com/v1",`
			`"apiKey": "MY_API_KEY",`
			`"api": "anthropic-messages",`
			`"headers": {`
			`"x-portkey-api-key": "PORTKEY_API_KEY",`
			`"x-secret": "!op read 'op://vault/item/secret'"`
			`},`
			`"models": [...]`
			`}`
			`}`
			`}`
			```

			`## Model Configuration`

			`\| Field \| Required \| Default \| Description \|`
			`\|-------\|----------\|---------\|-------------\|`
			\| `id` \| Yes \| — \| Model identifier (passed to the API) \|
			\| `name` \| No \| `id` \| Human-readable model label. Used for matching (`--model` patterns) and shown in model details/status text. \|
			\| `api` \| No \| provider's `api` \| Override provider's API for this model \|
			\| `reasoning` \| No \| `false` \| Supports extended thinking \|
			\| `input` \| No \| `["text"]` \| Input types: `["text"]` or `["text", "image"]` \|
			\| `contextWindow` \| No \| `128000` \| Context window size in tokens \|
			\| `maxTokens` \| No \| `16384` \| Maximum output tokens \|
			\| `cost` \| No \| all zeros \| `{"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0}` (per million tokens) \|
			\| `compat` \| No \| provider `compat` \| OpenAI compatibility overrides. Merged with provider-level `compat` when both are set. \|

			`Current behavior:`
			- `/model` and `--list-models` list entries by model `id`.
			- The configured `name` is used for model matching and detail/status text.

			`## Overriding Built-in Providers`

			`Route a built-in provider through a proxy without redefining models:`

			```json
			`{`
			`"providers": {`
			`"anthropic": {`
			`"baseUrl": "https://my-proxy.example.com/v1"`
			`}`
			`}`
			`}`
			```

			`All built-in Anthropic models remain available. Existing OAuth or API key auth continues to work.`

			To merge custom models into a built-in provider, include the `models` array:

			```json
			`{`
			`"providers": {`
			`"anthropic": {`
			`"baseUrl": "https://my-proxy.example.com/v1",`
			`"apiKey": "ANTHROPIC_API_KEY",`
			`"api": "anthropic-messages",`
			`"models": [...]`
			`}`
			`}`
			`}`
			```

			`Merge semantics:`
			`- Built-in models are kept.`
			- Custom models are upserted by `id` within the provider.
			- If a custom model `id` matches a built-in model `id`, the custom model replaces that built-in model.
			- If a custom model `id` is new, it is added alongside built-in models.

			`## Per-model Overrides`

			Use `modelOverrides` to customize specific built-in models without replacing the provider's full model list.

			```json
			`{`
			`"providers": {`
			`"openrouter": {`
			`"modelOverrides": {`
			`"anthropic/claude-sonnet-4": {`
			`"name": "Claude Sonnet 4 (Bedrock Route)",`
			`"compat": {`
			`"openRouterRouting": {`
			`"only": ["amazon-bedrock"]`
			`}`
			`}`
			`}`
			`}`
			`}`
			`}`
			`}`
			```

			`modelOverrides` supports these fields per model: `name`, `reasoning`, `input`, `cost` (partial), `contextWindow`, `maxTokens`, `headers`, `compat`.

			`Behavior notes:`
			- `modelOverrides` are applied to built-in provider models.
			`- Unknown model IDs are ignored.`
			- You can combine provider-level `baseUrl`/`headers` with `modelOverrides`.
			- If `models` is also defined for a provider, custom models are merged after built-in overrides. A custom model with the same `id` replaces the overridden built-in model entry.

			`## OpenAI Compatibility`

			For providers with partial OpenAI compatibility, use the `compat` field.

			- Provider-level `compat` applies defaults to all models under that provider.
			- Model-level `compat` overrides provider-level values for that model.

			```json
			`{`
			`"providers": {`
			`"local-llm": {`
			`"baseUrl": "http://localhost:8080/v1",`
			`"api": "openai-completions",`
			`"compat": {`
			`"supportsUsageInStreaming": false,`
			`"maxTokensField": "max_tokens"`
			`},`
			`"models": [...]`
			`}`
			`}`
			`}`
			```

			`\| Field \| Description \|`
			`\|-------\|-------------\|`
			\| `supportsStore` \| Provider supports `store` field \|
			\| `supportsDeveloperRole` \| Use `developer` vs `system` role \|
			\| `supportsReasoningEffort` \| Support for `reasoning_effort` parameter \|
chore: sync workspace state after rebrand - Rebrand commits already in history (gsd → forge) - Sync pre-existing doc, docker, and CI config updates - All rebrand artifacts verified in place: * Native crates: forge-engine, forge-ast, forge-grep * Log prefixes: [forge] across 22+ files * Binary: ~/bin/sf-run * Workspace scopes: @sf-run/, @singularity-forge/ * Nix flake: Rust toolchain ready System ready for: nix develop && bun run build:native Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-15 14:54:20 +02:00			\| `reasoningEffortMap` \| Map SF thinking levels to provider-specific `reasoning_effort` values \|
fix: restore PR files lost during merge conflict resolution Files added by PR #2008 that were not in main were dropped during the merge. Restore all src/, docs/, and scripts/ files from the pre-merge PR head. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-03-25 22:38:55 -06:00			\| `supportsUsageInStreaming` \| Supports `stream_options: { include_usage: true }` (default: `true`) \|
			\| `maxTokensField` \| Use `max_completion_tokens` or `max_tokens` \|
			\| `requiresToolResultName` \| Include `name` on tool result messages \|
			\| `requiresAssistantAfterToolResult` \| Insert an assistant message before a user message after tool results \|
			\| `requiresThinkingAsText` \| Convert thinking blocks to plain text \|
			\| `thinkingFormat` \| Use `reasoning_effort`, `zai`, `qwen`, or `qwen-chat-template` thinking parameters \|
			\| `supportsStrictMode` \| Include the `strict` field in tool definitions \|
			\| `openRouterRouting` \| OpenRouter routing config passed to OpenRouter for model/provider selection \|
			\| `vercelGatewayRouting` \| Vercel AI Gateway routing config for provider selection (`only`, `order`) \|

			`qwen` uses top-level `enable_thinking`. Use `qwen-chat-template` for local Qwen-compatible servers that require `chat_template_kwargs.enable_thinking`.

			`Example:`

			```json
			`{`
			`"providers": {`
			`"openrouter": {`
			`"baseUrl": "https://openrouter.ai/api/v1",`
			`"apiKey": "OPENROUTER_API_KEY",`
			`"api": "openai-completions",`
			`"models": [`
			`{`
			`"id": "openrouter/anthropic/claude-3.5-sonnet",`
			`"name": "OpenRouter Claude 3.5 Sonnet",`
			`"compat": {`
			`"openRouterRouting": {`
			`"order": ["anthropic"],`
			`"fallbacks": ["openai"]`
			`}`
			`}`
			`}`
			`]`
			`}`
			`}`
			`}`
			```

			`Vercel AI Gateway example:`

			```json
			`{`
			`"providers": {`
			`"vercel-ai-gateway": {`
			`"baseUrl": "https://ai-gateway.vercel.sh/v1",`
			`"apiKey": "AI_GATEWAY_API_KEY",`
			`"api": "openai-completions",`
			`"models": [`
			`{`
			`"id": "moonshotai/kimi-k2.5",`
			`"name": "Kimi K2.5 (Fireworks via Vercel)",`
			`"reasoning": true,`
			`"input": ["text", "image"],`
			`"cost": { "input": 0.6, "output": 3, "cacheRead": 0, "cacheWrite": 0 },`
			`"contextWindow": 262144,`
			`"maxTokens": 262144,`
			`"compat": {`
			`"vercelGatewayRouting": {`
			`"only": ["fireworks", "novita"],`
			`"order": ["fireworks", "novita"]`
			`}`
			`}`
			`}`
			`]`
			`}`
			`}`
			`}`
			```