singularity-forge/gitbook/configuration/custom-models.md
ace-pm b29c12d5e5 refactor(native): rename gsd_parser.rs to forge_parser.rs
Final rebrand: rename remaining Rust source file to complete the gsd → forge
transition. All parser references already use forge_parser after earlier commits.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:58:21 +02:00

131 lines
3.2 KiB
Markdown

# Custom Models
Define custom models and providers in `~/.sf/agent/models.json`. This lets you add models not in the default registry — self-hosted endpoints, fine-tuned models, proxies, or new provider releases.
## File Location
SF looks for models.json at:
1. `~/.sf/agent/models.json` (primary)
2. `~/.pi/agent/models.json` (fallback)
The file reloads each time you open `/model` — no restart needed.
## Basic Structure
```json
{
"providers": {
"my-provider": {
"baseUrl": "https://my-endpoint.example.com/v1",
"apiKey": "MY_PROVIDER_API_KEY",
"api": "openai-completions",
"models": [
{
"id": "model-id-here",
"name": "Friendly Model Name",
"reasoning": false,
"input": ["text"],
"contextWindow": 128000,
"maxTokens": 16384,
"cost": { "input": 0.15, "output": 0.60, "cacheRead": 0.015, "cacheWrite": 0.19 }
}
]
}
}
}
```
## API Key Resolution
The `apiKey` field can be:
- **An environment variable name**: `"OPENROUTER_API_KEY"` — SF resolves it automatically
- **A literal value**: `"sk-abc123..."` — used directly
- **A dummy value**: `"not-needed"` — for local servers that don't require auth
## Compatibility Flags
Local and non-standard servers often need compatibility adjustments:
```json
{
"compat": {
"supportsDeveloperRole": false,
"supportsReasoningEffort": false,
"supportsUsageInStreaming": false,
"thinkingFormat": "qwen"
}
}
```
| Flag | Default | Purpose |
|------|---------|---------|
| `supportsDeveloperRole` | `true` | Set `false` if the server doesn't support the `developer` message role |
| `supportsReasoningEffort` | `true` | Set `false` if the server doesn't support reasoning effort parameters |
| `supportsUsageInStreaming` | `true` | Set `false` if streaming responses don't include token usage |
| `thinkingFormat` | — | Set `"qwen"` for Qwen thinking mode, `"qwen-chat-template"` for chat template variant |
## Custom Headers
For proxies that need extra headers:
```json
{
"providers": {
"litellm-proxy": {
"baseUrl": "https://litellm.example.com/v1",
"apiKey": "MY_API_KEY",
"api": "openai-completions",
"headers": {
"x-custom-header": "value"
},
"models": [...]
}
}
}
```
## Model Overrides
Override specific model settings without redefining the entire model:
```json
{
"providers": {
"openrouter": {
"modelOverrides": {
"anthropic/claude-sonnet-4": {
"compat": {
"openRouterRouting": {
"only": ["amazon-bedrock"]
}
}
}
}
}
}
}
```
## Cost Tracking
For accurate cost tracking with custom models, add the `cost` field (per million tokens):
```json
"cost": {
"input": 0.15,
"output": 0.60,
"cacheRead": 0.015,
"cacheWrite": 0.19
}
```
Without this, cost shows $0.00 — which is the expected default for custom models.
## Community Extensions
For providers not built into SF, community extensions add full provider support:
| Extension | Provider | Install |
|-----------|----------|---------|
| `pi-dashscope` | Alibaba DashScope (Qwen3, GLM-5, etc.) | `sf install npm:pi-dashscope` |