Final rebrand: rename remaining Rust source file to complete the gsd → forge transition. All parser references already use forge_parser after earlier commits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3.2 KiB
3.2 KiB
Custom Models
Define custom models and providers in ~/.sf/agent/models.json. This lets you add models not in the default registry — self-hosted endpoints, fine-tuned models, proxies, or new provider releases.
File Location
SF looks for models.json at:
~/.sf/agent/models.json(primary)~/.pi/agent/models.json(fallback)
The file reloads each time you open /model — no restart needed.
Basic Structure
{
"providers": {
"my-provider": {
"baseUrl": "https://my-endpoint.example.com/v1",
"apiKey": "MY_PROVIDER_API_KEY",
"api": "openai-completions",
"models": [
{
"id": "model-id-here",
"name": "Friendly Model Name",
"reasoning": false,
"input": ["text"],
"contextWindow": 128000,
"maxTokens": 16384,
"cost": { "input": 0.15, "output": 0.60, "cacheRead": 0.015, "cacheWrite": 0.19 }
}
]
}
}
}
API Key Resolution
The apiKey field can be:
- An environment variable name:
"OPENROUTER_API_KEY"— SF resolves it automatically - A literal value:
"sk-abc123..."— used directly - A dummy value:
"not-needed"— for local servers that don't require auth
Compatibility Flags
Local and non-standard servers often need compatibility adjustments:
{
"compat": {
"supportsDeveloperRole": false,
"supportsReasoningEffort": false,
"supportsUsageInStreaming": false,
"thinkingFormat": "qwen"
}
}
| Flag | Default | Purpose |
|---|---|---|
supportsDeveloperRole |
true |
Set false if the server doesn't support the developer message role |
supportsReasoningEffort |
true |
Set false if the server doesn't support reasoning effort parameters |
supportsUsageInStreaming |
true |
Set false if streaming responses don't include token usage |
thinkingFormat |
— | Set "qwen" for Qwen thinking mode, "qwen-chat-template" for chat template variant |
Custom Headers
For proxies that need extra headers:
{
"providers": {
"litellm-proxy": {
"baseUrl": "https://litellm.example.com/v1",
"apiKey": "MY_API_KEY",
"api": "openai-completions",
"headers": {
"x-custom-header": "value"
},
"models": [...]
}
}
}
Model Overrides
Override specific model settings without redefining the entire model:
{
"providers": {
"openrouter": {
"modelOverrides": {
"anthropic/claude-sonnet-4": {
"compat": {
"openRouterRouting": {
"only": ["amazon-bedrock"]
}
}
}
}
}
}
}
Cost Tracking
For accurate cost tracking with custom models, add the cost field (per million tokens):
"cost": {
"input": 0.15,
"output": 0.60,
"cacheRead": 0.015,
"cacheWrite": 0.19
}
Without this, cost shows $0.00 — which is the expected default for custom models.
Community Extensions
For providers not built into SF, community extensions add full provider support:
| Extension | Provider | Install |
|---|---|---|
pi-dashscope |
Alibaba DashScope (Qwen3, GLM-5, etc.) | sf install npm:pi-dashscope |