ace-pm b29c12d5e5 refactor(native): rename gsd_parser.rs to forge_parser.rs

Final rebrand: rename remaining Rust source file to complete the gsd → forge
transition. All parser references already use forge_parser after earlier commits.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-15 14:58:21 +02:00

3.2 KiB

Raw Blame History

Custom Models

Define custom models and providers in ~/.sf/agent/models.json. This lets you add models not in the default registry — self-hosted endpoints, fine-tuned models, proxies, or new provider releases.

File Location

SF looks for models.json at:

~/.sf/agent/models.json (primary)
~/.pi/agent/models.json (fallback)

The file reloads each time you open /model — no restart needed.

Basic Structure

{
  "providers": {
    "my-provider": {
      "baseUrl": "https://my-endpoint.example.com/v1",
      "apiKey": "MY_PROVIDER_API_KEY",
      "api": "openai-completions",
      "models": [
        {
          "id": "model-id-here",
          "name": "Friendly Model Name",
          "reasoning": false,
          "input": ["text"],
          "contextWindow": 128000,
          "maxTokens": 16384,
          "cost": { "input": 0.15, "output": 0.60, "cacheRead": 0.015, "cacheWrite": 0.19 }
        }
      ]
    }
  }
}

API Key Resolution

The apiKey field can be:

An environment variable name: "OPENROUTER_API_KEY" — SF resolves it automatically
A literal value: "sk-abc123..." — used directly
A dummy value: "not-needed" — for local servers that don't require auth

Compatibility Flags

Local and non-standard servers often need compatibility adjustments:

{
  "compat": {
    "supportsDeveloperRole": false,
    "supportsReasoningEffort": false,
    "supportsUsageInStreaming": false,
    "thinkingFormat": "qwen"
  }
}

Flag	Default	Purpose
`supportsDeveloperRole`	`true`	Set `false` if the server doesn't support the `developer` message role
`supportsReasoningEffort`	`true`	Set `false` if the server doesn't support reasoning effort parameters
`supportsUsageInStreaming`	`true`	Set `false` if streaming responses don't include token usage
`thinkingFormat`	—	Set `"qwen"` for Qwen thinking mode, `"qwen-chat-template"` for chat template variant

Custom Headers

For proxies that need extra headers:

{
  "providers": {
    "litellm-proxy": {
      "baseUrl": "https://litellm.example.com/v1",
      "apiKey": "MY_API_KEY",
      "api": "openai-completions",
      "headers": {
        "x-custom-header": "value"
      },
      "models": [...]
    }
  }
}

Model Overrides

Override specific model settings without redefining the entire model:

{
  "providers": {
    "openrouter": {
      "modelOverrides": {
        "anthropic/claude-sonnet-4": {
          "compat": {
            "openRouterRouting": {
              "only": ["amazon-bedrock"]
            }
          }
        }
      }
    }
  }
}

Cost Tracking

For accurate cost tracking with custom models, add the cost field (per million tokens):

"cost": {
  "input": 0.15,
  "output": 0.60,
  "cacheRead": 0.015,
  "cacheWrite": 0.19
}

Without this, cost shows $0.00 — which is the expected default for custom models.

Community Extensions

For providers not built into SF, community extensions add full provider support:

Extension	Provider	Install
`pi-dashscope`	Alibaba DashScope (Qwen3, GLM-5, etc.)	`sf install npm:pi-dashscope`

3.2 KiB Raw Blame History