chore(merge): resolve conflict with upstream/main for PR #3204

Keep catch-all STREAM_RE from PR; upstream's 5-variant whack-a-mole is
superseded by the /in JSON at position \d+/ pattern. Also drop the now-
stale comment about checking stream before server/connection (no longer
needed since catch-all avoids those false-positive overlaps).
This commit is contained in:
Jeremy 2026-04-01 14:05:28 -05:00
commit f7cb3ec07b
316 changed files with 24307 additions and 1333 deletions

View file

@ -0,0 +1,138 @@
# Extension Loading: Dependency Sort + Unified Enable/Disable
## Context
GSD-2 has a well-structured extension system with three discovery paths (bundled, global/community, project-local) that are **already wired up** through pi's `DefaultPackageManager.addAutoDiscoveredResources()`. However, two critical gaps remain:
1. `sortExtensionPaths()` (topological dependency sort) is implemented but **never called**`dependencies.extensions` in manifests is decorative
2. The GSD extension registry (enable/disable) only applies to **bundled** extensions — community extensions bypass it entirely
### Architecture (Current Flow)
```
GSD loader.ts
→ discoverExtensionEntryPaths(bundledExtDir)
→ filter by GSD registry (isExtensionEnabled)
→ set GSD_BUNDLED_EXTENSION_PATHS env var
DefaultResourceLoader.reload()
→ packageManager.resolve()
→ addAutoDiscoveredResources()
→ project: cwd/.gsd/extensions/ (CONFIG_DIR_NAME = ".gsd")
→ global: ~/.gsd/agent/extensions/ (includes synced bundled)
→ loadExtensions(mergedPaths) ← NO sort, NO registry check on community
```
### Key Files
| File | Role |
|------|------|
| `src/loader.ts` (lines 146-161) | GSD startup — bundled discovery + registry filter |
| `src/extension-sort.ts` | Topological sort (Kahn's BFS) — EXISTS but NEVER CALLED |
| `src/extension-registry.ts` | Registry I/O, enable/disable, tier checks |
| `src/resource-loader.ts` (lines 589-607) | `buildResourceLoader()` — constructs DefaultResourceLoader |
| `packages/pi-coding-agent/src/core/resource-loader.ts` (lines 311-395) | `reload()` — merges paths, calls `loadExtensions()` |
| `packages/pi-coding-agent/src/core/package-manager.ts` (lines 1585-1700) | `addAutoDiscoveredResources()` — auto-discovers from .gsd/ dirs |
| `packages/pi-coding-agent/src/core/extensions/loader.ts` (lines 945-1002) | `discoverAndLoadExtensions()` — DEAD CODE, never invoked |
---
## Plan
### Task 1: Wire topological sort into extension loading
**What:** Call `sortExtensionPaths()` on the merged extension paths before passing them to `loadExtensions()`.
**Where:** `packages/pi-coding-agent/src/core/resource-loader.ts` ~line 381-385
**Before:**
```typescript
const extensionsResult = await loadExtensions(extensionPaths, this.cwd, this.eventBus);
```
**After:**
```typescript
import { sortExtensionPaths } from '../../../src/extension-sort.js';
const { sortedPaths, warnings } = sortExtensionPaths(extensionPaths);
for (const w of warnings) {
// emit as diagnostic, not hard error
}
const extensionsResult = await loadExtensions(sortedPaths, this.cwd, this.eventBus);
```
**Consideration:** `sortExtensionPaths` lives in `src/` (GSD side), not in `packages/pi-coding-agent/`. Need to either:
- (a) Move it into pi-coding-agent as a shared utility, OR
- (b) Import it cross-package (already done for other GSD→pi imports), OR
- (c) Call it on the GSD side before paths reach pi — harder since auto-discovered paths are added inside pi's package manager
Option (a) is cleanest — the sort logic only depends on `readManifestFromEntryPath` which is also in `src/extension-registry.ts` but could be duplicated or shared.
### Task 2: Apply GSD registry to community extensions
**What:** When `buildResourceLoader()` in `src/resource-loader.ts` constructs the DefaultResourceLoader, also discover and filter community extensions from `~/.gsd/agent/extensions/` through the GSD registry — same as it already does for `~/.pi/agent/extensions/` paths.
**Where:** `src/resource-loader.ts``buildResourceLoader()` (lines 589-607)
**Current code already filters pi extensions:**
```typescript
const piExtensionPaths = discoverExtensionEntryPaths(piExtensionsDir)
.filter((entryPath) => !bundledKeys.has(getExtensionKey(entryPath, piExtensionsDir)))
.filter((entryPath) => {
const manifest = readManifestFromEntryPath(entryPath)
if (!manifest) return true
return isExtensionEnabled(registry, manifest.id)
})
```
**Add similar filtering for community extensions in agentDir:**
- Discover extensions in `~/.gsd/agent/extensions/` that are NOT bundled
- Filter through `isExtensionEnabled(registry, manifest.id)`
- Pass as disabled (via override patterns or pre-filtering) to the resource loader
**Alternative approach:** Hook into `addAutoDiscoveredResources` or the `addResource` call to check the GSD registry. This might be cleaner since the auto-discovery already happens inside pi's package manager.
### Task 3: Emit sort warnings as diagnostics
**What:** Surface dependency warnings (missing deps, cycles) through GSD's diagnostic system so users see them.
**Where:** Wherever the sort is invoked from Task 1.
**Format:**
```
⚠ Extension 'gsd-watch' declares dependency 'gsd' which is not installed — loading anyway
⚠ Extensions 'foo' and 'bar' form a dependency cycle — loading in alphabetical order
```
### Task 4: Clean up dead code
**What:** The `discoverAndLoadExtensions()` function in `packages/pi-coding-agent/src/core/extensions/loader.ts` (lines 945-1002) is exported but never invoked. The project-local trust model inside it (`getUntrustedExtensionPaths`) also never runs.
**Options:**
- (a) Remove it entirely — it's dead
- (b) Mark deprecated — in case upstream pi uses it
- (c) Leave it — lowest risk
Recommend (b) for now — add `@deprecated` JSDoc so it doesn't grow new callers.
### Task 5: Tests
- **Sort integration test:** Create two extensions where A depends on B. Verify B loads before A after sort.
- **Registry community test:** Drop a community extension in `~/.gsd/agent/extensions/`, run `gsd extensions disable <id>`, verify it doesn't load.
- **Conflict test:** Same extension ID in project-local and global — verify project-local wins.
- **Missing dep test:** Extension declares dependency on non-existent extension — verify warning emitted, extension still loads.
- **Cycle test:** Two extensions that depend on each other — verify warning, both load.
---
## Follow-up PR (separate)
**Subagent extension forwarding:** Update `src/resources/extensions/subagent/index.ts` to forward ALL extension paths (not just bundled) to child processes. May need a second env var like `GSD_COMMUNITY_EXTENSION_PATHS` or consolidate into `GSD_EXTENSION_PATHS`.
---
## Open Questions
1. **Where should `sortExtensionPaths` live?** Currently in `src/` (GSD side). Needs to be callable from pi's resource-loader. Options: move to pi, keep and import cross-package, or duplicate.
2. **Should community extensions respect the same registry as bundled?** Or should they have their own enable/disable mechanism? Current plan unifies them.
3. **Project-local trust:** The TOFU model in the dead `discoverAndLoadExtensions()` never runs. Should `addAutoDiscoveredResources` also gate project-local extensions behind trust? Or is `.gsd/extensions/` in your own project always trusted?

View file

@ -0,0 +1,241 @@
# Ollama Extension — First-Class Local LLM Support
## Status: DRAFT — Awaiting approval
## Problem
Ollama support in GSD2 currently requires manual `models.json` configuration. Users must:
1. Know the OpenAI-compatibility endpoint (`localhost:11434/v1`)
2. Manually list every model they want to use
3. Set compat flags (`supportsDeveloperRole: false`, etc.)
4. Use a dummy API key
There's an `ollama-cloud` provider for hosted Ollama, and a discovery adapter that can list models, but no first-class **local Ollama** extension that "just works."
## Goal
Make Ollama the easiest way to use GSD2 — zero config when Ollama is running locally. All Ollama functionality lives in a single extension: `src/resources/extensions/ollama/`.
## Architecture
Everything is a self-contained extension under `src/resources/extensions/ollama/`. The extension:
- Auto-detects Ollama on startup via health check
- Discovers and registers local models with the model registry
- Provides native Ollama API streaming (not OpenAI shim)
- Exposes `/ollama` slash commands for model management
- Registers an LLM-callable tool for model pull/status
Minimal core changes — only `KnownProvider` and `KnownApi` type additions in `pi-ai`, and `env-api-keys.ts` for key resolution. Everything else is in the extension.
## File Structure
```
src/resources/extensions/ollama/
├── index.ts # Extension entry — wires everything on session_start
├── ollama-client.ts # HTTP client for Ollama REST API (/api/*)
├── ollama-discovery.ts # Model discovery + capability detection
├── ollama-provider.ts # Native /api/chat streaming provider (registers with pi-ai)
├── ollama-commands.ts # /ollama slash commands (status, pull, list, remove, ps)
├── ollama-tool.ts # LLM-callable tool for model management
├── model-capabilities.ts # Known model capability table (context window, vision, reasoning)
└── types.ts # Shared types for Ollama API responses
```
## Scope
### Phase 1: Auto-Discovery + OpenAI-Compat Routing
**What:** Extension that auto-detects Ollama, discovers models, registers them using the existing `openai-completions` API provider. Zero config needed.
**Extension files:**
- `ollama/index.ts` — Main entry. On `session_start`:
1. Probe `localhost:11434` (or `OLLAMA_HOST`) with 1.5s timeout
2. If reachable, discover models via `/api/tags`
3. Register discovered models with `ctx.modelRegistry` using correct defaults
4. Show status widget if Ollama is detected
- `ollama/ollama-client.ts` — Low-level HTTP client:
- `isRunning()``GET /` health check
- `getVersion()``GET /api/version`
- `listModels()``GET /api/tags`
- `showModel(name)``POST /api/show` (details, template, parameters, size)
- `getRunningModels()``GET /api/ps` (loaded models, VRAM usage)
- `pullModel(name, onProgress)``POST /api/pull` (streaming progress)
- `deleteModel(name)``DELETE /api/delete`
- `copyModel(source, dest)``POST /api/copy`
- Respects `OLLAMA_HOST` env var for non-default endpoints
- `ollama/ollama-discovery.ts` — Enhanced model discovery:
- Calls `/api/tags` to get model list
- Calls `/api/show` per model (batch, cached) to get:
- `details.parameter_size` → estimate context window
- `details.families` → detect vision (clip), reasoning (deepseek-r1)
- `modelfile` → extract default parameters
- Returns enriched `DiscoveredModel[]` with proper capabilities
- `ollama/model-capabilities.ts` — Known model lookup table:
- Maps well-known model families to capabilities
- e.g., `llama3.1``{ contextWindow: 131072, input: ["text"] }`
- e.g., `llava``{ contextWindow: 4096, input: ["text", "image"] }`
- e.g., `deepseek-r1``{ reasoning: true, contextWindow: 131072 }`
- e.g., `qwen2.5-coder``{ contextWindow: 131072, input: ["text"] }`
- Fallback: estimate from parameter count if not in table
- `ollama/types.ts` — Ollama API response types
**Core changes (minimal):**
- `packages/pi-ai/src/types.ts` — Add `"ollama"` to `KnownProvider`
- `packages/pi-ai/src/env-api-keys.ts` — Add `"ollama"` key resolution (returns `"ollama"` placeholder — no real key needed)
- `src/onboarding.ts` — Add `"ollama"` to provider selection list
- `src/wizard.ts` — Add `ollama` entry (no key required)
**Model registration details:**
Each discovered model registers as:
```typescript
{
id: "llama3.1:8b", // from /api/tags
name: "Llama 3.1 8B", // humanized
api: "openai-completions", // uses existing provider
provider: "ollama",
baseUrl: "http://localhost:11434/v1",
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
reasoning: false, // from capabilities table
input: ["text"], // from capabilities table
contextWindow: 131072, // from capabilities table or /api/show
maxTokens: 16384, // conservative default
compat: {
supportsDeveloperRole: false,
supportsReasoningEffort: false,
supportsUsageInStreaming: false,
maxTokensField: "max_tokens",
},
}
```
**Behavior:**
- `gsd --list-models` shows all locally-pulled Ollama models automatically
- `/model ollama/llama3.1:8b` works without any config file
- If Ollama isn't running, extension is silent — no errors, no models listed
- `models.json` overrides still work (user config wins over auto-discovery)
### Phase 2: Native Ollama API Provider (`/api/chat`)
**What:** A dedicated streaming provider that talks Ollama's native protocol instead of the OpenAI compatibility shim.
**Extension files:**
- `ollama/ollama-provider.ts` — Native `/api/chat` streaming:
- Registers `"ollama-chat"` API with `registerApiProvider()`
- Implements `stream()` and `streamSimple()`:
- Maps GSD `Context` → Ollama messages format
- Maps GSD `Tool[]` → Ollama tool format
- Streams NDJSON responses, maps back to `AssistantMessage` events
- Extracts `<think>` blocks for reasoning models (deepseek-r1, qwq)
- Ollama-specific options:
- `keep_alive` — control model memory retention (default: "5m")
- `num_ctx` — pass through model's context window
- `num_predict` — max output tokens
- Temperature, top_p, top_k
- Response metadata:
- `eval_count` / `eval_duration` → tokens/sec in usage stats
- `total_duration`, `load_duration` → performance visibility
- Vision support: converts image content to base64 for multimodal models
**Core changes:**
- `packages/pi-ai/src/types.ts` — Add `"ollama-chat"` to `KnownApi`
**Phase 1 models switch to `api: "ollama-chat"` by default.** Users can force OpenAI-compat via `models.json` override if needed.
**Why native over OpenAI-compat:**
- Full `keep_alive` / `num_ctx` control
- Better error messages (Ollama-native vs generic OpenAI)
- More reliable tool calling on Ollama's native format
- Performance metrics in response (tokens/sec)
- Foundation for model management commands
### Phase 3: Local LLM Management UX
**What:** `/ollama` slash commands and an LLM tool for model management.
**Extension files:**
- `ollama/ollama-commands.ts` — Slash commands registered via `pi.registerCommand()`:
- `/ollama` — Status overview:
```
Ollama v0.5.7 — running (localhost:11434)
Loaded:
llama3.1:8b 4.7 GB VRAM idle 3m
Available:
llama3.1:8b (4.7 GB)
qwen2.5-coder:7b (4.4 GB)
deepseek-r1:8b (4.9 GB)
```
- `/ollama pull <model>` — Pull with streaming progress via `ctx.ui.setWidget()`
- `/ollama list` — List all local models with sizes and families
- `/ollama remove <model>` — Delete a model (with confirmation)
- `/ollama ps` — Running models + VRAM usage
- `ollama/ollama-tool.ts` — LLM-callable tool registered via `pi.registerTool()`:
- `ollama_manage` tool — lets the agent pull/list/check models
- Parameters: `{ action: "list" | "pull" | "status" | "ps", model?: string }`
- Use case: agent detects it needs a model, pulls it automatically
**UX Flow:**
```
$ gsd
> /ollama
Ollama v0.5.7 — running (localhost:11434)
Loaded:
llama3.1:8b — 4.7 GB VRAM, idle 3m
Available:
llama3.1:8b (4.7 GB)
qwen2.5-coder:7b (4.4 GB)
deepseek-r1:8b (4.9 GB)
> /ollama pull codestral:22b
Pulling codestral:22b...
████████████████████████████░░░░ 78% (14.2 GB / 18.1 GB)
✓ codestral:22b ready
> /model ollama/codestral:22b
Switched to codestral:22b (local, Ollama)
```
## Implementation Order
1. **Phase 1** — Auto-discovery with OpenAI-compat routing. Biggest user impact, smallest risk.
2. **Phase 3** — Management UX (`/ollama` commands). Valuable even before native API.
3. **Phase 2** — Native `/api/chat` provider. Optimization over OpenAI-compat; do last.
## Core Changes Summary (minimal)
| File | Change |
|------|--------|
| `packages/pi-ai/src/types.ts` | Add `"ollama"` to `KnownProvider`, `"ollama-chat"` to `KnownApi` (Phase 2) |
| `packages/pi-ai/src/env-api-keys.ts` | Add `"ollama"` → always returns `"ollama"` placeholder |
| `src/onboarding.ts` | Add `"ollama"` to provider picker |
| `src/wizard.ts` | Add `"ollama"` key mapping (no key required) |
Everything else lives in `src/resources/extensions/ollama/`.
## Risks & Mitigations
| Risk | Mitigation |
|------|------------|
| Ollama not running — startup probe latency | 1.5s timeout; cache result; probe async so it doesn't block TUI paint |
| Model capabilities unknown | Known-model table + `/api/show` fallback + parameter_size estimation |
| Tool calling unreliable on small models | Detect param count; warn on <7B models |
| Ollama API changes between versions | Version detect via `/api/version`; stable endpoints only |
| Conflicts with `models.json` Ollama config | User config always wins; auto-discovered models merge beneath manual config |
| Extension disabled — no impact on core | Extension is additive; disabling removes all Ollama features cleanly |
## Testing Strategy
- Unit tests: `ollama-client.ts` with mocked fetch responses
- Unit tests: `ollama-discovery.ts` model capability parsing
- Unit tests: `ollama-provider.ts` message format mapping + NDJSON stream parsing
- Unit tests: `model-capabilities.ts` known model lookups
- Integration test: mock HTTP server simulating Ollama `/api/tags`, `/api/chat`, `/api/pull`
- Manual test: real Ollama instance with llama3.1, qwen2.5-coder, deepseek-r1
## Open Questions
1. **Startup probe** — Probe Ollama on `session_start` (adds ~1.5s if not running) or lazy on first `/model`? **Recommendation: async probe on session_start (non-blocking), eager if `OLLAMA_HOST` is set.**
2. **Auto-start** — Try to launch Ollama if installed but not running? **Recommendation: no — too invasive. Show helpful message in `/ollama` status.**
3. **Vision support** — Support multimodal models (llava, etc.) in Phase 2 native API? **Recommendation: yes, detected via capabilities table.**
4. **Model refresh** — How often to re-probe Ollama for new models? **Recommendation: on `/ollama list`, on `/model` command, and every 5 min (existing TTL).**

View file

@ -7,7 +7,7 @@
[![npm version](https://img.shields.io/npm/v/gsd-pi?style=for-the-badge&logo=npm&logoColor=white&color=CB3837)](https://www.npmjs.com/package/gsd-pi)
[![npm downloads](https://img.shields.io/npm/dm/gsd-pi?style=for-the-badge&logo=npm&logoColor=white&color=CB3837)](https://www.npmjs.com/package/gsd-pi)
[![GitHub stars](https://img.shields.io/github/stars/gsd-build/GSD-2?style=for-the-badge&logo=github&color=181717)](https://github.com/gsd-build/GSD-2)
[![Discord](https://img.shields.io/badge/Discord-Join%20us-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.gg/gsd)
[![Discord](https://img.shields.io/badge/Discord-Join%20us-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.com/invite/nKXTsAcmbT)
[![License](https://img.shields.io/badge/license-MIT-blue?style=for-the-badge)](LICENSE)
[![$GSD Token](https://img.shields.io/badge/$GSD-Dexscreener-1C1C1C?style=for-the-badge&logo=data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMjQiIGhlaWdodD0iMjQiIHZpZXdCb3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj48Y2lyY2xlIGN4PSIxMiIgY3k9IjEyIiByPSIxMCIgZmlsbD0iIzAwRkYwMCIvPjwvc3ZnPg==&logoColor=00FF00)](https://dexscreener.com/solana/dwudwjvan7bzkw9zwlbyv6kspdlvhwzrqy6ebk8xzxkv)

View file

@ -38,6 +38,6 @@ Or just use conventional directory names (`extensions/`, `skills/`, `prompts/`,
- [Package gallery](https://shittycodingagent.ai/packages)
- [npm search](https://www.npmjs.com/search?q=keywords%3Api-package)
- [Discord community](https://discord.com/invite/3cU7Bz4UPx)
- [Discord community](https://discord.com/invite/nKXTsAcmbT)
---

View file

@ -54,7 +54,7 @@
"copy-themes": "node scripts/copy-themes.cjs",
"copy-export-html": "node scripts/copy-export-html.cjs",
"test:compile": "node scripts/compile-tests.mjs",
"test:unit": "npm run test:compile && node --import ./scripts/dist-test-resolve.mjs --experimental-test-isolation=process --test-reporter=./scripts/test-reporter-compact.mjs --test 'dist-test/src/tests/*.test.js' 'dist-test/src/resources/extensions/gsd/tests/*.test.js' 'dist-test/src/resources/extensions/gsd/tests/*.test.mjs' 'dist-test/src/resources/extensions/shared/tests/*.test.js' 'dist-test/src/resources/extensions/claude-code-cli/tests/*.test.js' 'dist-test/src/resources/extensions/github-sync/tests/*.test.js' 'dist-test/src/resources/extensions/universal-config/tests/*.test.js' 'dist-test/src/resources/extensions/voice/tests/*.test.js'",
"test:unit": "npm run test:compile && node --import ./scripts/dist-test-resolve.mjs --experimental-test-isolation=process --test-reporter=./scripts/test-reporter-compact.mjs --test 'dist-test/src/tests/*.test.js' 'dist-test/src/resources/extensions/gsd/tests/*.test.js' 'dist-test/src/resources/extensions/gsd/tests/*.test.mjs' 'dist-test/src/resources/extensions/shared/tests/*.test.js' 'dist-test/src/resources/extensions/claude-code-cli/tests/*.test.js' 'dist-test/src/resources/extensions/github-sync/tests/*.test.js' 'dist-test/src/resources/extensions/universal-config/tests/*.test.js' 'dist-test/src/resources/extensions/voice/tests/*.test.js' 'dist-test/src/resources/extensions/mcp-client/tests/*.test.js'",
"test:packages": "node --test packages/pi-coding-agent/dist/core/*.test.js",
"test:marketplace": "GSD_TEST_CLONE_MARKETPLACES=1 node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/claude-import-tui.test.ts src/resources/extensions/gsd/tests/plugin-importer-live.test.ts src/tests/marketplace-discovery.test.ts",
"test:coverage": "c8 --reporter=text --reporter=lcov --exclude='src/resources/extensions/gsd/tests/**' --exclude='src/tests/**' --exclude='scripts/**' --exclude='native/**' --exclude='node_modules/**' --check-coverage --statements=40 --lines=40 --branches=20 --functions=20 node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --experimental-test-isolation=process --test src/resources/extensions/gsd/tests/*.test.ts src/resources/extensions/gsd/tests/*.test.mjs src/tests/*.test.ts src/resources/extensions/shared/tests/*.test.ts",

View file

@ -2,7 +2,7 @@
"name": "@gsd/native",
"version": "0.1.0",
"description": "Native Rust bindings for GSD \u2014 high-performance native modules via N-API",
"type": "module",
"type": "commonjs",
"main": "./dist/index.js",
"types": "./dist/index.d.ts",
"scripts": {
@ -14,75 +14,75 @@
"exports": {
".": {
"types": "./dist/index.d.ts",
"import": "./dist/index.js"
"default": "./dist/index.js"
},
"./grep": {
"types": "./dist/grep/index.d.ts",
"import": "./dist/grep/index.js"
"default": "./dist/grep/index.js"
},
"./ps": {
"types": "./dist/ps/index.d.ts",
"import": "./dist/ps/index.js"
"default": "./dist/ps/index.js"
},
"./glob": {
"types": "./dist/glob/index.d.ts",
"import": "./dist/glob/index.js"
"default": "./dist/glob/index.js"
},
"./clipboard": {
"types": "./dist/clipboard/index.d.ts",
"import": "./dist/clipboard/index.js"
"default": "./dist/clipboard/index.js"
},
"./ast": {
"types": "./dist/ast/index.d.ts",
"import": "./dist/ast/index.js"
"default": "./dist/ast/index.js"
},
"./html": {
"types": "./dist/html/index.d.ts",
"import": "./dist/html/index.js"
"default": "./dist/html/index.js"
},
"./text": {
"types": "./dist/text/index.d.ts",
"import": "./dist/text/index.js"
"default": "./dist/text/index.js"
},
"./fd": {
"types": "./dist/fd/index.d.ts",
"import": "./dist/fd/index.js"
"default": "./dist/fd/index.js"
},
"./image": {
"types": "./dist/image/index.d.ts",
"import": "./dist/image/index.js"
"default": "./dist/image/index.js"
},
"./xxhash": {
"types": "./dist/xxhash/index.d.ts",
"import": "./dist/xxhash/index.js"
"default": "./dist/xxhash/index.js"
},
"./diff": {
"types": "./dist/diff/index.d.ts",
"import": "./dist/diff/index.js"
"default": "./dist/diff/index.js"
},
"./gsd-parser": {
"types": "./dist/gsd-parser/index.d.ts",
"import": "./dist/gsd-parser/index.js"
"default": "./dist/gsd-parser/index.js"
},
"./highlight": {
"types": "./dist/highlight/index.d.ts",
"import": "./dist/highlight/index.js"
"default": "./dist/highlight/index.js"
},
"./json-parse": {
"types": "./dist/json-parse/index.d.ts",
"import": "./dist/json-parse/index.js"
"default": "./dist/json-parse/index.js"
},
"./stream-process": {
"types": "./dist/stream-process/index.d.ts",
"import": "./dist/stream-process/index.js"
"default": "./dist/stream-process/index.js"
},
"./truncate": {
"types": "./dist/truncate/index.d.ts",
"import": "./dist/truncate/index.js"
"default": "./dist/truncate/index.js"
},
"./ttsr": {
"types": "./dist/ttsr/index.d.ts",
"import": "./dist/ttsr/index.js"
"default": "./dist/ttsr/index.js"
}
},
"files": [

View file

@ -0,0 +1,91 @@
/**
* Tests that the @gsd/native package.json is correctly configured
* for Node.js module resolution (ESM/CJS compatibility).
*
* Regression test for #2861: "type": "module" + "import"-only export
* conditions caused crashes on Node.js v24 when the parent package also
* declared "type": "module" and strict ESM resolution was enforced.
*/
import { test, describe } from "node:test";
import assert from "node:assert/strict";
import { readFileSync } from "node:fs";
import * as path from "node:path";
import { fileURLToPath } from "node:url";
const __dirname = path.dirname(fileURLToPath(import.meta.url));
const pkgPath = path.resolve(__dirname, "..", "..", "package.json");
const pkg = JSON.parse(readFileSync(pkgPath, "utf8"));
describe("@gsd/native module compatibility (#2861)", () => {
test("package.json must not declare type: module (compiled output is CJS-compatible)", () => {
// The compiled output uses createRequire() to load .node addons.
// Declaring "type": "module" forces Node.js to treat .js files as ESM,
// but the package needs "type": "commonjs" to override the parent
// package's "type": "module" and ensure correct CJS semantics.
assert.notEqual(
pkg.type,
"module",
'package.json must not set "type": "module" — this causes crashes on Node.js v24 ' +
"when the parent package also declares ESM (see #2861)",
);
});
test("package.json should explicitly declare type: commonjs", () => {
// When installed as a dependency under a parent with "type": "module"
// (e.g. gsd-pi), an absent "type" field would inherit the parent's
// ESM setting. Explicit "commonjs" overrides this.
assert.equal(
pkg.type,
"commonjs",
'package.json must explicitly set "type": "commonjs" to override ' +
"the parent package's ESM declaration",
);
});
test("all export conditions must use 'default' (not 'import'-only)", () => {
// The "import" condition key restricts resolution to ESM import
// statements only. Using "default" ensures the export works for both
// require() and import, which is essential for a CJS package that may
// be consumed from ESM code via Node's CJS interop.
const exportsMap = pkg.exports;
assert.ok(exportsMap, "package.json must have an exports map");
for (const [subpath, conditions] of Object.entries(exportsMap)) {
assert.ok(
!conditions.import || conditions.default,
`exports["${subpath}"] uses "import" condition without "default" — ` +
`this breaks CJS consumers and Node.js v24 strict resolution`,
);
}
});
test("native.ts source must not use bare import.meta.url (parse-time error in CJS)", () => {
// When compiled to CJS, import.meta is a *parse-time* syntax error --
// typeof guards don't help because Node rejects the syntax before
// executing any code. The source must wrap import.meta access in
// an indirect eval so the CJS parser never sees the bare syntax.
const nativeSrc = readFileSync(
path.resolve(__dirname, "..", "native.ts"),
"utf8",
);
// Bare import.meta.url (NOT wrapped) would crash at parse time in CJS.
// These regexes match direct usage like fileURLToPath(import.meta.url)
// and createRequire(import.meta.url), but NOT indirect patterns that
// hide import.meta from the CJS parser.
const hasBareImportMetaDirname = /path\.dirname\(.*fileURLToPath\(import\.meta\.url\)\)/.test(nativeSrc);
const hasBareImportMetaRequire = /createRequire\(import\.meta\.url\)/.test(nativeSrc);
assert.ok(
!hasBareImportMetaDirname,
"native.ts must not use bare import.meta.url in fileURLToPath() -- " +
"this is a parse-time syntax error in CJS; use indirect eval",
);
assert.ok(
!hasBareImportMetaRequire,
"native.ts must not use bare import.meta.url in createRequire() -- " +
"this is a parse-time syntax error in CJS; use indirect eval",
);
});
});

View file

@ -8,14 +8,15 @@
* 3. native/addon/gsd_engine.dev.node (local debug build)
*/
import { createRequire } from "node:module";
import * as path from "node:path";
import { fileURLToPath } from "node:url";
const __dirname = path.dirname(fileURLToPath(import.meta.url));
const require = createRequire(import.meta.url);
// __dirname and require are available in both execution contexts:
// - CJS (production build via tsc): provided natively by Node
// - ESM (CI test loader): injected by the dist-redirect.mjs preamble
const _dirname = __dirname;
const _require = require;
const addonDir = path.resolve(__dirname, "..", "..", "..", "native", "addon");
const addonDir = path.resolve(_dirname, "..", "..", "..", "native", "addon");
const platformTag = `${process.platform}-${process.arch}`;
/** Map Node.js platform/arch to the npm package suffix */
@ -36,7 +37,7 @@ function loadNative(): Record<string, unknown> {
const packageSuffix = platformPackageMap[platformTag];
if (packageSuffix) {
try {
_loadedSuccessfully = true; return require(`@gsd-build/engine-${packageSuffix}`) as Record<string, unknown>;
_loadedSuccessfully = true; return _require(`@gsd-build/engine-${packageSuffix}`) as Record<string, unknown>;
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
errors.push(`@gsd-build/engine-${packageSuffix}: ${message}`);
@ -46,7 +47,7 @@ function loadNative(): Record<string, unknown> {
// 2. Try local release build (native/addon/gsd_engine.{platform}.node)
const releasePath = path.join(addonDir, `gsd_engine.${platformTag}.node`);
try {
_loadedSuccessfully = true; return require(releasePath) as Record<string, unknown>;
_loadedSuccessfully = true; return _require(releasePath) as Record<string, unknown>;
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
errors.push(`${releasePath}: ${message}`);
@ -55,7 +56,7 @@ function loadNative(): Record<string, unknown> {
// 3. Try local dev build (native/addon/gsd_engine.dev.node)
const devPath = path.join(addonDir, "gsd_engine.dev.node");
try {
_loadedSuccessfully = true; return require(devPath) as Record<string, unknown>;
_loadedSuccessfully = true; return _require(devPath) as Record<string, unknown>;
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
errors.push(`${devPath}: ${message}`);

View file

@ -0,0 +1,45 @@
// agent-loop pauseTurn handling tests
// Verifies that pause_turn / pauseTurn stop reason causes the inner loop
// to continue (re-invoke the LLM) instead of exiting.
// Regression test for https://github.com/gsd-build/gsd-2/issues/2869
import { describe, it } from "node:test";
import assert from "node:assert/strict";
import { readFileSync } from "node:fs";
import { join, dirname } from "node:path";
import { fileURLToPath } from "node:url";
const __dirname = dirname(fileURLToPath(import.meta.url));
describe("agent-loop — pauseTurn handling (#2869)", () => {
it("sets hasMoreToolCalls when stopReason is pauseTurn", () => {
const source = readFileSync(join(__dirname, "agent-loop.ts"), "utf-8");
// The agent loop must treat pauseTurn as a reason to continue the inner
// loop, just like toolUse. This prevents incomplete server_tool_use blocks
// from being saved to history, which would cause a 400 on the next request.
assert.match(
source,
/pauseTurn/,
"agent-loop.ts must handle the pauseTurn stop reason",
);
// Verify it sets hasMoreToolCalls = true for pauseTurn
assert.match(
source,
/stopReason\s*===?\s*["']pauseTurn["']/,
'agent-loop.ts must check for stopReason === "pauseTurn"',
);
});
it("pauseTurn is in the StopReason union type", () => {
// Read the pi-ai types to ensure pauseTurn is a valid StopReason
const typesPath = join(__dirname, "..", "..", "pi-ai", "src", "types.ts");
const typesSource = readFileSync(typesPath, "utf-8");
assert.match(
typesSource,
/["']pauseTurn["']/,
'StopReason type must include "pauseTurn"',
);
});
});

View file

@ -231,9 +231,10 @@ async function runLoop(
return;
}
// Check for tool calls
// Check for tool calls or paused server turn
const toolCalls = message.content.filter((c) => c.type === "toolCall");
hasMoreToolCalls = toolCalls.length > 0;
hasMoreToolCalls =
toolCalls.length > 0 || message.stopReason === "pauseTurn";
const toolResults: ToolResultMessage[] = [];
if (hasMoreToolCalls && config.externalToolExecution) {

View file

@ -47,7 +47,7 @@ export type ProxyAssistantMessageEvent =
| { type: "toolcall_end"; contentIndex: number }
| {
type: "done";
reason: Extract<StopReason, "stop" | "length" | "toolUse">;
reason: Extract<StopReason, "stop" | "length" | "toolUse" | "pauseTurn">;
usage: AssistantMessage["usage"];
}
| {

View file

@ -137,6 +137,7 @@ export function getEnvApiKey(provider: any): string | undefined {
"opencode-go": "OPENCODE_API_KEY",
"kimi-coding": "KIMI_API_KEY",
"alibaba-coding-plan": "ALIBABA_API_KEY",
ollama: "OLLAMA_API_KEY",
"ollama-cloud": "OLLAMA_API_KEY",
"custom-openai": "CUSTOM_OPENAI_API_KEY",
};

View file

@ -27,4 +27,5 @@ export type {
} from "./utils/oauth/types.js";
export * from "./utils/overflow.js";
export * from "./utils/typebox-helpers.js";
export * from "./utils/repair-tool-json.js";
export * from "./utils/validation.js";

View file

@ -0,0 +1,29 @@
import { describe, it } from "node:test";
import assert from "node:assert/strict";
import { mapStopReason } from "./anthropic-shared.js";
describe("mapStopReason", () => {
it("maps end_turn to stop", () => {
assert.equal(mapStopReason("end_turn"), "stop");
});
it("maps max_tokens to length", () => {
assert.equal(mapStopReason("max_tokens"), "length");
});
it("maps tool_use to toolUse", () => {
assert.equal(mapStopReason("tool_use"), "toolUse");
});
it("maps pause_turn to pauseTurn (not stop)", () => {
// pause_turn means the server paused a long-running turn (e.g. native
// web search hit its iteration limit). Mapping it to "stop" causes the
// agent loop to exit, leaving an incomplete server_tool_use block in
// history which triggers a 400 on the next request.
assert.equal(mapStopReason("pause_turn"), "pauseTurn");
});
it("throws on unknown stop reason", () => {
assert.throws(() => mapStopReason("bogus"), /Unhandled stop reason/);
});
});

View file

@ -31,6 +31,7 @@ import type {
export type AnthropicApi = "anthropic-messages" | "anthropic-vertex";
import type { AssistantMessageEventStream } from "../utils/event-stream.js";
import { parseStreamingJson } from "../utils/json-parse.js";
import { repairToolJson } from "../utils/repair-tool-json.js";
import { sanitizeSurrogates } from "../utils/sanitize-unicode.js";
import { transformMessages } from "./transform-messages.js";
@ -502,7 +503,7 @@ export function mapStopReason(reason: string): StopReason {
case "refusal":
return "error";
case "pause_turn":
return "stop";
return "pauseTurn";
case "stop_sequence":
return "stop";
case "sensitive":
@ -696,7 +697,21 @@ export function processAnthropicStream(
partial: output,
});
} else if (block.type === "toolCall") {
block.arguments = parseStreamingJson(block.partialJson);
// Try strict parse first; if it fails, attempt YAML bullet
// repair (#2660) before falling back to the lenient streaming
// parser which silently swallows errors.
const raw = block.partialJson ?? "";
let parsed: Record<string, any> | undefined;
try {
parsed = JSON.parse(raw);
} catch {
try {
parsed = JSON.parse(repairToolJson(raw));
} catch {
// Fall through to streaming parser
}
}
block.arguments = parsed ?? parseStreamingJson(block.partialJson);
delete (block as any).partialJson;
stream.push({
type: "toolcall_end",

View file

@ -43,6 +43,7 @@ export type KnownProvider =
| "opencode-go"
| "kimi-coding"
| "alibaba-coding-plan"
| "ollama"
| "ollama-cloud";
export type Provider = KnownProvider | string;
@ -192,7 +193,7 @@ export interface Usage {
};
}
export type StopReason = "stop" | "length" | "toolUse" | "error" | "aborted";
export type StopReason = "stop" | "length" | "toolUse" | "pauseTurn" | "error" | "aborted";
export interface UserMessage {
role: "user";
@ -253,7 +254,7 @@ export type AssistantMessageEvent =
| { type: "toolcall_end"; contentIndex: number; toolCall: ToolCall; partial: AssistantMessage; malformedArguments?: boolean }
| { type: "server_tool_use"; contentIndex: number; partial: AssistantMessage }
| { type: "web_search_result"; contentIndex: number; partial: AssistantMessage }
| { type: "done"; reason: Extract<StopReason, "stop" | "length" | "toolUse">; message: AssistantMessage }
| { type: "done"; reason: Extract<StopReason, "stop" | "length" | "toolUse" | "pauseTurn">; message: AssistantMessage }
| { type: "error"; reason: Extract<StopReason, "aborted" | "error">; error: AssistantMessage };
/**

View file

@ -1,14 +1,41 @@
import { parseStreamingJson as nativeParseStreamingJson } from "@gsd/native";
import { hasYamlBulletLists, repairToolJson } from "./repair-tool-json.js";
/**
* Attempts to parse potentially incomplete JSON during streaming.
* Always returns a valid object, even if the JSON is incomplete.
*
* Uses the native Rust streaming JSON parser for performance.
* Falls back to YAML bullet-list repair when the native parser
* returns an empty object from input that contains YAML-style
* bullet lists copied from template formatting (#2660).
*
* @param partialJson The partial JSON string from streaming
* @returns Parsed object or empty object if parsing fails
*/
export function parseStreamingJson<T = any>(partialJson: string | undefined): T {
return nativeParseStreamingJson<T>(partialJson);
if (!partialJson || partialJson.trim() === "") {
return {} as T;
}
// Fast path: try native streaming parser first
const result = nativeParseStreamingJson<T>(partialJson);
// If the native parser returned a non-empty result, use it.
// Only attempt repair when the result is empty AND the input
// contains YAML bullet patterns (avoids unnecessary work).
if (
result &&
typeof result === "object" &&
Object.keys(result as object).length === 0 &&
hasYamlBulletLists(partialJson)
) {
try {
return JSON.parse(repairToolJson(partialJson)) as T;
} catch {
// Repair failed — return the empty object from native parser
}
}
return result;
}

View file

@ -0,0 +1,88 @@
/**
* Repair malformed JSON in LLM tool-call arguments.
*
* LLMs sometimes copy YAML template formatting into JSON tool arguments,
* producing patterns like:
*
* "keyDecisions": - Used Web Notification API...,
* "keyFiles": - src-tauri/src/lib.rs Extended...
*
* instead of valid JSON arrays:
*
* "keyDecisions": ["Used Web Notification API..."],
* "keyFiles": ["src-tauri/src/lib.rs — Extended..."]
*
* This module detects and repairs such patterns before JSON.parse is called.
*
* @see https://github.com/gsd-build/gsd-2/issues/2660
*/
/**
* Detect whether a JSON string contains YAML-style bullet-list values
* (i.e. `"key": - item` instead of `"key": ["item"]`).
*/
export function hasYamlBulletLists(json: string): boolean {
// Match: "key": followed by whitespace then a dash-space pattern (YAML bullet)
// The negative lookahead excludes negative numbers (e.g. "key": -1)
return /"\s*:\s*-\s+(?!\d)/.test(json);
}
/**
* Attempt to repair YAML-style bullet lists embedded in a JSON string.
*
* Converts patterns like:
* "keyDecisions": - Used Web Notification API..., "keyFiles": - file1
*
* Into:
* "keyDecisions": ["Used Web Notification API..."], "keyFiles": ["file1"]
*
* Returns the original string unchanged if no YAML patterns are detected
* or if the repair itself would produce invalid JSON.
*/
export function repairToolJson(json: string): string {
if (!hasYamlBulletLists(json)) {
return json;
}
// Strategy: find each `"key": - item1\n - item2\n - item3` region and
// wrap items in a JSON array.
//
// We work on the raw string because the JSON is not parseable yet.
// The pattern we target:
// "someKey":\s*- item text (possibly multiline)
// optionally followed by more `- item` lines
// terminated by the next `"key":` or `}` or end of string.
let repaired = json;
// Match a key followed by YAML-style bullet list.
// Capture: (1) the key portion including colon, (2) the bullet-list body,
// (3) the separator (comma or empty) before the next key/bracket.
// The bullet list body ends at the next `"key":` or `}` or `]` or end of string.
const keyBulletPattern =
/("(?:[^"\\]|\\.)*"\s*:\s*)(- .+?)(,?\s*)(?="(?:[^"\\]|\\.)*"\s*:|[}\]]|$)/gs;
repaired = repaired.replace(
keyBulletPattern,
(_match, keyPart: string, bulletBody: string, separator: string) => {
// Split the bullet body into individual items on `- ` boundaries.
// Items may contain embedded newlines for multi-line values.
const items = bulletBody
.split(/\n?\s*- /)
.filter((s) => s.trim().length > 0)
.map((s) => s.replace(/,\s*$/, "").trim());
// JSON-encode each item as a string, then wrap in an array.
const jsonArray = "[" + items.map((item) => JSON.stringify(item)).join(", ") + "]";
// Re-emit the separator (comma) so the next key is properly delimited
const sep = separator.trim() ? separator : (/^\s*"/.test(separator + "x") ? ", " : "");
return keyPart + jsonArray + sep;
},
);
// Strip trailing commas before } or ] (common in repaired JSON)
repaired = repaired.replace(/,(\s*[}\]])/g, "$1");
return repaired;
}

View file

@ -0,0 +1,102 @@
import { describe, test } from "node:test";
import assert from "node:assert/strict";
import { repairToolJson, hasYamlBulletLists } from "../repair-tool-json.js";
describe("repairToolJson — YAML bullet list repair (#2660)", () => {
// ── Detection ──────────────────────────────────────────────────────────
test("hasYamlBulletLists detects YAML-style bullets", () => {
assert.equal(
hasYamlBulletLists('"keyDecisions": - Used Web Notification API'),
true,
);
});
test("hasYamlBulletLists ignores negative numbers", () => {
assert.equal(
hasYamlBulletLists('"offset": -1'),
false,
"negative number should not be detected as YAML bullet",
);
});
test("hasYamlBulletLists returns false for valid JSON", () => {
assert.equal(
hasYamlBulletLists('{"keyDecisions": ["item1", "item2"]}'),
false,
);
});
// ── Single bullet item ────────────────────────────────────────────────
test("repairs single YAML bullet to JSON array", () => {
const malformed = '{"keyDecisions": - Used Web Notification API}';
const repaired = repairToolJson(malformed);
const parsed = JSON.parse(repaired);
assert.deepEqual(parsed.keyDecisions, ["Used Web Notification API"]);
});
// ── Multiple bullet items (newline-separated) ─────────────────────────
test("repairs multiple YAML bullets separated by newlines", () => {
const malformed =
'{"keyDecisions": - Used Web Notification API\n - Chose Tauri over Electron\n - Adopted SQLite for storage, "title": "M005"}';
const repaired = repairToolJson(malformed);
const parsed = JSON.parse(repaired);
assert.deepEqual(parsed.keyDecisions, [
"Used Web Notification API",
"Chose Tauri over Electron",
"Adopted SQLite for storage",
]);
assert.equal(parsed.title, "M005");
});
// ── Multiple fields with YAML bullets ─────────────────────────────────
test("repairs multiple fields each with YAML bullet lists", () => {
const malformed =
'{"keyDecisions": - decision one\n - decision two, "keyFiles": - src/lib.rs — Extended menu\n - src/main.ts — Entry point, "title": "done"}';
const repaired = repairToolJson(malformed);
const parsed = JSON.parse(repaired);
assert.deepEqual(parsed.keyDecisions, ["decision one", "decision two"]);
assert.deepEqual(parsed.keyFiles, [
"src/lib.rs \u2014 Extended menu",
"src/main.ts \u2014 Entry point",
]);
assert.equal(parsed.title, "done");
});
// ── Exact reproduction from issue #2660 ───────────────────────────────
test("repairs the exact malformed JSON from issue #2660", () => {
const malformed = `{"milestoneId": "M005", "title": "Native Desktop Polish", "oneLiner": "summary", "narrative": "details", "successCriteriaResults": "all pass", "definitionOfDoneResults": "all done", "requirementOutcomes": "met", "keyDecisions": - Used Web Notification API (new window.Notification()) instead of Tauri sendNotification wrapper, "keyFiles": - src-tauri/src/lib.rs \u2014 Extended menu builder with notification toggle, "lessonsLearned": - Always test notification permissions before sending, "followUps": "none", "deviations": "none", "verificationPassed": true}`;
const repaired = repairToolJson(malformed);
const parsed = JSON.parse(repaired);
assert.equal(parsed.milestoneId, "M005");
assert.equal(parsed.title, "Native Desktop Polish");
assert.ok(Array.isArray(parsed.keyDecisions), "keyDecisions should be an array");
assert.ok(parsed.keyDecisions[0].includes("Web Notification API"));
assert.ok(Array.isArray(parsed.keyFiles), "keyFiles should be an array");
assert.ok(parsed.keyFiles[0].includes("src-tauri/src/lib.rs"));
assert.ok(Array.isArray(parsed.lessonsLearned), "lessonsLearned should be an array");
assert.equal(parsed.verificationPassed, true);
});
// ── Passthrough for valid JSON ────────────────────────────────────────
test("returns valid JSON unchanged", () => {
const valid = '{"keyDecisions": ["item1", "item2"], "count": -5}';
const result = repairToolJson(valid);
assert.equal(result, valid, "valid JSON should be returned unchanged");
});
// ── Negative numbers are preserved ────────────────────────────────────
test("does not mangle negative numbers", () => {
const valid = '{"offset": -1, "limit": -100}';
const result = repairToolJson(valid);
assert.equal(result, valid);
});
});

View file

@ -72,6 +72,7 @@ import type { ModelRegistry } from "./model-registry.js";
import { expandPromptTemplate, type PromptTemplate } from "./prompt-templates.js";
import type { ResourceExtensionPaths, ResourceLoader } from "./resource-loader.js";
import { RetryHandler } from "./retry-handler.js";
import { isImageDimensionError, downsizeConversationImages } from "./image-overflow-recovery.js";
import type { BranchSummaryEntry, SessionManager } from "./session-manager.js";
import { getLatestCompactionEntry } from "./session-manager.js";
import type { SettingsManager } from "./settings-manager.js";
@ -136,7 +137,8 @@ export type AgentSessionEvent =
| { type: "auto_retry_end"; success: boolean; attempt: number; finalError?: string }
| { type: "fallback_provider_switch"; from: string; to: string; reason: string }
| { type: "fallback_provider_restored"; provider: string; reason: string }
| { type: "fallback_chain_exhausted"; reason: string };
| { type: "fallback_chain_exhausted"; reason: string }
| { type: "image_overflow_recovery"; strippedCount: number; imageCount: number };
/** Listener function for agent session events */
export type AgentSessionEventListener = (event: AgentSessionEvent) => void;
@ -487,6 +489,36 @@ export class AgentSession {
if (didRetry) return; // Retry was initiated, don't proceed to compaction
}
// Check for image dimension overflow (many-image 400 error).
// When a session accumulates many images, the API rejects requests
// whose images exceed the many-image dimension limit. Strip older
// images from the conversation and auto-retry. (#2874)
if (
msg.stopReason === "error" &&
isImageDimensionError(msg.errorMessage)
) {
const messages = this.agent.state.messages;
const result = downsizeConversationImages(messages as Message[]);
if (result.processed) {
// Remove the trailing error assistant message, then replace
if (messages.length > 0 && messages[messages.length - 1].role === "assistant") {
this.agent.replaceMessages(messages.slice(0, -1));
}
this._emit({
type: "image_overflow_recovery",
strippedCount: result.strippedCount,
imageCount: result.imageCount,
});
// Auto-retry after downsizing
setTimeout(() => {
this.agent.continue().catch(() => {});
}, 0);
return;
}
}
await this._compactionOrchestrator.checkCompaction(msg);
}
}
@ -1986,6 +2018,11 @@ export class AgentSession {
const messages = this.agent.state.messages;
const last = messages[messages.length - 1];
if (last?.role === "assistant" && (last as AssistantMessage).stopReason === "error") {
// If the error was an image dimension overflow, downsize images
// before retrying so the retry doesn't hit the same error (#2874)
if (isImageDimensionError((last as AssistantMessage).errorMessage)) {
downsizeConversationImages(messages as Message[]);
}
this.agent.replaceMessages(messages.slice(0, -1));
this.agent.continue().catch((err) => {
runner.emitError({

View file

@ -0,0 +1,236 @@
/**
* Tests for chunked compaction fallback when messages exceed model context window.
* Regression test for #2932.
*/
import assert from "node:assert/strict";
import { describe, it, mock } from "node:test";
import type { AgentMessage } from "@gsd/pi-agent-core";
import type { Model, AssistantMessage } from "@gsd/pi-ai";
import { generateSummary, estimateTokens, chunkMessages } from "./compaction.js";
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
/** Create a user message with approximately `tokenCount` tokens (chars = tokens * 4). */
function makeUserMessage(tokenCount: number): AgentMessage {
const text = "x".repeat(tokenCount * 4);
return { role: "user", content: text } as unknown as AgentMessage;
}
/** Create a mock model with a given context window. */
function makeModel(contextWindow: number): Model<any> {
return {
id: "test-model",
name: "Test Model",
api: "anthropic-messages",
provider: "anthropic",
baseUrl: "https://api.test",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow,
maxTokens: 4096,
} as Model<any>;
}
function makeFakeResponse(text: string): AssistantMessage {
return {
content: [{ type: "text", text }],
stopReason: "end_turn",
} as unknown as AssistantMessage;
}
// ---------------------------------------------------------------------------
// chunkMessages tests
// ---------------------------------------------------------------------------
describe("chunkMessages", () => {
it("returns a single chunk when messages fit in budget", () => {
const messages: AgentMessage[] = [
makeUserMessage(1_000),
makeUserMessage(1_000),
];
const chunks = chunkMessages(messages, 100_000);
assert.equal(chunks.length, 1);
assert.equal(chunks[0].length, 2);
});
it("splits messages into multiple chunks when they exceed budget", () => {
const messages: AgentMessage[] = [
makeUserMessage(50_000),
makeUserMessage(50_000),
makeUserMessage(50_000),
];
// Budget of 80k tokens means each 50k message gets its own chunk
// (or two fit together if budget allows)
const chunks = chunkMessages(messages, 80_000);
assert.ok(chunks.length > 1, `Expected multiple chunks, got ${chunks.length}`);
// All messages should be present across chunks
const totalMessages = chunks.reduce((sum, c) => sum + c.length, 0);
assert.equal(totalMessages, 3);
});
it("puts a single oversized message in its own chunk", () => {
const messages: AgentMessage[] = [
makeUserMessage(200_000), // Way over any reasonable budget
];
const chunks = chunkMessages(messages, 80_000);
assert.equal(chunks.length, 1);
assert.equal(chunks[0].length, 1);
});
it("preserves message order across chunks", () => {
// Create messages with identifiable sizes
const messages: AgentMessage[] = [
makeUserMessage(30_000), // ~30k tokens
makeUserMessage(30_000),
makeUserMessage(30_000),
makeUserMessage(30_000),
];
const chunks = chunkMessages(messages, 50_000);
// Reconstruct original order
const flat = chunks.flat();
assert.equal(flat.length, 4);
for (let i = 0; i < flat.length; i++) {
assert.strictEqual(flat[i], messages[i], `Message ${i} should be in order`);
}
});
});
// ---------------------------------------------------------------------------
// generateSummary chunked fallback tests
// ---------------------------------------------------------------------------
describe("generateSummary — chunked fallback (#2932)", () => {
it("calls _completeFn multiple times when messages exceed model context window", async () => {
// Arrange: 3 messages of ~80k tokens each = ~240k total, model has 200k window
const messages: AgentMessage[] = [
makeUserMessage(80_000),
makeUserMessage(80_000),
makeUserMessage(80_000),
];
const model = makeModel(200_000);
const reserveTokens = 16_384;
// Verify our test setup: messages really do exceed the model window
let totalTokens = 0;
for (const m of messages) totalTokens += estimateTokens(m);
assert.ok(
totalTokens > model.contextWindow,
`Test setup: ${totalTokens} tokens should exceed ${model.contextWindow} context window`,
);
// Track calls
const calls: string[] = [];
const mockComplete = mock.fn(async (_model: any, context: any, _options: any) => {
const userMsg = context.messages?.[0];
const text =
typeof userMsg?.content === "string"
? userMsg.content
: userMsg?.content?.[0]?.text ?? "";
if (text.includes("<previous-summary>")) {
calls.push("update");
} else {
calls.push("initial");
}
return makeFakeResponse("Summary of chunk");
});
const summary = await generateSummary(
messages,
model,
reserveTokens,
undefined, // apiKey
undefined, // signal
undefined, // customInstructions
undefined, // previousSummary
mockComplete, // _completeFn override for testing
);
// Assert: should have called completeSimple more than once (chunked)
assert.ok(
mockComplete.mock.callCount() > 1,
`Expected multiple calls for chunked summarization, got ${mockComplete.mock.callCount()}`,
);
// First call should be an initial summary, subsequent should be updates
assert.equal(calls[0], "initial", "First chunk should use initial summarization prompt");
for (let i = 1; i < calls.length; i++) {
assert.equal(calls[i], "update", `Chunk ${i + 1} should use update summarization prompt`);
}
// Should return a non-empty summary
assert.ok(summary.length > 0, "Summary should not be empty");
});
it("uses single-pass when messages fit within model context window", async () => {
const messages: AgentMessage[] = [
makeUserMessage(10_000),
makeUserMessage(10_000),
];
const model = makeModel(200_000);
const reserveTokens = 16_384;
// Verify test setup
let totalTokens = 0;
for (const m of messages) totalTokens += estimateTokens(m);
assert.ok(
totalTokens < model.contextWindow,
`Test setup: ${totalTokens} tokens should fit in ${model.contextWindow} context window`,
);
const mockComplete = mock.fn(async () => makeFakeResponse("Single pass summary"));
await generateSummary(messages, model, reserveTokens, undefined, undefined, undefined, undefined, mockComplete);
assert.equal(
mockComplete.mock.callCount(),
1,
"Should use single-pass summarization when messages fit in context window",
);
});
it("passes previousSummary through chunked summarization", async () => {
const messages: AgentMessage[] = [
makeUserMessage(80_000),
makeUserMessage(80_000),
makeUserMessage(80_000),
];
const model = makeModel(200_000);
const reserveTokens = 16_384;
const previousSummary = "Previous session summary content";
const prompts: string[] = [];
const mockComplete = mock.fn(async (_model: any, context: any) => {
const userMsg = context.messages?.[0];
const text =
typeof userMsg?.content === "string"
? userMsg.content
: userMsg?.content?.[0]?.text ?? "";
prompts.push(text);
return makeFakeResponse("Chunk summary");
});
await generateSummary(
messages,
model,
reserveTokens,
undefined,
undefined,
undefined,
previousSummary,
mockComplete,
);
// First chunk should include the previousSummary
assert.ok(
prompts[0].includes(previousSummary),
"First chunk should incorporate the previousSummary",
);
});
});

View file

@ -489,9 +489,49 @@ Use this EXACT format:
Keep each section concise. Preserve exact file paths, function names, and error messages.`;
/**
* Split messages into chunks where each chunk's estimated token count
* stays within `maxTokensPerChunk`. A single message that exceeds the
* budget is placed alone in its own chunk (never dropped).
*/
export function chunkMessages(messages: AgentMessage[], maxTokensPerChunk: number): AgentMessage[][] {
const chunks: AgentMessage[][] = [];
let currentChunk: AgentMessage[] = [];
let currentTokens = 0;
for (const msg of messages) {
const msgTokens = estimateTokens(msg);
if (currentChunk.length > 0 && currentTokens + msgTokens > maxTokensPerChunk) {
// Current chunk is full — start a new one
chunks.push(currentChunk);
currentChunk = [msg];
currentTokens = msgTokens;
} else {
currentChunk.push(msg);
currentTokens += msgTokens;
}
}
if (currentChunk.length > 0) {
chunks.push(currentChunk);
}
return chunks;
}
/** Type for the completion function, allowing injection for tests. */
type CompleteFn = typeof completeSimple;
/**
* Generate a summary of the conversation using the LLM.
* If previousSummary is provided, uses the update prompt to merge.
*
* When the messages exceed the model's context window, automatically
* falls back to chunked summarization: summarize the first chunk,
* then iteratively merge subsequent chunks using the update prompt.
*
* @param _completeFn - Internal override for testing; defaults to completeSimple.
*/
export async function generateSummary(
currentMessages: AgentMessage[],
@ -501,6 +541,59 @@ export async function generateSummary(
signal?: AbortSignal,
customInstructions?: string,
previousSummary?: string,
_completeFn?: CompleteFn,
): Promise<string> {
const complete = _completeFn ?? completeSimple;
// Estimate total tokens for the messages to summarize
let totalTokens = 0;
for (const msg of currentMessages) {
totalTokens += estimateTokens(msg);
}
// Overhead for the prompt framing, system prompt, and response budget
const promptOverhead = 4_000;
const maxTokens = Math.floor(0.8 * reserveTokens);
const maxInputTokens = (model.contextWindow || 200_000) - reserveTokens - promptOverhead;
// If messages fit in the context window, use single-pass summarization
if (totalTokens <= maxInputTokens) {
return singlePassSummary(currentMessages, model, reserveTokens, apiKey, signal, customInstructions, previousSummary, complete);
}
// Chunked fallback: split messages and iteratively summarize
const chunks = chunkMessages(currentMessages, maxInputTokens);
let runningSummary = previousSummary;
for (let i = 0; i < chunks.length; i++) {
runningSummary = await singlePassSummary(
chunks[i],
model,
reserveTokens,
apiKey,
signal,
customInstructions,
runningSummary,
complete,
);
}
return runningSummary!;
}
/**
* Single-pass summarization of messages using the LLM.
* If previousSummary is provided, uses the update prompt to merge.
*/
async function singlePassSummary(
currentMessages: AgentMessage[],
model: Model<any>,
reserveTokens: number,
apiKey: string | undefined,
signal?: AbortSignal,
customInstructions?: string,
previousSummary?: string,
complete: CompleteFn = completeSimple,
): Promise<string> {
const maxTokens = Math.floor(0.8 * reserveTokens);
@ -526,7 +619,7 @@ export async function generateSummary(
? { maxTokens, signal, apiKey, reasoning: "high" as const }
: { maxTokens, signal, apiKey };
const response = await completeSimple(
const response = await complete(
model,
{ systemPrompt: SUMMARIZATION_SYSTEM_PROMPT, messages: createSummarizationMessage(promptText) },
completionOptions,

View file

@ -39,7 +39,9 @@ export async function execCommand(
return new Promise((resolve) => {
const proc = spawn(command, args, {
cwd,
shell: false,
// On Windows, npm/npx/tsc etc. are .cmd scripts that require shell
// resolution. Without this, spawn fails with ENOENT or EINVAL (#2854).
shell: process.platform === "win32",
stdio: ["ignore", "pipe", "pipe"],
});

View file

@ -0,0 +1,77 @@
// GSD-2 — Extension Manifest Tests
// Copyright (c) 2026 Jeremy McSpadden <jeremy@fluxlabs.net>
import { describe, it } from "node:test";
import assert from "node:assert/strict";
import { mkdtempSync, mkdirSync, writeFileSync } from "node:fs";
import { join } from "node:path";
import { tmpdir } from "node:os";
import { readManifest, readManifestFromEntryPath } from "./extension-manifest.js";
describe("readManifest", () => {
it("returns null for missing directory", () => {
assert.equal(readManifest("/nonexistent/path"), null);
});
it("returns null for directory without manifest", () => {
const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
assert.equal(readManifest(dir), null);
});
it("returns null for invalid JSON", () => {
const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
writeFileSync(join(dir, "extension-manifest.json"), "not json{{{", "utf-8");
assert.equal(readManifest(dir), null);
});
it("returns null for manifest missing required fields", () => {
const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
writeFileSync(
join(dir, "extension-manifest.json"),
JSON.stringify({ id: "test", name: "test" }),
);
assert.equal(readManifest(dir), null);
});
it("returns valid manifest", () => {
const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
const manifest = {
id: "test-ext",
name: "Test Extension",
version: "1.0.0",
tier: "bundled",
requires: { platform: ">=2.29.0" },
};
writeFileSync(join(dir, "extension-manifest.json"), JSON.stringify(manifest));
const result = readManifest(dir);
assert.equal(result?.id, "test-ext");
assert.equal(result?.tier, "bundled");
});
});
describe("readManifestFromEntryPath", () => {
it("reads manifest from parent of entry path", () => {
const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
const extDir = join(dir, "my-ext");
mkdirSync(extDir);
writeFileSync(
join(extDir, "extension-manifest.json"),
JSON.stringify({
id: "my-ext",
name: "My Extension",
version: "1.0.0",
tier: "community",
}),
);
writeFileSync(join(extDir, "index.ts"), "");
const result = readManifestFromEntryPath(join(extDir, "index.ts"));
assert.equal(result?.id, "my-ext");
assert.equal(result?.tier, "community");
});
it("returns null when entry path parent has no manifest", () => {
const dir = mkdtempSync(join(tmpdir(), "ext-manifest-"));
assert.equal(readManifestFromEntryPath(join(dir, "index.ts")), null);
});
});

View file

@ -0,0 +1,62 @@
// GSD-2 — Extension Manifest: Types and reading for extension-manifest.json
// Copyright (c) 2026 Jeremy McSpadden <jeremy@fluxlabs.net>
import { existsSync, readFileSync } from "node:fs";
import { dirname, join } from "node:path";
// ─── Types ──────────────────────────────────────────────────────────────────
export interface ExtensionManifest {
id: string;
name: string;
version: string;
description: string;
tier: "core" | "bundled" | "community";
requires: { platform: string };
provides?: {
tools?: string[];
commands?: string[];
hooks?: string[];
shortcuts?: string[];
};
dependencies?: {
extensions?: string[];
runtime?: string[];
};
}
// ─── Validation ─────────────────────────────────────────────────────────────
function isManifest(data: unknown): data is ExtensionManifest {
if (typeof data !== "object" || data === null) return false;
const obj = data as Record<string, unknown>;
return (
typeof obj.id === "string" &&
typeof obj.name === "string" &&
typeof obj.version === "string" &&
typeof obj.tier === "string"
);
}
// ─── Reading ────────────────────────────────────────────────────────────────
/** Read extension-manifest.json from a directory. Returns null if missing or invalid. */
export function readManifest(extensionDir: string): ExtensionManifest | null {
const manifestPath = join(extensionDir, "extension-manifest.json");
if (!existsSync(manifestPath)) return null;
try {
const raw = JSON.parse(readFileSync(manifestPath, "utf-8"));
return isManifest(raw) ? raw : null;
} catch {
return null;
}
}
/**
* Given an entry path (e.g. `.../extensions/browser-tools/index.ts`),
* resolve the parent directory and read its manifest.
*/
export function readManifestFromEntryPath(entryPath: string): ExtensionManifest | null {
const dir = dirname(entryPath);
return readManifest(dir);
}

View file

@ -0,0 +1,134 @@
// GSD-2 — Extension Sort Tests
// Copyright (c) 2026 Jeremy McSpadden <jeremy@fluxlabs.net>
import { describe, it } from "node:test";
import assert from "node:assert/strict";
import { mkdtempSync, mkdirSync, writeFileSync } from "node:fs";
import { join } from "node:path";
import { tmpdir } from "node:os";
import { sortExtensionPaths } from "./extension-sort.js";
function createExtDir(base: string, id: string, deps?: string[]): string {
const dir = join(base, id);
mkdirSync(dir, { recursive: true });
writeFileSync(
join(dir, "extension-manifest.json"),
JSON.stringify({
id,
name: id,
version: "1.0.0",
tier: "bundled",
requires: { platform: ">=2.29.0" },
...(deps ? { dependencies: { extensions: deps } } : {}),
}),
);
writeFileSync(join(dir, "index.ts"), `export default function() {}`);
return join(dir, "index.ts");
}
describe("sortExtensionPaths", () => {
it("returns empty for empty input", () => {
const result = sortExtensionPaths([]);
assert.deepEqual(result.sortedPaths, []);
assert.deepEqual(result.warnings, []);
});
it("sorts independent extensions alphabetically", () => {
const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
const pathC = createExtDir(base, "charlie");
const pathA = createExtDir(base, "alpha");
const pathB = createExtDir(base, "bravo");
const result = sortExtensionPaths([pathC, pathA, pathB]);
assert.deepEqual(result.sortedPaths, [pathA, pathB, pathC]);
assert.equal(result.warnings.length, 0);
});
it("sorts dependencies before dependents", () => {
const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
const pathBase = createExtDir(base, "base-ext");
const pathDependent = createExtDir(base, "dependent-ext", ["base-ext"]);
// Pass dependent first — sort should reorder
const result = sortExtensionPaths([pathDependent, pathBase]);
assert.deepEqual(result.sortedPaths, [pathBase, pathDependent]);
assert.equal(result.warnings.length, 0);
});
it("handles deep dependency chains", () => {
const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
const pathA = createExtDir(base, "a");
const pathB = createExtDir(base, "b", ["a"]);
const pathC = createExtDir(base, "c", ["b"]);
const result = sortExtensionPaths([pathC, pathB, pathA]);
assert.deepEqual(result.sortedPaths, [pathA, pathB, pathC]);
assert.equal(result.warnings.length, 0);
});
it("warns about missing dependencies but still loads", () => {
const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
const pathExt = createExtDir(base, "my-ext", ["nonexistent"]);
const result = sortExtensionPaths([pathExt]);
assert.equal(result.sortedPaths.length, 1);
assert.equal(result.sortedPaths[0], pathExt);
assert.equal(result.warnings.length, 1);
assert.match(result.warnings[0].message, /nonexistent.*not installed/);
});
it("warns about cycles but still loads both", () => {
const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
const pathA = createExtDir(base, "cycle-a", ["cycle-b"]);
const pathB = createExtDir(base, "cycle-b", ["cycle-a"]);
const result = sortExtensionPaths([pathA, pathB]);
assert.equal(result.sortedPaths.length, 2);
assert.ok(result.warnings.length > 0);
assert.ok(result.warnings.some((w) => w.message.includes("cycle")));
});
it("silently ignores self-dependencies", () => {
const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
const pathExt = createExtDir(base, "self-dep", ["self-dep"]);
const result = sortExtensionPaths([pathExt]);
assert.deepEqual(result.sortedPaths, [pathExt]);
assert.equal(result.warnings.length, 0);
});
it("prepends extensions without manifests", () => {
const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
const noManifestDir = join(base, "no-manifest");
mkdirSync(noManifestDir, { recursive: true });
writeFileSync(join(noManifestDir, "index.ts"), `export default function() {}`);
const noManifestPath = join(noManifestDir, "index.ts");
const pathWithManifest = createExtDir(base, "with-manifest");
const result = sortExtensionPaths([pathWithManifest, noManifestPath]);
assert.equal(result.sortedPaths[0], noManifestPath);
assert.equal(result.sortedPaths[1], pathWithManifest);
});
it("handles non-array dependencies gracefully", () => {
const base = mkdtempSync(join(tmpdir(), "ext-sort-"));
const dir = join(base, "bad-deps");
mkdirSync(dir, { recursive: true });
writeFileSync(
join(dir, "extension-manifest.json"),
JSON.stringify({
id: "bad-deps",
name: "bad-deps",
version: "1.0.0",
tier: "bundled",
dependencies: { extensions: "not-an-array" },
}),
);
writeFileSync(join(dir, "index.ts"), `export default function() {}`);
const result = sortExtensionPaths([join(dir, "index.ts")]);
assert.equal(result.sortedPaths.length, 1);
assert.equal(result.warnings.length, 0);
});
});

View file

@ -0,0 +1,137 @@
// GSD-2 — Extension Sort: Topological dependency ordering
// Copyright (c) 2026 Jeremy McSpadden <jeremy@fluxlabs.net>
import { readManifestFromEntryPath } from "./extension-manifest.js";
export interface SortWarning {
declaringId: string;
missingId: string;
message: string;
}
export interface SortResult {
sortedPaths: string[];
warnings: SortWarning[];
}
/**
* Sort extension entry paths in topological dependency-first order using Kahn's BFS algorithm.
*
* - Extensions without manifests are prepended in input order.
* - Missing dependencies produce a structured warning but do not block loading.
* - Cycles produce warnings; cycle participants are appended alphabetically.
* - Self-dependencies are silently ignored.
*/
export function sortExtensionPaths(paths: string[]): SortResult {
const warnings: SortWarning[] = [];
const pathsWithoutId: string[] = [];
const idToPath = new Map<string, string>();
// Step 1: Build ID map
for (const p of paths) {
const manifest = readManifestFromEntryPath(p);
if (!manifest) {
pathsWithoutId.push(p);
} else {
idToPath.set(manifest.id, p);
}
}
// Step 2: Build graph — inDegree and dependents adjacency
const inDegree = new Map<string, number>();
const dependents = new Map<string, string[]>(); // dep → [ids that depend on dep]
for (const id of idToPath.keys()) {
if (!inDegree.has(id)) inDegree.set(id, 0);
if (!dependents.has(id)) dependents.set(id, []);
}
for (const [id, entryPath] of idToPath) {
const manifest = readManifestFromEntryPath(entryPath);
const rawDeps = manifest?.dependencies?.extensions ?? [];
const deps = Array.isArray(rawDeps) ? rawDeps : [];
for (const depId of deps) {
// Silently ignore self-deps
if (depId === id) continue;
if (!idToPath.has(depId)) {
// Missing dependency — warn and skip edge
warnings.push({
declaringId: id,
missingId: depId,
message: `Extension '${id}' declares dependency '${depId}' which is not installed — loading anyway`,
});
continue;
}
// Valid edge: id depends on depId → increment inDegree[id], add id to dependents[depId]
inDegree.set(id, (inDegree.get(id) ?? 0) + 1);
const depDependents = dependents.get(depId) ?? [];
depDependents.push(id);
dependents.set(depId, depDependents);
}
}
// Step 3: Kahn's algorithm — start with nodes that have inDegree 0
const sorted: string[] = [];
// Ready queue: IDs with inDegree 0, maintained in alphabetical order
const ready: string[] = [...idToPath.keys()]
.filter((id) => inDegree.get(id) === 0)
.sort();
while (ready.length > 0) {
const id = ready.shift()!;
sorted.push(idToPath.get(id)!);
const deps = dependents.get(id) ?? [];
for (const depId of deps) {
const newDegree = (inDegree.get(depId) ?? 0) - 1;
inDegree.set(depId, newDegree);
if (newDegree === 0) {
// Insert into ready queue maintaining alphabetical order
const insertIdx = ready.findIndex((r) => r > depId);
if (insertIdx === -1) {
ready.push(depId);
} else {
ready.splice(insertIdx, 0, depId);
}
}
}
}
// Step 4: Cycle handling — any remaining IDs with inDegree > 0
const cycleIds = [...idToPath.keys()]
.filter((id) => (inDegree.get(id) ?? 0) > 0)
.sort();
if (cycleIds.length > 0) {
const cycleSet = new Set(cycleIds);
for (const id of cycleIds) {
const entryPath = idToPath.get(id)!;
const manifest = readManifestFromEntryPath(entryPath);
const rawDeps = manifest?.dependencies?.extensions ?? [];
const deps = Array.isArray(rawDeps) ? rawDeps : [];
for (const depId of deps) {
if (depId === id) continue;
if (!cycleSet.has(depId)) continue;
// Both id and depId are in cycle — emit warning
warnings.push({
declaringId: id,
missingId: depId,
message: `Extension '${id}' and '${depId}' form a dependency cycle — loading both anyway (alphabetical order)`,
});
}
sorted.push(entryPath);
}
}
return {
sortedPaths: [...pathsWithoutId, ...sorted],
warnings,
};
}

View file

@ -2,6 +2,10 @@
* Extension system for lifecycle events and custom tools.
*/
export type { ExtensionManifest } from "./extension-manifest.js";
export { readManifest, readManifestFromEntryPath } from "./extension-manifest.js";
export type { SortResult, SortWarning } from "./extension-sort.js";
export { sortExtensionPaths } from "./extension-sort.js";
export type { SlashCommandInfo, SlashCommandLocation, SlashCommandSource } from "../slash-commands.js";
export {
createExtensionRuntime,

View file

@ -941,6 +941,11 @@ function discoverExtensionsInDir(dir: string): string[] {
/**
* Discover and load extensions from standard locations.
*
* @deprecated Use DefaultResourceLoader.reload() instead this function is
* not called in the GSD loading flow. Extension discovery happens through
* DefaultPackageManager.resolve() addAutoDiscoveredResources(). Kept for
* backwards compatibility with direct pi-coding-agent consumers.
*/
export async function discoverAndLoadExtensions(
configuredPaths: string[],

View file

@ -0,0 +1,228 @@
import assert from "node:assert/strict";
import { describe, it } from "node:test";
import {
isImageDimensionError,
MANY_IMAGE_MAX_DIMENSION,
downsizeConversationImages,
} from "./image-overflow-recovery.js";
import type { Message } from "@gsd/pi-ai";
// ─── isImageDimensionError ────────────────────────────────────────────────────
describe("isImageDimensionError", () => {
it("returns true for Anthropic many-image dimension error", () => {
const errorMessage =
'Error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"messages.125.content.38.image.source.base64.data: At least one of the image dimensions exceed max allowed size for many-image requests: 2000 pixels"}}';
assert.equal(isImageDimensionError(errorMessage), true);
});
it("returns true for bare dimension exceed message", () => {
const errorMessage =
"image dimensions exceed max allowed size for many-image requests: 2000 pixels";
assert.equal(isImageDimensionError(errorMessage), true);
});
it("returns false for unrelated 400 error", () => {
const errorMessage =
'Error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"max_tokens: 4096 > 2048"}}';
assert.equal(isImageDimensionError(errorMessage), false);
});
it("returns false for rate limit error", () => {
assert.equal(isImageDimensionError("429 rate limit exceeded"), false);
});
it("returns false for empty string", () => {
assert.equal(isImageDimensionError(""), false);
});
it("returns false for undefined", () => {
assert.equal(isImageDimensionError(undefined), false);
});
});
// ─── MANY_IMAGE_MAX_DIMENSION ─────────────────────────────────────────────────
describe("MANY_IMAGE_MAX_DIMENSION", () => {
it("is less than 2000 (the API-enforced limit)", () => {
assert.ok(MANY_IMAGE_MAX_DIMENSION < 2000);
});
it("is a positive integer", () => {
assert.ok(MANY_IMAGE_MAX_DIMENSION > 0);
assert.equal(MANY_IMAGE_MAX_DIMENSION, Math.floor(MANY_IMAGE_MAX_DIMENSION));
});
});
// ─── helpers ──────────────────────────────────────────────────────────────────
function makeUserMsg(content: Message["content"] & any): Message {
return { role: "user", content, timestamp: Date.now() } as Message;
}
function makeAssistantMsg(text: string): Message {
return {
role: "assistant",
content: [{ type: "text", text }],
api: "anthropic-messages",
provider: "anthropic",
model: "claude-opus-4-6",
usage: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, totalTokens: 0, cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 } },
stopReason: "stop",
timestamp: Date.now(),
} as Message;
}
function makeToolResultMsg(images: number): Message {
const content: any[] = [];
for (let i = 0; i < images; i++) {
content.push({ type: "image", data: `img${i}`, mimeType: "image/png" });
}
return {
role: "toolResult",
toolCallId: `tc${Math.random()}`,
toolName: "screenshot",
content,
isError: false,
timestamp: Date.now(),
} as Message;
}
// ─── downsizeConversationImages ───────────────────────────────────────────────
describe("downsizeConversationImages", () => {
it("counts images in user and toolResult messages", () => {
const messages: Message[] = [
makeUserMsg([
{ type: "image", data: "img1", mimeType: "image/png" },
{ type: "image", data: "img2", mimeType: "image/png" },
]),
makeAssistantMsg("I see them"),
makeToolResultMsg(1),
];
const result = downsizeConversationImages(messages);
assert.equal(result.imageCount, 3);
});
it("returns processed=false when no images present", () => {
const messages: Message[] = [
makeUserMsg("just text"),
makeAssistantMsg("reply"),
];
const result = downsizeConversationImages(messages);
assert.equal(result.imageCount, 0);
assert.equal(result.processed, false);
});
it("returns processed=false when image count <= RECENT_IMAGES_TO_KEEP", () => {
const messages: Message[] = [
makeUserMsg([
{ type: "image", data: "img1", mimeType: "image/png" },
]),
makeAssistantMsg("got it"),
];
const result = downsizeConversationImages(messages);
assert.equal(result.imageCount, 1);
assert.equal(result.processed, false);
});
it("strips older images when many images present, preserves recent ones", () => {
const messages: Message[] = [];
for (let i = 0; i < 25; i++) {
messages.push(
makeUserMsg([
{ type: "text", text: `message ${i}` },
{ type: "image", data: `img${i}`, mimeType: "image/png" },
]),
);
messages.push(makeAssistantMsg(`reply ${i}`));
}
const result = downsizeConversationImages(messages);
assert.ok(result.processed);
assert.equal(result.imageCount, 25);
assert.equal(result.strippedCount, 20); // 25 - 5 recent
// Count remaining images
let remainingImages = 0;
for (const msg of messages) {
if (msg.role === "assistant") continue;
if (typeof msg.content === "string") continue;
const arr = msg.content as any[];
for (const block of arr) {
if (block.type === "image") remainingImages++;
}
}
assert.equal(remainingImages, 5, "Should keep exactly 5 most recent images");
// The 5 most recent user messages (indices 40,42,44,46,48) should have images
for (let i = 20; i < 25; i++) {
const userMsg = messages[i * 2]; // user messages at even indices
const arr = userMsg.content as any[];
const hasImage = arr.some((c: any) => c.type === "image");
assert.ok(hasImage, `Recent message ${i} should retain its image`);
}
});
it("adds text placeholder when stripping an image", () => {
const messages: Message[] = [];
for (let i = 0; i < 10; i++) {
messages.push(
makeUserMsg([
{ type: "image", data: `img${i}`, mimeType: "image/jpeg" },
]),
);
messages.push(makeAssistantMsg(`reply ${i}`));
}
downsizeConversationImages(messages);
// First message's image should have been replaced with text
const firstMsg = messages[0];
const arr = firstMsg.content as any[];
const placeholder = arr.find(
(c: any) => c.type === "text" && c.text.includes("[image removed"),
);
assert.ok(placeholder, "Stripped image should be replaced with text placeholder");
assert.ok(
placeholder.text.includes("image/jpeg"),
"Placeholder should mention original mime type",
);
});
it("handles toolResult messages with images", () => {
const messages: Message[] = [];
for (let i = 0; i < 10; i++) {
messages.push(makeToolResultMsg(1));
messages.push(makeAssistantMsg(`reply ${i}`));
}
const result = downsizeConversationImages(messages);
assert.equal(result.imageCount, 10);
assert.equal(result.strippedCount, 5);
assert.ok(result.processed);
});
it("handles mixed user and toolResult images", () => {
const messages: Message[] = [];
for (let i = 0; i < 8; i++) {
messages.push(
makeUserMsg([
{ type: "text", text: `check ${i}` },
{ type: "image", data: `uimg${i}`, mimeType: "image/png" },
]),
);
messages.push(makeAssistantMsg(`processing ${i}`));
messages.push(makeToolResultMsg(1));
messages.push(makeAssistantMsg(`done ${i}`));
}
const result = downsizeConversationImages(messages);
// 8 user images + 8 tool result images = 16 total
assert.equal(result.imageCount, 16);
assert.equal(result.strippedCount, 11); // 16 - 5 recent
});
});

View file

@ -0,0 +1,118 @@
/**
* Image overflow recovery for many-image sessions.
*
* When a conversation accumulates many images (screenshots, file reads, etc.),
* the Anthropic API enforces a stricter per-image dimension limit (2000px) for
* "many-image requests." This module detects the resulting 400 error and
* recovers by stripping older images from the conversation history, preserving
* the most recent ones to maintain session continuity.
*
* @see https://github.com/gsd-build/gsd-2/issues/2874
*/
import type { Message, ImageContent, TextContent } from "@gsd/pi-ai";
/**
* Maximum image dimension (px) that the Anthropic API allows in many-image
* requests. Images at or above this size in a large conversation will be
* rejected with a 400 error. We use 1568 as the safe ceiling (Anthropic's
* recommended max for multi-image requests).
*/
export const MANY_IMAGE_MAX_DIMENSION = 1568;
/**
* Number of recent images to preserve when stripping old images.
* Keeps the most recent screenshots/images so the model retains visual context
* for the current task.
*/
const RECENT_IMAGES_TO_KEEP = 5;
/**
* Regex matching the Anthropic API error for oversized images in many-image requests.
*/
const IMAGE_DIMENSION_ERROR_RE =
/image.dimensions?.exceed.*max.*allowed.*size.*many.image/i;
/**
* Detect whether an error message is the Anthropic "image dimensions exceed max
* allowed size for many-image requests" 400 error.
*/
export function isImageDimensionError(errorMessage: string | undefined | null): boolean {
if (!errorMessage) return false;
return IMAGE_DIMENSION_ERROR_RE.test(errorMessage);
}
export interface DownsizeResult {
/** Total number of images found in the conversation */
imageCount: number;
/** Whether any images were stripped */
processed: boolean;
/** Number of images that were stripped */
strippedCount: number;
}
/**
* Strip older images from conversation messages to recover from many-image
* dimension errors. Preserves the N most recent images and replaces older ones
* with a text placeholder.
*
* Mutates messages in place (same pattern as replaceMessages/compaction).
*
* Accepts Message[] (the LLM message union) so it works with both
* agent.state.messages and session entries.
*/
export function downsizeConversationImages(messages: Message[]): DownsizeResult {
// First pass: collect all image locations (message index + content index)
const imageLocations: Array<{ msgIdx: number; contentIdx: number }> = [];
for (let msgIdx = 0; msgIdx < messages.length; msgIdx++) {
const msg = messages[msgIdx];
if (msg.role === "assistant") continue;
// UserMessage can have string content; ToolResultMessage always has array
if (msg.role === "user" && typeof msg.content === "string") continue;
const contentArr = msg.content as (TextContent | ImageContent)[];
if (!Array.isArray(contentArr)) continue;
for (let contentIdx = 0; contentIdx < contentArr.length; contentIdx++) {
if (contentArr[contentIdx].type === "image") {
imageLocations.push({ msgIdx, contentIdx });
}
}
}
const imageCount = imageLocations.length;
if (imageCount === 0) {
return { imageCount: 0, processed: false, strippedCount: 0 };
}
// Determine which images to strip (all except the N most recent)
const stripCount = Math.max(0, imageCount - RECENT_IMAGES_TO_KEEP);
if (stripCount === 0) {
return { imageCount, processed: false, strippedCount: 0 };
}
const toStrip = imageLocations.slice(0, stripCount);
// Second pass: replace stripped images with text placeholder.
// Process in reverse order to maintain content indices.
for (let i = toStrip.length - 1; i >= 0; i--) {
const { msgIdx, contentIdx } = toStrip[i];
const msg = messages[msgIdx];
if (msg.role === "assistant") continue;
if (msg.role === "user" && typeof msg.content === "string") continue;
const contentArr = msg.content as (TextContent | ImageContent)[];
const imageBlock = contentArr[contentIdx] as ImageContent;
const mimeType = imageBlock.mimeType || "image/unknown";
// Replace the image block with a text placeholder
(contentArr as any[])[contentIdx] = {
type: "text",
text: `[image removed to reduce context size — was ${mimeType}]`,
} as TextContent;
}
return { imageCount, processed: true, strippedCount: stripCount };
}

View file

@ -29,6 +29,7 @@ export {
type ExecResult,
type Extension,
type ExtensionAPI,
type ExtensionManifest,
type ExtensionCommandContext,
type ExtensionContext,
type ExtensionError,
@ -53,6 +54,11 @@ export {
type SessionSwitchEvent,
type SessionTreeEvent,
type ToolCallEvent,
readManifest,
readManifestFromEntryPath,
type SortResult,
type SortWarning,
sortExtensionPaths,
type ToolDefinition,
type ToolRenderResultOptions,
type ToolResultEvent,

View file

@ -340,6 +340,9 @@ async function runWorkspaceDiagnostics(
const proc = spawn(cmd, cmdArgs, {
cwd,
stdio: ["ignore", "pipe", "pipe"],
// On Windows, project-type commands (tsc, cargo, etc.) may be .cmd
// wrappers that need shell resolution to avoid ENOENT/EINVAL (#2854).
shell: process.platform === "win32",
});
const abortHandler = () => {
proc.kill();

View file

@ -90,6 +90,9 @@ async function checkServerRunning(binaryPath: string): Promise<boolean> {
try {
const proc = spawn(binaryPath, ["status"], {
stdio: ["ignore", "pipe", "pipe"],
// On Windows, the binary may be a .cmd wrapper requiring shell
// resolution to avoid ENOENT/EINVAL (#2854).
shell: process.platform === "win32",
});
const exited = await Promise.race([

View file

@ -0,0 +1,114 @@
/**
* messages.test.ts Tests for convertToLlm custom message handling.
*
* Reproduction test for #3026: background job completion notifications
* delivered as custom messages must be clearly distinguishable from
* user-typed input when converted to LLM messages.
*/
import test from "node:test";
import assert from "node:assert/strict";
import { convertToLlm, type CustomMessage } from "./messages.js";
/** Extract the first content block from a message, asserting array content. */
function firstTextBlock(msg: ReturnType<typeof convertToLlm>[number]) {
const { content } = msg;
assert.ok(Array.isArray(content), "Expected content to be an array");
const block = content[0];
assert.ok(typeof block === "object" && block !== null, "Expected first block to be an object");
return block;
}
test("convertToLlm wraps custom messages with system notification prefix", () => {
const customMsg: CustomMessage = {
role: "custom",
customType: "async_job_result",
content: "**Background job done: bg_abc123** (sleep 2, 2.1s)\n\ndone",
display: true,
timestamp: Date.now(),
};
const result = convertToLlm([customMsg]);
assert.equal(result.length, 1);
assert.equal(result[0].role, "user");
// The content must include a system notification wrapper so the LLM
// does not confuse it with user input (#3026).
const text = firstTextBlock(result[0]);
assert.equal(text.type, "text");
assert.ok(
"text" in text && text.text.includes("[system notification"),
"Custom message should be wrapped with system notification marker",
);
});
test("convertToLlm wraps custom messages with array content", () => {
const customMsg: CustomMessage = {
role: "custom",
customType: "bg-shell-status",
content: [{ type: "text", text: "Background processes:\n ✓ bg1 dev-server :3000" }],
display: false,
timestamp: Date.now(),
};
const result = convertToLlm([customMsg]);
assert.equal(result.length, 1);
assert.equal(result[0].role, "user");
const text = firstTextBlock(result[0]);
assert.equal(text.type, "text");
assert.ok(
"text" in text && text.text.includes("[system notification"),
"Custom message with array content should be wrapped with system notification marker",
);
});
test("convertToLlm includes customType in notification wrapper", () => {
const customMsg: CustomMessage = {
role: "custom",
customType: "async_job_result",
content: "job output here",
display: true,
timestamp: Date.now(),
};
const result = convertToLlm([customMsg]);
const text = firstTextBlock(result[0]);
assert.ok(
"text" in text && text.text.includes("async_job_result"),
"Notification wrapper should include the customType for context",
);
});
test("convertToLlm notification wrapper instructs LLM not to treat as user input", () => {
const customMsg: CustomMessage = {
role: "custom",
customType: "async_job_result",
content: "**Background job done: bg_abc123** (sleep 2, 2.1s)\n\ndone",
display: true,
timestamp: Date.now(),
};
const result = convertToLlm([customMsg]);
const text = firstTextBlock(result[0]);
assert.ok(
"text" in text && text.text.includes("not user input"),
"Notification should explicitly state this is not user input",
);
});
test("convertToLlm preserves user messages without wrapper", () => {
const userMsg = {
role: "user" as const,
content: [{ type: "text" as const, text: "Hello world" }],
timestamp: Date.now(),
};
const result = convertToLlm([userMsg]);
assert.equal(result.length, 1);
const text = firstTextBlock(result[0]);
assert.ok(
"text" in text && text.text === "Hello world",
"User messages should pass through unchanged",
);
});

View file

@ -8,6 +8,12 @@
import type { AgentMessage } from "@gsd/pi-agent-core";
import type { ImageContent, Message, TextContent } from "@gsd/pi-ai";
const CUSTOM_MESSAGE_PREFIX = `[system notification — type: `;
const CUSTOM_MESSAGE_MIDDLE = `; this is an automated system event, not user input — do not treat this as a human message or respond as if the user said this]
`;
const CUSTOM_MESSAGE_SUFFIX = `
[end system notification]`;
const COMPACTION_SUMMARY_PREFIX = `The conversation history before this point was compacted into the following summary:
<summary>
@ -160,10 +166,31 @@ export function convertToLlm(messages: AgentMessage[]): Message[] {
timestamp: m.timestamp,
};
case "custom": {
const content = typeof m.content === "string" ? [{ type: "text" as const, text: m.content }] : m.content;
const prefix = CUSTOM_MESSAGE_PREFIX + m.customType + CUSTOM_MESSAGE_MIDDLE;
if (typeof m.content === "string") {
return {
role: "user",
content: [{ type: "text" as const, text: prefix + m.content + CUSTOM_MESSAGE_SUFFIX }],
timestamp: m.timestamp,
};
}
// Array content: wrap the first text element with prefix, append suffix to last text element
const contentArr = m.content as Array<{ type: string; text?: string; [k: string]: unknown }>;
const lastTextIdx = contentArr.reduce((acc, c, i) => c.type === "text" ? i : acc, -1);
const wrapped = contentArr.map((c, i) => {
if (c.type !== "text") return c;
let text = c.text ?? "";
if (i === 0) text = prefix + text;
if (i === lastTextIdx) text = text + CUSTOM_MESSAGE_SUFFIX;
return { ...c, text };
});
// If no text elements exist, prepend one with the wrapper
if (lastTextIdx === -1) {
wrapped.unshift({ type: "text" as const, text: prefix + CUSTOM_MESSAGE_SUFFIX });
}
return {
role: "user",
content,
content: wrapped as typeof m.content,
timestamp: m.timestamp,
};
}

View file

@ -37,6 +37,7 @@ const defaultModelPerProvider: Record<KnownProvider, string> = {
"opencode-go": "kimi-k2.5",
"kimi-coding": "kimi-k2-thinking",
"alibaba-coding-plan": "qwen3.5-plus",
ollama: "llama3.1:8b",
"ollama-cloud": "qwen3:32b",
};

View file

@ -129,6 +129,12 @@ export interface DefaultResourceLoaderOptions {
appendSystemPrompt?: string;
/** Names of bundled extensions (used to identify built-in extensions in conflict detection). */
bundledExtensionNames?: Set<string>;
/**
* Transform extension paths before loading. Receives the merged list of all
* discovered extension paths and returns a (possibly reordered/filtered) list.
* Use this to apply dependency sorting or registry-based filtering.
*/
extensionPathsTransform?: (paths: string[]) => { paths: string[]; diagnostics?: string[] };
extensionsOverride?: (base: LoadExtensionsResult) => LoadExtensionsResult;
skillsOverride?: (base: { skills: Skill[]; diagnostics: ResourceDiagnostic[] }) => {
skills: Skill[];
@ -167,6 +173,7 @@ export class DefaultResourceLoader implements ResourceLoader {
private systemPromptSource?: string;
private appendSystemPromptSource?: string;
private bundledExtensionNames: Set<string>;
private extensionPathsTransform?: (paths: string[]) => { paths: string[]; diagnostics?: string[] };
private extensionsOverride?: (base: LoadExtensionsResult) => LoadExtensionsResult;
private skillsOverride?: (base: { skills: Skill[]; diagnostics: ResourceDiagnostic[] }) => {
skills: Skill[];
@ -223,6 +230,7 @@ export class DefaultResourceLoader implements ResourceLoader {
this.systemPromptSource = options.systemPrompt;
this.appendSystemPromptSource = options.appendSystemPrompt;
this.bundledExtensionNames = options.bundledExtensionNames ?? new Set();
this.extensionPathsTransform = options.extensionPathsTransform;
this.extensionsOverride = options.extensionsOverride;
this.skillsOverride = options.skillsOverride;
this.promptsOverride = options.promptsOverride;
@ -378,10 +386,21 @@ export class DefaultResourceLoader implements ResourceLoader {
const cliEnabledPrompts = getEnabledPaths(cliExtensionPaths.prompts);
const cliEnabledThemes = getEnabledPaths(cliExtensionPaths.themes);
const extensionPaths = this.noExtensions
let extensionPaths = this.noExtensions
? cliEnabledExtensions
: this.mergePaths(cliEnabledExtensions, enabledExtensions);
// Apply path transform (dependency sorting, registry filtering) if provided
if (this.extensionPathsTransform) {
const transformed = this.extensionPathsTransform(extensionPaths);
extensionPaths = transformed.paths;
if (transformed.diagnostics?.length) {
for (const msg of transformed.diagnostics) {
process.stderr.write(`[extensions] ${msg}\n`);
}
}
}
const extensionsResult = await loadExtensions(extensionPaths, this.cwd, this.eventBus);
const inlineExtensions = await this.loadExtensionFactories(extensionsResult.runtime);
extensionsResult.extensions.push(...inlineExtensions.extensions);

View file

@ -0,0 +1,255 @@
/**
* RetryHandler tests long-context entitlement 429 error handling (#2803)
*
* Verifies that "Extra usage is required for long context requests" errors
* are classified as quota_exhausted (not rate_limit) and trigger a model
* downgrade from [1m] to base when no cross-provider fallback exists.
*/
import { describe, it, beforeEach, mock, type Mock } from "node:test";
import assert from "node:assert/strict";
import { RetryHandler, type RetryHandlerDeps } from "./retry-handler.js";
import type { Api, AssistantMessage, Model } from "@gsd/pi-ai";
import type { FallbackResolver } from "./fallback-resolver.js";
import type { ModelRegistry } from "./model-registry.js";
import type { SettingsManager } from "./settings-manager.js";
// ─── Helpers ────────────────────────────────────────────────────────────────
function createMockModel(provider: string, id: string): Model<Api> {
return {
id,
name: id,
api: "anthropic" as Api,
provider,
baseUrl: "https://api.anthropic.com",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 1_000_000,
maxTokens: 16384,
} as Model<Api>;
}
function errorMessage(msg: string): AssistantMessage {
return {
role: "assistant",
content: [],
api: "anthropic-messages",
provider: "anthropic",
model: "claude-opus-4-6[1m]",
usage: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, totalTokens: 0, cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 } },
stopReason: "error",
errorMessage: msg,
timestamp: Date.now(),
} as AssistantMessage;
}
interface MockDeps {
deps: RetryHandlerDeps;
emittedEvents: Array<Record<string, any>>;
continueFn: Mock<() => Promise<void>>;
onModelChangeFn: Mock<(model: Model<any>) => void>;
markUsageLimitReached: Mock<(...args: any[]) => boolean>;
findFallback: Mock<(...args: any[]) => Promise<any>>;
findModel: Mock<(provider: string, modelId: string) => Model<Api> | undefined>;
}
function createMockDeps(overrides?: {
model?: Model<Api>;
retryEnabled?: boolean;
markUsageLimitReachedResult?: boolean;
fallbackResult?: any;
findModelResult?: (provider: string, modelId: string) => Model<Api> | undefined;
}): MockDeps {
const model = overrides?.model ?? createMockModel("anthropic", "claude-opus-4-6[1m]");
const emittedEvents: Array<Record<string, any>> = [];
const continueFn = mock.fn(async () => {});
const onModelChangeFn = mock.fn((_model: Model<any>) => {});
const markUsageLimitReached = mock.fn(
() => overrides?.markUsageLimitReachedResult ?? false,
);
const findFallback = mock.fn(async () => overrides?.fallbackResult ?? null);
const findModel = mock.fn(
overrides?.findModelResult ?? ((_provider: string, _modelId: string) => undefined),
);
const messages: Array<{ role: string } & Record<string, any>> = [];
const deps: RetryHandlerDeps = {
agent: {
continue: continueFn,
state: { messages },
setModel: mock.fn(),
replaceMessages: mock.fn((newMessages: any[]) => {
messages.length = 0;
messages.push(...newMessages);
}),
} as any,
settingsManager: {
getRetryEnabled: () => overrides?.retryEnabled ?? true,
getRetrySettings: () => ({
enabled: overrides?.retryEnabled ?? true,
maxRetries: 5,
baseDelayMs: 1000,
maxDelayMs: 30000,
}),
} as unknown as SettingsManager,
modelRegistry: {
authStorage: {
markUsageLimitReached,
},
find: findModel,
} as unknown as ModelRegistry,
fallbackResolver: {
findFallback,
} as unknown as FallbackResolver,
getModel: () => model,
getSessionId: () => "test-session",
emit: (event: any) => emittedEvents.push(event),
onModelChange: onModelChangeFn,
};
return { deps, emittedEvents, continueFn, onModelChangeFn, markUsageLimitReached, findFallback, findModel };
}
// ─── _classifyErrorType (tested via handleRetryableError behavior) ──────────
describe("RetryHandler — long-context entitlement 429 (#2803)", () => {
describe("error classification", () => {
it("classifies 'Extra usage is required for long context requests' as quota_exhausted, not rate_limit", async () => {
// When the error is classified as quota_exhausted AND no alternate credentials
// AND no fallback, the handler should emit fallback_chain_exhausted and stop.
// If misclassified as rate_limit, it would enter the backoff loop instead.
const { deps, emittedEvents, findModel } = createMockDeps({
model: createMockModel("anthropic", "claude-opus-4-6[1m]"),
markUsageLimitReachedResult: false, // no alternate credentials
fallbackResult: null, // no cross-provider fallback
findModelResult: () => undefined, // no base model either
});
const handler = new RetryHandler(deps);
const msg = errorMessage(
'429 {"type":"error","error":{"type":"rate_limit_error","message":"Extra usage is required for long context requests."}}'
);
const result = await handler.handleRetryableError(msg);
// Should NOT retry (would be true if misclassified as rate_limit entering backoff)
assert.equal(result, false);
// Should emit fallback_chain_exhausted (quota_exhausted path), NOT auto_retry_start (backoff path)
const chainExhausted = emittedEvents.find((e) => e.type === "fallback_chain_exhausted");
assert.ok(chainExhausted, "Expected fallback_chain_exhausted event for entitlement error");
const retryStart = emittedEvents.find((e) => e.type === "auto_retry_start");
assert.equal(retryStart, undefined, "Should NOT emit auto_retry_start for entitlement error");
});
it("still classifies regular 429 rate limits as rate_limit", async () => {
// A normal "rate limit" 429 should still be classified as rate_limit
const { deps, emittedEvents } = createMockDeps({
model: createMockModel("anthropic", "claude-opus-4-6"),
markUsageLimitReachedResult: false,
fallbackResult: null,
});
const handler = new RetryHandler(deps);
const msg = errorMessage("429 Too Many Requests");
const result = await handler.handleRetryableError(msg);
// Should enter the backoff loop (rate_limit path, not quota_exhausted)
assert.equal(result, true);
const retryStart = emittedEvents.find((e) => e.type === "auto_retry_start");
assert.ok(retryStart, "Regular 429 should enter backoff retry");
});
});
describe("long-context model downgrade", () => {
it("downgrades from [1m] to base model when entitlement error and no fallback", async () => {
const baseModel = createMockModel("anthropic", "claude-opus-4-6");
const { deps, emittedEvents, onModelChangeFn, continueFn } = createMockDeps({
model: createMockModel("anthropic", "claude-opus-4-6[1m]"),
markUsageLimitReachedResult: false,
fallbackResult: null,
findModelResult: (provider: string, modelId: string) => {
if (provider === "anthropic" && modelId === "claude-opus-4-6") return baseModel;
return undefined;
},
});
const handler = new RetryHandler(deps);
const msg = errorMessage("Extra usage is required for long context requests.");
const result = await handler.handleRetryableError(msg);
assert.equal(result, true, "Should retry after downgrade");
// Should have called setModel with the base model
const setModelCalls = (deps.agent.setModel as any).mock.calls;
assert.equal(setModelCalls.length, 1);
assert.equal(setModelCalls[0].arguments[0].id, "claude-opus-4-6");
// Should have notified about model change
assert.equal(onModelChangeFn.mock.calls.length, 1);
// Should emit a fallback_provider_switch event indicating downgrade
const switchEvent = emittedEvents.find((e) => e.type === "fallback_provider_switch");
assert.ok(switchEvent, "Expected fallback_provider_switch event for downgrade");
assert.ok(switchEvent!.reason.includes("long context downgrade"), `reason should mention downgrade: ${switchEvent!.reason}`);
});
it("emits fallback_chain_exhausted when base model is also unavailable", async () => {
const { deps, emittedEvents } = createMockDeps({
model: createMockModel("anthropic", "claude-opus-4-6[1m]"),
markUsageLimitReachedResult: false,
fallbackResult: null,
findModelResult: () => undefined, // base model not found
});
const handler = new RetryHandler(deps);
const msg = errorMessage("Extra usage is required for long context requests.");
const result = await handler.handleRetryableError(msg);
assert.equal(result, false);
const chainExhausted = emittedEvents.find((e) => e.type === "fallback_chain_exhausted");
assert.ok(chainExhausted, "Expected fallback_chain_exhausted when base model unavailable");
});
it("does not attempt downgrade for non-[1m] models", async () => {
// When a regular model (no [1m] suffix) gets a quota_exhausted error
// with no fallback, it should just stop — no downgrade attempt.
const { deps, emittedEvents } = createMockDeps({
model: createMockModel("anthropic", "claude-opus-4-6"),
markUsageLimitReachedResult: false,
fallbackResult: null,
});
const handler = new RetryHandler(deps);
const msg = errorMessage("Extra usage is required for long context requests.");
const result = await handler.handleRetryableError(msg);
assert.equal(result, false);
const chainExhausted = emittedEvents.find((e) => e.type === "fallback_chain_exhausted");
assert.ok(chainExhausted);
// No downgrade switch should occur
const switchEvent = emittedEvents.find((e) => e.type === "fallback_provider_switch");
assert.equal(switchEvent, undefined, "Should not switch for non-[1m] models");
});
});
describe("isRetryableError", () => {
it("considers long-context entitlement error as retryable", () => {
const { deps } = createMockDeps();
const handler = new RetryHandler(deps);
const msg = errorMessage("Extra usage is required for long context requests.");
assert.equal(handler.isRetryableError(msg), true);
});
});
});

View file

@ -107,7 +107,7 @@ export class RetryHandler {
if (isContextOverflow(message, contextWindow)) return false;
const err = message.errorMessage;
return /overloaded|rate.?limit|too many requests|429|500|502|503|504|service.?unavailable|server.?error|internal.?error|connection.?error|connection.?refused|other side closed|fetch failed|upstream.?connect|reset before headers|terminated|retry delay|network.?(?:is\s+)?unavailable|credentials.*expired|temporarily backed off/i.test(
return /overloaded|rate.?limit|too many requests|429|500|502|503|504|service.?unavailable|server.?error|internal.?error|connection.?error|connection.?refused|other side closed|fetch failed|upstream.?connect|reset before headers|terminated|retry delay|network.?(?:is\s+)?unavailable|credentials.*expired|temporarily backed off|extra usage is required/i.test(
err,
);
}
@ -202,6 +202,10 @@ export class RetryHandler {
// No fallback available either
if (errorType === "quota_exhausted") {
// Try long-context model downgrade ([1m] → base) before giving up
const downgraded = this._tryLongContextDowngrade(message);
if (downgraded) return true;
this._deps.emit({
type: "fallback_chain_exhausted",
reason: `All providers exhausted for ${this._deps.getModel()!.provider}/${this._deps.getModel()!.id}`,
@ -343,12 +347,59 @@ export class RetryHandler {
*/
private _classifyErrorType(errorMessage: string): UsageLimitErrorType {
const err = errorMessage.toLowerCase();
// Long-context entitlement errors are billing gates, not transient rate limits.
// Must be checked before the generic 429/rate_limit regex.
if (/extra usage is required|long context required/i.test(err)) return "quota_exhausted";
if (/quota|billing|exceeded.*limit|usage.*limit/i.test(err)) return "quota_exhausted";
if (/rate.?limit|too many requests|429/i.test(err)) return "rate_limit";
if (/500|502|503|504|server.?error|internal.?error|service.?unavailable/i.test(err)) return "server_error";
return "unknown";
}
/**
* Attempt to downgrade a long-context model (e.g. claude-opus-4-6[1m]) to its
* base model (claude-opus-4-6) when the account lacks the long-context billing
* entitlement. Returns true if the downgrade was initiated.
*/
private _tryLongContextDowngrade(message: AssistantMessage): boolean {
const currentModel = this._deps.getModel();
if (!currentModel) return false;
// Only attempt downgrade for [1m] (or similar long-context) model IDs
const match = currentModel.id.match(/^(.+)\[\d+m\]$/);
if (!match) return false;
const baseModelId = match[1];
const baseModel = this._deps.modelRegistry.find(currentModel.provider, baseModelId);
if (!baseModel) return false;
const previousId = currentModel.id;
this._deps.agent.setModel(baseModel);
this._deps.onModelChange(baseModel);
this._removeLastAssistantError();
this._deps.emit({
type: "fallback_provider_switch",
from: `${currentModel.provider}/${previousId}`,
to: `${baseModel.provider}/${baseModel.id}`,
reason: `long context downgrade: ${previousId}${baseModel.id}`,
});
this._deps.emit({
type: "auto_retry_start",
attempt: this._retryAttempt + 1,
maxAttempts: this._deps.settingsManager.getRetrySettings().maxRetries,
delayMs: 0,
errorMessage: `${message.errorMessage} (long context downgrade)`,
});
setTimeout(() => {
this._deps.agent.continue().catch(() => {});
}, 0);
return true;
}
/** Remove the last assistant error message from agent state */
private _removeLastAssistantError(): void {
const messages = this._deps.agent.state.messages;

View file

@ -123,12 +123,15 @@ export function createHashlineReadTool(cwd: string, options?: HashlineReadToolOp
const allLines = textContent.split("\n");
const totalFileLines = allLines.length;
const startLine = offset ? Math.max(0, offset - 1) : 0;
const startLineDisplay = startLine + 1;
let startLine = offset ? Math.max(0, offset - 1) : 0;
// Clamp offset to file bounds instead of throwing (#3007)
let offsetClamped = false;
if (startLine >= allLines.length) {
throw new Error(`Offset ${offset} is beyond end of file (${allLines.length} lines total)`);
startLine = Math.max(0, allLines.length - 1);
offsetClamped = true;
}
const startLineDisplay = startLine + 1;
let selectedContent: string;
let userLimitedLines: number | undefined;
@ -172,6 +175,11 @@ export function createHashlineReadTool(cwd: string, options?: HashlineReadToolOp
outputText = formatHashLines(truncation.content, startLineDisplay);
}
// Prepend clamp notice so the agent knows offset was adjusted
if (offsetClamped) {
outputText = `[Offset ${offset} beyond end of file (${totalFileLines} lines). Clamped to line ${startLineDisplay}.]\n\n${outputText}`;
}
content = [{ type: "text", text: outputText }];
}

View file

@ -133,13 +133,18 @@ export function createReadTool(cwd: string, options?: ReadToolOptions): AgentToo
const totalFileLines = allLines.length;
// Apply offset if specified (1-indexed to 0-indexed)
const startLine = offset ? Math.max(0, offset - 1) : 0;
const startLineDisplay = startLine + 1; // For display (1-indexed)
let startLine = offset ? Math.max(0, offset - 1) : 0;
// Check if offset is out of bounds
// Clamp offset to file bounds instead of throwing (#3007).
// When an agent requests offset:30 on a 13-line file, return
// the last line with a notice rather than an error that
// propagates as invalid JSON downstream.
let offsetClamped = false;
if (startLine >= allLines.length) {
throw new Error(`Offset ${offset} is beyond end of file (${allLines.length} lines total)`);
startLine = Math.max(0, allLines.length - 1);
offsetClamped = true;
}
const startLineDisplay = startLine + 1; // For display (1-indexed)
// If limit is specified by user, use it; otherwise we'll let truncateHead decide
let selectedContent: string;
@ -187,6 +192,11 @@ export function createReadTool(cwd: string, options?: ReadToolOptions): AgentToo
outputText = truncation.content;
}
// Prepend clamp notice so the agent knows offset was adjusted
if (offsetClamped) {
outputText = `[Offset ${offset} beyond end of file (${totalFileLines} lines). Clamped to line ${startLineDisplay}.]\n\n${outputText}`;
}
content = [{ type: "text", text: outputText }];
}

View file

@ -0,0 +1,92 @@
/**
* spawn-shell-windows.test.ts Regression test for Windows spawn ENOENT/EINVAL.
*
* On Windows, npm/npx/tsc and other tools are installed as .cmd batch scripts.
* Node's `spawn()` without `shell: true` cannot execute .cmd files, resulting
* in ENOENT or EINVAL errors. Every spawn site that may invoke a user-installed
* binary (not `node` or a shell like `sh`/`bash`/`cmd`) must include
* `shell: process.platform === "win32"` so the call is resolved through cmd.exe
* on Windows while remaining a direct exec on POSIX.
*
* This test structurally scans all spawn sites and verifies the guard is present.
*
* Fixes: gsd-build/gsd-2#2854
*/
import test from "node:test";
import assert from "node:assert/strict";
import { readFileSync } from "node:fs";
import { join, dirname, relative } from "node:path";
import { fileURLToPath } from "node:url";
const __dirname = dirname(fileURLToPath(import.meta.url));
const coreDir = join(__dirname, "..");
/**
* Files that call `spawn()` with a user-facing binary (not `node`, `sh`, `bash`,
* or `cmd`) and therefore need the Windows shell guard.
*
* If a file spawns only hardcoded system binaries (like `node` in rpc-client.ts),
* it does not need the guard and should NOT appear here.
*/
const SPAWN_FILES_NEEDING_SHELL_GUARD = [
// Extension's GSD client — spawns the `gsd` binary which is a .cmd on Windows
join(coreDir, "..", "..", "..", "vscode-extension", "src", "gsd-client.ts"),
// exec.ts — used by extensions to run arbitrary commands
join(coreDir, "exec.ts"),
// LSP index — spawns project-type commands (tsc, cargo, etc.)
join(coreDir, "lsp", "index.ts"),
// LSP client — spawns LSP server binaries (npx, etc.)
join(coreDir, "lsp", "client.ts"),
// LSP mux — spawns lspmux binary
join(coreDir, "lsp", "lspmux.ts"),
// Package manager — spawns npm/yarn/pnpm
join(coreDir, "package-manager.ts"),
];
test("all spawn sites that invoke user-facing binaries include shell: process.platform === 'win32'", () => {
const failures: string[] = [];
for (const file of SPAWN_FILES_NEEDING_SHELL_GUARD) {
let content: string;
try {
content = readFileSync(file, "utf-8");
} catch {
// File may not exist in this checkout — skip
continue;
}
const lines = content.split("\n");
// Find all spawn(..., { ... }) call sites and check each one
// for the presence of `shell: process.platform === "win32"` within
// 5 lines after the spawn call.
for (let i = 0; i < lines.length; i++) {
const line = lines[i]!;
// Skip comments
if (line.trim().startsWith("//") || line.trim().startsWith("*")) continue;
// Detect a spawn() call
if (/\bspawn\(/.test(line)) {
// Look ahead up to 8 lines for the shell guard
const lookahead = lines.slice(i, i + 8).join("\n");
const hasShellGuard =
/shell:\s*process\.platform\s*===\s*["']win32["']/.test(lookahead);
if (!hasShellGuard) {
const relPath = relative(join(coreDir, "..", ".."), file);
failures.push(`${relPath}:${i + 1}`);
}
}
}
}
assert.deepEqual(
failures,
[],
`The following spawn sites are missing 'shell: process.platform === "win32"':\n` +
failures.map(f => ` - ${f}`).join("\n") +
`\nOn Windows, .cmd wrapper scripts (npm, npx, tsc, gsd) require shell ` +
`resolution. Without this guard, spawn fails with ENOENT or EINVAL.`,
);
});

View file

@ -68,6 +68,7 @@ export type {
Extension,
ExtensionActions,
ExtensionAPI,
ExtensionManifest,
ExtensionCommandContext,
ExtensionCommandContextActions,
ExtensionContext,
@ -119,6 +120,8 @@ export type {
ToolCallEvent,
ToolDefinition,
ToolInfo,
SortResult,
SortWarning,
ToolRenderResultOptions,
ToolResultEvent,
TurnEndEvent,
@ -137,6 +140,9 @@ export {
importExtensionModule,
isToolCallEventType,
isToolResultEventType,
readManifest,
readManifestFromEntryPath,
sortExtensionPaths,
wrapRegisteredTool,
wrapRegisteredTools,
wrapToolsWithExtensions,

View file

@ -337,5 +337,12 @@ export async function handleAgentEvent(host: InteractiveModeStateHost & {
host.showError(event.reason);
host.ui.requestRender();
break;
case "image_overflow_recovery":
host.showStatus(
`Removed ${event.strippedCount} older image(s) to comply with API limits. Retrying...`,
);
host.ui.requestRender();
break;
}
}

View file

@ -49,6 +49,12 @@ export class RemoteTerminal implements Terminal {
return this._rows;
}
get isTTY(): boolean {
// RemoteTerminal renders to a browser-based terminal emulator via
// the RPC bridge — it behaves like a real TTY for rendering purposes.
return true;
}
get kittyProtocolActive(): boolean {
return false;
}

View file

@ -9,6 +9,9 @@ const cjsRequire = createRequire(import.meta.url);
* Minimal terminal interface for TUI
*/
export interface Terminal {
// Whether stdout is a real TTY (false for pipes, e.g. RPC bridge processes)
readonly isTTY: boolean;
// Start the terminal with input and resize handlers
start(onInput: (data: string) => void, onResize: () => void): void;
@ -63,11 +66,22 @@ export class ProcessTerminal implements Terminal {
private stdinDataHandler?: (data: string) => void;
private writeLogPath = process.env.PI_TUI_WRITE_LOG || "";
get isTTY(): boolean {
return !!process.stdout.isTTY;
}
get kittyProtocolActive(): boolean {
return this._kittyProtocolActive;
}
start(onInput: (data: string) => void, onResize: () => void): void {
// Non-TTY stdout (pipe) — skip TUI initialization entirely.
// RPC bridge processes communicate via JSON, not terminal escape codes.
// Without this guard, the render loop burns 500%+ CPU. (issue #3095)
if (!this.isTTY) {
return;
}
this.inputHandler = onInput;
this.resizeHandler = onResize;

View file

@ -399,6 +399,12 @@ export class TUI extends Container {
start(): void {
this.stopped = false;
// Non-TTY stdout (pipe) — skip TUI entirely to avoid burning CPU.
// RPC bridge processes have piped stdio; rendering ANSI escape codes
// to a pipe is pure waste and causes a runaway render loop. (issue #3095)
if (!this.terminal.isTTY) {
return;
}
this.terminal.start(
(data) => this.handleInput(data),
() => this.requestRender(),
@ -458,6 +464,8 @@ export class TUI extends Container {
}
requestRender(force = false): void {
// Skip rendering on non-TTY stdout to prevent CPU burn (issue #3095)
if (!this.terminal.isTTY) return;
if (force) {
this.previousLines = [];
this.previousWidth = -1; // -1 triggers widthChanged, forcing a full clear

View file

@ -37,6 +37,48 @@ function newestSrcMtime(dir) {
return newest
}
/**
* Detects workspace packages whose dist/ is missing or stale.
*
* Missing dist/index.js is always reported (the package won't work at all).
*
* Staleness (src/ newer than dist/) is ONLY checked when a .git directory
* exists at root indicating a development clone. In npm tarball installs,
* file timestamps are unreliable (npm sets all files to a canonical date,
* but extraction ordering can cause src/ to appear 1-2 seconds newer than
* dist/). Attempting to rebuild in that scenario is dangerous: devDependencies
* (including TypeScript) are not installed, and any globally-installed tsc
* may produce broken output that overwrites the known-good dist/.
*
* @param {string} root Project root directory
* @param {string[]} packages Package directory names to check
* @returns {string[]} Package names that need rebuilding
*/
function detectStalePackages(root, packages) {
const packagesDir = join(root, 'packages')
const isDevClone = existsSync(join(root, '.git'))
const stale = []
for (const pkg of packages) {
const distIndex = join(packagesDir, pkg, 'dist', 'index.js')
if (!existsSync(distIndex)) {
stale.push(pkg)
continue
}
// Only check src vs dist timestamps in development clones.
// In npm tarball installs, timestamps are unreliable and rebuilding
// without devDependencies can corrupt the pre-built dist/ (#2877).
if (isDevClone) {
const distMtime = statSync(distIndex).mtimeMs
const srcMtime = newestSrcMtime(join(packagesDir, pkg, 'src'))
if (srcMtime > distMtime) {
stale.push(pkg)
}
}
}
return stale
}
if (require.main === module) {
const root = resolve(__dirname, '..')
const packagesDir = join(root, 'packages')
@ -57,19 +99,7 @@ if (require.main === module) {
'pi-coding-agent',
]
const stale = []
for (const pkg of WORKSPACE_PACKAGES) {
const distIndex = join(packagesDir, pkg, 'dist', 'index.js')
if (!existsSync(distIndex)) {
stale.push(pkg)
continue
}
const distMtime = statSync(distIndex).mtimeMs
const srcMtime = newestSrcMtime(join(packagesDir, pkg, 'src'))
if (srcMtime > distMtime) {
stale.push(pkg)
}
}
const stale = detectStalePackages(root, WORKSPACE_PACKAGES)
if (stale.length === 0) process.exit(0)
@ -78,6 +108,7 @@ if (require.main === module) {
for (const pkg of stale) {
const pkgDir = join(packagesDir, pkg)
try {
// execSync is safe here: the command is a hardcoded string, not user input
execSync('npm run build', { cwd: pkgDir, stdio: 'pipe' })
process.stderr.write(`${pkg}\n`)
} catch (err) {
@ -87,4 +118,4 @@ if (require.main === module) {
}
}
module.exports = { newestSrcMtime }
module.exports = { newestSrcMtime, detectStalePackages }

View file

@ -16,7 +16,8 @@ import { agentDir, sessionsDir, authFilePath } from './app-paths.js'
import { initResources, buildResourceLoader, getNewerManagedResourceVersion } from './resource-loader.js'
import { ensureManagedTools } from './tool-bootstrap.js'
import { loadStoredEnvKeys } from './wizard.js'
import { getPiDefaultModelAndProvider, migratePiCredentials } from './pi-migration.js'
import { migratePiCredentials } from './pi-migration.js'
import { validateConfiguredModel } from './startup-model-validation.js'
import { shouldRunOnboarding, runOnboarding } from './onboarding.js'
import chalk from 'chalk'
import { checkForUpdates } from './update-check.js'
@ -170,6 +171,7 @@ const hasSubcommand = cliFlags.messages.length > 0
if (!process.stdin.isTTY && !isPrintMode && !hasSubcommand && !cliFlags.listModels && !cliFlags.web) {
process.stderr.write('[gsd] Error: Interactive mode requires a terminal (TTY).\n')
process.stderr.write('[gsd] Non-interactive alternatives:\n')
process.stderr.write('[gsd] gsd auto Auto-mode (pipeable, no TUI)\n')
process.stderr.write('[gsd] gsd --print "your message" Single-shot prompt\n')
process.stderr.write('[gsd] gsd --mode rpc JSON-RPC over stdin/stdout\n')
process.stderr.write('[gsd] gsd --mode mcp MCP server over stdin/stdout\n')
@ -300,6 +302,23 @@ if (cliFlags.messages[0] === 'headless') {
process.exit(0)
}
// `gsd auto [args...]` — shorthand for `gsd headless auto [args...]` (#2732)
// Without this, `gsd auto` falls through to the interactive TUI which hangs
// when stdin/stdout are piped (non-TTY environments).
if (cliFlags.messages[0] === 'auto') {
await ensureRtkBootstrap()
const { runHeadless, parseHeadlessArgs } = await import('./headless.js')
// Rewrite argv so parseHeadlessArgs sees: [node, gsd, headless, auto, ...rest]
const rewrittenArgv = [
process.argv[0],
process.argv[1],
'headless',
...cliFlags.messages, // ['auto', ...extra args]
]
await runHeadless(parseHeadlessArgs(rewrittenArgv))
process.exit(0)
}
// Pi's tool bootstrap can mis-detect already-installed fd/rg on some systems
// because spawnSync(..., ["--version"]) returns EPERM despite a zero exit code.
// Provision local managed binaries first so Pi sees them without probing PATH.
@ -391,42 +410,6 @@ if (cliFlags.listModels !== undefined) {
process.exit(0)
}
// Validate configured model on startup — catches stale settings from prior installs
// (e.g. grok-2 which no longer exists) and fresh installs with no settings.
// Only resets the default when the configured model no longer exists in the registry;
// never overwrites a valid user choice.
const configuredProvider = settingsManager.getDefaultProvider()
const configuredModel = settingsManager.getDefaultModel()
const allModels = modelRegistry.getAll()
const availableModels = modelRegistry.getAvailable()
const configuredExists = configuredProvider && configuredModel &&
allModels.some((m) => m.provider === configuredProvider && m.id === configuredModel)
const configuredAvailable = configuredProvider && configuredModel &&
availableModels.some((m) => m.provider === configuredProvider && m.id === configuredModel)
if (!configuredModel || !configuredExists) {
// Model not configured at all, or removed from registry — pick a fallback.
// Only fires when the model is genuinely unknown (not just temporarily unavailable).
const piDefault = getPiDefaultModelAndProvider()
const preferred =
(piDefault
? availableModels.find((m) => m.provider === piDefault.provider && m.id === piDefault.model)
: undefined) ||
availableModels.find((m) => m.provider === 'openai' && m.id === 'gpt-5.4') ||
availableModels.find((m) => m.provider === 'openai') ||
availableModels.find((m) => m.provider === 'anthropic' && m.id === 'claude-opus-4-6') ||
availableModels.find((m) => m.provider === 'anthropic' && m.id.includes('opus')) ||
availableModels.find((m) => m.provider === 'anthropic') ||
availableModels[0]
if (preferred) {
settingsManager.setDefaultModelAndProvider(preferred.provider, preferred.id)
}
}
if (settingsManager.getDefaultThinkingLevel() !== 'off' && !configuredExists) {
settingsManager.setDefaultThinkingLevel('off')
}
// GSD always uses quiet startup — the gsd extension renders its own branded header
if (!settingsManager.getQuietStartup()) {
settingsManager.setQuietStartup(true)
@ -477,6 +460,11 @@ if (isPrintMode) {
})
markStartup('createAgentSession')
// Validate configured model AFTER extensions have registered their models (#2626).
// Before this, extension-provided models (e.g. claude-code/*) were not yet in the
// registry, causing the user's valid choice to be silently overwritten.
validateConfiguredModel(modelRegistry, settingsManager)
if (extensionsResult.errors.length > 0) {
for (const err of extensionsResult.errors) {
// Downgrade conflicts with built-in tools to warnings (#1347)
@ -565,6 +553,20 @@ if (!cliFlags.worktree && !isPrintMode) {
} catch { /* non-fatal */ }
}
// ---------------------------------------------------------------------------
// Auto-redirect: `gsd auto` with piped stdout → headless mode (#2732)
// When stdout is not a TTY (e.g. `gsd auto | cat`, `gsd auto > file`),
// the TUI cannot render and the process hangs. Redirect to headless mode
// which handles non-interactive output gracefully.
// ---------------------------------------------------------------------------
if (cliFlags.messages[0] === 'auto' && !process.stdout.isTTY) {
await ensureRtkBootstrap()
const { runHeadless, parseHeadlessArgs } = await import('./headless.js')
process.stderr.write('[gsd] stdout is not a terminal — running auto-mode in headless mode.\n')
await runHeadless(parseHeadlessArgs(['node', 'gsd', 'headless', ...cliFlags.messages.slice(1)]))
process.exit(0)
}
// ---------------------------------------------------------------------------
// Interactive mode — normal TTY session
// ---------------------------------------------------------------------------
@ -611,6 +613,11 @@ const { session, extensionsResult } = await createAgentSession({
})
markStartup('createAgentSession')
// Validate configured model AFTER extensions have registered their models (#2626).
// Before this, extension-provided models (e.g. claude-code/*) were not yet in the
// registry, causing the user's valid choice to be silently overwritten.
validateConfiguredModel(modelRegistry, settingsManager)
if (extensionsResult.errors.length > 0) {
for (const err of extensionsResult.errors) {
const isSuperseded = err.error.includes("supersedes");
@ -662,14 +669,21 @@ if (enabledModelPatterns && enabledModelPatterns.length > 0) {
}
}
if (!process.stdin.isTTY) {
process.stderr.write('[gsd] Error: Interactive mode requires a terminal (TTY).\n')
if (!process.stdin.isTTY || !process.stdout.isTTY) {
const missing = !process.stdin.isTTY && !process.stdout.isTTY
? 'stdin and stdout are'
: !process.stdin.isTTY
? 'stdin is'
: 'stdout is'
process.stderr.write(`[gsd] Error: Interactive mode requires a terminal (TTY) but ${missing} not a TTY.\n`)
process.stderr.write('[gsd] Non-interactive alternatives:\n')
process.stderr.write('[gsd] gsd auto Auto-mode (pipeable, no TUI)\n')
process.stderr.write('[gsd] gsd --print "your message" Single-shot prompt\n')
process.stderr.write('[gsd] gsd --web [path] Browser-only web mode\n')
process.stderr.write('[gsd] gsd --mode rpc JSON-RPC over stdin/stdout\n')
process.stderr.write('[gsd] gsd --mode mcp MCP server over stdin/stdout\n')
process.stderr.write('[gsd] gsd --mode text "message" Text output mode\n')
process.stderr.write('[gsd] gsd headless Auto-mode without TUI\n')
process.exit(1)
}

View file

@ -169,6 +169,7 @@ export function printHelp(version: string): void {
process.stdout.write(' update Update GSD to the latest version\n')
process.stdout.write(' sessions List and resume a past session\n')
process.stdout.write(' worktree <cmd> Manage worktrees (list, merge, clean, remove)\n')
process.stdout.write(' auto [args] Run auto-mode without TUI (pipeable)\n')
process.stdout.write(' headless [cmd] [args] Run /gsd commands without TUI (default: auto)\n')
process.stdout.write('\nRun gsd <subcommand> --help for subcommand-specific help.\n')
}

View file

@ -74,6 +74,7 @@ const LLM_PROVIDER_IDS = [
'xai',
'openrouter',
'mistral',
'ollama',
'ollama-cloud',
'custom-openai',
]
@ -90,6 +91,7 @@ const OTHER_PROVIDERS = [
{ value: 'xai', label: 'xAI (Grok)' },
{ value: 'openrouter', label: 'OpenRouter' },
{ value: 'mistral', label: 'Mistral' },
{ value: 'ollama', label: 'Ollama (Local)' },
{ value: 'ollama-cloud', label: 'Ollama Cloud' },
{ value: 'custom-openai', label: 'Custom (OpenAI-compatible)' },
]
@ -335,6 +337,9 @@ async function runLlmStep(p: ClackModule, pc: PicoModule, authStorage: AuthStora
if (provider === 'custom-openai') {
return await runCustomOpenAIFlow(p, pc, authStorage)
}
if (provider === 'ollama') {
return await runOllamaLocalFlow(p, pc, authStorage)
}
const label = provider === 'anthropic' ? 'Anthropic'
: provider === 'openai' ? 'OpenAI'
: OTHER_PROVIDERS.find(op => op.value === provider)?.label ?? String(provider)
@ -444,6 +449,54 @@ async function runApiKeyFlow(
return true
}
// ─── Ollama Local Flow ───────────────────────────────────────────────────────
async function runOllamaLocalFlow(
p: ClackModule,
pc: PicoModule,
authStorage: AuthStorage,
): Promise<boolean> {
const host = process.env.OLLAMA_HOST || 'http://localhost:11434'
const s = p.spinner()
s.start(`Checking Ollama at ${host}...`)
try {
const controller = new AbortController()
const timeout = setTimeout(() => controller.abort(), 3000)
const response = await fetch(host, { signal: controller.signal })
clearTimeout(timeout)
if (response.ok) {
s.stop(`Ollama is running at ${pc.green(host)}`)
// Store a placeholder so the provider is recognized as authenticated
authStorage.set('ollama', { type: 'api_key', key: 'ollama' })
p.log.success(`${pc.green('Ollama (Local)')} configured — no API key needed`)
p.log.info(pc.dim('Models are discovered automatically from your local Ollama instance.'))
return true
} else {
s.stop('Ollama check failed')
p.log.warn(`Ollama responded with status ${response.status} at ${host}`)
}
} catch {
s.stop('Ollama not detected')
p.log.warn(`Could not reach Ollama at ${host}`)
p.log.info(pc.dim('Install Ollama from https://ollama.com and run "ollama serve"'))
p.log.info(pc.dim('Set OLLAMA_HOST if using a non-default address.'))
}
// Even if not reachable now, save the config — the extension will detect it at runtime
const proceed = await p.confirm({
message: 'Save Ollama as your provider anyway? (it will auto-detect when running)',
})
if (p.isCancel(proceed) || !proceed) return false
authStorage.set('ollama', { type: 'api_key', key: 'ollama' })
p.log.success(`${pc.green('Ollama (Local)')} saved — models will appear when Ollama is running`)
return true
}
// ─── Custom OpenAI-compatible Flow ────────────────────────────────────────────
async function runCustomOpenAIFlow(

View file

@ -1,4 +1,4 @@
import { DefaultResourceLoader } from '@gsd/pi-coding-agent'
import { DefaultResourceLoader, sortExtensionPaths } from '@gsd/pi-coding-agent'
import { createHash } from 'node:crypto'
import { homedir } from 'node:os'
import { chmodSync, copyFileSync, cpSync, existsSync, lstatSync, mkdirSync, openSync, closeSync, readFileSync, readlinkSync, readdirSync, rmSync, statSync, symlinkSync, unlinkSync, writeFileSync } from 'node:fs'
@ -603,5 +603,21 @@ export function buildResourceLoader(agentDir: string): DefaultResourceLoader {
agentDir,
additionalExtensionPaths: piExtensionPaths,
bundledExtensionNames: bundledKeys,
extensionPathsTransform: (paths: string[]) => {
// 1. Filter community extensions through the GSD registry
const filteredPaths = paths.filter((entryPath) => {
const manifest = readManifestFromEntryPath(entryPath)
if (!manifest) return true // no manifest = always load
return isExtensionEnabled(registry, manifest.id)
})
// 2. Sort in topological dependency order
const { sortedPaths, warnings } = sortExtensionPaths(filteredPaths)
return {
paths: sortedPaths,
diagnostics: warnings.map((w) => w.message),
}
},
} as ConstructorParameters<typeof DefaultResourceLoader>[0])
}

View file

@ -1,7 +1,7 @@
---
name: researcher
description: Web researcher that finds and synthesizes current information using Brave Search
tools: web_search, bash
tools: search-the-web, bash
---
You are a web researcher. You find current, accurate information using web search and synthesize it into a clear, well-structured report.

View file

@ -162,9 +162,27 @@ export default function AskUserQuestions(pi: ExtensionAPI) {
if (selected === undefined) {
return errorResult("ask_user_questions was cancelled", params.questions);
}
answers[q.id] = {
answers: Array.isArray(selected) ? selected : [selected],
};
// When the user picks "None of the above" on a single-select
// question, prompt for a free-text explanation so they are not
// trapped in a re-asking loop (bug #2715).
let freeTextNote = "";
const selectedStr = Array.isArray(selected) ? selected[0] : selected;
if (!q.allowMultiple && selectedStr === OTHER_OPTION_LABEL) {
const note = await ctx.ui.input(
`${q.header}: Please explain in your own words`,
"Type your answer here…",
);
if (note) {
freeTextNote = note;
}
}
const answerList = Array.isArray(selected) ? selected : [selected];
if (freeTextNote) {
answerList.push(`user_note: ${freeTextNote}`);
}
answers[q.id] = { answers: answerList };
}
const roundResult: RoundResult = {
endInterview: false,

View file

@ -8,6 +8,6 @@
"provides": {
"tools": ["async_bash", "await_job", "cancel_job"],
"commands": ["jobs"],
"hooks": ["session_start"]
"hooks": ["session_start", "session_before_switch", "session_shutdown"]
}
}

View file

@ -8,7 +8,7 @@
"provides": {
"tools": ["bg_shell"],
"commands": ["bg"],
"hooks": ["session_shutdown"],
"hooks": ["session_shutdown", "session_compact", "session_tree", "session_switch", "before_agent_start", "session_start", "turn_end", "agent_end", "tool_execution_end"],
"shortcuts": ["Ctrl+Alt+B"]
}
}

View file

@ -29,7 +29,7 @@
"browser_visual_diff", "browser_zoom_region",
"browser_generate_test", "browser_action_cache", "browser_check_injection"
],
"hooks": ["session_shutdown"]
"hooks": ["session_start", "session_shutdown"]
},
"dependencies": {
"runtime": ["playwright"]

View file

@ -16,6 +16,7 @@ import type {
Usage,
WebSearchResultContent,
} from "@gsd/pi-ai";
import { repairToolJson } from "@gsd/pi-ai";
import type { BetaContentBlock, BetaRawMessageStreamEvent, NonNullableUsage } from "./sdk-types.js";
// ---------------------------------------------------------------------------
@ -244,12 +245,18 @@ export class PartialMessageBuilder {
try {
block.arguments = JSON.parse(jsonStr);
} catch {
// Stream was truncated mid-tool-call — JSON is garbage.
// Preserve the raw string for diagnostics but signal the
// malformation explicitly so downstream consumers can
// distinguish this from a healthy tool completion (#2574).
block.arguments = { _raw: jsonStr };
return { type: "toolcall_end", contentIndex, toolCall: block, partial: this.partial, malformedArguments: true };
// JSON.parse failed — attempt repair for YAML-style bullet
// lists that LLMs copy from template formatting (#2660).
try {
block.arguments = JSON.parse(repairToolJson(jsonStr));
} catch {
// Repair also failed — stream was truncated or garbage.
// Preserve the raw string for diagnostics but signal the
// malformation explicitly so downstream consumers can
// distinguish this from a healthy tool completion (#2574).
block.arguments = { _raw: jsonStr };
return { type: "toolcall_end", contentIndex, toolCall: block, partial: this.partial, malformedArguments: true };
}
}
return { type: "toolcall_end", contentIndex, toolCall: block, partial: this.partial };
}

View file

@ -23,9 +23,6 @@ import type {
SDKMessage,
SDKPartialAssistantMessage,
SDKResultMessage,
SDKSystemMessage,
SDKStatusMessage,
SDKUserMessage,
} from "./sdk-types.js";
// ---------------------------------------------------------------------------
@ -71,30 +68,49 @@ function getClaudePath(): string {
}
// ---------------------------------------------------------------------------
// Prompt extraction
// Prompt construction
// ---------------------------------------------------------------------------
/**
* Extract the last user prompt text from GSD's context messages.
* The SDK manages its own conversation history we only send
* the latest user message as the prompt.
* Extract text content from a single message regardless of content shape.
*/
function extractLastUserPrompt(context: Context): string {
for (let i = context.messages.length - 1; i >= 0; i--) {
const msg = context.messages[i];
if (msg.role === "user") {
if (typeof msg.content === "string") return msg.content;
if (Array.isArray(msg.content)) {
const textParts = msg.content
.filter((part: any) => part.type === "text")
.map((part: any) => part.text);
if (textParts.length > 0) return textParts.join("\n");
}
}
function extractMessageText(msg: { role: string; content: unknown }): string {
if (typeof msg.content === "string") return msg.content;
if (Array.isArray(msg.content)) {
const textParts = msg.content
.filter((part: any) => part.type === "text")
.map((part: any) => part.text ?? part.thinking ?? "");
if (textParts.length > 0) return textParts.join("\n");
}
return "";
}
/**
* Build a full conversational prompt from GSD's context messages.
*
* Previous behaviour sent only the last user message, making every SDK
* call effectively stateless. This version serialises the complete
* conversation history (system prompt + all user/assistant turns) so
* Claude Code has full context for multi-turn continuity.
*/
export function buildPromptFromContext(context: Context): string {
const parts: string[] = [];
if (context.systemPrompt) {
parts.push(`[System]\n${context.systemPrompt}`);
}
for (const msg of context.messages) {
const text = extractMessageText(msg);
if (!text) continue;
const label = msg.role === "user" ? "User" : msg.role === "assistant" ? "Assistant" : "System";
parts.push(`[${label}]\n${text}`);
}
return parts.join("\n\n");
}
// ---------------------------------------------------------------------------
// Error helper
// ---------------------------------------------------------------------------
@ -127,6 +143,31 @@ export function makeStreamExhaustedErrorMessage(model: string, lastTextContent:
return message;
}
// ---------------------------------------------------------------------------
// SDK options builder
// ---------------------------------------------------------------------------
/**
* Build the options object passed to the Claude Agent SDK's `query()` call.
*
* Extracted for testability callers can verify session persistence,
* beta flags, and other configuration without mocking the full SDK.
*/
export function buildSdkOptions(modelId: string, prompt: string): Record<string, unknown> {
return {
pathToClaudeCodeExecutable: getClaudePath(),
model: modelId,
includePartialMessages: true,
persistSession: true,
cwd: process.cwd(),
permissionMode: "bypassPermissions",
allowDangerouslySkipPermissions: true,
settingSources: ["project"],
systemPrompt: { type: "preset", preset: "claude_code" },
betas: modelId.includes("sonnet") ? ["context-1m-2025-08-07"] : [],
};
}
// ---------------------------------------------------------------------------
// streamSimple implementation
// ---------------------------------------------------------------------------
@ -180,22 +221,14 @@ async function pumpSdkMessages(
options.signal.addEventListener("abort", () => controller.abort(), { once: true });
}
const prompt = extractLastUserPrompt(context);
const prompt = buildPromptFromContext(context);
const sdkOpts = buildSdkOptions(modelId, prompt);
const queryResult = sdk.query({
prompt,
options: {
pathToClaudeCodeExecutable: getClaudePath(),
model: modelId,
includePartialMessages: true,
persistSession: false,
...sdkOpts,
abortController: controller,
cwd: process.cwd(),
permissionMode: "bypassPermissions",
allowDangerouslySkipPermissions: true,
settingSources: ["project"],
systemPrompt: { type: "preset", preset: "claude_code" },
betas: modelId.includes("sonnet") ? ["context-1m-2025-08-07"] : [],
},
});
@ -225,7 +258,6 @@ async function pumpSdkMessages(
// -- Streaming partial messages --
case "stream_event": {
const partial = msg as SDKPartialAssistantMessage;
if (partial.parent_tool_use_id !== null) break; // skip subagent
const event = partial.event;
@ -256,7 +288,6 @@ async function pumpSdkMessages(
// -- Complete assistant message (non-streaming fallback) --
case "assistant": {
const sdkAssistant = msg as SDKAssistantMessage;
if (sdkAssistant.parent_tool_use_id !== null) break;
// Capture text content from complete messages
for (const block of sdkAssistant.message.content) {
@ -271,9 +302,6 @@ async function pumpSdkMessages(
// -- User message (synthetic tool result — signals turn boundary) --
case "user": {
const userMsg = msg as SDKUserMessage;
if (userMsg.parent_tool_use_id !== null) break;
// Capture content from the completed turn before resetting
if (builder) {
for (const block of builder.message.content) {

View file

@ -102,4 +102,32 @@ describe("PartialMessageBuilder — malformed tool arguments (#2574)", () => {
"non-JSON content should set malformedArguments: true",
);
});
test("YAML bullet lists repaired to JSON arrays (#2660)", () => {
const builder = new PartialMessageBuilder("claude-sonnet-4-20250514");
const malformedJson =
'{"milestoneId": "M005", "keyDecisions": - Used Web Notification API, "keyFiles": - src/lib.rs, "title": "done"}';
const event = feedToolCall(builder, [malformedJson]);
assert.ok(event, "event should not be null");
assert.equal(event!.type, "toolcall_end");
// Repaired YAML bullets should NOT set malformedArguments
assert.equal(
(event as any).malformedArguments,
undefined,
"repaired YAML bullets should not set malformedArguments",
);
if (event!.type === "toolcall_end") {
assert.equal(event!.toolCall.arguments.milestoneId, "M005");
assert.ok(
Array.isArray(event!.toolCall.arguments.keyDecisions),
"keyDecisions should be repaired to an array",
);
assert.ok(
Array.isArray(event!.toolCall.arguments.keyFiles),
"keyFiles should be repaired to an array",
);
assert.equal(event!.toolCall.arguments.title, "done");
}
});
});

View file

@ -1,6 +1,15 @@
import { describe, test } from "node:test";
import assert from "node:assert/strict";
import { makeStreamExhaustedErrorMessage } from "../stream-adapter.ts";
import {
makeStreamExhaustedErrorMessage,
buildPromptFromContext,
buildSdkOptions,
} from "../stream-adapter.ts";
import type { Context, Message } from "@gsd/pi-ai";
// ---------------------------------------------------------------------------
// Existing tests — exhausted stream fallback (#2575)
// ---------------------------------------------------------------------------
describe("stream-adapter — exhausted stream fallback (#2575)", () => {
test("generator exhaustion becomes an error message instead of clean completion", () => {
@ -19,3 +28,101 @@ describe("stream-adapter — exhausted stream fallback (#2575)", () => {
assert.match(String((message.content[0] as any)?.text ?? ""), /Claude Code error: stream_exhausted_without_result/);
});
});
// ---------------------------------------------------------------------------
// Bug #2859 — stateless provider regression tests
// ---------------------------------------------------------------------------
describe("stream-adapter — full context prompt (#2859)", () => {
test("buildPromptFromContext includes all user and assistant messages, not just the last user message", () => {
const context: Context = {
systemPrompt: "You are a helpful assistant.",
messages: [
{ role: "user", content: "What is 2+2?" } as Message,
{
role: "assistant",
content: [{ type: "text", text: "4" }],
api: "anthropic-messages",
provider: "claude-code",
model: "claude-sonnet-4-20250514",
usage: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, totalTokens: 0, cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 } },
stopReason: "stop",
timestamp: Date.now(),
} as Message,
{ role: "user", content: "Now multiply that by 3" } as Message,
],
};
const prompt = buildPromptFromContext(context);
// Must contain content from BOTH user messages, not just the last
assert.ok(prompt.includes("2+2"), "prompt must include first user message");
assert.ok(prompt.includes("multiply"), "prompt must include second user message");
// Must contain assistant response for continuity
assert.ok(prompt.includes("4"), "prompt must include assistant reply for context");
});
test("buildPromptFromContext includes system prompt when present", () => {
const context: Context = {
systemPrompt: "You are a coding assistant.",
messages: [
{ role: "user", content: "Write a function" } as Message,
],
};
const prompt = buildPromptFromContext(context);
assert.ok(prompt.includes("coding assistant"), "prompt must include system prompt");
});
test("buildPromptFromContext handles array content parts in user messages", () => {
const context: Context = {
messages: [
{
role: "user",
content: [
{ type: "text", text: "First part" },
{ type: "text", text: "Second part" },
],
} as Message,
{ role: "user", content: "Follow-up" } as Message,
],
};
const prompt = buildPromptFromContext(context);
assert.ok(prompt.includes("First part"), "prompt must include array content parts");
assert.ok(prompt.includes("Second part"), "prompt must include all text parts");
assert.ok(prompt.includes("Follow-up"), "prompt must include follow-up message");
});
test("buildPromptFromContext returns empty string for empty messages", () => {
const context: Context = { messages: [] };
const prompt = buildPromptFromContext(context);
assert.equal(prompt, "");
});
});
describe("stream-adapter — session persistence (#2859)", () => {
test("buildSdkOptions enables persistSession by default", () => {
const options = buildSdkOptions("claude-sonnet-4-20250514", "test prompt");
assert.equal(options.persistSession, true, "persistSession must default to true");
});
test("buildSdkOptions sets model and prompt correctly", () => {
const options = buildSdkOptions("claude-sonnet-4-20250514", "hello world");
assert.equal(options.model, "claude-sonnet-4-20250514");
});
test("buildSdkOptions enables betas for sonnet models", () => {
const sonnetOpts = buildSdkOptions("claude-sonnet-4-20250514", "test");
assert.ok(
Array.isArray(sonnetOpts.betas) && sonnetOpts.betas.length > 0,
"sonnet models should have betas enabled",
);
const opusOpts = buildSdkOptions("claude-opus-4-20250514", "test");
assert.ok(
Array.isArray(opusOpts.betas) && opusOpts.betas.length === 0,
"non-sonnet models should have empty betas",
);
});
});

View file

@ -7,6 +7,6 @@
"requires": { "platform": ">=2.29.0" },
"provides": {
"tools": ["resolve_library", "get_library_docs"],
"hooks": ["session_start"]
"hooks": ["session_start", "session_shutdown"]
}
}

View file

@ -54,6 +54,9 @@ function hydrateProcessEnv(key: string, value: string): void {
}
async function writeEnvKey(filePath: string, key: string, value: string): Promise<void> {
if (typeof value !== "string") {
throw new TypeError(`writeEnvKey expects a string value for key "${key}", got ${typeof value}`);
}
let content = "";
try {
content = await readFile(filePath, "utf8");
@ -419,7 +422,7 @@ export async function collectSecretsFromManifest(
for (const { key, value } of collected) {
const entry = manifest.entries.find((e) => e.key === key);
if (entry) {
entry.status = value !== null ? "collected" : "skipped";
entry.status = value != null ? "collected" : "skipped";
}
}
@ -427,14 +430,14 @@ export async function collectSecretsFromManifest(
await writeFile(manifestPath, formatSecretsManifest(manifest), "utf8");
// (j) Apply collected values to destination
const provided = collected.filter((c) => c.value !== null) as Array<{ key: string; value: string }>;
const provided = collected.filter((c) => c.value != null) as Array<{ key: string; value: string }>;
const { applied } = await applySecrets(provided, destination, {
envFilePath: resolve(ctx.cwd, ".env"),
});
const skipped = [
...alreadySkipped,
...collected.filter((c) => c.value === null).map((c) => c.key),
...collected.filter((c) => c.value == null).map((c) => c.key),
];
return { applied, skipped, existingSkipped };
@ -505,8 +508,8 @@ export default function secureEnv(pi: ExtensionAPI) {
collected.push({ key: item.key, value });
}
const provided = collected.filter((c) => c.value !== null) as Array<{ key: string; value: string }>;
const skipped = collected.filter((c) => c.value === null).map((c) => c.key);
const provided = collected.filter((c) => c.value != null) as Array<{ key: string; value: string }>;
const skipped = collected.filter((c) => c.value == null).map((c) => c.key);
// Apply to destination via shared helper
const { applied, errors } = await applySecrets(provided, destination, {

View file

@ -7,6 +7,6 @@
"requires": { "platform": ">=2.29.0" },
"provides": {
"tools": ["google_search"],
"hooks": ["session_start"]
"hooks": ["session_start", "session_shutdown"]
}
}

View file

@ -79,7 +79,7 @@ async function searchWithOAuth(
signal?: AbortSignal,
): Promise<SearchResult> {
const model = process.env.GEMINI_SEARCH_MODEL || "gemini-2.5-flash";
const url = `https://cloudcode-pa.googleapis.com/v1internal:streamGenerateContent`;
const url = `https://cloudcode-pa.googleapis.com/v1internal:streamGenerateContent?alt=sse`;
const GEMINI_CLI_HEADERS = {
ideType: "IDE_UNSPECIFIED",
@ -104,6 +104,7 @@ async function searchWithOAuth(
contents: [{ parts: [{ text: query }] }],
tools: [{ googleSearch: {} }],
},
userAgent: "pi-coding-agent",
}),
signal,
});

View file

@ -56,7 +56,7 @@ export function resolveExpectedArtifactPath(
}
case "run-uat": {
const dir = resolveSlicePath(base, mid, sid!);
return dir ? join(dir, buildSliceFileName(sid!, "UAT")) : null;
return dir ? join(dir, buildSliceFileName(sid!, "ASSESSMENT")) : null;
}
case "execute-task": {
const dir = resolveSlicePath(base, mid, sid!);
@ -124,7 +124,7 @@ export function diagnoseExpectedArtifact(
case "reassess-roadmap":
return `${relSliceFile(base, mid, sid!, "ASSESSMENT")} (roadmap reassessment)`;
case "run-uat":
return `${relSliceFile(base, mid, sid!, "UAT")} (UAT result)`;
return `${relSliceFile(base, mid, sid!, "ASSESSMENT")} (UAT assessment result)`;
case "validate-milestone":
return `${relMilestoneFile(base, mid, "VALIDATION")} (milestone validation report)`;
case "complete-milestone":

View file

@ -569,6 +569,13 @@ export function updateProgressWidget(
: "";
lines.push(rightAlign(headerLeft, headerRight, width));
// Worktree/branch right-aligned below header
if (worktreeName && cachedBranch) {
lines.push(rightAlign("", theme.fg("dim", `${worktreeName} (${cachedBranch})`), width));
} else if (cachedBranch) {
lines.push(rightAlign("", theme.fg("dim", cachedBranch), width));
}
// Show health signal details when degraded (yellow/red)
if (score.level !== "green" && score.signals.length > 0 && widgetMode !== "min") {
// Show up to 3 most relevant signals in compact form
@ -682,12 +689,12 @@ export function updateProgressWidget(
const hasContext = !!(mid || (slice && unitType !== "research-milestone" && unitType !== "plan-milestone"));
if (mid) {
const modelTag = modelDisplay ? theme.fg("muted", ` ${modelDisplay}`) : "";
lines.push(truncateToWidth(`${pad}${theme.fg("dim", mid.title)}${modelTag}`, width));
lines.push(truncateToWidth(`${pad}${theme.fg("dim", mid.title)}${modelTag}`, width, "…"));
}
if (slice && unitType !== "research-milestone" && unitType !== "plan-milestone") {
lines.push(truncateToWidth(
`${pad}${theme.fg("text", theme.bold(`${slice.id}: ${slice.title}`))}`,
width,
width, "…",
));
}
if (hasContext) lines.push("");
@ -733,6 +740,12 @@ export function updateProgressWidget(
const rightLines: string[] = [];
const maxVisibleTasks = 8;
// Max visible chars for task title text (before ANSI theming)
const maxTaskTitleLen = 45;
function truncTitle(s: string): string {
return s.length > maxTaskTitleLen ? s.slice(0, maxTaskTitleLen - 1) + "…" : s;
}
function formatTaskLine(t: { id: string; title: string; done: boolean }, isCurrent: boolean): string {
const glyph = t.done
? theme.fg("success", "*")
@ -744,11 +757,12 @@ export function updateProgressWidget(
: t.done
? theme.fg("muted", t.id)
: theme.fg("dim", t.id);
const short = truncTitle(t.title);
const title = isCurrent
? theme.fg("text", t.title)
? theme.fg("text", short)
: t.done
? theme.fg("muted", t.title)
: theme.fg("text", t.title);
? theme.fg("muted", short)
: theme.fg("text", short);
return `${glyph} ${id}: ${title}`;
}
@ -771,7 +785,7 @@ export function updateProgressWidget(
if (maxRows > 0) {
lines.push("");
for (let i = 0; i < maxRows; i++) {
const left = padToWidth(truncateToWidth(leftLines[i] ?? "", leftColWidth), leftColWidth);
const left = padToWidth(truncateToWidth(leftLines[i] ?? "", leftColWidth, "…"), leftColWidth);
const right = rightLines[i] ?? "";
lines.push(`${left}${right}`);
}
@ -779,7 +793,7 @@ export function updateProgressWidget(
} else {
if (leftLines.length > 0) {
lines.push("");
for (const l of leftLines) lines.push(truncateToWidth(l, width));
for (const l of leftLines) lines.push(truncateToWidth(l, width, "…"));
}
}
@ -808,23 +822,27 @@ export function updateProgressWidget(
lines.push(rightAlign("", theme.fg("dim", cachedRtkLabel), width));
}
}
// PWD line with last commit info right-aligned
// Last commit info
const lastCommit = getLastCommit(accessors.getBasePath());
const commitStr = lastCommit
? theme.fg("dim", `${lastCommit.timeAgo} ago: ${lastCommit.message}`)
const maxCommitLen = 65;
const commitMsg = lastCommit
? lastCommit.message.length > maxCommitLen
? lastCommit.message.slice(0, maxCommitLen - 1) + "…"
: lastCommit.message
: "";
const pwdStr = theme.fg("dim", widgetPwd);
if (commitStr) {
lines.push(rightAlign(`${pad}${pwdStr}`, truncateToWidth(commitStr, Math.floor(width * 0.45)), width));
} else {
lines.push(`${pad}${pwdStr}`);
}
// Hints line
const hintParts: string[] = [];
hintParts.push("esc pause");
hintParts.push(process.platform === "darwin" ? "⌃⌥G dashboard" : "Ctrl+Alt+G dashboard");
const hintStr = theme.fg("dim", hintParts.join(" | "));
lines.push(rightAlign("", hintStr, width));
const commitStr = lastCommit
? theme.fg("dim", `${lastCommit.timeAgo} ago: ${commitMsg}`)
: "";
if (commitStr) {
lines.push(rightAlign(`${pad}${commitStr}`, hintStr, width));
} else {
lines.push(rightAlign("", hintStr, width));
}
lines.push(...ui.bar());
@ -851,12 +869,12 @@ function rightAlign(left: string, right: string, width: number): string {
const leftVis = visibleWidth(left);
const rightVis = visibleWidth(right);
const gap = Math.max(1, width - leftVis - rightVis);
return truncateToWidth(left + " ".repeat(gap) + right, width);
return truncateToWidth(left + " ".repeat(gap) + right, width, "…");
}
/** Pad a string with trailing spaces to fill exactly `colWidth` (ANSI-aware). */
function padToWidth(s: string, colWidth: number): string {
const vis = visibleWidth(s);
if (vis >= colWidth) return truncateToWidth(s, colWidth);
if (vis >= colWidth) return truncateToWidth(s, colWidth, "…");
return s + " ".repeat(colWidth - vis);
}

View file

@ -28,6 +28,7 @@ import {
buildSliceFileName,
} from "./paths.js";
import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
import { logError } from "./workflow-logger.js";
import { join } from "node:path";
import { hasImplementationArtifacts } from "./auto-recovery.js";
import {
@ -129,6 +130,21 @@ export function setRewriteCount(basePath: string, count: number): void {
writeFileSync(filePath, JSON.stringify({ count, updatedAt: new Date().toISOString() }) + "\n");
}
// ─── Helpers ─────────────────────────────────────────────────────────────
/**
* Returns true when the verification_operational value indicates that no
* operational verification is needed. Covers common phrasings the planning
* agent may use: "None", "None required", "N/A", "Not applicable", etc.
*
* @see https://github.com/gsd-build/gsd-2/issues/2931
*/
export function isVerificationNotApplicable(value: string): boolean {
const v = (value ?? "").toLowerCase().trim();
if (!v || v === "none") return true;
return /^(?:none[\s._-]*(?:required|needed|planned)?|n\/?a|not[\s._-]+(?:applicable|required|needed)|no[\s._-]+operational[\s\S]*)$/i.test(v);
}
// ─── Rules ────────────────────────────────────────────────────────────────
export const DISPATCH_RULES: DispatchRule[] = [
@ -511,7 +527,7 @@ export const DISPATCH_RULES: DispatchRule[] = [
};
} catch (err) {
// Non-fatal — fall through to sequential execution
process.stderr.write(`gsd-reactive: graph derivation failed: ${(err as Error).message}\n`);
logError("dispatch", "reactive graph derivation failed", { error: (err as Error).message });
return null;
}
},
@ -672,7 +688,7 @@ export const DISPATCH_RULES: DispatchRule[] = [
if (isDbAvailable()) {
const milestone = getMilestone(mid);
if (milestone?.verification_operational &&
milestone.verification_operational.toLowerCase() !== "none") {
!isVerificationNotApplicable(milestone.verification_operational)) {
const validationPath = resolveMilestoneFile(basePath, mid, "VALIDATION");
if (validationPath) {
const validationContent = await loadFile(validationPath);

View file

@ -222,9 +222,30 @@ export function resolveModelId<T extends { id: string; provider: string }>(
);
}
// Bare ID — prefer current provider, then first available
const exactProviderMatch = availableModels.find(
m => m.id === modelId && m.provider === currentProvider,
);
return exactProviderMatch ?? availableModels.find(m => m.id === modelId);
// Bare ID — resolve with provider precedence to avoid silent misrouting.
// Extension providers (e.g. claude-code) expose the same model IDs as their
// upstream API providers but route through a subprocess with different
// context, tool visibility, and cost characteristics (#2905). Bare IDs in
// PREFERENCES.md must resolve to the canonical API provider, not to an
// extension wrapper that happens to be the current session provider.
const candidates = availableModels.filter(m => m.id === modelId);
if (candidates.length === 0) return undefined;
if (candidates.length === 1) return candidates[0];
// Extension / CLI-wrapper providers that should never win bare-ID resolution
// when a first-class API provider also offers the same model.
const EXTENSION_PROVIDERS = new Set(["claude-code"]);
// Prefer currentProvider only when it is a first-class API provider
if (currentProvider && !EXTENSION_PROVIDERS.has(currentProvider)) {
const providerMatch = candidates.find(m => m.provider === currentProvider);
if (providerMatch) return providerMatch;
}
// Prefer "anthropic" as the canonical provider for Anthropic models
const anthropicMatch = candidates.find(m => m.provider === "anthropic");
if (anthropicMatch) return anthropicMatch;
// Fall back to first non-extension candidate, or any candidate
return candidates.find(m => !EXTENSION_PROVIDERS.has(m.provider)) ?? candidates[0];
}

View file

@ -13,6 +13,7 @@
import type { ExtensionContext, ExtensionAPI } from "@gsd/pi-coding-agent";
import { deriveState } from "./state.js";
import { logWarning, logError } from "./workflow-logger.js";
import { loadFile, parseSummary, resolveAllOverrides } from "./files.js";
import { loadPrompt } from "./prompt-loader.js";
import {
@ -412,10 +413,10 @@ export async function postUnitPreVerification(pctx: PostUnitContext, opts?: PreV
);
}
for (const action of triageResult.actions) {
process.stderr.write(`gsd-triage: ${action}\n`);
logWarning("engine", `triage resolution: ${action}`);
}
} catch (err) {
process.stderr.write(`gsd-triage: resolution execution failed: ${(err as Error).message}\n`);
logError("engine", "triage resolution failed", { error: (err as Error).message });
}
}
@ -423,7 +424,7 @@ export async function postUnitPreVerification(pctx: PostUnitContext, opts?: PreV
try {
const rogueFiles = detectRogueFileWrites(s.currentUnit.type, s.currentUnit.id, s.basePath);
for (const rogue of rogueFiles) {
process.stderr.write(`gsd-rogue: detected rogue file write: ${rogue.path} (unit: ${rogue.unitId})\n`);
logWarning("engine", "rogue file write detected", { path: rogue.path, unitId: rogue.unitId });
ctx.ui.notify(`Rogue file write detected: ${rogue.path}`, "warning");
}
} catch (e) {
@ -465,7 +466,20 @@ export async function postUnitPreVerification(pctx: PostUnitContext, opts?: PreV
// When artifact verification fails for a unit type that has a known expected
// artifact, return "retry" so the caller re-dispatches with failure context
// instead of blindly re-dispatching the same unit (#1571).
if (!triggerArtifactVerified) {
//
// HOWEVER, if the DB is unavailable (db_unavailable), the artifact was never
// written because the completion tool failed at the infra level. Retrying
// can never succeed and produces a costly re-dispatch loop (#2517).
if (!triggerArtifactVerified && !isDbAvailable()) {
// DB infra failure — do NOT retry; the completion tool returned
// db_unavailable so the artifact was never written. Retrying would
// produce an infinite re-dispatch loop (#2517).
debugLog("postUnit", { phase: "artifact-verify-skip-db-unavailable", unitType: s.currentUnit.type, unitId: s.currentUnit.id });
ctx.ui.notify(
`Artifact missing for ${s.currentUnit.type} ${s.currentUnit.id} but DB is unavailable — skipping retry to avoid loop (#2517)`,
"error",
);
} else if (!triggerArtifactVerified) {
const hasExpectedArtifact = resolveExpectedArtifactPath(s.currentUnit.type, s.currentUnit.id, s.basePath) !== null;
if (hasExpectedArtifact) {
const retryKey = `${s.currentUnit.type}:${s.currentUnit.id}`;

View file

@ -1568,7 +1568,7 @@ export async function buildRunUatPrompt(
const inlinedContext = capPreamble(`## Inlined Context (preloaded — do not re-read these files)\n\n${inlined.join("\n\n---\n\n")}`);
const uatResultPath = join(base, relSliceFile(base, mid, sliceId, "UAT"));
const uatResultPath = join(base, relSliceFile(base, mid, sliceId, "ASSESSMENT"));
const uatType = getUatType(uatContent);
return loadPrompt("run-uat", {

View file

@ -14,6 +14,7 @@ import { clearParseCache } from "./files.js";
import { parseRoadmap as parseLegacyRoadmap, parsePlan as parseLegacyPlan } from "./parsers-legacy.js";
import { isDbAvailable, getTask, getSlice, getSliceTasks, updateTaskStatus } from "./gsd-db.js";
import { isValidationTerminal } from "./state.js";
import { getErrorMessage } from "./error-utils.js";
import {
nativeConflictFiles,
nativeCommit,
@ -476,11 +477,17 @@ export function reconcileMergeState(
if (conflictedFiles.length === 0) {
// All conflicts resolved — finalize the merge/squash commit
try {
nativeCommit(basePath, ""); // --no-edit equivalent: use empty message placeholder
const mode = hasMergeHead ? "merge" : "squash commit";
ctx.ui.notify(`Finalized leftover ${mode} from prior session.`, "info");
} catch {
// Commit may already exist; non-fatal
const commitSha = nativeCommit(basePath, ""); // --no-edit equivalent: use empty message placeholder
if (commitSha) {
const mode = hasMergeHead ? "merge" : "squash commit";
ctx.ui.notify(`Finalized leftover ${mode} from prior session.`, "info");
} else {
ctx.ui.notify("No new commit needed for leftover merge/squash state — already committed.", "info");
}
} catch (err) {
const errorMessage = getErrorMessage(err);
ctx.ui.notify(`Failed to finalize leftover merge/squash commit: ${errorMessage}`, "error");
return false;
}
} else {
// Still conflicted — try auto-resolving .gsd/ state file conflicts (#530)

View file

@ -58,9 +58,8 @@ import { initRoutingHistory } from "./routing-history.js";
import { restoreHookState, resetHookState } from "./post-unit-hooks.js";
import { resetProactiveHealing, setLevelChangeCallback } from "./doctor-proactive.js";
import { snapshotSkills } from "./skill-discovery.js";
import { isDbAvailable, getMilestone, openDatabase } from "./gsd-db.js";
import { isDbAvailable, getMilestone } from "./gsd-db.js";
import { hideFooter } from "./auto-dashboard.js";
import { resolveProjectRootDbPath } from "./bootstrap/dynamic-tools.js";
import {
debugLog,
enableDebug,
@ -68,7 +67,6 @@ import {
getDebugLogPath,
} from "./debug-logger.js";
import { parseUnitId } from "./unit-id.js";
import { setLogBasePath } from "./workflow-logger.js";
import type { AutoSession } from "./auto/session.js";
import {
existsSync,
@ -80,6 +78,7 @@ import {
import { join } from "node:path";
import { sep as pathSep } from "node:path";
import { resolveProjectRootDbPath } from "./bootstrap/dynamic-tools.js";
import type { WorktreeResolver } from "./worktree-resolver.js";
export interface BootstrapDeps {
@ -98,26 +97,32 @@ export interface BootstrapDeps {
* concurrent session detected). Returns true when ready to dispatch.
*/
/**
* Open the project-root DB before the first deriveState call (#2841).
* When auto-mode starts cold (no prior DB handle), state derivation that
* touches DB-backed helpers (queue-order, task status) silently falls back
* to markdown-only data, producing stale or incomplete state. Opening the
* DB first ensures deriveState sees the full picture on its very first run.
*/
async function openProjectDbIfPresent(basePath: string): Promise<void> {
const gsdDbPath = resolveProjectRootDbPath(basePath);
if (!existsSync(gsdDbPath)) return;
if (isDbAvailable()) return;
try {
const { openDatabase } = await import("./gsd-db.js");
openDatabase(gsdDbPath);
} catch {
/* non-fatal — DB lifecycle block below will retry */
}
}
/** Guard: tracks consecutive bootstrap attempts that found phase === "complete".
* Prevents the recursive dialog loop described in #1348 where
* bootstrapAutoSession showSmartEntry checkAutoStartAfterDiscuss startAuto
* cycles indefinitely when the discuss workflow doesn't produce a milestone. */
let _consecutiveCompleteBootstraps = 0;
const MAX_CONSECUTIVE_COMPLETE_BOOTSTRAPS = 2;
async function openProjectDbIfPresent(basePath: string): Promise<void> {
const gsdDbPath = resolveProjectRootDbPath(basePath);
if (!existsSync(gsdDbPath) || isDbAvailable()) return;
try {
openDatabase(gsdDbPath);
} catch (err) {
process.stderr.write(
`gsd-db: failed to open existing database: ${(err as Error).message}\n`,
);
}
}
export async function bootstrapAutoSession(
s: AutoSession,
ctx: ExtensionCommandContext,
@ -198,10 +203,13 @@ export async function bootstrapAutoSession(
ensureGitignore(base, { manageGitignore });
if (manageGitignore !== false) untrackRuntimeFiles(base);
// Bootstrap .gsd/ if it doesn't exist
// Bootstrap milestones/ if it doesn't exist.
// Check milestones/ directly — ensureGsdSymlink above already created .gsd/,
// so checking .gsd/ existence would be dead code (#2942).
const gsdDir = join(base, ".gsd");
if (!existsSync(gsdDir)) {
mkdirSync(join(gsdDir, "milestones"), { recursive: true });
const milestonesPath = join(gsdDir, "milestones");
if (!existsSync(milestonesPath)) {
mkdirSync(milestonesPath, { recursive: true });
try {
nativeAddAll(base);
nativeCommit(base, "chore: init gsd");
@ -280,10 +288,6 @@ export async function bootstrapAutoSession(
ctx.ui.notify(`Debug logging enabled → ${getDebugLogPath()}`, "info");
}
// Open the project DB before the first derive so resume uses DB truth
// immediately on cold starts instead of falling back to markdown (#2841).
await openProjectDbIfPresent(base);
// Invalidate caches before initial state derivation
invalidateAllCaches();
@ -293,6 +297,10 @@ export async function bootstrapAutoSession(
(mid) => !!resolveMilestoneFile(base, mid, "SUMMARY"),
);
// Open the project-root DB before deriveState so DB-backed state
// derivation (queue-order, task status) works on a cold start (#2841).
await openProjectDbIfPresent(base);
let state = await deriveState(base);
// Stale worktree state recovery (#654)
@ -490,7 +498,6 @@ export async function bootstrapAutoSession(
s.verbose = verboseMode;
s.cmdCtx = ctx;
s.basePath = base;
setLogBasePath(base);
s.unitDispatchCount.clear();
s.unitRecoveryCount.clear();
s.lastBudgetAlertLevel = 0;
@ -554,14 +561,15 @@ export async function bootstrapAutoSession(
}
// ── DB lifecycle ──
const gsdDbPath = resolveProjectRootDbPath(s.basePath);
const gsdDbPath = join(s.basePath, ".gsd", "gsd.db");
const gsdDirPath = join(s.basePath, ".gsd");
if (existsSync(gsdDirPath) && !existsSync(gsdDbPath)) {
const hasDecisions = existsSync(join(gsdDirPath, "DECISIONS.md"));
const hasRequirements = existsSync(join(gsdDirPath, "REQUIREMENTS.md"));
const hasMilestones = existsSync(join(gsdDirPath, "milestones"));
try {
openDatabase(gsdDbPath);
const { openDatabase: openDb } = await import("./gsd-db.js");
openDb(gsdDbPath);
if (hasDecisions || hasRequirements || hasMilestones) {
const { migrateFromMarkdown } = await import("./md-importer.js");
migrateFromMarkdown(s.basePath);
@ -574,7 +582,8 @@ export async function bootstrapAutoSession(
}
if (existsSync(gsdDbPath) && !isDbAvailable()) {
try {
openDatabase(gsdDbPath);
const { openDatabase: openDb } = await import("./gsd-db.js");
openDb(gsdDbPath);
} catch (err) {
process.stderr.write(
`gsd-db: failed to open existing database: ${(err as Error).message}\n`,

View file

@ -15,6 +15,7 @@ import {
realpathSync,
rmSync,
unlinkSync,
statSync,
lstatSync as lstatSyncFn,
} from "node:fs";
import { isAbsolute, join, sep as pathSep } from "node:path";
@ -62,6 +63,7 @@ import {
nativeDiffNumstat,
nativeUpdateRef,
nativeIsAncestor,
nativeMergeAbort,
} from "./native-git-bridge.js";
const gsdHome = process.env.GSD_HOME || join(homedir(), ".gsd");
@ -84,6 +86,7 @@ const ROOT_STATE_FILES = [
"QUEUE.md",
"completed-units.json",
"metrics.json",
"mcp.json",
// NOTE: project preferences are intentionally NOT in ROOT_STATE_FILES.
// Forward-sync (main → worktree) is handled explicitly in syncGsdStateToWorktree().
// Back-sync (worktree → main) must NEVER overwrite the project root's copy
@ -102,6 +105,67 @@ function isSamePath(a: string, b: string): boolean {
}
}
// ─── ASSESSMENT Force-Sync Helper (#2821) ─────────────────────────────────
/** Regex matching YAML frontmatter `verdict:` field. */
const VERDICT_RE = /verdict:\s*[\w-]+/i;
/**
* Walk a milestone directory and force-overwrite ASSESSMENT files in the
* destination when the source copy contains a `verdict:` field.
*
* This is the targeted fix for the UAT stuck-loop (#2821): the main
* safeCopyRecursive uses force:false to protect worktree-authoritative
* files (#1886), but ASSESSMENT files written by run-uat must be
* forward-synced when the project root has a verdict. Without this,
* the worktree retains a stale FAIL or missing ASSESSMENT and
* checkNeedsRunUat re-dispatches run-uat indefinitely.
*
* Only overwrites when the source has a verdict never clobbers a
* worktree ASSESSMENT with a verdictless project-root copy.
*/
function forceOverwriteAssessmentsWithVerdict(
srcMilestoneDir: string,
dstMilestoneDir: string,
): void {
if (!existsSync(srcMilestoneDir)) return;
// Walk slices/<SID>/ looking for *-ASSESSMENT.md files
const slicesDir = join(srcMilestoneDir, "slices");
if (!existsSync(slicesDir)) return;
try {
for (const sliceEntry of readdirSync(slicesDir, { withFileTypes: true })) {
if (!sliceEntry.isDirectory()) continue;
const srcSliceDir = join(slicesDir, sliceEntry.name);
const dstSliceDir = join(dstMilestoneDir, "slices", sliceEntry.name);
try {
for (const fileEntry of readdirSync(srcSliceDir, { withFileTypes: true })) {
if (!fileEntry.isFile()) continue;
if (!fileEntry.name.endsWith("-ASSESSMENT.md")) continue;
const srcFile = join(srcSliceDir, fileEntry.name);
try {
const srcContent = readFileSync(srcFile, "utf-8");
if (!VERDICT_RE.test(srcContent)) continue; // no verdict in source — skip
// Source has a verdict — force-copy into worktree
mkdirSync(dstSliceDir, { recursive: true });
safeCopy(srcFile, join(dstSliceDir, fileEntry.name), { force: true });
} catch {
/* non-fatal per file */
}
}
} catch {
/* non-fatal per slice */
}
}
} catch {
/* non-fatal */
}
}
// ─── Module State ──────────────────────────────────────────────────────────
/** Original project root before chdir into auto-worktree. */
@ -214,6 +278,19 @@ export function syncProjectRootToWorktree(
{ force: false },
);
// Force-sync ASSESSMENT files that have a verdict from project root (#2821).
// The additive-only copy above preserves worktree-authoritative files, but
// ASSESSMENT files are special: after run-uat writes a verdict and post-unit
// syncs it to the project root, the worktree may retain a stale copy (e.g.
// verdict:fail while the project root has verdict:pass from a retry). On
// session resume the DB is rebuilt from disk, and if the stale ASSESSMENT
// persists, checkNeedsRunUat finds no passing verdict → re-dispatches
// run-uat indefinitely (stuck-loop ×9).
forceOverwriteAssessmentsWithVerdict(
join(prGsd, "milestones", milestoneId),
join(wtGsd, "milestones", milestoneId),
);
// Forward-sync completed-units.json from project root to worktree.
// Project root is authoritative for completion state after crash recovery;
// without this, the worktree re-dispatches already-completed units (#1886).
@ -223,12 +300,18 @@ export function syncProjectRootToWorktree(
{ force: true },
);
// Delete worktree gsd.db so it rebuilds from the freshly synced files.
// Stale DB rows are the root cause of the infinite skip loop (#853).
// Delete worktree gsd.db ONLY if it is empty (0 bytes).
// An empty DB is stale/corrupt and should be rebuilt (#853).
// A non-empty DB was populated by gsd-migrate on respawn and must be
// preserved — deleting it truncates the file to 0 bytes when
// openDatabase re-creates it, causing "no such table" failures (#2815).
try {
const wtDb = join(wtGsd, "gsd.db");
if (existsSync(wtDb)) {
unlinkSync(wtDb);
const size = statSync(wtDb).size;
if (size === 0) {
unlinkSync(wtDb);
}
}
} catch {
/* non-fatal */
@ -1004,6 +1087,7 @@ function copyPlanningArtifacts(srcBase: string, wtPath: string): void {
"STATE.md",
"KNOWLEDGE.md",
"OVERRIDES.md",
"mcp.json",
]) {
safeCopy(join(srcGsd, file), join(dstGsd, file), { force: true });
}
@ -1414,9 +1498,19 @@ export function mergeMilestoneToMain(
encoding: "utf-8",
}).trim();
if (status) {
// Use --include-untracked to stash untracked files that would block
// the squash merge, but EXCLUDE .gsd/milestones/ (#2505).
// --include-untracked without exclusion sweeps queued milestone
// CONTEXT files into the stash. If stash pop later fails, those files
// are permanently trapped in the stash entry and lost on the next
// stash push or drop.
execFileSync(
"git",
["stash", "push", "--include-untracked", "-m", `gsd: pre-merge stash for ${milestoneId}`],
[
"stash", "push", "--include-untracked",
"-m", `gsd: pre-merge stash for ${milestoneId}`,
"--", ":(exclude).gsd/milestones",
],
{ cwd: originalBasePath_, stdio: ["ignore", "pipe", "pipe"], encoding: "utf-8" },
);
stashed = true;
@ -1426,6 +1520,65 @@ export function mergeMilestoneToMain(
// report the dirty tree if it fails.
}
// 7a. Shelter queued milestone directories before the squash merge (#2505).
// The milestone branch may contain copies of queued milestone dirs (via
// copyPlanningArtifacts), so `git merge --squash` rejects when those same
// files exist as untracked in the working tree. Temporarily move them to
// a backup location, then restore after the merge+commit.
const milestonesDir = join(gsdRoot(originalBasePath_), "milestones");
const shelterDir = join(gsdRoot(originalBasePath_), ".milestone-shelter");
const shelteredDirs: string[] = [];
// Helper: restore sheltered milestone directories (#2505).
// Called on both success and error paths to ensure queued CONTEXT files
// are never permanently lost.
const restoreShelter = (): void => {
if (shelteredDirs.length === 0) return;
for (const dirName of shelteredDirs) {
try {
mkdirSync(milestonesDir, { recursive: true });
cpSync(join(shelterDir, dirName), join(milestonesDir, dirName), { recursive: true, force: true });
} catch { /* best-effort */ }
}
try { rmSync(shelterDir, { recursive: true, force: true }); } catch { /* best-effort */ }
};
try {
if (existsSync(milestonesDir)) {
const entries = readdirSync(milestonesDir, { withFileTypes: true });
for (const entry of entries) {
if (!entry.isDirectory()) continue;
// Only shelter directories that do NOT belong to the milestone being merged
if (entry.name === milestoneId) continue;
const srcDir = join(milestonesDir, entry.name);
const dstDir = join(shelterDir, entry.name);
try {
mkdirSync(shelterDir, { recursive: true });
cpSync(srcDir, dstDir, { recursive: true, force: true });
rmSync(srcDir, { recursive: true, force: true });
shelteredDirs.push(entry.name);
} catch {
// Non-fatal — if shelter fails, the merge may still succeed
}
}
}
} catch {
// Non-fatal — proceed with merge; untracked files may block it
}
// 7b. Clean up stale merge state before attempting squash merge (#2912).
// A leftover MERGE_HEAD (from a previous failed merge, libgit2 native path,
// or interrupted operation) causes `git merge --squash` to refuse with
// "fatal: You have not concluded your merge (MERGE_HEAD exists)".
// Defensively remove merge artifacts before starting.
try {
const gitDir_ = resolveGitDir(originalBasePath_);
for (const f of ["SQUASH_MSG", "MERGE_MSG", "MERGE_HEAD"]) {
const p = join(gitDir_, f);
if (existsSync(p)) unlinkSync(p);
}
} catch { /* best-effort */ }
// 8. Squash merge — auto-resolve .gsd/ state file conflicts (#530)
const mergeResult = nativeMergeSquash(originalBasePath_, milestoneBranch);
@ -1434,6 +1587,16 @@ export function mergeMilestoneToMain(
// untracked .gsd/ files left by syncStateToProjectRoot). Preserve the
// milestone branch so commits are not lost.
if (mergeResult.conflicts.includes("__dirty_working_tree__")) {
// Defensively clean merge state — the native path may leave MERGE_HEAD
// even when the merge is rejected (#2912).
try {
const gitDir_ = resolveGitDir(originalBasePath_);
for (const f of ["SQUASH_MSG", "MERGE_MSG", "MERGE_HEAD"]) {
const p = join(gitDir_, f);
if (existsSync(p)) unlinkSync(p);
}
} catch { /* best-effort */ }
// Pop stash before throwing so local work is not lost.
if (stashed) {
try {
@ -1444,6 +1607,7 @@ export function mergeMilestoneToMain(
});
} catch { /* stash pop conflict is non-fatal */ }
}
restoreShelter();
// Restore cwd so the caller is not stranded on the integration branch
process.chdir(previousCwd);
// Surface the actual dirty filenames from git stderr instead of
@ -1490,6 +1654,18 @@ export function mergeMilestoneToMain(
// If there are still real code conflicts, escalate
if (codeConflicts.length > 0) {
// Abort merge state so MERGE_HEAD is not left on disk (#2912).
// libgit2's merge creates MERGE_HEAD even for squash merges; if left
// dangling, subsequent merges fail and doctor reports corrupt state.
try { nativeMergeAbort(originalBasePath_); } catch { /* best-effort */ }
try {
const gitDir_ = resolveGitDir(originalBasePath_);
for (const f of ["SQUASH_MSG", "MERGE_MSG", "MERGE_HEAD"]) {
const p = join(gitDir_, f);
if (existsSync(p)) unlinkSync(p);
}
} catch { /* best-effort */ }
// Pop stash before throwing so local work is not lost (#2151).
if (stashed) {
try {
@ -1500,6 +1676,7 @@ export function mergeMilestoneToMain(
});
} catch { /* stash pop conflict is non-fatal */ }
}
restoreShelter();
throw new MergeConflictError(
codeConflicts,
"squash",
@ -1515,14 +1692,18 @@ export function mergeMilestoneToMain(
const commitResult = nativeCommit(originalBasePath_, commitMessage);
const nothingToCommit = commitResult === null;
// 9a. Clean up SQUASH_MSG left by git merge --squash (#1853).
// 9a. Clean up merge state files left by git merge --squash (#1853, #2912).
// git only removes SQUASH_MSG when the commit reads it directly (plain
// `git commit`). nativeCommit uses `-F -` (stdin) or libgit2, neither
// of which trigger git's SQUASH_MSG cleanup. If left on disk, doctor
// reports `corrupt_merge_state` on every subsequent run.
// of which trigger git's SQUASH_MSG cleanup. MERGE_HEAD is created by
// libgit2's merge even in squash mode and is not removed by nativeCommit.
// If left on disk, doctor reports `corrupt_merge_state` on every subsequent run.
try {
const squashMsgPath = join(resolveGitDir(originalBasePath_), "SQUASH_MSG");
if (existsSync(squashMsgPath)) unlinkSync(squashMsgPath);
const gitDir_ = resolveGitDir(originalBasePath_);
for (const f of ["SQUASH_MSG", "MERGE_MSG", "MERGE_HEAD"]) {
const p = join(gitDir_, f);
if (existsSync(p)) unlinkSync(p);
}
} catch { /* best-effort */ }
// 9a-ii. Restore stashed files now that the merge+commit is complete (#2151).
@ -1581,6 +1762,9 @@ export function mergeMilestoneToMain(
}
}
// 9a-iii. Restore sheltered queued milestone directories (#2505).
restoreShelter();
// 9b. Safety check (#1792): if nothing was committed, verify the milestone
// work is already on the integration branch before allowing teardown.
// Compare only non-.gsd/ paths — .gsd/ state files diverge normally and

View file

@ -93,6 +93,7 @@ export interface LoopDeps {
body: string,
kind: string,
category: string,
projectName?: string,
) => void;
setActiveMilestoneId: (basePath: string, mid: string) => void;
pruneQueueOrder: (basePath: string, pendingIds: string[]) => void;

View file

@ -26,7 +26,7 @@ import { runUnit } from "./run-unit.js";
import { debugLog } from "../debug-logger.js";
import { PROJECT_FILES } from "../detection.js";
import { MergeConflictError } from "../git-service.js";
import { join } from "node:path";
import { join, basename } from "node:path";
import { existsSync, cpSync } from "node:fs";
import { logWarning, logError } from "../workflow-logger.js";
import { gsdRoot } from "../paths.js";
@ -230,6 +230,7 @@ export async function runPreDispatch(
`Milestone ${s.currentMilestoneId} complete!`,
"success",
"milestone",
basename(s.originalBasePath || s.basePath),
);
deps.logCmuxEvent(
prefs,
@ -388,6 +389,7 @@ export async function runPreDispatch(
"All milestones complete!",
"success",
"milestone",
basename(s.originalBasePath || s.basePath),
);
deps.logCmuxEvent(
prefs,
@ -411,7 +413,7 @@ export async function runPreDispatch(
const blockerMsg = `Blocked: ${state.blockers.join(", ")}`;
await deps.stopAuto(ctx, pi, blockerMsg);
ctx.ui.notify(`${blockerMsg}. Fix and run /gsd auto.`, "warning");
deps.sendDesktopNotification("GSD", blockerMsg, "error", "attention");
deps.sendDesktopNotification("GSD", blockerMsg, "error", "attention", basename(s.originalBasePath || s.basePath));
deps.logCmuxEvent(prefs, blockerMsg, "error");
} else {
const ids = incomplete.map((m: { id: string }) => m.id).join(", ");
@ -492,6 +494,7 @@ export async function runPreDispatch(
`Milestone ${mid} complete!`,
"success",
"milestone",
basename(s.originalBasePath || s.basePath),
);
deps.logCmuxEvent(
prefs,
@ -509,7 +512,7 @@ export async function runPreDispatch(
const blockerMsg = `Blocked: ${state.blockers.join(", ")}`;
await closeoutAndStop(ctx, pi, s, deps, blockerMsg);
ctx.ui.notify(`${blockerMsg}. Fix and run /gsd auto.`, "warning");
deps.sendDesktopNotification("GSD", blockerMsg, "error", "attention");
deps.sendDesktopNotification("GSD", blockerMsg, "error", "attention", basename(s.originalBasePath || s.basePath));
deps.logCmuxEvent(prefs, blockerMsg, "error");
debugLog("autoLoop", { phase: "exit", reason: "blocked" });
deps.emitJournalEvent({ ts: new Date().toISOString(), flowId: ic.flowId, seq: ic.nextSeq(), eventType: "terminal", data: { reason: "blocked", blockers: state.blockers } });
@ -755,7 +758,7 @@ export async function runGuards(
// 100% — special enforcement logic (halt/pause/warn)
const msg = `Budget ceiling ${deps.formatCost(budgetCeiling)} reached (spent ${deps.formatCost(totalCost)}).`;
if (budgetEnforcementAction === "halt") {
deps.sendDesktopNotification("GSD", msg, "error", "budget");
deps.sendDesktopNotification("GSD", msg, "error", "budget", basename(s.originalBasePath || s.basePath));
await deps.stopAuto(ctx, pi, "Budget ceiling reached");
debugLog("autoLoop", { phase: "exit", reason: "budget-halt" });
return { action: "break", reason: "budget-halt" };
@ -765,14 +768,14 @@ export async function runGuards(
`${msg} Pausing auto-mode — /gsd auto to override and continue.`,
"warning",
);
deps.sendDesktopNotification("GSD", msg, "warning", "budget");
deps.sendDesktopNotification("GSD", msg, "warning", "budget", basename(s.originalBasePath || s.basePath));
deps.logCmuxEvent(prefs, msg, "warning");
await deps.pauseAuto(ctx, pi);
debugLog("autoLoop", { phase: "exit", reason: "budget-pause" });
return { action: "break", reason: "budget-pause" };
}
ctx.ui.notify(`${msg} Continuing (enforcement: warn).`, "warning");
deps.sendDesktopNotification("GSD", msg, "warning", "budget");
deps.sendDesktopNotification("GSD", msg, "warning", "budget", basename(s.originalBasePath || s.basePath));
deps.logCmuxEvent(prefs, msg, "warning");
} else if (threshold.pct < 100) {
// Sub-100% — simple notification
@ -783,6 +786,7 @@ export async function runGuards(
msg,
threshold.notifyLevel,
"budget",
basename(s.originalBasePath || s.basePath),
);
deps.logCmuxEvent(prefs, msg, threshold.cmuxLevel);
}
@ -812,6 +816,7 @@ export async function runGuards(
`Context ${contextUsage.percent}% — paused`,
"warning",
"attention",
basename(s.originalBasePath || s.basePath),
);
await deps.pauseAuto(ctx, pi);
debugLog("autoLoop", { phase: "exit", reason: "context-window" });
@ -929,6 +934,23 @@ export async function runUnitPhase(
},
);
// Select and apply model (with tier escalation on retry — normal units only)
const modelResult = await deps.selectAndApplyModel(
ctx,
pi,
unitType,
unitId,
s.basePath,
prefs,
s.verbose,
s.autoModeStartModel,
sidecarItem ? undefined : { isRetry, previousTier },
);
s.currentUnitRouting =
modelResult.routing as AutoSession["currentUnitRouting"];
s.currentUnitModel =
modelResult.appliedModel as AutoSession["currentUnitModel"];
// Status bar + progress widget
ctx.ui.setStatus("gsd-auto", "auto");
if (mid)
@ -1001,23 +1023,6 @@ export async function runUnitPhase(
logWarning("engine", "Prompt reorder failed", { error: msg });
}
// Select and apply model (with tier escalation on retry — normal units only)
const modelResult = await deps.selectAndApplyModel(
ctx,
pi,
unitType,
unitId,
s.basePath,
prefs,
s.verbose,
s.autoModeStartModel,
sidecarItem ? undefined : { isRetry, previousTier },
);
s.currentUnitRouting =
modelResult.routing as AutoSession["currentUnitRouting"];
s.currentUnitModel =
modelResult.appliedModel as AutoSession["currentUnitModel"];
// Apply sidecar/pre-dispatch hook model override (takes priority over standard model selection)
const hookModelOverride = sidecarItem?.model ?? iterData.hookModelOverride;
if (hookModelOverride) {
@ -1142,14 +1147,18 @@ export async function runUnitPhase(
// ── Immediate unit closeout (metrics, activity log, memory) ────────
// Run right after runUnit() returns so telemetry is never lost to a
// crash between iterations.
await deps.closeoutUnit(
ctx,
s.basePath,
unitType,
unitId,
s.currentUnit.startedAt,
deps.buildSnapshotOpts(unitType, unitId),
);
// Guard: stopAuto() may have nulled s.currentUnit via s.reset() while
// this coroutine was suspended at `await runUnit(...)` (#2939).
if (s.currentUnit) {
await deps.closeoutUnit(
ctx,
s.basePath,
unitType,
unitId,
s.currentUnit.startedAt,
deps.buildSnapshotOpts(unitType, unitId),
);
}
// ── Zero tool-call guard (#1833) ──────────────────────────────────
// An execute-task agent that completes with 0 tool calls made no
@ -1159,7 +1168,7 @@ export async function runUnitPhase(
const currentLedger = deps.getLedger() as { units: Array<{ type: string; id: string; startedAt: number; toolCalls: number }> } | null;
if (currentLedger?.units) {
const lastUnit = [...currentLedger.units].reverse().find(
(u: { type: string; id: string; startedAt: number; toolCalls: number }) => u.type === unitType && u.id === unitId && u.startedAt === s.currentUnit!.startedAt,
(u: { type: string; id: string; startedAt: number; toolCalls: number }) => u.type === unitType && u.id === unitId && u.startedAt === s.currentUnit?.startedAt,
);
if (lastUnit && lastUnit.toolCalls === 0) {
debugLog("runUnitPhase", {
@ -1174,7 +1183,7 @@ export async function runUnitPhase(
);
// Fall through to next iteration where dispatch will re-derive
// and re-dispatch this task.
return { action: "next", data: { unitStartedAt: s.currentUnit.startedAt } };
return { action: "next", data: { unitStartedAt: s.currentUnit?.startedAt } };
}
}
}
@ -1198,7 +1207,7 @@ export async function runUnitPhase(
deps.emitJournalEvent({ ts: new Date().toISOString(), flowId: ic.flowId, seq: ic.nextSeq(), eventType: "unit-end", data: { unitType, unitId, status: unitResult.status, artifactVerified, ...(unitResult.errorContext ? { errorContext: unitResult.errorContext } : {}) }, causedBy: { flowId: ic.flowId, seq: unitStartSeq } });
return { action: "next", data: { unitStartedAt: s.currentUnit.startedAt } };
return { action: "next", data: { unitStartedAt: s.currentUnit?.startedAt } };
}
// ─── runFinalize ──────────────────────────────────────────────────────────────

View file

@ -68,6 +68,28 @@ export async function handleAgentEnd(
const lastMsg = event.messages[event.messages.length - 1];
if (lastMsg && "stopReason" in lastMsg && lastMsg.stopReason === "aborted") {
// Empty content with aborted stopReason is a non-fatal agent stop (the LLM
// chose to end without producing output). Only pause on genuine fatal aborts
// that carry error context — e.g. errorMessage field or non-empty content
// indicating a mid-stream failure. (#2695)
const content = "content" in lastMsg ? lastMsg.content : undefined;
const hasEmptyContent = Array.isArray(content) && content.length === 0;
const hasErrorMessage = "errorMessage" in lastMsg && !!lastMsg.errorMessage;
if (hasEmptyContent && !hasErrorMessage) {
// Non-fatal: treat as a normal agent end so the loop can continue
// instead of entering a stuck re-dispatch cycle.
try {
resetRetryState(retryState);
resolveAgentEnd(event);
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
ctx.ui.notify(`Auto-mode error after empty-content abort: ${message}. Stopping auto-mode.`, "error");
try { await pauseAuto(ctx, pi); } catch { /* best-effort */ }
}
return;
}
await pauseAuto(ctx, pi);
return;
}
@ -79,6 +101,15 @@ export async function handleAgentEnd(
// ── 1. Classify ──────────────────────────────────────────────────────
const cls = classifyError(errorMsg, explicitRetryAfterMs);
// Cap rate-limit backoff for CLI-style providers (openai-codex, google-gemini-cli)
// which use per-user quotas with shorter windows (#2922).
if (cls.kind === "rate-limit") {
const currentProvider = ctx.model?.provider;
if (currentProvider === "openai-codex" || currentProvider === "google-gemini-cli") {
cls.retryAfterMs = Math.min(cls.retryAfterMs, 30_000);
}
}
// ── 2. Decide & Act ──────────────────────────────────────────────────
// --- Network errors: same-model retry with backoff ---

View file

@ -121,14 +121,6 @@ export function registerDbTools(pi: ExtensionAPI): void {
};
}
try {
const db = await import("../gsd-db.js");
const existing = db.getRequirementById(params.id);
if (!existing) {
return {
content: [{ type: "text" as const, text: `Error: Requirement ${params.id} not found.` }],
details: { operation: "update_requirement", id: params.id, error: "not_found" } as any,
};
}
const { updateRequirementInDb } = await import("../db-writer.js");
const updates: Record<string, string | undefined> = {};
if (params.status !== undefined) updates.status = params.status;
@ -196,6 +188,91 @@ export function registerDbTools(pi: ExtensionAPI): void {
pi.registerTool(requirementUpdateTool);
registerAlias(pi, requirementUpdateTool, "gsd_update_requirement", "gsd_requirement_update");
// ─── gsd_requirement_save ─────────────────────────────────────────────
const requirementSaveExecute = async (_toolCallId: string, params: any, _signal: AbortSignal | undefined, _onUpdate: unknown, _ctx: unknown) => {
const dbAvailable = await ensureDbOpen();
if (!dbAvailable) {
return {
content: [{ type: "text" as const, text: "Error: GSD database is not available. Cannot save requirement." }],
details: { operation: "save_requirement", error: "db_unavailable" } as any,
};
}
try {
const { saveRequirementToDb } = await import("../db-writer.js");
const result = await saveRequirementToDb(
{
class: params.class,
status: params.status,
description: params.description,
why: params.why,
source: params.source,
primary_owner: params.primary_owner,
supporting_slices: params.supporting_slices,
validation: params.validation,
notes: params.notes,
},
process.cwd(),
);
return {
content: [{ type: "text" as const, text: `Saved requirement ${result.id}` }],
details: { operation: "save_requirement", id: result.id } as any,
};
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
logError("tool", `gsd_requirement_save tool failed: ${msg}`, { tool: "gsd_requirement_save", error: String(err) });
return {
content: [{ type: "text" as const, text: `Error saving requirement: ${msg}` }],
details: { operation: "save_requirement", error: msg } as any,
};
}
};
const requirementSaveTool = {
name: "gsd_requirement_save",
label: "Save Requirement",
description:
"Record a new requirement to the GSD database and regenerate REQUIREMENTS.md. " +
"Requirement IDs are auto-assigned — never provide an ID manually.",
promptSnippet: "Record a new GSD requirement to the database (auto-assigns ID, regenerates REQUIREMENTS.md)",
promptGuidelines: [
"Use gsd_requirement_save when recording a new functional, non-functional, or operational requirement.",
"Requirement IDs are auto-assigned (R001, R002, ...) — never guess or provide an ID.",
"class, description, why, and source are required. All other fields are optional.",
"The tool writes to the DB and regenerates .gsd/REQUIREMENTS.md automatically.",
],
parameters: Type.Object({
class: Type.String({ description: "Requirement class (e.g. 'functional', 'non-functional', 'operational')" }),
description: Type.String({ description: "Short description of the requirement" }),
why: Type.String({ description: "Why this requirement matters" }),
source: Type.String({ description: "Origin of the requirement (e.g. 'user-research', 'design', 'M001')" }),
status: Type.Optional(Type.String({ description: "Status (default: 'active')" })),
primary_owner: Type.Optional(Type.String({ description: "Primary owning slice" })),
supporting_slices: Type.Optional(Type.String({ description: "Supporting slices" })),
validation: Type.Optional(Type.String({ description: "Validation criteria" })),
notes: Type.Optional(Type.String({ description: "Additional notes" })),
}),
execute: requirementSaveExecute,
renderCall(args: any, theme: any) {
let text = theme.fg("toolTitle", theme.bold("requirement_save "));
if (args.class) text += theme.fg("accent", `[${args.class}] `);
if (args.description) text += theme.fg("muted", args.description);
return new Text(text, 0, 0);
},
renderResult(result: any, _options: any, theme: any) {
const d = result.details;
if (result.isError || d?.error) {
return new Text(theme.fg("error", `Error: ${d?.error ?? "unknown"}`), 0, 0);
}
let text = theme.fg("success", `Requirement ${d?.id ?? ""} saved`);
text += theme.fg("dim", ` → REQUIREMENTS.md`);
return new Text(text, 0, 0);
},
};
pi.registerTool(requirementSaveTool);
registerAlias(pi, requirementSaveTool, "gsd_save_requirement", "gsd_requirement_save");
// ─── gsd_summary_save (formerly gsd_save_summary) ──────────────────────
const summarySaveExecute = async (_toolCallId: string, params: any, _signal: AbortSignal | undefined, _onUpdate: unknown, _ctx: unknown) => {

View file

@ -32,6 +32,31 @@ export function resolveProjectRootDbPath(basePath: string): string {
return join(projectRoot, ".gsd", "gsd.db");
}
// Symlink-resolved layout: /.gsd/projects/<hash>/worktrees/M001/...
// The project root is everything before /.gsd/projects/ (#2517)
const symlinkMarker = `${sep}.gsd${sep}projects${sep}`;
const symlinkIdx = basePath.indexOf(symlinkMarker);
if (symlinkIdx !== -1) {
const afterProjects = basePath.slice(symlinkIdx + symlinkMarker.length);
// Expect: <hash>/worktrees/...
const worktreeSeg = `${sep}worktrees${sep}`;
if (afterProjects.includes(worktreeSeg)) {
const projectRoot = basePath.slice(0, symlinkIdx);
return join(projectRoot, ".gsd", "gsd.db");
}
}
// Forward-slash variant for symlink-resolved layout
const fwdSymlinkMarker = "/.gsd/projects/";
const fwdSymlinkIdx = basePath.indexOf(fwdSymlinkMarker);
if (fwdSymlinkIdx !== -1) {
const afterProjects = basePath.slice(fwdSymlinkIdx + fwdSymlinkMarker.length);
if (afterProjects.includes("/worktrees/")) {
const projectRoot = basePath.slice(0, fwdSymlinkIdx);
return join(projectRoot, ".gsd", "gsd.db");
}
}
return join(basePath, ".gsd", "gsd.db");
}
@ -81,8 +106,20 @@ export async function ensureDbOpen(): Promise<boolean> {
return opened;
}
process.stderr.write(
`gsd-db: ensureDbOpen failed — no .gsd directory found (resolvedPath=${resolveProjectRootDbPath(basePath)}, cwd=${basePath})\n`,
);
return false;
} catch {
} catch (err) {
const basePath = process.cwd();
const diagnostic = {
resolvedPath: resolveProjectRootDbPath(basePath),
cwd: basePath,
error: (err as Error).message ?? String(err),
};
process.stderr.write(
`gsd-db: ensureDbOpen failed — ${JSON.stringify(diagnostic)}\n`,
);
return false;
}
}

View file

@ -6,8 +6,9 @@ import { isToolCallEventType } from "@gsd/pi-coding-agent";
import { buildMilestoneFileName, resolveMilestonePath, resolveSliceFile, resolveSlicePath } from "../paths.js";
import { buildBeforeAgentStartResult } from "./system-context.js";
import { handleAgentEnd } from "./agent-end-recovery.js";
import { clearDiscussionFlowState, isDepthVerified, isQueuePhaseActive, markDepthVerified, resetWriteGateState, shouldBlockContextWrite } from "./write-gate.js";
import { clearDiscussionFlowState, isDepthVerified, isQueuePhaseActive, markDepthVerified, resetWriteGateState, shouldBlockContextWrite, shouldBlockQueueExecution } from "./write-gate.js";
import { isBlockedStateFile, isBashWriteToStateFile, BLOCKED_WRITE_ERROR } from "../write-intercept.js";
import { cleanupQuickBranch } from "../quick.js";
import { getDiscussionMilestoneId } from "../guided-flow.js";
import { loadToolApiKeys } from "../commands-config.js";
import { loadFile, saveFile, formatContinue } from "../files.js";
@ -16,8 +17,6 @@ import { getAutoDashboardData, isAutoActive, isAutoPaused, markToolEnd, markTool
import { isParallelActive, shutdownParallel } from "../parallel-orchestrator.js";
import { checkToolCallLoop, resetToolCallLoopGuard } from "./tool-call-loop-guard.js";
import { saveActivityLog } from "../activity-log.js";
import { startRtkStatusUpdates, stopRtkStatusUpdates } from "../rtk-status.js";
import { rewriteCommandWithRtk } from "../../shared/rtk.js";
// Skip the welcome screen on the very first session_start — cli.ts already
// printed it before the TUI launched. Only re-print on /clear (subsequent sessions).
@ -29,19 +28,10 @@ async function syncServiceTierStatus(ctx: ExtensionContext): Promise<void> {
}
export function registerHooks(pi: ExtensionAPI): void {
// Route all agent bash tool commands through RTK rewrite when opted in.
// This is a no-op when RTK is disabled or not installed.
pi.on("bash_transform", async (event) => {
const rewritten = rewriteCommandWithRtk(event.command);
if (rewritten === event.command) return undefined;
return { command: rewritten };
});
pi.on("session_start", async (_event, ctx) => {
resetWriteGateState();
resetToolCallLoopGuard();
await syncServiceTierStatus(ctx);
startRtkStatusUpdates(ctx);
// Apply show_token_cost preference (#1515)
try {
@ -86,11 +76,6 @@ export function registerHooks(pi: ExtensionAPI): void {
clearDiscussionFlowState();
await syncServiceTierStatus(ctx);
loadToolApiKeys();
startRtkStatusUpdates(ctx);
});
pi.on("session_fork", async (_event, ctx) => {
startRtkStatusUpdates(ctx);
});
pi.on("before_agent_start", async (event, ctx: ExtensionContext) => {
@ -102,6 +87,17 @@ export function registerHooks(pi: ExtensionAPI): void {
await handleAgentEnd(pi, event, ctx);
});
// Squash-merge quick-task branch back to the original branch after the
// agent turn completes (#2668). cleanupQuickBranch is a no-op when no
// quick-return state is pending, so this is safe to call on every turn.
pi.on("turn_end", async () => {
try {
cleanupQuickBranch();
} catch {
// Best-effort: don't break the turn lifecycle if cleanup fails.
}
});
pi.on("session_before_compact", async () => {
if (isAutoActive() || isAutoPaused()) {
return { cancel: true };
@ -139,7 +135,6 @@ export function registerHooks(pi: ExtensionAPI): void {
});
pi.on("session_shutdown", async (_event, ctx: ExtensionContext) => {
stopRtkStatusUpdates(ctx);
if (isParallelActive()) {
try {
await shutdownParallel(process.cwd());
@ -161,6 +156,23 @@ export function registerHooks(pi: ExtensionAPI): void {
return { block: true, reason: loopCheck.reason };
}
// ── Queue-mode execution guard (#2545): block source-code mutations ──
// When /gsd queue is active, the agent should only create milestones,
// not execute work. Block write/edit to non-.gsd/ paths and bash commands
// that would modify files.
if (isQueuePhaseActive()) {
let queueInput = "";
if (isToolCallEventType("write", event)) {
queueInput = event.input.path;
} else if (isToolCallEventType("edit", event)) {
queueInput = event.input.path;
} else if (isToolCallEventType("bash", event)) {
queueInput = event.input.command;
}
const queueGuard = shouldBlockQueueExecution(event.toolName, queueInput, true);
if (queueGuard.block) return queueGuard;
}
// ── Single-writer engine: block direct writes to STATE.md ──────────
// Covers write, edit, and bash tools to prevent bypass vectors.
if (isToolCallEventType("write", event)) {
@ -245,7 +257,7 @@ export function registerHooks(pi: ExtensionAPI): void {
pi.on("tool_execution_start", async (event) => {
if (!isAutoActive()) return;
markToolStart(event.toolCallId, event.toolName);
markToolStart(event.toolCallId);
});
pi.on("tool_execution_end", async (event) => {

View file

@ -1,4 +1,4 @@
import { existsSync, readFileSync } from "node:fs";
import { existsSync, readFileSync, unlinkSync } from "node:fs";
import { homedir } from "node:os";
import { join } from "node:path";
@ -6,6 +6,7 @@ import type { ExtensionContext } from "@gsd/pi-coding-agent";
import { debugTime } from "../debug-logger.js";
import { loadPrompt } from "../prompt-loader.js";
import { readForensicsMarker } from "../forensics.js";
import { resolveAllSkillReferences, renderPreferencesForSystemPrompt, loadEffectiveGSDPreferences } from "../preferences.js";
import { resolveGsdRootFile, resolveSliceFile, resolveSlicePath, resolveTaskFile, resolveTaskFiles, resolveTasksDir, relSliceFile, relSlicePath, relTaskFile } from "../paths.js";
import { hasSkillSnapshot, detectNewSkills, formatSkillsXml } from "../skill-discovery.js";
@ -94,30 +95,54 @@ export async function buildBeforeAgentStartResult(
}
}
let codebaseBlock = "";
const codebasePath = resolveGsdRootFile(process.cwd(), "CODEBASE");
if (existsSync(codebasePath)) {
try {
const rawContent = readFileSync(codebasePath, "utf-8").trim();
if (rawContent) {
// Cap injection size to ~2 000 tokens to avoid bloating every request.
// Full map is always available at .gsd/CODEBASE.md.
const MAX_CODEBASE_CHARS = 8_000;
const generatedMatch = rawContent.match(/Generated: (\S+)/);
const generatedAt = generatedMatch?.[1] ?? "unknown";
const content = rawContent.length > MAX_CODEBASE_CHARS
? rawContent.slice(0, MAX_CODEBASE_CHARS) + "\n\n*(truncated — see .gsd/CODEBASE.md for full map)*"
: rawContent;
codebaseBlock = `\n\n[PROJECT CODEBASE — File structure and descriptions (generated ${generatedAt}, may be stale — run /gsd codebase update to refresh)]\n\n${content}`;
}
} catch {
// skip
}
}
warnDeprecatedAgentInstructions();
const injection = await buildGuidedExecuteContextInjection(event.prompt, process.cwd());
// Re-inject forensics context on follow-up turns (#2941)
const forensicsInjection = !injection ? buildForensicsContextInjection(process.cwd()) : null;
const worktreeBlock = buildWorktreeContextBlock();
const fullSystem = `${event.systemPrompt}\n\n[SYSTEM CONTEXT — GSD]\n\n${systemContent}${preferenceBlock}${knowledgeBlock}${memoryBlock}${newSkillsBlock}${worktreeBlock}`;
const fullSystem = `${event.systemPrompt}\n\n[SYSTEM CONTEXT — GSD]\n\n${systemContent}${preferenceBlock}${knowledgeBlock}${codebaseBlock}${memoryBlock}${newSkillsBlock}${worktreeBlock}`;
stopContextTimer({
systemPromptSize: fullSystem.length,
injectionSize: injection?.length ?? 0,
injectionSize: injection?.length ?? forensicsInjection?.length ?? 0,
hasPreferences: preferenceBlock.length > 0,
hasNewSkills: newSkillsBlock.length > 0,
});
// Determine which context message to inject (guided execute takes priority)
const contextMessage = injection
? { customType: "gsd-guided-context", content: injection, display: false as const }
: forensicsInjection
? { customType: "gsd-forensics", content: forensicsInjection, display: false as const }
: null;
return {
systemPrompt: fullSystem,
...(injection
? {
message: {
customType: "gsd-guided-context",
content: injection,
display: false as const,
},
}
: {}),
...(contextMessage ? { message: contextMessage } : {}),
};
}
@ -375,3 +400,38 @@ function oneLine(text: string): string {
return text.replace(/\s+/g, " ").trim();
}
// ─── Forensics Context Re-injection (#2941) ──────────────────────────────────
/**
* Check for an active forensics session and return the prompt content
* so it can be re-injected on follow-up turns.
*/
function buildForensicsContextInjection(basePath: string): string | null {
const marker = readForensicsMarker(basePath);
if (!marker) return null;
// Expire markers older than 2 hours to avoid stale context
const age = Date.now() - new Date(marker.createdAt).getTime();
if (age > 2 * 60 * 60 * 1000) {
clearForensicsMarker(basePath);
return null;
}
return marker.promptContent;
}
/**
* Remove the active forensics marker file, e.g. when the investigation
* is complete or the session expires.
*/
export function clearForensicsMarker(basePath: string): void {
const markerPath = join(basePath, ".gsd", "runtime", "active-forensics.json");
if (existsSync(markerPath)) {
try {
unlinkSync(markerPath);
} catch {
// non-fatal
}
}
}

View file

@ -1,5 +1,31 @@
const MILESTONE_CONTEXT_RE = /M\d+(?:-[a-z0-9]{6})?-CONTEXT\.md$/;
/**
* Path segment that identifies .gsd/ planning artifacts.
* Writes to these paths are allowed during queue mode.
*/
const GSD_DIR_RE = /(^|[/\\])\.gsd([/\\]|$)/;
/**
* Read-only tool names that are always safe during queue mode.
*/
const QUEUE_SAFE_TOOLS = new Set([
"read", "grep", "find", "ls", "glob",
// Discussion & planning tools
"ask_user_questions",
"gsd_milestone_generate_id",
"gsd_summary_save",
// Web research tools used during queue discussion
"search-the-web", "resolve_library", "get_library_docs", "fetch_page",
"search_and_read",
]);
/**
* Bash commands that are read-only / investigative safe during queue mode.
* Matches the leading command in a bash invocation.
*/
const BASH_READ_ONLY_RE = /^\s*(cat|head|tail|less|more|wc|file|stat|du|df|which|type|echo|printf|ls|find|grep|rg|awk|sed\b(?!.*-i)|sort|uniq|diff|comm|tr|cut|tee\s+-a\s+\/dev\/null|git\s+(log|show|diff|status|branch|tag|remote|rev-parse|ls-files|blame|shortlog|describe|stash\s+list|config\s+--get|cat-file)|gh\s+(issue|pr|api|repo|release)\s+(view|list|diff|status|checks)|mkdir\s+-p\s+\.gsd|rtk\s)/;
let depthVerificationDone = false;
let activeQueuePhase = false;
@ -49,3 +75,52 @@ export function shouldBlockContextWrite(
};
}
/**
* Queue-mode execution guard (#2545).
*
* When the queue phase is active, the agent should only create planning
* artifacts (milestones, CONTEXT.md, QUEUE.md, etc.) never execute work.
* This function blocks write/edit/bash tool calls that would modify source
* code outside of .gsd/.
*
* @param toolName The tool being called (write, edit, bash, etc.)
* @param input For write/edit: the file path. For bash: the command string.
* @param queuePhaseActive Whether the queue phase is currently active.
* @returns { block, reason } block=true if the call should be rejected.
*/
export function shouldBlockQueueExecution(
toolName: string,
input: string,
queuePhaseActive: boolean,
): { block: boolean; reason?: string } {
if (!queuePhaseActive) return { block: false };
// Always-safe tools (read-only, discussion, planning)
if (QUEUE_SAFE_TOOLS.has(toolName)) return { block: false };
// write/edit — allow if targeting .gsd/ planning artifacts
if (toolName === "write" || toolName === "edit") {
if (GSD_DIR_RE.test(input)) return { block: false };
return {
block: true,
reason: `Blocked: /gsd queue is a planning tool — it creates milestones, not executes work. ` +
`Cannot ${toolName} to "${input}" during queue mode. ` +
`Write CONTEXT.md files and update PROJECT.md/QUEUE.md instead.`,
};
}
// bash — allow read-only/investigative commands, block everything else
if (toolName === "bash") {
if (BASH_READ_ONLY_RE.test(input)) return { block: false };
return {
block: true,
reason: `Blocked: /gsd queue is a planning tool — it creates milestones, not executes work. ` +
`Cannot run "${input.slice(0, 80)}${input.length > 80 ? "…" : ""}" during queue mode. ` +
`Use read-only commands (cat, grep, git log, etc.) to investigate, then write planning artifacts.`,
};
}
// Unknown tools — allow by default (custom extension tools, etc.)
return { block: false };
}

View file

@ -26,6 +26,7 @@ export interface CaptureEntry {
resolution?: string;
rationale?: string;
resolvedAt?: string;
resolvedInMilestone?: string;
executed?: boolean;
}
@ -176,6 +177,7 @@ export function markCaptureResolved(
classification: Classification,
resolution: string,
rationale: string,
milestoneId?: string,
): void {
const filePath = resolveCapturesPath(basePath);
if (!existsSync(filePath)) return;
@ -206,13 +208,17 @@ export function markCaptureResolved(
`**Rationale:** ${rationale}`,
`**Resolved:** ${resolvedAt}`,
];
if (milestoneId) {
newFields.push(`**Milestone:** ${milestoneId}`);
}
// Remove any existing classification/resolution/rationale/resolved fields
// Remove any existing classification/resolution/rationale/resolved/milestone fields
// (in case of re-triage)
section = section.replace(/\*\*Classification:\*\*\s*.+\n?/g, "");
section = section.replace(/\*\*Resolution:\*\*\s*.+\n?/g, "");
section = section.replace(/\*\*Rationale:\*\*\s*.+\n?/g, "");
section = section.replace(/\*\*Resolved:\*\*\s*.+\n?/g, "");
section = section.replace(/\*\*Milestone:\*\*\s*.+\n?/g, "");
// Add new fields after Status line
section = section.trimEnd() + "\n" + newFields.join("\n") + "\n";
@ -255,18 +261,70 @@ export function markCaptureExecuted(basePath: string, captureId: string): void {
* Load resolved captures that have actionable classifications (inject, replan,
* quick-task) but have NOT yet been executed.
* These are captures whose resolutions need to be carried out.
*
* When `currentMilestoneId` is provided, captures resolved in a *different*
* milestone are treated as stale and excluded. This prevents quick-task
* captures from a prior milestone re-executing after the underlying issues
* were already fixed by planned milestone work (#2872).
*
* Captures that have no `resolvedInMilestone` (legacy captures resolved before
* this field was introduced) are always included for backward compatibility.
*/
export function loadActionableCaptures(basePath: string): CaptureEntry[] {
export function loadActionableCaptures(basePath: string, currentMilestoneId?: string): CaptureEntry[] {
return loadAllCaptures(basePath).filter(
c =>
c.status === "resolved" &&
!c.executed &&
(c.classification === "inject" ||
c.classification === "replan" ||
c.classification === "quick-task"),
c.classification === "quick-task") &&
// Staleness gate: exclude captures resolved in a different milestone (#2872)
(!currentMilestoneId ||
!c.resolvedInMilestone ||
c.resolvedInMilestone === currentMilestoneId),
);
}
/**
* Retroactively stamp a capture with a milestone ID.
*
* Used by executeTriageResolutions() as a safety net when the triage LLM
* resolves a capture without writing the **Milestone:** field. This ensures
* the staleness gate in loadActionableCaptures() works correctly even for
* captures resolved before the prompt was updated (#2872).
*/
export function stampCaptureMilestone(basePath: string, captureId: string, milestoneId: string): void {
const filePath = resolveCapturesPath(basePath);
if (!existsSync(filePath)) return;
const content = readFileSync(filePath, "utf-8");
const sectionRegex = new RegExp(
`(### ${escapeRegex(captureId)}\\n(?:(?!### ).)*?)(?=### |$)`,
"s",
);
const match = sectionRegex.exec(content);
if (!match) return;
let section = match[1];
// Only stamp if not already present
if (/\*\*Milestone:\*\*/.test(section)) return;
// Insert after the Resolved field (or at end of section)
const resolvedFieldEnd = section.search(/\*\*Resolved:\*\*\s*.+\n?/);
if (resolvedFieldEnd !== -1) {
const resolvedMatch = section.match(/\*\*Resolved:\*\*\s*.+\n?/);
const insertPos = resolvedFieldEnd + (resolvedMatch?.[0]?.length ?? 0);
section = section.slice(0, insertPos) + `**Milestone:** ${milestoneId}\n` + section.slice(insertPos);
} else {
section = section.trimEnd() + "\n" + `**Milestone:** ${milestoneId}` + "\n";
}
const updated = content.replace(sectionRegex, section);
writeFileSync(filePath, updated, "utf-8");
}
// ─── Parser ───────────────────────────────────────────────────────────────────
/**
@ -291,6 +349,7 @@ function parseCapturesContent(content: string): CaptureEntry[] {
const resolution = extractBoldField(body, "Resolution");
const rationale = extractBoldField(body, "Rationale");
const resolvedAt = extractBoldField(body, "Resolved");
const milestoneId = extractBoldField(body, "Milestone");
const executedAt = extractBoldField(body, "Executed");
if (!text || !timestamp) continue;
@ -308,6 +367,7 @@ function parseCapturesContent(content: string): CaptureEntry[] {
...(resolution ? { resolution } : {}),
...(rationale ? { rationale } : {}),
...(resolvedAt ? { resolvedAt } : {}),
...(milestoneId ? { resolvedInMilestone: milestoneId } : {}),
...(executedAt ? { executed: true } : {}),
});
}

View file

@ -0,0 +1,351 @@
/**
* GSD Codebase Map Generator
*
* Produces .gsd/CODEBASE.md a structural table of contents for the project.
* Gives fresh agent contexts instant orientation without filesystem exploration.
*
* Generation: walk `git ls-files`, group by directory, output with descriptions.
* Maintenance: agent updates descriptions as it works; incremental update preserves them.
*/
import { existsSync, readFileSync, writeFileSync, mkdirSync } from "node:fs";
import { join, dirname, extname } from "node:path";
import { execSync } from "node:child_process";
import { gsdRoot } from "./paths.js";
// ─── Types ───────────────────────────────────────────────────────────────────
export interface CodebaseMapOptions {
excludePatterns?: string[];
maxFiles?: number;
collapseThreshold?: number;
}
interface FileEntry {
path: string;
description: string;
}
interface DirectoryGroup {
path: string;
files: FileEntry[];
collapsed: boolean;
}
// ─── Defaults ────────────────────────────────────────────────────────────────
const DEFAULT_EXCLUDES = [
".gsd/",
".planning/",
".git/",
"node_modules/",
"dist/",
"build/",
".next/",
"coverage/",
"__pycache__/",
".venv/",
"vendor/",
];
const DEFAULT_MAX_FILES = 500;
const DEFAULT_COLLAPSE_THRESHOLD = 20;
// ─── Parsing ─────────────────────────────────────────────────────────────────
/**
* Parse an existing CODEBASE.md to extract file description mappings.
* Also scans <!-- gsd:collapsed-descriptions --> comment blocks to preserve
* descriptions for files in collapsed directories across incremental updates.
*/
export function parseCodebaseMap(content: string): Map<string, string> {
const descriptions = new Map<string, string>();
let inCollapsedBlock = false;
for (const line of content.split("\n")) {
// Track collapsed-description comment blocks
if (line.trimStart().startsWith("<!-- gsd:collapsed-descriptions")) {
inCollapsedBlock = true;
continue;
}
if (inCollapsedBlock && line.trimStart().startsWith("-->")) {
inCollapsedBlock = false;
continue;
}
// Match: - `path/to/file.ts` — Description here
const match = line.match(/^- `(.+?)` — (.+)$/);
if (match) {
descriptions.set(match[1], match[2]);
continue;
}
// Match: - `path/to/file.ts` (no description) — only outside collapsed blocks
if (!inCollapsedBlock) {
const bareMatch = line.match(/^- `(.+?)`\s*$/);
if (bareMatch) {
descriptions.set(bareMatch[1], "");
}
}
}
return descriptions;
}
// ─── File Enumeration ────────────────────────────────────────────────────────
function shouldExclude(filePath: string, excludes: string[]): boolean {
for (const pattern of excludes) {
if (pattern.endsWith("/")) {
if (filePath.startsWith(pattern) || filePath.includes(`/${pattern}`)) return true;
} else if (filePath === pattern || filePath.endsWith(`/${pattern}`)) {
return true;
}
}
// Skip binary/lock files
const ext = extname(filePath).toLowerCase();
if ([".lock", ".png", ".jpg", ".jpeg", ".gif", ".ico", ".woff", ".woff2", ".ttf", ".eot", ".svg"].includes(ext)) {
return true;
}
return false;
}
function lsFiles(basePath: string): string[] {
try {
const result = execSync("git ls-files", { cwd: basePath, encoding: "utf-8", timeout: 10000 });
return result.split("\n").filter(Boolean);
} catch {
return [];
}
}
/**
* Enumerate tracked files, applying exclusions and the maxFiles cap.
* Returns both the file list and whether truncation occurred.
*/
function enumerateFiles(basePath: string, excludes: string[], maxFiles: number): { files: string[]; truncated: boolean } {
const allFiles = lsFiles(basePath);
const filtered = allFiles.filter((f) => !shouldExclude(f, excludes));
const truncated = filtered.length > maxFiles;
return { files: truncated ? filtered.slice(0, maxFiles) : filtered, truncated };
}
// ─── Grouping ────────────────────────────────────────────────────────────────
function groupByDirectory(
files: string[],
descriptions: Map<string, string>,
collapseThreshold: number,
): DirectoryGroup[] {
const dirMap = new Map<string, FileEntry[]>();
for (const file of files) {
const dir = dirname(file);
const dirKey = dir === "." ? "" : dir;
if (!dirMap.has(dirKey)) {
dirMap.set(dirKey, []);
}
dirMap.get(dirKey)!.push({
path: file,
description: descriptions.get(file) ?? "",
});
}
const groups: DirectoryGroup[] = [];
const sortedDirs = [...dirMap.keys()].sort();
for (const dir of sortedDirs) {
const dirFiles = dirMap.get(dir)!;
dirFiles.sort((a, b) => a.path.localeCompare(b.path));
groups.push({
path: dir,
files: dirFiles,
collapsed: dirFiles.length > collapseThreshold,
});
}
return groups;
}
// ─── Rendering ───────────────────────────────────────────────────────────────
function renderCodebaseMap(groups: DirectoryGroup[], totalFiles: number, truncated: boolean): string {
const lines: string[] = [];
const now = new Date().toISOString().split(".")[0] + "Z";
const described = groups.reduce((sum, g) => sum + g.files.filter((f) => f.description).length, 0);
lines.push("# Codebase Map");
lines.push("");
lines.push(`Generated: ${now} | Files: ${totalFiles} | Described: ${described}/${totalFiles}`);
if (truncated) {
lines.push(`Note: Truncated to first ${totalFiles} files. Run with higher --max-files to include all.`);
}
lines.push("");
for (const group of groups) {
const heading = group.path || "(root)";
lines.push(`### ${heading}/`);
if (group.collapsed) {
// Summarize collapsed directories
const extensions = new Map<string, number>();
for (const f of group.files) {
const ext = extname(f.path) || "(no ext)";
extensions.set(ext, (extensions.get(ext) ?? 0) + 1);
}
const extSummary = [...extensions.entries()]
.sort((a, b) => b[1] - a[1])
.map(([ext, count]) => `${count} ${ext}`)
.join(", ");
lines.push(`- *(${group.files.length} files: ${extSummary})*`);
// Preserve any existing descriptions in a hidden comment block so
// incremental updates can recover them via parseCodebaseMap.
const descLines = group.files
.filter((f) => f.description)
.map((f) => `- \`${f.path}\`${f.description}`);
if (descLines.length > 0) {
lines.push("<!-- gsd:collapsed-descriptions");
lines.push(...descLines);
lines.push("-->");
}
} else {
for (const file of group.files) {
if (file.description) {
lines.push(`- \`${file.path}\`${file.description}`);
} else {
lines.push(`- \`${file.path}\``);
}
}
}
lines.push("");
}
return lines.join("\n");
}
// ─── Public API ──────────────────────────────────────────────────────────────
/**
* Generate a fresh CODEBASE.md from scratch.
* Preserves existing descriptions if `existingDescriptions` is provided.
*/
export function generateCodebaseMap(
basePath: string,
options?: CodebaseMapOptions,
existingDescriptions?: Map<string, string>,
): { content: string; fileCount: number; truncated: boolean; files: string[] } {
const excludes = [...DEFAULT_EXCLUDES, ...(options?.excludePatterns ?? [])];
const maxFiles = options?.maxFiles ?? DEFAULT_MAX_FILES;
const collapseThreshold = options?.collapseThreshold ?? DEFAULT_COLLAPSE_THRESHOLD;
const { files, truncated } = enumerateFiles(basePath, excludes, maxFiles);
const descriptions = existingDescriptions ?? new Map<string, string>();
const groups = groupByDirectory(files, descriptions, collapseThreshold);
const content = renderCodebaseMap(groups, files.length, truncated);
return { content, fileCount: files.length, truncated, files };
}
/**
* Incremental update: re-scan files, preserve existing descriptions,
* add new files, remove deleted files.
*/
export function updateCodebaseMap(
basePath: string,
options?: CodebaseMapOptions,
): { content: string; added: number; removed: number; unchanged: number; fileCount: number; truncated: boolean } {
const codebasePath = join(gsdRoot(basePath), "CODEBASE.md");
// Load existing descriptions
let existingDescriptions = new Map<string, string>();
if (existsSync(codebasePath)) {
const existing = readFileSync(codebasePath, "utf-8");
existingDescriptions = parseCodebaseMap(existing);
}
const existingFiles = new Set(existingDescriptions.keys());
// Generate new map preserving descriptions — reuse the returned file list
// to avoid a second enumeration (prevents race between content and stats).
const result = generateCodebaseMap(basePath, options, existingDescriptions);
const currentSet = new Set(result.files);
// Count changes
let added = 0;
let removed = 0;
for (const f of result.files) {
if (!existingFiles.has(f)) added++;
}
for (const f of existingFiles) {
if (!currentSet.has(f)) removed++;
}
return {
content: result.content,
added,
removed,
unchanged: result.files.length - added,
fileCount: result.fileCount,
truncated: result.truncated,
};
}
/**
* Write CODEBASE.md to .gsd/ directory.
*/
export function writeCodebaseMap(basePath: string, content: string): string {
const root = gsdRoot(basePath);
mkdirSync(root, { recursive: true });
const outPath = join(root, "CODEBASE.md");
writeFileSync(outPath, content, "utf-8");
return outPath;
}
/**
* Read existing CODEBASE.md, or return null if it doesn't exist.
*/
export function readCodebaseMap(basePath: string): string | null {
const codebasePath = join(gsdRoot(basePath), "CODEBASE.md");
if (!existsSync(codebasePath)) return null;
try {
return readFileSync(codebasePath, "utf-8");
} catch {
return null;
}
}
/**
* Get stats about the codebase map.
*/
export function getCodebaseMapStats(basePath: string): {
exists: boolean;
fileCount: number;
describedCount: number;
undescribedCount: number;
generatedAt: string | null;
} {
const content = readCodebaseMap(basePath);
if (!content) {
return { exists: false, fileCount: 0, describedCount: 0, undescribedCount: 0, generatedAt: null };
}
// Parse total file count from the header line (accurate even for collapsed dirs)
const fileCountMatch = content.match(/Files:\s*(\d+)/);
const totalFiles = fileCountMatch ? parseInt(fileCountMatch[1], 10) : 0;
// Use parseCodebaseMap to count described files (includes collapsed-description blocks)
const descriptions = parseCodebaseMap(content);
const described = [...descriptions.values()].filter((d) => d.length > 0).length;
const dateMatch = content.match(/Generated: (\S+)/);
return {
exists: true,
fileCount: totalFiles,
describedCount: described,
undescribedCount: totalFiles - described,
generatedAt: dateMatch?.[1] ?? null,
};
}

View file

@ -0,0 +1,164 @@
/**
* GSD Command /gsd codebase
*
* Generate and manage the codebase map (.gsd/CODEBASE.md).
* Subcommands: generate, update, stats, help
*/
import type { ExtensionAPI, ExtensionCommandContext } from "@gsd/pi-coding-agent";
import {
generateCodebaseMap,
updateCodebaseMap,
writeCodebaseMap,
getCodebaseMapStats,
readCodebaseMap,
} from "./codebase-generator.js";
const USAGE =
"Usage: /gsd codebase [generate|update|stats]\n\n" +
" generate [--max-files N] — Generate or regenerate CODEBASE.md\n" +
" update — Incremental update (preserves descriptions)\n" +
" stats — Show file count, coverage, and generation time\n" +
" help — Show this help\n\n" +
"With no subcommand, shows stats if a map exists or help if not.";
export async function handleCodebase(
args: string,
ctx: ExtensionCommandContext,
_pi: ExtensionAPI,
): Promise<void> {
const basePath = process.cwd();
const parts = args.trim().split(/\s+/);
const sub = parts[0] ?? "";
switch (sub) {
case "generate": {
const maxFiles = parseMaxFiles(args, ctx);
if (maxFiles === false) return; // validation failed, message already shown
const existing = readCodebaseMap(basePath);
const existingDescriptions = existing
? (await import("./codebase-generator.js")).parseCodebaseMap(existing)
: undefined;
const result = generateCodebaseMap(basePath, { maxFiles: maxFiles ?? undefined }, existingDescriptions);
if (result.fileCount === 0) {
ctx.ui.notify(
"Codebase map generated with 0 files.\n" +
"Is this a git repository? Run 'git ls-files' to verify.",
"warning",
);
return;
}
const outPath = writeCodebaseMap(basePath, result.content);
ctx.ui.notify(
`Codebase map generated: ${result.fileCount} files\n` +
`Written to: ${outPath}` +
(result.truncated ? `\n⚠ Truncated — increase --max-files to include all files` : ""),
"success",
);
return;
}
case "update": {
const existing = readCodebaseMap(basePath);
if (!existing) {
ctx.ui.notify(
"No codebase map found. Run /gsd codebase generate to create one.",
"warning",
);
return;
}
const maxFiles = parseMaxFiles(args, ctx);
if (maxFiles === false) return;
const result = updateCodebaseMap(basePath, { maxFiles: maxFiles ?? undefined });
writeCodebaseMap(basePath, result.content);
ctx.ui.notify(
`Codebase map updated: ${result.fileCount} files\n` +
` Added: ${result.added} | Removed: ${result.removed} | Unchanged: ${result.unchanged}` +
(result.truncated ? `\n⚠ Truncated — increase --max-files to include all files` : ""),
"success",
);
return;
}
case "stats": {
showStats(basePath, ctx);
return;
}
case "help":
ctx.ui.notify(USAGE, "info");
return;
case "": {
// Safe default: show stats if map exists, help if not
const existing = readCodebaseMap(basePath);
if (existing) {
showStats(basePath, ctx);
} else {
ctx.ui.notify(USAGE, "info");
}
return;
}
default:
ctx.ui.notify(
`Unknown subcommand "${sub}".\n\n${USAGE}`,
"warning",
);
}
}
function showStats(basePath: string, ctx: ExtensionCommandContext): void {
const stats = getCodebaseMapStats(basePath);
if (!stats.exists) {
ctx.ui.notify("No codebase map found. Run /gsd codebase generate to create one.", "info");
return;
}
const coverage = stats.fileCount > 0
? Math.round((stats.describedCount / stats.fileCount) * 100)
: 0;
ctx.ui.notify(
`Codebase Map Stats:\n` +
` Files: ${stats.fileCount}\n` +
` Described: ${stats.describedCount} (${coverage}%)\n` +
` Undescribed: ${stats.undescribedCount}\n` +
` Generated: ${stats.generatedAt ?? "unknown"}\n\n` +
(stats.undescribedCount > 0
? `Tip: Run /gsd codebase update to refresh after file changes.`
: `Coverage is complete.`),
"info",
);
}
/**
* Parse and validate --max-files flag.
* Returns the parsed number, undefined if flag not present, or false if invalid.
*/
function parseMaxFiles(args: string, ctx: ExtensionCommandContext): number | undefined | false {
const maxFilesStr = extractFlag(args, "--max-files");
if (!maxFilesStr) return undefined;
const maxFiles = parseInt(maxFilesStr, 10);
if (isNaN(maxFiles) || maxFiles < 1) {
ctx.ui.notify("--max-files must be a positive integer (e.g. --max-files 200).", "warning");
return false;
}
return maxFiles;
}
function extractFlag(args: string, flag: string): string | undefined {
const escaped = flag.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
const regex = new RegExp(`${escaped}[=\\s]+(\\S+)`);
const match = args.match(regex);
return match?.[1];
}

View file

@ -15,7 +15,7 @@ export interface GsdCommandDefinition {
type CompletionMap = Record<string, readonly GsdCommandDefinition[]>;
export const GSD_COMMAND_DESCRIPTION =
"GSD — Get Shit Done: /gsd help|start|templates|next|auto|stop|pause|status|widget|visualize|queue|quick|discuss|capture|triage|dispatch|history|undo|undo-task|reset-slice|rate|skip|export|cleanup|mode|prefs|config|keys|hooks|run-hook|skill-health|doctor|logs|forensics|changelog|migrate|remote|steer|knowledge|new-milestone|parallel|cmux|park|unpark|init|setup|inspect|extensions|update|fast|mcp|rethink";
"GSD — Get Shit Done: /gsd help|start|templates|next|auto|stop|pause|status|widget|visualize|queue|quick|discuss|capture|triage|dispatch|history|undo|undo-task|reset-slice|rate|skip|export|cleanup|mode|prefs|config|keys|hooks|run-hook|skill-health|doctor|logs|forensics|changelog|migrate|remote|steer|knowledge|new-milestone|parallel|cmux|park|unpark|init|setup|inspect|extensions|update|fast|mcp|rethink|codebase";
export const TOP_LEVEL_SUBCOMMANDS: readonly GsdCommandDefinition[] = [
{ cmd: "help", desc: "Categorized command reference with descriptions" },
@ -71,6 +71,7 @@ export const TOP_LEVEL_SUBCOMMANDS: readonly GsdCommandDefinition[] = [
{ cmd: "mcp", desc: "MCP server status and connectivity check (status, check <server>)" },
{ cmd: "rethink", desc: "Conversational project reorganization — reorder, park, discard, add milestones" },
{ cmd: "workflow", desc: "Custom workflow lifecycle (new, run, list, validate, pause, resume)" },
{ cmd: "codebase", desc: "Generate and manage codebase map (.gsd/CODEBASE.md)" },
];
const NESTED_COMPLETIONS: CompletionMap = {
@ -225,6 +226,14 @@ const NESTED_COMPLETIONS: CompletionMap = {
{ cmd: "pause", desc: "Pause custom workflow auto-mode" },
{ cmd: "resume", desc: "Resume paused custom workflow auto-mode" },
],
codebase: [
{ cmd: "generate", desc: "Generate or regenerate CODEBASE.md" },
{ cmd: "generate --max-files", desc: "Generate with custom file limit (default: 500)" },
{ cmd: "update", desc: "Incremental update (preserves descriptions)" },
{ cmd: "update --max-files", desc: "Update with custom file limit" },
{ cmd: "stats", desc: "Show file count, description coverage, and generation time" },
{ cmd: "help", desc: "Show usage and available subcommands" },
],
};
function filterOptions(

View file

@ -206,5 +206,10 @@ Examples:
await handleRethink(trimmed, ctx, pi);
return true;
}
if (trimmed === "codebase" || trimmed.startsWith("codebase ")) {
const { handleCodebase } = await import("../../commands-codebase.js");
await handleCodebase(trimmed.replace(/^codebase\s*/, "").trim(), ctx, pi);
return true;
}
return false;
}

View file

@ -35,15 +35,17 @@ const UNIT_TYPE_TIERS: Record<string, ComplexityTier> = {
"complete-slice": "light",
"run-uat": "light",
// Tier 2 — Standard: research, routine planning, discussion
// Tier 2 — Standard: research, routine discussion
"discuss-milestone": "standard",
"discuss-slice": "standard",
"research-milestone": "standard",
"research-slice": "standard",
"plan-milestone": "standard",
"plan-slice": "standard",
// Tier 3 — Heavy: execution, replanning (requires deep reasoning)
// Tier 3 — Heavy: planning, execution, replanning (requires deep reasoning)
// Planning is heavy so it uses the best configured model (e.g. Opus) and is
// not downgraded by dynamic routing when a capable model is configured.
"plan-milestone": "heavy",
"plan-slice": "heavy",
"execute-task": "standard", // default standard, upgraded by metadata
"replan-slice": "heavy",
"reassess-roadmap": "heavy",
@ -185,8 +187,8 @@ function analyzePlanComplexity(
// Check if this is a milestone-level plan (more complex) vs single slice
const { milestone: mid, slice: sid } = parseUnitId(unitId);
if (!sid) {
// Milestone-level planning is always at least standard
return { tier: "standard", reason: "milestone-level planning" };
// Milestone-level planning is always heavy — requires full context and best model
return { tier: "heavy", reason: "milestone-level planning" };
}
// For slice planning, try to read the context/research to gauge complexity

View file

@ -227,6 +227,122 @@ export async function nextDecisionId(): Promise<string> {
}
}
// ─── Next Requirement ID ─────────────────────────────────────────────────
/**
* Compute the next requirement ID from the current DB state.
* Queries MAX(CAST(SUBSTR(id, 2) AS INTEGER)) from requirements table.
* Returns R001 if no requirements exist. Zero-pads to 3 digits.
*/
export async function nextRequirementId(): Promise<string> {
try {
const db = await import('./gsd-db.js');
const adapter = db._getAdapter();
if (!adapter) return 'R001';
const row = adapter
.prepare('SELECT MAX(CAST(SUBSTR(id, 2) AS INTEGER)) as max_num FROM requirements')
.get();
const maxNum = row ? (row['max_num'] as number | null) : null;
if (maxNum == null || isNaN(maxNum)) return 'R001';
const next = maxNum + 1;
return `R${String(next).padStart(3, '0')}`;
} catch (err) {
logError('manifest', 'nextRequirementId failed', { fn: 'nextRequirementId', error: String((err as Error).message) });
return 'R001';
}
}
// ─── Save Requirement to DB + Regenerate Markdown ────────────────────────
export interface SaveRequirementFields {
class: string;
status?: string;
description: string;
why: string;
source: string;
primary_owner?: string;
supporting_slices?: string;
validation?: string;
notes?: string;
}
/**
* Save a new requirement to DB and regenerate REQUIREMENTS.md.
* Auto-assigns the next ID via nextRequirementId().
* Returns the assigned ID.
*/
export async function saveRequirementToDb(
fields: SaveRequirementFields,
basePath: string,
): Promise<{ id: string }> {
try {
const db = await import('./gsd-db.js');
const id = await nextRequirementId();
const requirement: Requirement = {
id,
class: fields.class,
status: fields.status ?? 'active',
description: fields.description,
why: fields.why,
source: fields.source,
primary_owner: fields.primary_owner ?? '',
supporting_slices: fields.supporting_slices ?? '',
validation: fields.validation ?? '',
notes: fields.notes ?? '',
full_content: '',
superseded_by: null,
};
db.upsertRequirement(requirement);
// Fetch all requirements for full file regeneration
const adapter = db._getAdapter();
let allRequirements: Requirement[] = [];
if (adapter) {
const rows = adapter.prepare('SELECT * FROM requirements ORDER BY id').all();
allRequirements = rows.map(row => ({
id: row['id'] as string,
class: row['class'] as string,
status: row['status'] as string,
description: row['description'] as string,
why: row['why'] as string,
source: row['source'] as string,
primary_owner: row['primary_owner'] as string,
supporting_slices: row['supporting_slices'] as string,
validation: row['validation'] as string,
notes: row['notes'] as string,
full_content: row['full_content'] as string,
superseded_by: (row['superseded_by'] as string) ?? null,
}));
}
const nonSuperseded = allRequirements.filter(r => r.superseded_by == null);
const md = generateRequirementsMd(nonSuperseded);
const filePath = resolveGsdRootFile(basePath, 'REQUIREMENTS');
try {
await saveFile(filePath, md);
} catch (diskErr) {
logError('manifest', 'disk write failed, rolling back DB row', { fn: 'saveRequirementToDb', error: String((diskErr as Error).message) });
const rollbackAdapter = db._getAdapter();
rollbackAdapter?.prepare('DELETE FROM requirements WHERE id = :id').run({ ':id': id });
throw diskErr;
}
invalidateStateCache();
clearPathCache();
clearParseCache();
return { id };
} catch (err) {
logError('manifest', 'saveRequirementToDb failed', { fn: 'saveRequirementToDb', error: String((err as Error).message) });
throw err;
}
}
// ─── Save Decision to DB + Regenerate Markdown ────────────────────────────
export interface SaveDecisionFields {
@ -344,15 +460,30 @@ export async function updateRequirementInDb(
const db = await import('./gsd-db.js');
const existing = db.getRequirementById(id);
if (!existing) {
throw new GSDError(GSD_STALE_STATE, `Requirement ${id} not found`);
}
// Merge updates into existing
// If requirement doesn't exist in DB, create a skeleton and merge updates.
// This handles the case where requirements were written to REQUIREMENTS.md
// but never imported into the database (see #2919).
const base: Requirement = existing ?? {
id,
class: '',
status: 'active',
description: '',
why: '',
source: '',
primary_owner: '',
supporting_slices: '',
validation: '',
notes: '',
full_content: '',
superseded_by: null,
};
// Merge updates into existing (or skeleton)
const merged: Requirement = {
...existing,
...base,
...updates,
id: existing.id, // ID cannot be changed
id: base.id, // ID cannot be changed
};
db.upsertRequirement(merged);
@ -388,7 +519,9 @@ export async function updateRequirementInDb(
await saveFile(filePath, md);
} catch (diskErr) {
logError('manifest', 'disk write failed, reverting DB row', { fn: 'updateRequirementInDb', error: String((diskErr as Error).message) });
db.upsertRequirement(existing);
if (existing) {
db.upsertRequirement(existing);
}
throw diskErr;
}
// Invalidate file-read caches so deriveState() sees the updated markdown.

View file

@ -14,6 +14,28 @@ import { nativeIsRepo, nativeWorktreeList, nativeWorktreeRemove, nativeBranchLis
import { getAllWorktreeHealth } from "./worktree-health.js";
import { loadEffectiveGSDPreferences } from "./preferences.js";
/**
* Returns true if the directory contains only doctor artifacts
* (e.g. `.gsd/doctor-history.jsonl`). These dirs are created by
* appendDoctorHistory() writing to worktree-scoped paths during the audit
* and should not be flagged as orphaned worktrees (#3105).
*/
function isDoctorArtifactOnly(dirPath: string): boolean {
try {
const entries = readdirSync(dirPath);
// Empty dir — not a doctor artifact, still orphaned
if (entries.length === 0) return false;
// Only a .gsd subdirectory
if (entries.length === 1 && entries[0] === ".gsd") {
const gsdEntries = readdirSync(join(dirPath, ".gsd"));
return gsdEntries.length <= 1 && gsdEntries.every(e => e === "doctor-history.jsonl");
}
return false;
} catch {
return false;
}
}
export async function checkGitHealth(
basePath: string,
issues: DoctorIssue[],
@ -314,6 +336,10 @@ export async function checkGitHealth(
} catch { continue; }
const normalizedFullPath = normalizePath(fullPath);
if (!registeredPaths.has(normalizedFullPath)) {
// Skip directories that only contain doctor artifacts (.gsd/doctor-history.jsonl).
// appendDoctorHistory() can recreate these dirs during the audit itself,
// causing a circular false positive (#3105 Bug 1).
if (isDoctorArtifactOnly(fullPath)) continue;
issues.push({
severity: "warning",
code: "worktree_directory_orphaned",

View file

@ -181,7 +181,8 @@ function resolveKey(providerId: string): KeyLookup {
*/
const PROVIDER_ROUTES: Record<string, string[]> = {
anthropic: ["github-copilot"],
openai: ["github-copilot"],
openai: ["github-copilot", "openai-codex"],
google: ["google-gemini-cli"],
};
function checkLlmProviders(): ProviderCheckResult[] {

View file

@ -119,10 +119,11 @@ export async function checkRuntimeHealth(
for (const key of keys) {
// Key format: "unitType/unitId" e.g. "execute-task/M001/S01/T01"
const slashIdx = key.indexOf("/");
if (slashIdx === -1) continue;
const unitType = key.slice(0, slashIdx);
const unitId = key.slice(slashIdx + 1);
// Hook units have compound types: "hook/<hookName>/unitId"
const { splitCompletedKey } = await import("./forensics.js");
const parsed = splitCompletedKey(key);
if (!parsed) continue;
const { unitType, unitId } = parsed;
// Only validate artifact-producing unit types
const { verifyExpectedArtifact } = await import("./auto-recovery.js");

View file

@ -729,8 +729,10 @@ export async function runGSDDoctor(basePath: string, options?: { fix?: boolean;
}
// Blocker-without-replan detection
// Skip when all tasks are done — the blocker was implicitly resolved
// within the task and the slice is not stuck (#3105 Bug 2).
const replanPath = resolveSliceFile(basePath, milestoneId, slice.id, "REPLAN");
if (!replanPath) {
if (!replanPath && !allTasksDone) {
for (const task of plan.tasks) {
if (!task.done) continue;
const summaryPath = resolveTaskFile(basePath, milestoneId, slice.id, task.id, "SUMMARY");

View file

@ -60,9 +60,9 @@ const RESET_DELAY_RE = /reset in (\d+)s/i;
* 1. Permanent (auth/billing/quota) unless also rate-limited
* 2. Rate limit (429, rate.?limit, too many requests)
* 3. Network (ECONNRESET, ETIMEDOUT, socket hang up, fetch failed, dns)
* 4. Server (500/502/503, overloaded, server_error)
* 5. Connection (terminated, ECONNREFUSED, EPIPE, other side closed)
* 6. Stream truncation (malformed JSON from mid-stream cut)
* 4. Stream truncation (malformed JSON from mid-stream cut)
* 5. Server (500/502/503, overloaded, server_error)
* 6. Connection (terminated, ECONNREFUSED, EPIPE, other side closed)
* 7. Unknown
*/
export function classifyError(errorMsg: string, retryAfterMs?: number): ErrorClass {
@ -92,21 +92,21 @@ export function classifyError(errorMsg: string, retryAfterMs?: number): ErrorCla
return { kind: "network", retryAfterMs: retryAfterMs ?? 3_000 };
}
// 4. Server errors — try fallback model
// 4. Stream truncation — downstream symptom of connection drop
if (STREAM_RE.test(errorMsg)) {
return { kind: "stream", retryAfterMs: retryAfterMs ?? 15_000 };
}
// 5. Server errors — try fallback model
if (SERVER_RE.test(errorMsg)) {
return { kind: "server", retryAfterMs: retryAfterMs ?? 30_000 };
}
// 5. Connection errors — try fallback model
// 6. Connection errors — try fallback model
if (CONNECTION_RE.test(errorMsg)) {
return { kind: "connection", retryAfterMs: retryAfterMs ?? 15_000 };
}
// 6. Stream truncation — downstream symptom of connection drop
if (STREAM_RE.test(errorMsg)) {
return { kind: "stream", retryAfterMs: retryAfterMs ?? 15_000 };
}
// 7. Unknown
return { kind: "unknown" };
}

View file

@ -12,7 +12,22 @@
"gsd_requirement_update", "gsd_milestone_generate_id"
],
"commands": ["gsd", "kill", "worktree", "exit"],
"hooks": ["session_start", "session_switch"],
"hooks": [
"session_start",
"session_switch",
"bash_transform",
"session_fork",
"before_agent_start",
"agent_end",
"session_before_compact",
"session_shutdown",
"tool_call",
"tool_result",
"tool_execution_start",
"tool_execution_end",
"model_select",
"before_provider_request"
],
"shortcuts": ["Ctrl+Alt+G"]
}
}

View file

@ -28,6 +28,8 @@ import { deriveState } from "./state.js";
import { isAutoActive } from "./auto.js";
import { loadPrompt } from "./prompt-loader.js";
import { gsdRoot } from "./paths.js";
import { isDbAvailable, getAllMilestones, getMilestoneSlices, getSliceTasks } from "./gsd-db.js";
import { isClosedStatus } from "./status-guards.js";
import { formatDuration } from "../shared/format-utils.js";
import { getAutoWorktreePath } from "./auto-worktree.js";
import { loadEffectiveGSDPreferences, loadGlobalGSDPreferences, getGlobalGSDPreferencesPath } from "./preferences.js";
@ -85,6 +87,15 @@ interface JournalSummary {
fileCount: number;
}
interface DbCompletionCounts {
milestones: number;
milestonesTotal: number;
slices: number;
slicesTotal: number;
tasks: number;
tasksTotal: number;
}
interface ForensicReport {
gsdVersion: string;
timestamp: string;
@ -95,6 +106,7 @@ interface ForensicReport {
unitTraces: UnitTrace[];
metrics: MetricsLedger | null;
completedKeys: string[];
dbCompletionCounts: DbCompletionCounts | null;
crashLock: LockData | null;
doctorIssues: DoctorIssue[];
anomalies: ForensicAnomaly[];
@ -106,13 +118,15 @@ interface ForensicReport {
// ─── Duplicate Detection ──────────────────────────────────────────────────────
const DEDUP_PROMPT_SECTION = `
## Duplicate Detection (REQUIRED before issue creation)
## Pre-Investigation: Duplicate Check (REQUIRED)
Before offering to create a GitHub issue, you MUST search for existing issues and PRs that may already address this bug. This step uses the user's AI tokens for analysis.
Before reading GSD source code or performing deep analysis, you MUST search for existing issues and PRs that may already address this bug. This avoids wasting tokens on already-fixed bugs.
### Search Steps
1. **Search closed issues** for similar keywords from your diagnosis:
Use keywords from the user's problem description and the anomaly summaries in the forensic report above.
1. **Search closed issues** for similar keywords:
\`\`\`
gh issue list --repo gsd-build/gsd-2 --state closed --search "<keywords from root cause>" --limit 20
\`\`\`
@ -129,20 +143,16 @@ Before offering to create a GitHub issue, you MUST search for existing issues an
### Analysis
For each result, compare it against your root-cause diagnosis:
For each result, compare it against the user's reported symptoms and the forensic anomalies:
- Does the issue describe the same code path or file?
- Does the PR modify the same file:line you identified?
- Does the PR modify the area related to the reported symptoms?
- Is the symptom description semantically similar even if keywords differ?
### Present Findings
### Decision Gate
If you find potential matches, present them to the user:
1. **"Already fixed by PR #X — skip issue creation"** when a merged PR or closed issue clearly addresses the same root cause. Explain why you believe it matches.
2. **"Add my findings to existing issue #Y"** when an open issue exists for the same bug. Use \`gh issue comment #Y --repo gsd-build/gsd-2\` to add forensic evidence.
3. **"Create new issue anyway"** when existing results do not cover this specific failure.
Only proceed to issue creation if no matches were found OR the user explicitly chooses "Create new issue anyway".
- **Merged PR clearly fixes the described symptom** Report "Already fixed by PR #X" with brief explanation. Skip full investigation.
- **Open issue matches** Report "Existing issue #Y covers this." Offer to add forensic evidence. Skip full investigation unless user asks for deeper analysis.
- **No matches** Proceed to full investigation below.
`;
async function writeForensicsDedupPref(ctx: ExtensionCommandContext, enabled: boolean): Promise<void> {
@ -250,6 +260,9 @@ export async function handleForensics(
{ customType: "gsd-forensics", content, display: false },
{ triggerTurn: true },
);
// Persist forensics context so follow-up turns can re-inject it (#2941)
writeForensicsMarker(basePath, savedPath, content);
}
// ─── Report Builder ───────────────────────────────────────────────────────────
@ -275,8 +288,9 @@ export async function buildForensicReport(basePath: string): Promise<ForensicRep
// 3. Load metrics
const metrics = loadLedgerFromDisk(basePath);
// 4. Load completed keys
// 4. Load completed keys (legacy) and DB completion counts
const completedKeys = loadCompletedKeys(basePath);
const dbCompletionCounts = getDbCompletionCounts();
// 5. Check crash lock
const crashLock = readCrashLock(basePath);
@ -335,6 +349,7 @@ export async function buildForensicReport(basePath: string): Promise<ForensicRep
unitTraces,
metrics,
completedKeys,
dbCompletionCounts,
crashLock,
doctorIssues,
anomalies,
@ -585,6 +600,44 @@ function loadCompletedKeys(basePath: string): string[] {
return [];
}
// ─── DB Completion Counts ────────────────────────────────────────────────────
function getDbCompletionCounts(): DbCompletionCounts | null {
if (!isDbAvailable()) return null;
const milestones = getAllMilestones();
let completedMilestones = 0;
let totalSlices = 0;
let completedSlices = 0;
let totalTasks = 0;
let completedTasks = 0;
for (const m of milestones) {
if (isClosedStatus(m.status)) completedMilestones++;
const slices = getMilestoneSlices(m.id);
for (const s of slices) {
totalSlices++;
if (isClosedStatus(s.status)) completedSlices++;
const tasks = getSliceTasks(m.id, s.id);
for (const t of tasks) {
totalTasks++;
if (isClosedStatus(t.status)) completedTasks++;
}
}
}
return {
milestones: completedMilestones,
milestonesTotal: milestones.length,
slices: completedSlices,
slicesTotal: totalSlices,
tasks: completedTasks,
tasksTotal: totalTasks,
};
}
// ─── Anomaly Detectors ───────────────────────────────────────────────────────
function detectStuckLoops(units: UnitMetrics[], anomalies: ForensicAnomaly[]): void {
@ -649,15 +702,42 @@ function detectTimeouts(traces: UnitTrace[], anomalies: ForensicAnomaly[]): void
}
}
/**
* Parse a completed-unit key into its unitType and unitId.
*
* Hook units use a compound slash-delimited type ("hook/<hookName>"), so a
* naive `key.indexOf("/")` would split "hook/telegram-progress/M007/S01" into
* unitType="hook" (wrong) instead of "hook/telegram-progress".
*
* Returns `null` for malformed keys that cannot be split.
*/
export function splitCompletedKey(key: string): { unitType: string; unitId: string } | null {
if (key.startsWith("hook/")) {
// Hook unit types are two segments: "hook/<hookName>/<unitId...>"
const secondSlash = key.indexOf("/", 5); // skip past "hook/"
if (secondSlash === -1) return null; // malformed — no unitId after hook name
return {
unitType: key.slice(0, secondSlash),
unitId: key.slice(secondSlash + 1),
};
}
const slashIdx = key.indexOf("/");
if (slashIdx === -1) return null;
return {
unitType: key.slice(0, slashIdx),
unitId: key.slice(slashIdx + 1),
};
}
function detectMissingArtifacts(completedKeys: string[], basePath: string, activeMilestone: string | null, anomalies: ForensicAnomaly[]): void {
// Also check the worktree path for artifacts — they may exist there but not at root
const wtBasePath = activeMilestone ? getAutoWorktreePath(basePath, activeMilestone) : null;
for (const key of completedKeys) {
const slashIdx = key.indexOf("/");
if (slashIdx === -1) continue;
const unitType = key.slice(0, slashIdx);
const unitId = key.slice(slashIdx + 1);
const parsed = splitCompletedKey(key);
if (!parsed) continue;
const { unitType, unitId } = parsed;
const rootHasArtifact = verifyExpectedArtifact(unitType, unitId, basePath);
const wtHasArtifact = wtBasePath ? verifyExpectedArtifact(unitType, unitId, wtBasePath) : false;
@ -896,6 +976,42 @@ function saveForensicReport(basePath: string, report: ForensicReport, problemDes
return filePath;
}
// ─── Forensics Session Marker ────────────────────────────────────────────────
export interface ForensicsMarker {
reportPath: string;
promptContent: string;
createdAt: string;
}
/**
* Write a marker file so that buildBeforeAgentStartResult() can re-inject
* the forensics prompt on follow-up turns. (#2941)
*/
export function writeForensicsMarker(basePath: string, reportPath: string, promptContent: string): void {
const dir = join(gsdRoot(basePath), "runtime");
mkdirSync(dir, { recursive: true });
const marker: ForensicsMarker = {
reportPath,
promptContent,
createdAt: new Date().toISOString(),
};
writeFileSync(join(dir, "active-forensics.json"), JSON.stringify(marker), "utf-8");
}
/**
* Read the active forensics marker, or null if none exists.
*/
export function readForensicsMarker(basePath: string): ForensicsMarker | null {
const markerPath = join(gsdRoot(basePath), "runtime", "active-forensics.json");
if (!existsSync(markerPath)) return null;
try {
return JSON.parse(readFileSync(markerPath, "utf-8")) as ForensicsMarker;
} catch {
return null;
}
}
// ─── Prompt Formatter ─────────────────────────────────────────────────────────
function formatReportForPrompt(report: ForensicReport): string {
@ -1008,8 +1124,16 @@ function formatReportForPrompt(report: ForensicReport): string {
sections.push("");
}
// Completed keys count
sections.push(`### Completed Keys: ${report.completedKeys.length}`);
// Completion status — prefer DB counts, fall back to legacy completed-units.json
if (report.dbCompletionCounts) {
const c = report.dbCompletionCounts;
sections.push(`### Completion Status (from DB)`);
sections.push(`- ${c.milestones}/${c.milestonesTotal} milestones complete`);
sections.push(`- ${c.slices}/${c.slicesTotal} slices complete`);
sections.push(`- ${c.tasks}/${c.tasksTotal} tasks complete`);
} else {
sections.push(`### Completed Keys: ${report.completedKeys.length}`);
}
sections.push(`### GSD Version: ${report.gsdVersion}`);
sections.push(`### Active Milestone: ${report.activeMilestone ?? "none"}`);
sections.push(`### Active Slice: ${report.activeSlice ?? "none"}`);

View file

@ -9,7 +9,7 @@
*/
import { execFileSync, execSync } from "node:child_process";
import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
import { existsSync, mkdirSync, readFileSync, readdirSync, writeFileSync } from "node:fs";
import { join } from "node:path";
import { gsdRoot } from "./paths.js";
import { GIT_NO_PROMPT_ENV } from "./git-constants.js";
@ -50,9 +50,9 @@ export interface GitPreferences {
main_branch?: string;
merge_strategy?: "squash" | "merge";
/** Controls auto-mode git isolation strategy.
* - "worktree": (default) creates a milestone worktree for isolated work
* - "worktree": creates a milestone worktree for isolated work
* - "branch": works directly in the project root (for submodule-heavy repos)
* - "none": no git isolation commits land on the user's current branch directly
* - "none": (default) no git isolation commits land on the user's current branch directly
*/
isolation?: "worktree" | "branch" | "none";
/** When false, GSD will not modify .gitignore at all no baseline patterns
@ -488,6 +488,29 @@ export class GitServiceImpl {
// If .gsd/ IS in .gitignore (the default for external state projects),
// git add -A already skips it and the exclusions are harmless no-ops.
const allExclusions = [...RUNTIME_EXCLUSION_PATHS, ...extraExclusions];
// ── Parallel worker milestone scope (#1991) ──
// When GSD_MILESTONE_LOCK is set, this process is a parallel worker that
// must only commit files belonging to its own milestone. Exclude all other
// milestone directories from staging to prevent cross-milestone pollution
// (e.g., an M033 worker fabricating M032 artifacts in the same commit).
const milestoneLock = process.env.GSD_MILESTONE_LOCK;
if (milestoneLock) {
const msDir = join(gsdRoot(this.basePath), "milestones");
if (existsSync(msDir)) {
try {
const entries = readdirSync(msDir, { withFileTypes: true });
for (const entry of entries) {
if (entry.isDirectory() && entry.name !== milestoneLock) {
allExclusions.push(`.gsd/milestones/${entry.name}/`);
}
}
} catch {
// Best-effort — if we can't read the milestones dir, proceed without scoping
}
}
}
nativeAddAllWithExclusions(this.basePath, allExclusions);
}

View file

@ -41,6 +41,7 @@ const GSD_RUNTIME_PATTERNS = [
const BASELINE_PATTERNS = [
// ── GSD state directory (symlink to external storage) ──
".gsd",
".gsd-id",
// ── OS junk ──
".DS_Store",
@ -84,6 +85,38 @@ const BASELINE_PATTERNS = [
"tmp/",
];
/**
* Check whether `.gsd` is covered by the project's `.gitignore`.
*
* Uses `git check-ignore` for accurate evaluation this respects nested
* .gitignore files, global gitignore, and negation patterns. Returns true
* only when git would actually ignore `.gsd/`.
*
* Returns false (not ignored) if:
* - No `.gitignore` exists
* - `.gsd` is not listed in any active ignore rule
* - Not a git repo or git is unavailable
*/
export function isGsdGitignored(basePath: string): boolean {
// Check both `.gsd` and `.gsd/` because `.gsd/` in .gitignore (trailing
// slash = directory-only pattern) only matches the directory form. Using
// both paths covers all gitignore pattern variants.
for (const path of [".gsd", ".gsd/"]) {
try {
// git check-ignore exits 0 when the path IS ignored, 1 when it is NOT.
execFileSync("git", ["check-ignore", "-q", path], {
cwd: basePath,
stdio: "pipe",
env: GIT_NO_PROMPT_ENV,
});
return true; // exit 0 → .gsd is ignored
} catch {
// exit 1 → this form is NOT ignored, try the other
}
}
return false; // neither form is ignored (or git unavailable)
}
/**
* Check whether `.gsd/` contains files tracked by git.
* If so, the project intentionally keeps `.gsd/` in version control

View file

@ -10,6 +10,7 @@ import { existsSync, copyFileSync, mkdirSync, realpathSync } from "node:fs";
import { dirname } from "node:path";
import type { Decision, Requirement, GateRow, GateId, GateScope, GateStatus, GateVerdict } from "./types.js";
import { GSDError, GSD_STALE_STATE } from "./errors.js";
import { logError } from "./workflow-logger.js";
const _require = createRequire(import.meta.url);
@ -778,8 +779,21 @@ export function openDatabase(path: string): boolean {
try {
initSchema(adapter, fileBacked);
} catch (err) {
try { adapter.close(); } catch { /* swallow */ }
throw err;
// Corrupt freelist: DDL fails with "malformed" but VACUUM can rebuild.
// Attempt VACUUM recovery before giving up (see #2519).
if (fileBacked && err instanceof Error && err.message?.includes("malformed")) {
try {
adapter.exec("VACUUM");
initSchema(adapter, fileBacked);
process.stderr.write("gsd-db: recovered corrupt database via VACUUM\n");
} catch (retryErr) {
try { adapter.close(); } catch { /* swallow */ }
throw retryErr;
}
} else {
try { adapter.close(); } catch { /* swallow */ }
throw err;
}
}
currentDb = adapter;
@ -1124,10 +1138,11 @@ export function insertMilestone(m: {
});
}
export function upsertMilestonePlanning(milestoneId: string, planning: Partial<MilestonePlanningRecord>): void {
export function upsertMilestonePlanning(milestoneId: string, planning: Partial<MilestonePlanningRecord>, title?: string): void {
if (!currentDb) throw new GSDError(GSD_STALE_STATE, "gsd-db: No database open");
currentDb.prepare(
`UPDATE milestones SET
title = COALESCE(:title, title),
vision = COALESCE(:vision, vision),
success_criteria = COALESCE(:success_criteria, success_criteria),
key_risks = COALESCE(:key_risks, key_risks),
@ -1142,6 +1157,7 @@ export function upsertMilestonePlanning(milestoneId: string, planning: Partial<M
WHERE id = :id`,
).run({
":id": milestoneId,
":title": title ?? null,
":vision": planning.vision ?? null,
":success_criteria": planning.successCriteria ? JSON.stringify(planning.successCriteria) : null,
":key_risks": planning.keyRisks ? JSON.stringify(planning.keyRisks) : null,
@ -1519,6 +1535,26 @@ export function insertVerificationEvidence(e: {
});
}
export interface VerificationEvidenceRow {
id: number;
task_id: string;
slice_id: string;
milestone_id: string;
command: string;
exit_code: number;
verdict: string;
duration_ms: number;
created_at: string;
}
export function getVerificationEvidence(milestoneId: string, sliceId: string, taskId: string): VerificationEvidenceRow[] {
if (!currentDb) return [];
const rows = currentDb.prepare(
"SELECT * FROM verification_evidence WHERE milestone_id = :mid AND slice_id = :sid AND task_id = :tid ORDER BY id",
).all({ ":mid": milestoneId, ":sid": sliceId, ":tid": taskId });
return rows as unknown as VerificationEvidenceRow[];
}
export interface MilestoneRow {
id: string;
title: string;
@ -1738,7 +1774,7 @@ export function copyWorktreeDb(srcDbPath: string, destDbPath: string): boolean {
copyFileSync(srcDbPath, destDbPath);
return true;
} catch (err) {
process.stderr.write(`gsd-db: failed to copy DB to worktree: ${(err as Error).message}\n`);
logError("db", "failed to copy DB to worktree", { error: (err as Error).message });
return false;
}
}
@ -1770,13 +1806,13 @@ export function reconcileWorktreeDb(
// ATTACH DATABASE doesn't support parameterized paths in all providers,
// so we use strict allowlist validation instead.
if (/['";\x00]/.test(worktreeDbPath)) {
process.stderr.write("gsd-db: worktree DB reconciliation failed: path contains unsafe characters\n");
logError("db", "worktree DB reconciliation failed: path contains unsafe characters");
return zero;
}
if (!currentDb) {
const opened = openDatabase(mainDbPath);
if (!opened) {
process.stderr.write("gsd-db: worktree DB reconciliation failed: cannot open main DB\n");
logError("db", "worktree DB reconciliation failed: cannot open main DB");
return zero;
}
}
@ -1910,7 +1946,7 @@ export function reconcileWorktreeDb(
try { adapter.exec("DETACH DATABASE wt"); } catch { /* best effort */ }
}
} catch (err) {
process.stderr.write(`gsd-db: worktree DB reconciliation failed: ${(err as Error).message}\n`);
logError("db", "worktree DB reconciliation failed", { error: (err as Error).message });
return { ...zero, conflicts };
}
}

Some files were not shown because too many files have changed in this diff Show more