* chore(pi-ai): regenerate model registry from upstream APIs
Regenerated models.generated.ts by running generate-models.ts against
live provider APIs. Last generated: 2026-04-09.
+48 models added, 19 removed across all providers.
Notable additions: z-ai/glm-5.1 via OpenRouter (closes#4069,
supersedes custom entry in #4055), zai-org/GLM-5.1, z-ai/glm-5v-turbo.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(pi-ai): add structural and regression tests for models.generated.ts
- Regression #3582: pins qwen/qwen3.6-plus in openrouter
- Regression #4069: pins z-ai/glm-5.1 in openrouter
- Structural invariants across all 23 providers / all models
- Registry shape: exact provider list, model count lower bound
- Removed models guard: decommissioned models must stay absent
- Spot-checks for notable models added in this regeneration
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(pi-ai): add Alibaba DashScope as standalone provider
Adds `alibaba-dashscope` for users with a regular DashScope API key,
separate from the existing `alibaba-coding-plan` free-tier provider.
- types.ts: register `alibaba-dashscope` as KnownProvider
- env-api-keys.ts: map to DASHSCOPE_API_KEY
- models.custom.ts: add qwen3-max, qwen3.5-plus, qwen3.5-flash,
qwen3-coder-plus with international endpoint and real pricing
- model-resolver.ts: default model qwen3.5-plus
- key-manager.ts: add alibaba-coding-plan and alibaba-dashscope
to PROVIDER_REGISTRY so /gsd keys add works for both
Co-Authored-By: Claude Code <noreply@anthropic.com>
* feat(pi-ai): add qwen3.6-plus to alibaba-dashscope provider
qwen3.6-plus is available on DashScope international endpoint.
Pricing: $0.5/M input, $3/M output (base tier, 0-256K tokens).
Supports thinking mode (reasoning: true).
Source: https://www.alibabacloud.com/help/en/model-studio/model-pricing
Co-Authored-By: Claude Code <noreply@anthropic.com>
* test(pi-ai): add tests for alibaba-dashscope provider and key-manager regression
- packages/pi-ai/src/models.test.ts: add describe block covering all 5
alibaba-dashscope models (presence, base URL, API, provider field,
context window, paid pricing, per-model reasoning/cost assertions,
independence from alibaba-coding-plan, failure path for unknown model)
- src/resources/extensions/gsd/tests/key-manager.test.ts: add regression
tests for #3891 — alibaba-coding-plan was missing from PROVIDER_REGISTRY,
causing /gsd keys add alibaba-coding-plan to fail silently; also covers
alibaba-dashscope registration, env var separation, and getAllKeyStatuses
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Code <noreply@anthropic.com>
- Add OLLAMA_API_KEY Bearer token auth to all Ollama HTTP client requests
(fetchWithTimeout, pullModel, chat) via getAuthHeaders/withAuth helpers.
Local Ollama ignores the Authorization header; cloud endpoints require it.
- Fix isRunning() probe for cloud endpoints: use /api/tags instead of root /
since cloud hosts may not serve the root endpoint.
- Resolve real context window for unknown models via /api/show model_info
({arch}.context_length) instead of defaulting to 8192. Priority chain:
known table > /api/show > estimate from parameter_size > 8192.
- Use dependency injection for discoverModels() to allow test mocking
without ESM named export issues.
- Pick up OLLAMA_API_KEY in provider registration (apiKey field).
Closes#3544
Co-authored-by: luannevesb <luannevesb@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Adds a prominent "Linked issue" section at the top of the PR template
with a required acknowledgement checkbox. PRs that leave the issue
number blank or uncheck the box signal they haven't read the contribution
guidelines and will be closed.
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The Next.js auth middleware (proxy.ts) was never wired in — it exported
`proxy` from a file named proxy.ts, but Next.js requires a `middleware`
export from middleware.ts. The middleware-manifest.json was empty,
leaving all 42 API routes accessible without authentication.
Fixes:
- Rename web/proxy.ts → web/middleware.ts, export `middleware` not `proxy`
- Add defense-in-depth auth-guard to /api/shutdown and /api/update routes
- Remove shell: true from update-service spawn (command injection surface)
- Update contract tests to verify middleware file name and export
Closes#4014
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Since 2.72.0 the interactive permission default is acceptEdits, which
auto-approves built-in Edit/Write/Bash but leaves the SDK permission
gate up for MCP tools. Without a canUseTool handler, every
mcp__gsd-workflow__* call surfaces as "This command requires approval"
and blocks GSD actions (#4099).
Add allowedTools entries (mcp__<server>__*) for each registered workflow
MCP server in buildSdkOptions so they run unattended while the rest of
the acceptEdits safety gate stays intact. Env-overridden server names
are handled by deriving the glob list from the built mcpServers keys.
Fixes#4099
Asserts that getPiDefaultModelAndProvider and migratePiCredentials remain
callable top-level exports from src/pi-migration.ts. If either is ever
renamed or unexported, this test fails before the root `tsc` build breaks
every CI job on main — the same class of regression introduced by
110c01b8c.
Commit 110c01b8c added an inline `validateConfiguredModel` function in
`src/cli.ts` while leaving the prior import from
`./startup-model-validation.js` in place, producing TS2440 (import
declaration conflicts with local declaration). The same commit added a
call to `getPiDefaultModelAndProvider()` without importing it, producing
TS2304 (cannot find name). Both errors block `npm run build` and every
CI job on main.
Drop the stale import and add `getPiDefaultModelAndProvider` to the
existing `./pi-migration.js` import where the symbol is actually
exported. The local `validateConfiguredModel` function (lines 139-174)
becomes the sole definition in scope. `./startup-model-validation.js`
is still consumed by its dedicated test files so the module stays.
Organize discussion question rounds into four layers (Scope →
Architecture → Error States → Quality Bar) with user-confirmed
gates between each. Prevents silent advancement and ensures
systematic depth coverage.
Each gate pauses for user confirmation. Users can skip forward
at any gate. Adjustments are reflected back before advancing.
Work-type adaptation shapes question depth per layer.
Prompt-only change — no TypeScript modifications.
Builds on #3977 (multi-round question structure).
* fix: update GSD runtime ignore patterns for team mode
Add missing runtime files to gitignore patterns across codebase and docs:
- .gsd/completed-units*.json (wildcard for archived per-milestone files)
- .gsd/state-manifest.json (workflow state manifest)
- .gsd/gsd.db* (SQLite database and WAL sidecars)
- .gsd/journal/ (daily-rotated event journal)
- .gsd/doctor-history.jsonl (diagnostic check history)
- .gsd/event-log.jsonl (workflow event log)
Updated files:
- gitignore.ts: GSD_RUNTIME_PATTERNS
- git-service.ts: RUNTIME_EXCLUSION_PATHS
- worktree-manager.ts: SKIP_PATHS, SKIP_EXACT, SKIP_PREFIXES
- doctor-runtime-checks.ts: criticalPatterns
- tests/git-service.test.ts: test expectations
- docs: README.md, working-in-teams.mdx
* docs: add comments noting gitignore.ts as canonical source of truth
Address code review feedback about maintenance risk of having multiple
sources of truth for ignore patterns. Add clear comments in all files
that reference GSD_RUNTIME_PATTERNS to indicate gitignore.ts is the
canonical source that must stay synchronized.
renderSummaryContent() in workflow-projections.ts wraps full_summary_md
(already a complete markdown doc with frontmatter) inside a second generated
frontmatter/heading envelope. This produces double frontmatter, double H1
headings, and duplicate Deviations/Known Issues sections.
The fix checks whether full_summary_md exists and starts with frontmatter
delimiters. If so, it is used as the entire output. The fallback synthesis
from individual DB columns only runs when full_summary_md is absent or
lacks frontmatter.
Adds 3 regression tests to projection-regression.test.ts.
Extension-based providers like pi-claude-cli register their models
during extension loading, but registrations were queued and not flushed
until after model resolution ran. This caused findInitialModel() and
the startup model validation to see extension models as nonexistent,
permanently overwriting the user's saved model selection on every launch.
- Flush pendingProviderRegistrations in createAgentSession() before
findInitialModel() so extension models are visible in the registry
- Move model validation to after createAgentSession() in both print
and interactive code paths
- Load extensions before --list-models so extension models appear
The claude-cli onboarding path stored the auth sentinel for claude-code
but did not update defaultProvider in settings.json. Users who had an
existing Anthropic API key were left on the "anthropic" provider because
the startup migration in cli.ts correctly skips direct-key holders.
Write defaultProvider = "claude-code" to settings.json in the claude-cli
branch so the provider switch takes effect immediately.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes#4102.
buildPromptFromContext previously serialized multi-turn history using
literal [User] / [Assistant] / [System] bracket labels. Those tokens
are the exact pattern the anti-fabrication rule in system.md and
discuss.md forbids — the model saw its own input framed as a bracket-
labeled transcript and mirrored the format in its output, inventing
both sides of the conversation during /gsd discuss turns.
Replace the bracket labels with XML-tag structure:
- <conversation_history> wraps the whole turn sequence
- <user_message> / <assistant_message> per turn
- <prior_system_context> for the system prompt (renamed from
<system_prompt> to avoid overlap with Claude Code's reserved
<system-reminder> convention)
Prepend a directive telling the model to respond only to the final
user message and not emit the XML tags in its own response. Keep
system.md and discuss.md in sync by documenting that prior context
is delivered in those tags.
Add regression tests asserting:
- no literal [User]/[Assistant]/[System] substrings in the prompt
- history wrapped in <conversation_history> with per-turn tags
- directive leads the prompt
- empty-history edge cases still render correctly
Strip the `mcp__<server>__` prefix from tool_use blocks emitted by the
Claude Agent SDK so registered GSD extension renderers (gsd_plan_milestone,
gsd_task_complete, etc.) match instead of falling through to the generic
JSON-dump fallback. The original server name is preserved on the toolCall
block under `mcpServer` for downstream rendering.
Tighten the generic ToolExecutionComponent fallback for any remaining
prefixed names (third-party MCP servers): show a muted `server·tool`
title, render primitive args as compact `key=value` pairs, and truncate
output to 10 lines when collapsed.
PR #4093 split build and integration-tests into parallel CI jobs but the new
integration-tests job only ran `npm ci`, leaving tests without the compiled
artifacts they spawn at runtime. That caused 8 failures on main (run 24325713845):
- e2e-headless, e2e-smoke, pack-install — throw "dist/loader.js not found"
- 4 web-session-parity / web-live-state tests — "session manager module not
found; checked=packages/pi-coding-agent/dist/core/session-manager.js"
- web-mode-onboarding — "sh: 1: next: not found" when the test shells
`npm run build:web-host` at runtime (web/node_modules/.bin/next absent)
Add `npm --prefix web ci` and `npm run build` to the integration-tests job
before `test:integration`, matching what the build job already does. Using
`needs: build` + artifact sharing would serialize the two jobs and undo the
parallelism PR #4093 was buying, so the build is duplicated intentionally.
Split integration tests into a separate job that runs concurrently with
the build+unit test job. Integration tests run from TypeScript source
(no build artifact dependency), so they can start immediately after
detect-changes without waiting for compilation.
Add useblacksmith/cache@v5 to persist web/.next/cache across CI runs,
eliminating redundant Webpack recompilation of unchanged pages. Applied
to ci.yml build job and pipeline.yml dev-publish/prod-release jobs.
When GSD auto-mode is running a planning phase, the planner subagent
could bypass GSD's state machine and artifact system. This adds a
shared state module and conflict check to block agents that overlap
with the active GSD phase.
- Add shared/gsd-phase-state.ts for cross-extension phase coordination
- Add conflicts_with frontmatter field to agent definitions
- Block conflicting agents with clear error directing to GSD workflow
- Tag planner agent with conflicts_with for plan/research phases
- 10 new tests for phase state and conflict parsing
The gate-registry test intentionally passes an invalid gate id "Q999"
to verify error handling, but the strict GateId union type rejects it
at compile time. Cast to GateId to fix the typecheck:extensions CI step.
Every workflow turn that needed a quality gate either let it drop
silently or bulk-stamped it at closeout. Q8 was the worst case: seeded
as scope:"slice" by plan-slice, treated as a blocker for the
evaluating-gates phase by state.ts, then filtered out of the
gate-evaluate prompt via `if (!meta) continue;` and never closed by
complete-slice — a guaranteed auto-loop stall once slice gates were
enabled.
Introduce gate-registry.ts as the single source of truth for which
turn owns which gate (Q3/Q4 → gate-evaluate, Q5/Q6/Q7 → execute-task,
Q8 → complete-slice, MV01–MV04 → validate-milestone). Every layer of
the prompt system now consults it:
- state.ts derives pending counts by owner turn, not scope, so Q8
never stalls evaluating-gates again.
- auto-prompts.ts builders call assertGateCoverage() and render a
"Gates to Close" block from the registry instead of a hand-rolled
GATE_QUESTIONS table.
- complete-slice and complete-task handlers saveGateResult for every
gate they own, mapping gate id → params field so empty sections
become `omitted` and populated sections become `pass`.
- milestone-validation-gates sources its MV id list from the registry.
- prompt-validation.ts adds validateSliceSummaryOutput /
validateTaskSummaryOutput / validateMilestoneValidationOutput
schema checks.
- gsd_save_gate_result accepts MV01–MV04 (via the registry keys) in
the MCP server and bootstrap tool registration.
Tests: new gate-registry + prompt-system-gate-coverage +
complete-slice-gate-closure suites, plus a Q8 regression case in
gate-dispatch.test.ts. 161 related tests pass end-to-end.
https://claude.ai/code/session_019PT3EmrkMxr4TsgGGLSYK3
Two related fixes for `gsd --mode mcp` that the audit missed on first pass:
1. Tool inventory — session.agent.state.tools was the *active* subset, not
the full registry. Before this change, MCP clients connected to GSD saw
63 tools and four built-ins were silently missing: `find`, `grep`, `ls`,
and `hashline_edit`. After: 67 tools, matching the full _toolRegistry.
Fix: call session.getAllTools() + session.setActiveToolsByName() before
starting the MCP transport so every registered tool is active for the
lifetime of the MCP session.
2. SDK subpath resolution — the #3603 createRequire workaround no longer
works with @modelcontextprotocol/sdk 1.27.x + current Node. The
wildcard export ./* → ./dist/cjs/* does NOT auto-append `.js`, and
_require.resolve fails with "Cannot find module .../server/stdio".
End-to-end handshake was actually broken in src/mcp-server.ts even
before my earlier F5 change. Fix: use explicit `.js` suffixes on
every subpath import (server/index.js, server/stdio.js, types.js),
matching the convention already in use by packages/mcp-server/.
The regression test is rewritten to enforce the `.js`-suffix convention
and reject any bare subpath or lingering createRequire resolution.
Verified end-to-end via raw JSON-RPC against `gsd --mode mcp --bare`:
BEFORE_COUNT=63
AFTER_COUNT=67
diff: +find +grep +hashline_edit +ls
Test sweep: 76 tests pass across mcp-createRequire, stream-adapter,
mcp-server, workflow-tools.
https://claude.ai/code/session_0174sYny3VvdwYTdCNTmY4Do
Rename intermediateToolCalls → intermediateToolBlocks to match upstream
rename, and pass onElicitation via extraOptions (4th arg) instead of
overrides (3rd arg) in buildSdkOptions test.
The lockfile was still pointing at 2.66.1 after the v2.68.0 release;
`npm install` during the audit work resynced it. No dependency graph
changes, just version-string alignment.
https://claude.ai/code/session_0174sYny3VvdwYTdCNTmY4Do
Audit-driven fixes across the two MCP server surfaces and the Claude Code
streaming adapter:
- src/mcp-server.ts: propagate `extra.signal` into `tool.execute` so MCP
clients can actually cancel long-running Bash/WebFetch/grep calls, and
route the remaining `/server` subpath import through `createRequire`
for #3603 consistency.
- src/tests/mcp-createRequire.test.ts: extend regression coverage to the
`/server` subpath.
- claude-code-cli/stream-adapter.ts: (a) classify aborts as `aborted`
instead of the retry-eligible `stream_exhausted_without_result`,
(b) merge final-turn toolCall blocks from the builder into the
AssistantMessage via the new `mergePendingToolCalls` helper so a turn
ending in `tool_use` stop_reason no longer drops its tool calls, and
(c) resolve the SDK permission mode via `resolveClaudePermissionMode`
(auto-mode → bypass, interactive → acceptEdits, env override).
- packages/mcp-server/src/server.ts: make `gsd_query` actually respect
its `query` argument with known categories + forward-compatible
fallback, and thread `extra.signal` into `gsd_execute` so an aborted
RPC request cancels the newly-created session instead of leaking a
background RpcClient process.
- stream-adapter test suite: add regression tests for abort
classification, final-turn tool-call merging, and permission mode
resolution.
Verified via: mcp-createRequire, stream-adapter (27), partial-builder,
mcp-server package (31), workflow-tools (13) — 83 tests green.
https://claude.ai/code/session_0174sYny3VvdwYTdCNTmY4Do