The #4162 refactor removed parseCliArgs' inline --help handler assuming
loader.ts's fast-path covered it, but loader.ts only intercepts --help/-h
as argv[1]. That broke:
- gsd update --help — fell through to runUpdate() (subcommand help
check sat dead-code below the update handler)
- gsd --unknown --help in non-TTY — tripped the TTY gate and exited 1
Move the subcommand-help check ahead of every subcommand handler and
fall back to general help when no subcommand matches, so --help wins
whenever it appears anywhere in argv.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tracked output from 2022 (commit d93956ba4) that's missing the modern
GSD_HOME env support and webPreferencesPath export present in the .ts
source. No runtime path consumes it, but the test compile script's
copyAssets step overlays src/* onto esbuild output in dist-test, so the
stale .js was shadowing the compiled app-paths and breaking any unit
test transitively importing webPreferencesPath.
Cover the canonical parseCliArgs export in cli-web-branch.ts including
the new mcp mode, worktree flag (boolean and named forms), and existing
short flags, web mode flags, list flags, and positional message handling.
Also remove src/app-paths.js — a stale tracked output (last touched in
2022, missing GSD_HOME and webPreferencesPath exports). The test compile
script copies all of src/ over esbuild's output, so this stale .js was
shadowing the compiled app-paths in dist-test and breaking any test that
transitively imported it. No runtime path uses it (production loads from
dist/app-paths.js; jiti/tsx prefer the .ts source).
Satisfies require-tests.sh on PR #4162.
Pure deletion/deduplication pass on top-level src/*.ts. External behavior
unchanged; all targeted unit tests still pass.
cli.ts (−170 net lines)
- Adopt canonical validateConfiguredModel from startup-model-validation.ts;
delete the drifted local copy with hardcoded model fallbacks.
- Import CliFlags + parseCliArgs from cli-web-branch.ts instead of keeping
a second, 90%-identical parser; pass cliFlags directly into
runWebCliBranch instead of re-parsing process.argv.
- Extract 3 helpers for verbatim duplicates:
* printNonTtyErrorAndExit (TTY gate, 2 call sites)
* printExtensionErrors (extension load errors, 2 call sites)
* reapplyValidatedModelOnFallback (post-createAgentSession fix, 2 sites)
- Factor runHeadlessFromAuto helper shared by the `gsd auto` shorthand
and the auto-piped-stdout redirect.
- Collapse ensureRtkBootstrap from hand-rolled _done flag to a
promise-memoized doRtkBootstrap.
- Drop redundant validateConfiguredModel pre-createAgentSession calls
(the post-createAgentSession call is the correct one per #2626).
- Delete dead --version/-v and --help/-h fast paths (loader.ts already
handles these before cli.ts is imported).
cli-web-branch.ts
- Unify CliFlags with worktree, 'mcp' mode, and _selectedSessionPath.
- Drop unused help?/version? flags (loader.ts intercepts them).
onboarding.ts
- Add runStep<T>() helper with shared cancel/warn handling; collapse 4
near-identical try/catch blocks around runLlmStep, runWebSearchStep,
runRemoteQuestionsStep, runToolKeysStep.
- Delete trivial isCancelError helper (inlined as p.isCancel).
- Rewrite loadPico() adapter to build PicoModule from chalk so we can
drop the redundant picocolors dependency.
package.json / package-lock.json
- Remove picocolors direct dep (chalk remains the single color library).
When a user picks a custom-provider model via /gsd model (Ollama, vLLM,
LM Studio, OpenAI-compatible proxies — anything defined in
~/.gsd/agent/models.json) and then runs /gsd auto, the bootstrap silently
swaps it out for whichever model PREFERENCES.md happens to list. That
model is invariably a built-in provider (claude-code, anthropic) the user
isn't logged into, so auto-mode immediately fails with
"Not logged in · Please run /login", pauses, and resets the session to
claude-code/claude-sonnet-4-6.
Root cause: #3517 made resolveDefaultSessionModel() (PREFERENCES.md) take
priority over ctx.model (settings.json) in auto-start.ts. That fix was
correct for the scenario where settings.json had a stale built-in default
but PREFERENCES.md was freshly configured, but it has no awareness of
custom providers — PREFERENCES.md cannot reference them, so honoring it
when the session provider is custom always discards the user's explicit
choice.
Add isCustomProvider() to preferences-models.ts which checks whether a
provider is declared in ~/.gsd/agent/models.json (with ~/.pi/agent
fallback). Read the file directly with JSON.parse to avoid pulling in
the model-registry at this call site, and treat any read or parse error
as not-custom so a malformed models.json never breaks bootstrap.
In bootstrapAutoSession(), when the session provider is custom, use
ctx.model directly. Otherwise fall through to the existing #3517
behavior (preferredModel ?? ctx.model).
Tests:
- New behavioral regression in model-isolation.test.ts that mirrors
the auto-start.ts logic and verifies the four interesting cases:
custom session beats PREFERENCES.md, built-in session still defers
to PREFERENCES.md (#3517 preserved), custom session with no
PREFERENCES.md uses ctx.model, and null ctx.model falls through.
- New string-grep guard in auto-start-model-capture.test.ts that the
isCustomProvider() call is wired into the snapshot path.
- Updated #3517 grep to allow the new branching shape while still
asserting preferredModel remains a snapshot source for built-ins.
https://claude.ai/code/session_01QLYCeiXWjSFPEXFxjkSLni
* fix(ci): address 5 pipeline integrity issues from release audit
- version-stamp.mjs: regenerate package-lock.json after dev version stamp
(mirrors the same fix applied to bump-version.mjs in #4116)
- bump-version.mjs: regenerate root and web/package-lock.json after version
bump so both lockfiles are always in sync at release time
- pipeline.yml: add post-bump validation step that verifies all package.json
files parse as valid JSON before the release commit is made
- pipeline.yml: split "Commit, tag, and push" — commit+tag+rebase happen
before build, but git push is deferred until after build and npm publish
both succeed, preventing a broken tag from landing on main
- pipeline.yml: emit a :⚠️: annotation when live LLM tests fail so
failures are visible in the Actions UI instead of silently swallowed
Closes#4118
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(gsd): address 3 silent-crash secondary issues from #3348 post-#3696
Three gaps that remained after the double-fault fix in #3696:
1. unhandledRejection not wired — installEpipeGuard only registered
uncaughtException; promise rejections that escaped without a catch
were not handled by the GSD error path. Added _gsdRejectionGuard
alongside _gsdEpipeGuard.
2. Non-fatal overcorrection — the #3696 fix replaced re-throwing with
log-and-continue, leaving the process running in an indeterminate
state after any non-EPIPE/non-ENOENT exception. Replaced with
writeCrashLog + process.exit(1). writeCrashLog is extracted into
bootstrap/crash-log.ts (zero deps) so tests can import it without
pulling in the full extension graph.
3. unit-end not emitted after crash-with-side-effects — hameltomor
observed that complete-milestone M001 wrote SUMMARY.md and updated
the DB but never emitted unit-end (#3348 comment-4237533440). Added
emitCrashRecoveredUnitEnd() in crash-recovery.ts: on the next
auto-mode startup, if a stale lock references a unit whose
unit-start has no matching unit-end in the journal, a synthetic
unit-end with status "crash-recovered" is emitted before the lock
is cleared. This closes the causal chain for downstream tooling
and forensics without requiring changes to the lock file schema.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove hard-coded Anthropic/Claude defaults and silent provider swaps so
the app honors whatever model/provider the user has configured.
- src/cli.ts: drop the anthropic->claude-code auto-migration blocks that
were rewriting the user's saved defaultProvider on every startup.
- packages/pi-coding-agent/src/core/model-resolver.ts: delete the
defaultModelPerProvider table, drop the "recommended variant" swap
that silently upgraded e.g. claude-opus-4-6 to -extended, and replace
the provider-iteration first-available fallback with provider-sticky
(user's saved provider first, then first registry entry).
- src/startup-model-validation.ts: replace the openai/anthropic-first
fallback chain with Pi-default -> same-provider -> first-available.
- src/help-text.ts: use a generic provider/model-id example for --model
instead of claude-opus-4-6.
- src/tests/startup-model-validation.test.ts: update the fallback test
to assert provider stickiness rather than a specific Claude model id.
https://claude.ai/code/session_01CvuUuzuVjRcQN25263nG6V
reactive_execution.subagent_model was validated and stored but never
passed to the prompt builders that generate subagent dispatch instructions.
The executing agent therefore autonomously chose its default model instead
of the configured preference.
- buildReactiveExecutePrompt: add subagentModel? param, inject into
instruction string; auto-dispatch passes reactiveConfig.subagent_model
with fallback to resolveModelWithFallbacksForUnit("subagent")
- buildParallelResearchSlicesPrompt: same pattern, resolves from
models.subagent preference
- buildGateEvaluatePrompt: same pattern
- system-context: inject configured subagent model into system prompt
so the executing agent always knows which model to use for subagents
Closes#4078
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 0 of #3631 — remove dead code before screaming architecture reorg.
- auto-observability.ts (72 LOC): zero imports anywhere in codebase
- rtk-status.ts (53 LOC): zero imports anywhere in codebase
- file-watcher.ts (100 LOC): zero imports anywhere in codebase
- file-watcher.test.ts: test for dead file-watcher.ts
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(gsd): reconcile stale slice rows and rebuild STATE.md before DB close
Two coupled defects caused auto-mode split-brain where dispatch falsely
reported "No slice eligible" while STATE.md showed executable work:
1. deriveStateFromDb() reconciled missing slice rows but not stale
existing ones. A slice with status "pending" in the DB but a SUMMARY
file on disk was never repaired, permanently blocking downstream
slices. Added slice-level stale reconciliation matching the existing
task-level pattern.
2. stopAuto() closed the DB before rebuilding STATE.md, forcing
deriveState() into filesystem fallback mode. Moved rebuildState()
before closeDatabase() so stop-time STATE.md uses the same
authoritative DB backend as dispatch.
Fixes#3599
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add regression test for stale slice row reconciliation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(gsd): block direct writes to gsd.db via hooks to prevent corruption
When gsd_complete_task tool was unavailable, agents fell back to shell-
based sqlite3/sql.js writes to .gsd/gsd.db, corrupting the WAL-backed
database.
Extend write-intercept to block:
- File writes to gsd.db, gsd.db-wal, gsd.db-shm
- Bash commands using sqlite3/sql.js/better-sqlite3 targeting gsd.db
- Shell redirects/cp/mv targeting gsd.db
Closes#3625
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add regression test for blocking direct gsd.db writes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 1 of #3631 — eliminate circular imports before screaming arch reorg.
Cycle 1 (auto.ts ↔ auto-direct-dispatch.ts):
Remove redundant re-export of dispatchDirectPhase from auto.ts.
No consumer imported it through auto.ts.
Cycle 2 (context-injector.ts ↔ custom-workflow-engine.ts):
Extract readFrozenDefinition to new definition-io.ts.
context-injector now imports from definition-io directly.
Cycle 3 (preferences.ts ↔ preferences-skills.ts):
Move formatSkillRef to preferences-types.ts (pure fn, depends only on
SkillResolution which is already there).
Move resolveSkillDiscoveryMode + resolveSkillStalenessDays into
preferences.ts (trivial wrappers over loadEffectiveGSDPreferences).
Tests: new definition-io.test.ts (3 tests), preferences-formatting.test.ts
(6 tests covering all formatSkillRef branches).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The first pass at #4099 only pre-authorized `mcp__<server>__*` tools, but
in `acceptEdits` mode the SDK still gates Read, Write, Glob/Grep, and
basic shell inspection commands like `ls`. GSD subagents need the full
workflow toolset and were still hitting "This command requires approval"
prompts on every tool call.
Two changes:
1. `resolveClaudePermissionMode` now returns `bypassPermissions` for all
GSD subagent runs (auto + interactive), dropping the `acceptEdits`
branch and the `isAutoActive` dynamic import. The host Claude Code
session's permission model is the user-visible gate; the inner SDK
process re-prompting on every tool was approval fatigue with no net
safety benefit. `GSD_CLAUDE_CODE_PERMISSION_MODE` env override stays
so security-conscious users can opt back into a stricter mode.
2. Expanded the pre-authorized `allowedTools` list to include Read,
Write, Edit, Glob, Grep, `Bash(ls:*)`, and `Bash(pwd)` alongside the
MCP server globs. Acts as a belt-and-suspenders safety net for users
who set the env override to `acceptEdits`.
Tradeoff documented inline: bypass means a prompt-injection payload read
from an untrusted file could trigger tool calls without a second gate.
Accepted because the workflow is explicit user intent and the
alternative is continuous approval fatigue that blocks real work.
Tests updated for the new allowedTools shape; permission-mode tests
already accepted bypass as the default.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(gsd): add memory pressure watchdog and persist stuck detection state
Two architectural improvements to auto-mode resilience:
1. Memory pressure monitoring (#3331): checks heap usage every 5
iterations and triggers graceful shutdown at 85% of V8 heap limit,
preventing OOM SIGKILL after long-running sessions.
2. Stuck detection persistence (#3704): saves loopState (recentUnits,
stuckRecoveryAttempts) to .gsd/runtime/stuck-state.json so counters
survive session restarts. Previously, restarting auto-mode reset all
stuck detection, allowing the same blocked unit to burn a full retry
budget each session.
Closes#3331Closes#3704
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: use valid LogComponent 'dispatch' instead of 'autoLoop'
* fix: replace empty catch blocks with debug logging in auto/loop.ts
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(state): prevent false degraded-mode warning when DB not yet initialized
deriveState() is called during before_agent_start context injection,
before any tool invocation has had a chance to open the DB. Previously,
isDbAvailable() returning false in this path triggered a misleading
"DB unavailable — using filesystem state derivation (degraded mode)"
warning, even though the DB was simply not yet initialized (not failed).
Add a _dbOpenAttempted flag in gsd-db.ts that tracks whether
openDatabase() has been called at least once. The degraded-mode warning
now only fires when the DB was actually attempted and failed to open,
not when it hasn't been initialized yet.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test(state): add regression test for false degraded-mode warning
Adds the test file that was missing from the previous commit,
fixing the CI require-tests gate.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The queueMicrotask() deferral in deliverResult() only prevented duplicate
follow-ups when a job completed *while* await_job was blocked in Promise.race().
For jobs that completed before await_job was called (common in multi-turn
interactive sessions), the microtask had already fired and queued the follow-up
message before suppressFollowUp could run.
Fix: replace queueMicrotask with setTimeout(0), storing the timer handle on
the job object. suppressFollowUp() (new method on AsyncJobManager) cancels
that timer and marks awaited = true atomically, handling both the within-turn
and cross-turn cases.
await-tool.ts now calls manager.suppressFollowUp(id) instead of directly
setting j.awaited = true, which gives it the cancellable timer path.
Adds a regression test specifically for the cross-turn case.
* feat(pi-ai): add Alibaba DashScope as standalone provider
Adds `alibaba-dashscope` for users with a regular DashScope API key,
separate from the existing `alibaba-coding-plan` free-tier provider.
- types.ts: register `alibaba-dashscope` as KnownProvider
- env-api-keys.ts: map to DASHSCOPE_API_KEY
- models.custom.ts: add qwen3-max, qwen3.5-plus, qwen3.5-flash,
qwen3-coder-plus with international endpoint and real pricing
- model-resolver.ts: default model qwen3.5-plus
- key-manager.ts: add alibaba-coding-plan and alibaba-dashscope
to PROVIDER_REGISTRY so /gsd keys add works for both
Co-Authored-By: Claude Code <noreply@anthropic.com>
* feat(pi-ai): add qwen3.6-plus to alibaba-dashscope provider
qwen3.6-plus is available on DashScope international endpoint.
Pricing: $0.5/M input, $3/M output (base tier, 0-256K tokens).
Supports thinking mode (reasoning: true).
Source: https://www.alibabacloud.com/help/en/model-studio/model-pricing
Co-Authored-By: Claude Code <noreply@anthropic.com>
* test(pi-ai): add tests for alibaba-dashscope provider and key-manager regression
- packages/pi-ai/src/models.test.ts: add describe block covering all 5
alibaba-dashscope models (presence, base URL, API, provider field,
context window, paid pricing, per-model reasoning/cost assertions,
independence from alibaba-coding-plan, failure path for unknown model)
- src/resources/extensions/gsd/tests/key-manager.test.ts: add regression
tests for #3891 — alibaba-coding-plan was missing from PROVIDER_REGISTRY,
causing /gsd keys add alibaba-coding-plan to fail silently; also covers
alibaba-dashscope registration, env var separation, and getAllKeyStatuses
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Code <noreply@anthropic.com>
- Add OLLAMA_API_KEY Bearer token auth to all Ollama HTTP client requests
(fetchWithTimeout, pullModel, chat) via getAuthHeaders/withAuth helpers.
Local Ollama ignores the Authorization header; cloud endpoints require it.
- Fix isRunning() probe for cloud endpoints: use /api/tags instead of root /
since cloud hosts may not serve the root endpoint.
- Resolve real context window for unknown models via /api/show model_info
({arch}.context_length) instead of defaulting to 8192. Priority chain:
known table > /api/show > estimate from parameter_size > 8192.
- Use dependency injection for discoverModels() to allow test mocking
without ESM named export issues.
- Pick up OLLAMA_API_KEY in provider registration (apiKey field).
Closes#3544
Co-authored-by: luannevesb <luannevesb@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The Next.js auth middleware (proxy.ts) was never wired in — it exported
`proxy` from a file named proxy.ts, but Next.js requires a `middleware`
export from middleware.ts. The middleware-manifest.json was empty,
leaving all 42 API routes accessible without authentication.
Fixes:
- Rename web/proxy.ts → web/middleware.ts, export `middleware` not `proxy`
- Add defense-in-depth auth-guard to /api/shutdown and /api/update routes
- Remove shell: true from update-service spawn (command injection surface)
- Update contract tests to verify middleware file name and export
Closes#4014
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Since 2.72.0 the interactive permission default is acceptEdits, which
auto-approves built-in Edit/Write/Bash but leaves the SDK permission
gate up for MCP tools. Without a canUseTool handler, every
mcp__gsd-workflow__* call surfaces as "This command requires approval"
and blocks GSD actions (#4099).
Add allowedTools entries (mcp__<server>__*) for each registered workflow
MCP server in buildSdkOptions so they run unattended while the rest of
the acceptEdits safety gate stays intact. Env-overridden server names
are handled by deriving the glob list from the built mcpServers keys.
Fixes#4099
Asserts that getPiDefaultModelAndProvider and migratePiCredentials remain
callable top-level exports from src/pi-migration.ts. If either is ever
renamed or unexported, this test fails before the root `tsc` build breaks
every CI job on main — the same class of regression introduced by
110c01b8c.
Commit 110c01b8c added an inline `validateConfiguredModel` function in
`src/cli.ts` while leaving the prior import from
`./startup-model-validation.js` in place, producing TS2440 (import
declaration conflicts with local declaration). The same commit added a
call to `getPiDefaultModelAndProvider()` without importing it, producing
TS2304 (cannot find name). Both errors block `npm run build` and every
CI job on main.
Drop the stale import and add `getPiDefaultModelAndProvider` to the
existing `./pi-migration.js` import where the symbol is actually
exported. The local `validateConfiguredModel` function (lines 139-174)
becomes the sole definition in scope. `./startup-model-validation.js`
is still consumed by its dedicated test files so the module stays.
Organize discussion question rounds into four layers (Scope →
Architecture → Error States → Quality Bar) with user-confirmed
gates between each. Prevents silent advancement and ensures
systematic depth coverage.
Each gate pauses for user confirmation. Users can skip forward
at any gate. Adjustments are reflected back before advancing.
Work-type adaptation shapes question depth per layer.
Prompt-only change — no TypeScript modifications.
Builds on #3977 (multi-round question structure).
* fix: update GSD runtime ignore patterns for team mode
Add missing runtime files to gitignore patterns across codebase and docs:
- .gsd/completed-units*.json (wildcard for archived per-milestone files)
- .gsd/state-manifest.json (workflow state manifest)
- .gsd/gsd.db* (SQLite database and WAL sidecars)
- .gsd/journal/ (daily-rotated event journal)
- .gsd/doctor-history.jsonl (diagnostic check history)
- .gsd/event-log.jsonl (workflow event log)
Updated files:
- gitignore.ts: GSD_RUNTIME_PATTERNS
- git-service.ts: RUNTIME_EXCLUSION_PATHS
- worktree-manager.ts: SKIP_PATHS, SKIP_EXACT, SKIP_PREFIXES
- doctor-runtime-checks.ts: criticalPatterns
- tests/git-service.test.ts: test expectations
- docs: README.md, working-in-teams.mdx
* docs: add comments noting gitignore.ts as canonical source of truth
Address code review feedback about maintenance risk of having multiple
sources of truth for ignore patterns. Add clear comments in all files
that reference GSD_RUNTIME_PATTERNS to indicate gitignore.ts is the
canonical source that must stay synchronized.
renderSummaryContent() in workflow-projections.ts wraps full_summary_md
(already a complete markdown doc with frontmatter) inside a second generated
frontmatter/heading envelope. This produces double frontmatter, double H1
headings, and duplicate Deviations/Known Issues sections.
The fix checks whether full_summary_md exists and starts with frontmatter
delimiters. If so, it is used as the entire output. The fallback synthesis
from individual DB columns only runs when full_summary_md is absent or
lacks frontmatter.
Adds 3 regression tests to projection-regression.test.ts.
Extension-based providers like pi-claude-cli register their models
during extension loading, but registrations were queued and not flushed
until after model resolution ran. This caused findInitialModel() and
the startup model validation to see extension models as nonexistent,
permanently overwriting the user's saved model selection on every launch.
- Flush pendingProviderRegistrations in createAgentSession() before
findInitialModel() so extension models are visible in the registry
- Move model validation to after createAgentSession() in both print
and interactive code paths
- Load extensions before --list-models so extension models appear
The claude-cli onboarding path stored the auth sentinel for claude-code
but did not update defaultProvider in settings.json. Users who had an
existing Anthropic API key were left on the "anthropic" provider because
the startup migration in cli.ts correctly skips direct-key holders.
Write defaultProvider = "claude-code" to settings.json in the claude-cli
branch so the provider switch takes effect immediately.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes#4102.
buildPromptFromContext previously serialized multi-turn history using
literal [User] / [Assistant] / [System] bracket labels. Those tokens
are the exact pattern the anti-fabrication rule in system.md and
discuss.md forbids — the model saw its own input framed as a bracket-
labeled transcript and mirrored the format in its output, inventing
both sides of the conversation during /gsd discuss turns.
Replace the bracket labels with XML-tag structure:
- <conversation_history> wraps the whole turn sequence
- <user_message> / <assistant_message> per turn
- <prior_system_context> for the system prompt (renamed from
<system_prompt> to avoid overlap with Claude Code's reserved
<system-reminder> convention)
Prepend a directive telling the model to respond only to the final
user message and not emit the XML tags in its own response. Keep
system.md and discuss.md in sync by documenting that prior context
is delivered in those tags.
Add regression tests asserting:
- no literal [User]/[Assistant]/[System] substrings in the prompt
- history wrapped in <conversation_history> with per-turn tags
- directive leads the prompt
- empty-history edge cases still render correctly