* fix: track remote-questions extension in managed-resources manifest
writeManagedResourceManifest only checked for index.js/index.ts when
deciding if a subdirectory is an extension. remote-questions uses mod.ts
as its entry point and was missed, causing it to be pruned on upgrades.
Also check for extension-manifest.json which is the canonical marker for
bundled extensions.
Fixes#2367
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: retrigger CI
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
* docs: add provider setup guide and improve onboarding hints
Fixes#2161
Add docs/providers.md with step-by-step setup instructions for every
supported LLM provider: OpenRouter, Ollama, LM Studio, vLLM, SGLang,
and all built-in providers. Includes env var names, example configs,
common pitfalls, and verification steps.
Improve onboarding wizard:
- Add URL hints to provider selection list
- Show common local endpoints when choosing Custom (OpenAI-compatible)
- Add post-setup guidance for OpenRouter and custom endpoints
- Reference docs/providers.md for compat troubleshooting
Update cross-references in getting-started.md, troubleshooting.md,
docs/README.md, and help-text.ts to link to the new guide.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: verify config help mentions OpenRouter, Ollama, and docs/providers.md
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: trek-e <trek-e@users.noreply.github.com>
Move regression tests and override tests from standalone files into
the existing test files introduced by PR #666:
- resolve-config-value.test.ts: add REGRESSION #666 describe block
and setAllowedCommandPrefixes override tests
- url-utils.test.ts: add REGRESSION #666 describe block and
setFetchAllowedUrls override tests
- Delete: regression-666.test.ts, resolve-config-value-override.test.ts,
url-utils-override.test.ts
Same 59 tests, fewer files, tests live next to the code they test.
Two regression tests that prove the bug introduced by PR #666:
1. Non-default credential tool (sops) is silently blocked by the
hardcoded SAFE_COMMAND_PREFIXES with no way to override.
2. Private IP URL is silently blocked by isBlockedUrl() with no
way to allowlist.
Both tests use dynamic import to check for the override functions,
so they run cleanly on both main (where they fail) and this branch
(where they pass). Verified in a git worktree of main.
PR #666 introduced hardcoded SAFE_COMMAND_PREFIXES and SSRF URL
blocklists with no override mechanism. Users with non-standard
credential tools (sops, doppler, age, infisical) or needing to fetch
from internal URLs (self-hosted docs, VPN services) were silently
blocked with no recourse.
Add two global-only settings (ignored in project-level settings.json
to preserve the security property against malicious repos):
- allowedCommandPrefixes: replaces the built-in command allowlist
- fetchAllowedUrls: exempts hostnames from SSRF blocking
Both also support env var overrides (GSD_ALLOWED_COMMAND_PREFIXES,
GSD_FETCH_ALLOWED_URLS) for CI/container environments. Env vars
take precedence over settings.json.
Security model: global-only keys are stripped from project settings
at load time via stripGlobalOnlyKeys(), applied at all three
assignment points for this.projectSettings. The merge function
stays untouched — no future caller can accidentally skip stripping.
15 new tests covering override behavior, cache invalidation,
allowlist exemptions, and global-only enforcement.
On Windows, child_process.spawn() and execFile() open a visible console
window by default. The web server spawn, RPC bridge, browser opener, and
all 15 web service subprocess calls were missing windowsHide: true,
causing constant console window flashing when running gsd --web.
Closes#2628
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
When an agent requests read(file, offset: 30) on a 13-line file, the
read tool threw "Offset 30 is beyond end of file" which propagated as
invalid JSON downstream during milestone completion. Now clamps the
offset to the last line and prepends a notice, allowing the agent to
continue with valid content.
Fixes both read.ts and hashline-read.ts variants.
Closes#3007
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Two root causes destroyed terminal state during normal navigation:
1. The pagehide handler fired a shutdown beacon unconditionally, but on
mobile/Safari tab switches pagehide fires with event.persisted=true
(bfcache entry). This killed the server and all PTY sessions when the
user merely switched browser tabs. Fix: check event.persisted and skip
the beacon when the page is being cached, not unloaded.
2. ShellTerminal used project-agnostic session IDs ("default"), so
switching projects and switching back either collided with the old
session or spawned a new one, losing terminal state. Fix: scope session
IDs by project path (e.g. "default:/path/to/project") so the server's
getOrCreateSession returns the existing live PTY on reconnect.
Closes#2701
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
`gsd auto` was not handled as a subcommand — it fell through to the
interactive TUI, which hangs indefinitely when stdin/stdout are piped
(non-TTY). Add `auto` as a recognized subcommand that rewrites argv
and delegates to `runHeadless(parseHeadlessArgs(...))`, matching the
existing `gsd headless auto` behavior.
Also adds `gsd auto` to TTY error hints and help text.
Closes#2732
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Closes#2711
Two changes fix the tab-reset-to-dashboard bug:
1. Remove the forced `gsd:navigate-view` dispatch to "dashboard" in
ProjectsPanel.handleSelectProject — this was unconditionally resetting
the view on every project switch.
2. Add a useEffect in WorkspaceChrome that resets `viewRestored` when
`projectPath` changes, so the per-project sessionStorage view restore
fires for the newly-selected project.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: defer model validation until after extensions register (#2626)
Extension-provided models (e.g. claude-code/claude-sonnet-4-6) were
silently overwritten on every startup because the model validation ran
before createAgentSession(), which is where extensions register their
models in the ModelRegistry. At validation time, extension models did
not exist in the registry, so the user's valid choice was replaced
with a built-in fallback.
Extract validation into validateConfiguredModel() and call it after
createAgentSession() in both print-mode and interactive-mode paths.
Closes#2626
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: align MinimalSettingsManager interface with SettingsManager
The MinimalSettingsManager interface used `string` for thinking level
types, but SettingsManager uses a specific union type and returns
`undefined`. This caused TS2345 at cli.ts lines 448 and 587.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Three root causes addressed:
1. PtyChatParser: user input echoed after a bare prompt line (e.g. "❯ \n"
followed by "hello\n") was misclassified as assistant content. Added
_awaitingInput flag that flips true on prompt boundary and classifies the
next content line as role=user.
2. Chat mode "looks stuck": when the session is idle (connected, not
streaming, has timeline content), no visual cue indicated GSD was waiting
for input. Added a "Ready for your input" indicator with a pulsing dot.
3. Transcript overflow misalignment: chatUserMessages was not trimmed when
liveTranscript/completedTurnSegments overflowed MAX_TRANSCRIPT_BLOCKS,
causing index-based interleaving to pair user messages with wrong
assistant responses.
Also exposed isAwaitingInput() on PtyChatParser so chat UIs can query
whether the session is waiting for user input, and widened the > and $
prompt marker regexes to match bare prompts after trimEnd strips trailing
whitespace.
Closes#2707
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The ensure-workspace-builds.cjs postinstall script falsely detected
workspace packages as stale in npm tarball installs. npm sets all
tarball entries to a canonical timestamp (Oct 26 1985), but extraction
ordering causes src/ files to appear 1-2 seconds newer than dist/
files. This triggered a rebuild attempt that either failed silently
(no tsc available) or — when tsc was globally installed — could
produce broken dist/ output, corrupting the known-good pre-built
files and causing the DefaultResourceLoader export error on startup.
The fix gates the src-vs-dist staleness check behind a .git directory
check: only development clones (with .git/) perform the timestamp
comparison. npm tarball installs (no .git/) only check for missing
dist/index.js, which is the safe and correct behavior.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: TÂCHES <afromanguy@me.com>
Community extensions must be placed in ~/.pi/agent/extensions/, not
~/.gsd/agent/extensions/ which is reserved for bundled extensions synced
from the gsd-pi package. Extensions placed in the wrong path are silently
ignored by the loader.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
When gsd is spawned as an RPC bridge child process, stdout is a pipe
(process.stdout.isTTY === undefined). The TUI render loop would run at
~4,600 renders/sec writing ANSI escape codes to the pipe, consuming
500%+ CPU per process while idle.
Add isTTY guard to Terminal interface, ProcessTerminal.start(), TUI.start(),
and requestRender() so the entire render pipeline is skipped on non-TTY stdout.
RemoteTerminal (browser-backed) correctly reports isTTY=true.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The milestones list only refreshed on agent_end events, causing stale
milestone state during multi-turn agent execution. Add turn_end as a
workspace cache invalidation trigger so the UI reflects milestone
changes after each turn boundary.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The subprocess spawned by collectAuthoritativeAutoDashboardData always
starts with fresh module state (s.active === false), so the web UI
always showed "Start auto" even while auto mode was running. After
obtaining the subprocess result, reconcile active/paused state with
the on-disk session lock (.gsd/auto.lock) and paused-session metadata
(.gsd/runtime/paused-session.json).
Closes#2705
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
When `gsd auto` is run with piped stdout (e.g. `gsd auto | cat` or
`gsd auto > file`), the TUI cannot render on a non-terminal output
stream, causing the process to hang indefinitely.
This fix:
- Detects piped stdout before entering interactive mode and redirects
`gsd auto` to headless mode automatically
- Extends the interactive mode TTY gate to also check process.stdout.isTTY
(previously only checked stdin), with a descriptive error message
- Adds `gsd headless` to the non-interactive alternatives hint
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The searchWithOAuth() function sent a request body that the Cloud Code
Assist API rejected with 400 INVALID_ARGUMENT. Two issues:
1. URL was missing ?alt=sse query parameter (endpoint returns SSE format)
2. Request body was missing the required userAgent field
Also adds regression tests that capture the fetch call and assert the
request URL and body match the Cloud Code Assist wire contract.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add configured remote channel (Discord/Slack/Telegram) as a checkmark
in the tools row alongside Brave/Answers/Jina. Remove verbose remote
status lines and duplicate display from header-renderer and register-hooks.
Previously, headless --verbose mode accumulated text_delta events into a
buffer and displayed a single truncated 120-char [thinking] line before
tool calls. The model's actual text responses between tool calls were
effectively invisible.
Changes:
- Stream text_delta and thinking_delta events directly to stderr in
verbose mode with [text] and [thinking] block markers
- No truncation — full model output is visible
- Fix non-verbose fallback: read from ame.delta (correct field) instead
of ame.text (always undefined for text_delta events)
- Track inTextBlock/inThinkingBlock state to properly close streaming
blocks before tool calls
- Expand summarizeToolArgs with support for async_bash, await_job,
cancel_job, find, ls, lsp, hashline_edit, subagent, browser_navigate,
and gsd_* tools
- Add streaming formatter functions: formatTextStart, formatTextEnd,
formatThinkingStart, formatThinkingEnd
- Update tests for new tool arg summarization and path field handling
Multi-turn commands (auto, next) have their own completion signals via
isTerminalNotification ("Auto-mode stopped..."/"Step-mode stopped...").
The execution_complete event fires after command setup before any real
work begins, causing these commands to exit immediately with zero work done.
Closes#2917
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mapStatusToExitCode only handled "complete" but RPC v2 emits "completed",
causing all headless sessions to falsely timeout and restart.
Also emits milestone-ready notification in checkAutoStartAfterDiscuss so
headless parent can detect and chain into auto-mode.
Closes#2914
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
`gsd headless new-milestone --auto --verbose` now works — flags are
parsed regardless of position relative to the command word.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): wire src/resources/extensions/shared/tests/ into test:unit runner
The test:unit glob excluded src/resources/extensions/shared/tests/ entirely,
leaving format-utils.test.ts (and any future tests there) silently unfired.
- Add shared/tests/*.test.ts to the test:unit glob in package.json
- Export newestSrcMtime from ensure-workspace-builds.cjs (require.main guard
prevents side-effects on require) so the staleness logic can be tested
- Add src/tests/ensure-workspace-builds.test.ts covering newestSrcMtime:
non-existent dir, no .ts files, single file, max of multiple, recursion,
node_modules skip
Closes#2808
* perf(test): compile unit tests with esbuild and fix dist-test/node_modules
Replace per-file --experimental-strip-types with a single esbuild compilation
step (scripts/compile-tests.mjs) that compiles all src/ TypeScript to dist-test/
in ~3s, then runs the pre-compiled JS. Eliminates ~1.7s Node startup overhead
per test file.
- scripts/compile-tests.mjs: esbuild compilation, asset copy, .ts→.js rewrite,
stale file cleanup; creates dist-test/node_modules symlink so resource-loader.ts
resolves gsdNodeModules to a real path (fixes node-modules-symlink test failure)
- scripts/dist-test-resolve.mjs: ESM loader hook for @gsd/* bare specifiers and
.ts→.js fallback rewriting at runtime
- .gitignore: exclude dist-test/ from version control
- package.json: add test:compile script; update test:unit to compile-then-run;
update test:integration globs to cover new integration/ subdirectories
- worker-registry.ts: unref() cleanup timer so it does not keep the Node process
alive after tests complete
Closes#2858
* fix(test): update relative imports in tests/integration/ after directory move
When tests were moved from tests/ to tests/integration/ in the previous
commit, relative imports weren't updated. ../foo now resolves one level
too shallow.
Fix all 117 import paths across 43 test files:
- ../foo → ../../foo (source files at gsd/ level)
- ../../get-secrets-from-user.ts → ../../../ (at extensions/ level)
- ../../subagent/worker-registry.ts → ../../../ (at extensions/ level)
- ./marketplace-test-fixtures.js → ../marketplace-test-fixtures.ts
- ./test-helpers.ts → ../test-helpers.ts
typecheck:extensions now passes with zero errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(integration): set 10-minute timeout for integration test runner
build job takes ~7min on main. Without a global timeout, hanging tests
block the suite indefinitely. --test-timeout=600000 caps each test at 10min.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Revert "test(integration): set 10-minute timeout for integration test runner"
This reverts commit be77ead77d369ad8569292ae6b69ba56435f5433.
* fix(test): correct formatDuration(0) edge case and docker test root path
- formatDuration(0) now returns '0s' instead of '0ms' by guarding the
sub-second branch with ms > 0
- docker-template.test.ts root path goes ../../.. from dist-test/src/tests/
to reach project root instead of landing in dist-test/
- replace require() calls in skill-health.ts and visualizer-overlay.ts
with proper ES module imports
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): correct relative import paths in integration tests
All affected tests were one directory level off — importing from ../web/
and ../resources/ when the correct paths are ../../web/ and ../../resources/.
Tests live at src/tests/integration/, not src/tests/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): add esbuild to root devDeps and wire dist-test-resolve hook
P1: esbuild was only in web/package.json — compile-tests.mjs requires it
at the root node_modules path, so CI failed on clean installs.
P2: dist-test-resolve.mjs existed but was never loaded; @gsd/* imports in
compiled tests resolved to installed workspace packages instead of freshly
compiled dist-test output. Add --import to test:unit.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(deps): align esbuild version with lock file (0.25.12)
^0.27.4 didn't satisfy the existing lock file entry. Use the version
already present so npm ci passes without regenerating the lock file.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): correct all relative import depths in src/tests/integration/
Tests in src/tests/integration/ need 3 levels up (../../..) to reach
project-root dirs (web/, packages/) and 2 levels up (../..) to reach
src-level dirs (src/web/, src/cli-web-branch.ts).
Fixes:
- ../../web/lib/ → ../../../web/lib/ (Next.js app, not src/web/)
- ../../web/app/ → ../../../web/app/
- ../../packages/ → ../../../packages/
- ../cli-web-branch.ts → ../../cli-web-branch.ts
- ../web-mode.ts → ../../web-mode.ts
- ../resources/extensions/ → ../../resources/extensions/
- ci_monitor ROOT path: 2 levels up → 3 levels up
- web-responsive WEB_ROOT: 2 levels up → 3 levels up
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(test): use dot reporter for test:unit to reduce noise
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(test): switch test:unit reporter to tap
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(test): compact test reporter — silent on pass, failures + summary only
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(test): include shared/tests in test:coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): correct path depths in tests moved to integration/
Tests moved from tests/ to tests/integration/ need one extra ../
to reach the same source files. Also fix web component paths — those
files live at web/ not src/web/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): fix web component paths in web-session-parity-contract
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(test): use process.cwd() for project root in docker-template test
Resolving relative to __dirname breaks under test:coverage which runs
source files directly from src/tests/ — needs ../.. not ../../..
(the extra level only exists in the compiled dist-test/ output).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* ci: retrigger CI
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Unify the Power Mode xterm light palette behind a shared helper and replace low-contrast ANSI white/yellow entries with contrast-safe values.
Add a regression test that guards both the readable light-theme palette and the shared helper wiring so the duplicated terminal palettes do not drift again.
Closes#2810
When devRoot pointed at a monorepo, discoverProjects scanned one level
deep and listed each workspace/package as a separate project. Now it
checks for monorepo markers (pnpm-workspace.yaml, lerna.json, turbo.json,
nx.json, rush.json, package.json workspaces) before scanning children.
If the root is a monorepo, it returns it as a single project entry.
- Add detectMonorepo() to bridge-service with support for 6 monorepo formats
- Add isMonorepo signal to ProjectDetectionSignals
- Update discoverProjects to short-circuit when root is a monorepo
- Show 'Monorepo' tag in project list UI
- Add 24 tests covering all monorepo detection scenarios
* feat: Migrated headless orchestrator to use execution_complete events,…
- "src/headless.ts"
- "src/headless-ui.ts"
- "src/tests/headless-v2-migration.test.ts"
GSD-Task: S06/T02
* test: Wired pi-coding-agent to re-export JSONL utils from @gsd/rpc-clie…
- "packages/pi-coding-agent/src/modes/rpc/jsonl.ts"
- "packages/pi-coding-agent/package.json"
- "packages/rpc-client/src/index.ts"
- "packages/rpc-client/src/jsonl.ts"
- "packages/rpc-client/src/rpc-client.ts"
- "packages/rpc-client/src/rpc-types.ts"
- "packages/rpc-client/src/rpc-client.test.ts"
- "packages/rpc-client/package.json"
GSD-Task: S06/T03
* feat: Wire --resume flag to resolve session IDs via prefix matching and…
- "src/headless.ts"
- "dist/headless.js"
GSD-Task: S01/T01
* test: Added 5 e2e integration tests proving headless JSON batch, SIGINT…
- "src/tests/integration/e2e-headless.test.ts"
GSD-Task: S01/T02
* test: Updated @gsd/rpc-client and @gsd/mcp-server to 2.52.0 with publis…
- "packages/rpc-client/package.json"
- "packages/mcp-server/package.json"
- "packages/rpc-client/.npmignore"
- "packages/mcp-server/.npmignore"
GSD-Task: S02/T01
* chore: auto-commit after complete-milestone
GSD-Unit: M002-gzq23a
* fix: revert jsonl.ts to inline implementation — @gsd-build/rpc-client not available at source-level test time in CI
The re-export from @gsd-build/rpc-client fails in CI because tests run against
TypeScript source (--experimental-strip-types) before any build step. The npm
dependency resolves to node_modules/ which requires dist/ to exist. Reverting
to the original inline implementation eliminates the cross-package dependency
for source-level imports.
* fix: use localStorage for auth token to enable multi-tab usage
sessionStorage is tab-scoped, so manually opened second tabs cannot
access the auth token delivered via URL fragment to the first tab.
localStorage is shared across all tabs on the same origin, and since
each GSD instance binds to a unique random port the origin already
scopes the token to that instance.
Also adds a `storage` event listener so already-open tabs pick up
token changes immediately.
Closes#2714
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: update web-auth-token test for localStorage migration
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: harden web runtime auth token and lock retry tests
Teach the packaged web runtime harness to recover the auth token from the launcher stderr when the browser-open stub log is absent. Also widen the transient session-lock retry tests so they stay stable under full-suite CPU contention.
* test: harden suite-level RTK and worktree stability
Stabilize the RTK seam tests under full-suite load by using a faster fake RTK binary on Unix and allowing the tests to raise the rewrite timeout without changing the production default. Also widen the transient session-lock retry budget and give the heavy auto-worktree milestone merge suite an explicit timeout so it can complete under CI-level contention.
* test: harden git-service repo bootstrap under suite load
Switch repo bootstrap steps in git-service.test.ts to runGit(...) where the setup only needs direct git invocations.
This removes avoidable shell wrappers from the highest-churn repo setup paths, which makes the full unit suite less prone to child-process flake under load while keeping the test behavior unchanged.
Split fake multi-stage Dockerfile into independent CI builder and
runtime images. Add proper entrypoint with UID/GID remapping via
PUID/PGID, sentinel-based first-boot bootstrap, pre-creation of
critical file targets, and signal-forwarding privilege drop via gosu.
Standardize on Node 24, split compose into minimal + full reference.
Closes#9
Runs commands in the user's login shell ($SHELL -l -c) so PATH additions
and env vars from shell profiles (.zprofile/.profile) are available.
Shell aliases are intentionally not loaded (requires -i which causes
startup noise and job control side effects).
Implementation spawns $SHELL directly via a loginShell flag threaded
through the bash executor — no double-shell wrapping.
- Registered as builtin slash command with autocomplete
- Reuses existing bash execution pipeline (streaming, session recording)
- Output included in LLM context for agent reference
- Added loginShell option to executeBash and handleBashCommand
- Browser mode rejects /terminal (terminal-only command)
- Updated web-command-parity-contract tests
AI-assisted: This change was authored with Claude (AI pair programming).
* feat: integrate managed RTK across shell workflows
* fix(rtk): unify managed fallback and live savings wiring
* fix(rtk): improve TUI status visibility
* fix(tests): make portability tests independent of pi-coding-agent dist build
The CI portability test runs don't guarantee that
packages/pi-coding-agent has been compiled. Any test that
imported files pulling in @gsd/pi-coding-agent (resource-loader,
preferences-skills, async-bash-tool, etc.) crashed with
ERR_MODULE_NOT_FOUND pointing at dist/index.js.
Two changes to dist-redirect.mjs (the Node ESM loader hook used by
all unit tests):
- Redirect the bare @gsd/pi-coding-agent specifier to the workspace
source entrypoint (src/index.ts) so no dist/ artifact is needed.
- Extend the load() hook to transpile *.ts files under
packages/pi-coding-agent/src/ through TypeScript's transpileModule.
Node's --experimental-strip-types can't handle parameter properties
and similar syntax present in that package's source; full transpilation
avoids the ERR_UNSUPPORTED_TYPESCRIPT_SYNTAX crash.
Also fix the dashboard.tsx responsive grid:
- xl:grid-cols-5 → xl:grid-cols-4 2xl:grid-cols-5
(5 metric cards no longer fit at xl without overflow; test contract
expected xl:grid-cols-4)
- Keep loading-skeletons.tsx in sync with the same breakpoints.
Add src/tests/resolve-ts-loader.test.ts to guard the loader behaviour:
- bare @gsd/pi-coding-agent redirect points to workspace source
- direct source-entry rewrite (.js → .ts)
- transpilation removes TS parameter property syntax that strip-only
mode cannot parse
* fix(tests): redirect all workspace package imports to source in portability tests
The previous fix only redirected @gsd/pi-coding-agent to its
source entrypoint. In CI, pi-coding-agent/src itself imports
@gsd/pi-ai (and other workspace packages) which were still pointing
at dist/. Since no workspace dist is built during the portability
test run, any transitive resolution hit the same ERR_MODULE_NOT_FOUND.
Changes to dist-redirect.mjs:
- Redirect @gsd/pi-ai, @gsd/pi-ai/oauth, @gsd/pi-agent-core, and
@gsd/pi-tui bare imports to their workspace src/ entrypoints.
- Broaden the load() transpilation condition from
'/packages/pi-coding-agent/src/' to '/packages/*/src/' so that
all workspace source files are run through TypeScript's
transpileModule, handling parameter properties and other syntax
that Node's strip-only mode rejects.
Verified by hiding all four workspace dist/ directories locally and
running the failing test set — 96/96 pass.
* fix(tests): redirect @gsd/native sub-paths; fix Windows .cmd spawnSync
Two more portability failures after the previous fix:
1. @gsd/native sub-path imports (@gsd/native/fd, @gsd/native/text, etc.)
were not redirected — the loader only handled the bare specifier.
Added a prefix-match redirect for @gsd/native/* → packages/native/src/<sub>/index.ts.
2. Windows RTK tests failed because createFakeRtk produces a .cmd wrapper
on Windows, and spawnSync(binaryPath, [...]) without shell:true silently
returns non-zero when the binary is a .cmd file.
Added shell: /\.(cmd|bat)$/i.test(binaryPath) to the spawnSync calls in:
- src/resources/extensions/shared/rtk.ts (rewriteCommandWithRtk)
- src/resources/extensions/shared/rtk-session-stats.ts (readCurrentRtkGainSummary)
- packages/pi-coding-agent/src/utils/rtk.ts (rewriteCommandForGsd)
Production use of rtk.exe is unaffected; the shell flag is only true for
.cmd/.bat paths.
Verified: all 93 portability tests pass with all workspace dist/ directories
removed (simulating CI portability environment).
* fix(tests): Windows portability fixes — HOME env, managed RTK path, perf threshold
Four Windows-specific failures fixed:
1. app-smoke.test.ts: process.env.HOME is undefined on Windows (uses
USERPROFILE instead). Changed to homedir() from node:os which works
cross-platform.
2. Managed RTK path tests on Windows: tests placed a fake RTK as rtk.exe
(by copying a .cmd script into a .exe filename), which Windows cannot
execute. Two-part fix:
- resolveRtkBinaryPath() in both rtk.ts files now falls back to rtk.cmd
in the managed dir on Windows when rtk.exe is absent.
- withManagedFakeRtk and equivalent patterns in rtk.test.ts,
rtk-session-stats.test.ts, rtk-execution-seams.test.ts changed to
place the fake at rtk.cmd instead of rtk.exe on Windows.
3. bg_shell RTK test on Windows: requires bash (for shell sessions), which
is not available on the blacksmith-4vcpu-windows-2025 runner without
Git Bash installed. Test now skips on win32.
4. derive-state-db perf assertion: 10ms threshold was too tight for Windows
CI runners (measured 12ms under load). Raised to 25ms — still catches
real regressions (baseline is 3ms locally and ~12ms on stressed runners).
* fix(tests): fix managed RTK path fallback on Windows in src/rtk.ts + fix copyable fake
Two remaining Windows failures:
1. src/rtk.ts was never patched with the rtk.cmd managed-dir fallback
(only the shared/rtk.ts and pi-coding-agent/src/utils/rtk.ts were updated).
Added the same rtk.cmd fallback and shell:.cmd detection to src/rtk.ts,
which is what rtk.test.ts imports from.
2. createFakeRtk on Windows wrote '%~dp0\fake-rtk.js' in the .cmd content —
this resolves relative to the .cmd file's own directory. When the test
copies rtk.cmd to a different managed dir, %~dp0 resolves to the copy
destination where fake-rtk.js does not exist. Fixed by embedding the
absolute path to fake-rtk.js directly in the .cmd content so the fake
works correctly regardless of where the .cmd is copied.
* feat(experimental): add RTK opt-in preference with web UI toggle
- Add `experimental` category to GSDPreferences with `rtk: boolean` (default: false)
- RTK is now opt-in: disabled by default for all projects unless explicitly enabled
- Validate experimental.* keys; unknown experimental keys produce warnings
Web UI:
- Add ExperimentalPanel component with animated toggle switch per flag
- Add /api/experimental route (GET/PATCH) to read/write flags in preferences.md
- Add 'Experimental' tab to settings dialog sidebar nav (FlaskConical icon)
- Include ExperimentalPanel at bottom of gsd-prefs mega-scroll
- Fix toggle disabled state: trigger loadSettingsData for 'experimental' section
and self-fetch on mount when data is absent
Dashboard:
- Gate RTK Saved metric card on rtkEnabled from live auto state (web)
- Gate TUI dashboard RTK savings row on rtkEnabled
- Gate TUI footer RTK status updates on experimental.rtk preference
- Propagate rtkEnabled through AutoDashboardData → bridge-service → store
Build:
- Add scripts/build-if-stale.cjs: incremental build driver that skips each
step (packages, root tsc, copy-resources, web) when output is newer than
source; replaces full rebuild chain in gsd:web
- Add scripts/web-stop.cjs: robust stop with registry + legacy PID + orphan
sweep via pgrep; handles crash/restart orphaned next-server processes
- gsd:web now uses build-if-stale.cjs (fast cold starts, instant when unchanged)
- gsd:web:stop / gsd:web:stop:all use web-stop.cjs directly
Fix: correct import path in rtk-status.ts (./preferences.js not ../preferences.js)
* fix: restore em-dash encoding in package.json to match upstream
* refactor(rtk): move command rewrite out of pi-coding-agent into GSD extension
Per review feedback from igouss: pi-coding-agent should not be modified to add
GSD-specific logic. Instead, add a proper extension point and wire RTK through it.
Changes to packages/pi-coding-agent (extension API only — no RTK logic):
- Add BashTransformEvent + BashTransformEventResult types to extension API
- Add on('bash_transform') overload to ExtensionAPI interface
- Add emitBashTransform() to ExtensionRunner (chains all handlers in order)
- Call emitBashTransform() in wrapToolWithExtensions before bash tool execution
- Export new types from extensions/index.ts and package index.ts
- Revert all RTK-specific changes from bash-executor.ts, tools/bash.ts
- Remove packages/pi-coding-agent/src/utils/rtk.ts entirely
Changes to GSD extension:
- Register bash_transform handler in register-hooks.ts that calls
rewriteCommandWithRtk() from the existing shared/rtk.ts module
- Handler is a no-op when RTK is disabled or not installed
* fix: correct import path for shared/rtk.js in register-hooks
* fix(tests): remove deleted pi-coding-agent/utils/rtk imports from execution seams test
The RTK rewrite logic was moved out of pi-coding-agent into the GSD
extension (bash_transform hook). Tests that directly imported the
deleted utils/rtk.ts are removed; remaining tests verify the shared
RTK module and GSD-layer surfaces that still call rewriteCommandWithRtk.
* test: add cross-platform filesystem safety static analysis guard
Scan all production .ts files for patterns that break on Windows,
Linux, or macOS:
1. Hardcoded /tmp paths (FAIL) — use os.tmpdir()
2. String concatenation path separators (WARN) — use path.join()
3. rmSync without force: true (FAIL) — Windows read-only files
4. Shell command path interpolation (FAIL) — injection/spaces risk
5. existsSync + delete TOCTOU races (WARN) — informational
6. Recursive rmSync without containment check (WARN) — safety audit
Includes allowlists for known-safe patterns (e.g. cmux Unix socket,
npm package name constants). Reports violations with file path and
line number context.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: normalize path separators in allowlist matching for Windows CI
The isAllowlisted function compared relative paths using forward slashes,
but path.relative() produces backslashes on Windows, causing allowlist
entries to never match on the Windows CI runner.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>