When auto-mode merges a completed slice and hits code conflicts in
non-.gsd files, dispatch a fix-merge session to resolve them instead
of hard-resetting and stopping. This eliminates the #1 cause of
unnecessary auto-mode stops.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
writeIntegrationBranch was unconditionally skipping if any integration
branch was already recorded, even if the user started auto-mode from a
different branch. Now it only skips when the recorded branch matches —
if it differs, the record is updated so slices merge to the correct target.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The merge flow checked out the slice branch mid-merge to untrack runtime
files, which failed when .gsd/STATE.md had uncommitted working tree changes.
Instead, strip runtime files from the staged merge result post-merge — no
branch switching needed.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The merge flow checked out the slice branch mid-merge to untrack runtime
files, which failed when .gsd/STATE.md had uncommitted working tree changes.
Instead, strip runtime files from the staged merge result post-merge — no
branch switching needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: suppress git credential prompts that freeze TUI (#280)
Set GIT_TERMINAL_PROMPT=0 and GIT_ASKPASS="" on all git subprocess calls
so git fails immediately instead of prompting for credentials when tokens
expire, which deadlocks the TUI's stdin.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ci: add CI workflow and fix publish to prevent broken releases
Add ci.yml that runs build + test + smoke test on every push/PR to main.
Fix build-native.yml publish job to explicitly build before publishing,
verify dist/loader.js exists, check tarball contents, and smoke test the
published package.
Closes#293
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Background job completions were delivered via an infinite retry loop with
exponential backoff. Since delivery is an in-process function call (not a
network operation), retries served no purpose and caused each retry to
trigger a full LLM turn — burning tokens indefinitely until the 5-minute
eviction timer fired.
Delivery is now fire-once. The acknowledgeDeliveries API is retained as a
no-op for compatibility with the await_job tool.
Closes#301
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add GitHub Workflows skill with CI workflow and ci_monitor tool
- Runs on push to main and feature branches
- Runs on pull requests to main
- Build + test pipeline using Node 22
Cross-platform CI monitoring tool for debugging GitHub Actions:
- `runs` - List recent workflow runs
- `watch` - Monitor running workflow
- `fail-fast` - Exit 1 on first failure (for scripts)
- `log-failed` - Show failed job logs
- `test-summary` - Extract test pass/fail counts
- `check-actions` - GraphQL query for action versions
- `grep` - Search logs with context
- `wait-for` - Block until deployment keyword appears
Pure Node.js - no shell interpolation, works on macOS/Windows/Linux.
Drift-immune skill that:
- Routes all CI operations through ci_monitor.cjs
- Fetches live docs from docs.github.com (no stale training data)
- Provides validation constraints (BEFORE/AFTER/EVIDENCE)
- Split tests into test:unit (141 tests, ~12s) and test:integration (5 tests)
- Fixed idle-recovery.test.ts for current implementation
- Removed AGENTS.md dead code from resource-loader.ts
- Moved npm run build out of tests (fixes ENOBUFS)
When CI fails, you need observable diagnostics:
- `gh run` output is not script-friendly
- ci_monitor.cjs provides structured output for automation
- The skill ensures AI uses the tool, not stale training data
* fix: resolve imports and path for current upstream version
- Updated imports from @mariozechner/pi-coding-agent to @gsd/pi-coding-agent
- Fixed integration test path calculation to use process.cwd()
- Kept test:unit and test:integration scripts
* fix: replace search provider preference instead of accumulating
AuthStorage.set() for api_key credentials appends to the existing list
rather than replacing. When setSearchProviderPreference was called twice
with different values, the second call appended the new value, leaving
the first value at index 0, which get() returned.
Fix: call auth.remove() before auth.set() to ensure only the latest
preference is stored.
https://claude.ai/code/session_01Qx7HRSDb117KzDZzdKk1KB
* fix: address all 10 open PR review comments
- package.json: run build before test:integration so a fresh checkout works
- pack-install.test.ts: replace execSync+shell redirects with execFileSync
argument arrays (portable, no shell parsing, paths with spaces safe)
- ci_monitor.test.ts: remove unconditional passed++ after assert; move
success message after the failed > 0 check so it only prints on success
- setup_gh.cjs: replace unzip/tar shell-outs with platform-specific
execFileSync calls (unzip on macOS, PowerShell Expand-Archive on Windows);
add compareVersions() for correct element-by-element semver comparison
- ci_monitor.cjs: add --repo/-R global option so repo is overrideable;
fix getLogs() to use gh run view --log --job instead of binary REST endpoint
https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM
* fix: make all changed files fully cross-platform (Windows/macOS/Linux)
- pack-install.test.ts: use tar npm package instead of tar CLI; resolve
gsd binary as gsd.cmd on Windows; skip shebang check on Windows
- setup_gh.cjs: use execFileSync for all binary invocations; replace
which with where on Windows; add Windows PATH guidance; filter preferred
install dirs by platform; unify ZIP extraction to use process.platform
consistently; escape single quotes in PowerShell Expand-Archive args
- ci_monitor.cjs: use path.join for .github/workflows paths; replace
all split('\n') with split(/\r?\n/) to handle Windows CRLF output
https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM
* refactor: simplify and deduplicate changed files
- ci_monitor.cjs: memoize getRepo() so gh repo view subprocess runs at
most once per invocation instead of once per command call in watch loops
- pack-install.test.ts: extract packTarball() helper to eliminate
duplicate npm pack logic across two tests; remove unused contents variable
https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM
* refactor: remove redundant existsSync before canWrite() in findInstallDir
canWrite() already returns false for non-existent directories, so the
pre-check was a TOCTOU-style redundancy with no behavioral value.
https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM
* fix: replace tar npm package with Node built-ins (zlib + manual tar parsing)
tar is not in the dependency tree. listTarEntries() decompresses via
createGunzip() and parses the 512-byte tar block format directly,
reading name/prefix/type/size fields per POSIX ustar spec. No external
dependency required. Also fixes the broken tarball variable reference
left over from the packTarball() refactor.
https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM
* remove: drop setup_gh scripts in favour of ci_monitor
setup_gh.cjs and setup_gh.py were one-shot gh CLI installers.
ci_monitor.cjs covers the day-to-day CI use case and is the tool
the skill routes through. Environments that need gh installed can
use brew/winget/distro packages directly.
https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM
* fix: run only unit tests in CI — integration tests cause ENOBUFS
The integration tests (npm pack → npm install → spawn node) exceed
the buffer limits of the CI runner environment. They are documented
as requiring a manual build+run step. CI now runs test:unit only.
https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM
* fix: run all tests in CI without ENOBUFS
- ci.yml: run unit and integration as separate steps; build is already
its own step so test:integration doesn't need to rebuild
- package.json: remove npm run build from test:integration script
- pack-install.test.ts: npm install uses stdio:'ignore' to avoid
piping large output through Node buffers (root cause of ENOBUFS);
add early dist/ check with clear error message instead of rebuilding
https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM
* fix: resolve ENOBUFS and clean up setup_gh references
- pack-install.test.ts: derive tarball filename from package.json
instead of piping npm pack --json stdout; use stdio:ignore throughout
to avoid exhausting OS pipe buffers on CI runners
- SKILL.md: remove setup_gh install instructions; assume gh is
pre-installed via system package manager; point to ci_monitor.cjs
- github_project_setup.py: remove setup_gh.py reference from error message
https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM
* fix: address Copilot review comments on pack-install.test.ts
- listTarEntries: collect chunks in array, Buffer.concat once on end
instead of O(n²) repeated concat in data handler
- listTarEntries: attach error handler to createReadStream so read
errors reject the Promise instead of crashing the process
- npm pack: use stdio:['ignore','ignore','pipe'] to preserve stderr
for diagnostics while still avoiding ENOBUFS on stdout
- npm install: same — pipe stderr so failures include error output
https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM
---------
Co-authored-by: Claude <noreply@anthropic.com>
Hashline prefixes (e.g. "1#BQ:") were leaking into the TUI display
for file reads, showing as weird characters to users. Strip them
before rendering since they're only meant for model consumption.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Anthropic's 429 responses include retry-after and x-ratelimit-reset-*
headers that tell us exactly when to retry. Previously we ignored these
and used exponential backoff (2s, 4s, 8s), which is both wrong and
misleading in the UI countdown.
- Add retryAfterMs to AssistantMessage as the structured carrier
- Extract retry-after / x-ratelimit-reset-requests / x-ratelimit-reset-tokens
from Anthropic SDK APIError.headers in the provider catch block
- Session uses retryAfterMs when present (capped by maxDelayMs=60s),
falls back to exponential backoff for errors with no timing hint
The UI countdown now shows the actual Anthropic reset time. No UI changes needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Lines exceeding terminal width are now silently truncated at the render
boundary rather than throwing a fatal error that kills the session.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add native Rust output truncation module
Line-boundary-aware truncation for tool outputs (bash, grep, file reads),
replacing JS byte-counting with native Rust via napi-rs. Supports head,
tail, and both modes. Counts by UTF-8 bytes, respects line boundaries,
uses memchr for fast newline scanning.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: remove unsafe blocks and fix truncation message byte counts
Replace unsafe from_utf8_unchecked with safe from_utf8().expect() —
the invariant (splitting at newline boundaries) is sound but the perf
difference is negligible, so no reason to use unsafe.
Fix truncateOutput messages that reported the byte budget as "bytes
truncated" instead of the actual number of bytes removed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: replace pure-JS xxHash32 with native Rust implementation via napi
The hashline edit tool calls xxHash32 on every line of every file read/edit.
Moving this to a native Rust implementation (xxhash-rust crate) eliminates
JS overhead for this hot path. Hash output is identical -- verified by tests
comparing native vs JS reference across 11 input vectors including empty
strings, short/long inputs, unicode, and seeded variants.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: use typed native interface and remove version-drag comment in xxhash wrapper
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the multi-pass JS pipeline (TextDecoder → stripAnsi → sanitizeBinaryOutput)
in bash-executor.ts with a single native Rust call that handles UTF-8 decoding,
ANSI stripping, binary sanitization, and CR removal in one pass.
Key features:
- StreamState tracks incomplete UTF-8 and ANSI sequences across chunk boundaries
- Standalone stripAnsiNative() and sanitizeBinaryOutputNative() for use elsewhere
- Comprehensive test coverage for split multibyte, split ANSI, binary data
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
- idle-recovery.test.ts: Use 'unknown-type' instead of 'execute-task'
for null-path test (execute-task now has artifact paths for task summaries)
- app-smoke.test.ts: Remove AGENTS.md assertions (merged into system.md
in commit 0b6d88f). Add ENOBUFS skip handling for tarball tests
(system buffer exhaustion is not a code issue).
Add the 1M context variant of Claude Opus 4.6 to the model registry
and fix model resolver to try exact match before glob detection, so
model IDs containing bracket characters (like [1m]) are not
misinterpreted as glob patterns.
Prevents publishing gsd-pi to npm if any @gsd-build/engine-* optional
dependency is missing at the expected version. Avoids the failure mode
seen in 2.10.5 where the main package shipped before platform packages
were available, causing 'Cannot find module' errors on fresh installs.
* feat: add native Rust streaming JSON parser for LLM tool call argument parsing
Replaces the JS partial-json library with a Rust implementation exposed via napi-rs.
The parser handles incomplete JSON from streaming deltas by closing unclosed strings,
objects, arrays, removing trailing commas, and completing truncated literals.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: handle truncated numbers and remove dead partial-json dependency
Adds truncated number recovery (e.g. `{"key": 12`, `{"key": 3.`, `{"key": 1e`)
to the Rust streaming JSON parser, and removes the now-unused `partial-json`
npm dependency from pi-ai.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The v2.10.5 release broke on darwin-arm64 because the main package
was published before the native CI built and published the platform
packages. With exact version pinning, npm silently skips the
optional dep when the version doesn't exist, causing a fatal crash.
Change to >=2.10.2 range so npm installs the latest available
binary. The native API is stable across patch versions.
Also stop sync-platform-versions.cjs from overwriting the ranges
back to exact versions during CI.
Adds a git diff check after sync-platform-versions so npm publish fails
if the sync had to make changes. Prevents a repeat of #276 where
optionalDependencies were out of sync with the published version.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add task isolation for subagent filesystem safety
Subagents can run in isolated git worktrees (or FUSE overlays on Linux)
so concurrent tasks don't stomp on each other's files. Changes are
captured as unified diffs and merged back via git apply.
- New isolation.ts module with worktree and FUSE overlay backends
- TaskIsolationSettings in settings-manager (mode + merge strategy)
- isolated parameter on the subagent tool schema
- Baseline capture/apply mirrors the parent repo's dirty state
- Process exit handler for best-effort cleanup of stale worktrees
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: correct delta capture to exclude parent baseline state
The worktree backend now commits a baseline snapshot after applying the
parent's dirty state, so captureDeltaPatch diffs only the subagent's
actual changes against the post-baseline HEAD (not the original HEAD).
The FUSE overlay backend tracks the parent's dirty file set at mount
time and filters the upper dir during delta capture to exclude inherited
dirty files.
Also removes dead code: findGitRoot (unused), readIsolationMergeStrategy
(exported but never called).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The async-jobs PR (#260) accidentally dropped `bashInterceptor` from the
Settings interface and the getBashInterceptorEnabled/getBashInterceptorRules
methods from SettingsManager, breaking the TypeScript build on main.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When running `gsd config` with existing LLM auth or web search
configured, show a 'Keep current (provider)' option at the top
so users don't have to re-authenticate.
Co-authored-by: Juan Francisco Lebrero <fran@Juans-MacBook-Air.local>
Fix optionalDependencies version sync — 2.10.4 shipped with engine packages pinned to 2.10.2 (the broken version), so users never got the fixed binaries. Closes#276.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After LLM provider login, ask how to search the web:
- Anthropic built-in (no key needed, shown when using Claude)
- Brave Search (API key)
- Tavily (API key)
- Skip
Moves Brave/Brave Answers out of the generic tool keys step into
the dedicated web search step for better discoverability.
Replace the flat 9-option provider list with a two-step flow:
1. How to sign in? (Browser login / API key / Skip)
2. Which provider? (filtered by auth method)
This reduces cognitive load on first launch — users pick their
auth method first, then see only the relevant providers.
Adds a CLI subcommand that checks npm for the latest version and
runs `npm install -g gsd-pi@latest` if an update is available.
Prints current/latest version and clear success/failure messages.
- Fix cat rule to exclude heredoc syntax (cat <<EOF) via negative lookahead
- Fix write rule: exclude >> append and digit-prefixed fd redirects (2>)
using lookbehind (?<![|>\d])>(?!>)
- Add compileInterceptor() — pre-compiles rules once at construction time
instead of on every bash call; export CompiledInterceptor type
- Update createBashTool to use pre-compiled interceptor instance
- Add 33 unit tests covering all rules, edge cases, and pass-throughs