Commit graph

9 commits

Author SHA1 Message Date
Jeremy McSpadden
0e0f47ef9f fix: failure recovery & resume safeguards (all 4 waves) (#956)
* fix: prevent data loss on crash with atomic writes, file locking, and error handling

Wave 1 of failure recovery safeguards:

1. Atomic session file rewrites (tmp+rename) — _rewriteFile() and forkFrom()
   now use atomicWriteFileSync to prevent session file corruption on crash
2. Atomic auto.lock writes — crash-recovery.ts writeLock() uses tmp+rename
   so the crash detection system itself can't be corrupted
3. unhandledRejection handler — catches silent process death from unhandled
   promise rejections in OAuth, extensions, LSP, or MCP connections
4. try/catch in emitToolCall — matches pattern used by emitUserBash,
   emitContext, and emitToolResult to prevent extension handler crashes
   from killing the entire agent turn
5. File locking on session appends — prevents concurrent pi instances from
   interleaving partial JSON lines in session JSONL files using the same
   proper-lockfile pattern established in auth-storage.ts and settings-manager.ts

* fix: add OAuth timeouts, RPC exit detection, and command context guards

Wave 2 of failure recovery safeguards:

1. OAuth fetch timeouts — all fetch() calls across all OAuth providers
   (Anthropic, OpenAI Codex, Google Antigravity, Google Gemini CLI,
   GitHub Copilot) now have 30-second AbortSignal.timeout() to prevent
   indefinite hangs when OAuth servers are unresponsive
2. RPC subprocess exit detection — pending requests are now rejected
   when the agent subprocess exits unexpectedly, preventing indefinite
   hangs in the RPC client
3. Extension command context guards — default handlers for newSession,
   fork, navigateTree, switchSession, and reload now throw explicit
   errors instead of silently returning success when called before
   bindCommandContext()
4. OAuth error detail preservation — token refresh errors now preserve
   the original error as `cause` for better diagnostics

* fix: resource cleanup, LSP retry, and crash detection on session resume

Wave 3 of failure recovery safeguards:

1. Atomic completed-units.json cleanup — milestone completion writes
   now use tmp+rename pattern for consistency with auto-recovery.ts
2. Bash temp file cleanup — track temp files created for large output
   and register a process exit handler to clean them up
3. Settings write queue flush on shutdown — call settingsManager.flush()
   during interactive mode shutdown so queued writes aren't lost
4. LSP initialization retry — wrap getOrCreateClient with up to 2 retries
   with exponential backoff (1s, 2s) for transient spawn failures
5. Crash detection on session resume — wasInterrupted() checks if last
   assistant turn had tool calls without results, shows warning on resume

* fix: blob garbage collection and LSP debug logging

Wave 4 of failure recovery safeguards:

1. Blob garbage collection — BlobStore.gc(referencedHashes) removes
   orphaned blobs not referenced by any session file, plus totalSize()
   for monitoring blob directory growth
2. LSP JSON parse error logging — malformed LSP messages are now logged
   at debug level (when DEBUG env is set) instead of being silently dropped
2026-03-17 16:03:49 -06:00
Jeremy McSpadden
2c926c12e3 fix: Phase 1 quick wins — bug fixes, security hardening, and performance
- Fix loadStoredEnvKeys divergent provider lists: add telegram_bot and
  custom-openai to wizard.ts (the canonical copy used by CLI), remove
  dead duplicate from onboarding.ts
- Security: add SAFE_COMMAND_PREFIXES allowlist to resolveConfigValue
  to prevent arbitrary RCE via settings.json shell commands
- Security: add TOFU (Trust On First Use) model for project-local
  extensions — skip untrusted .pi/extensions/ with stderr warning
- Performance: debounce sql.js MemoryStorage persistence (500ms window)
  so rapid mutations coalesce into a single db.export()+writeFileSync
- Fix double lstatSync call in tool-bootstrap.ts isRegularFile
- Add 26 new tests covering all changes
2026-03-16 13:18:02 -05:00
Gary Trakhman
1ea9163dea feat: add yaml support, run-hook command, and path sanitization (#637)
* feat: allow extensions to use 'yaml' and rework frontmatter parsing

* feat: add run-hook command for manual hook execution

* fix: sanitize slashes in unitType for runtime file paths
2026-03-16 09:22:23 -06:00
TÂCHES
7578292b6b fix: resolve TypeScript errors in GSD extension files (#571)
Add "success" to notify type union across ExtensionUIContext, interactive
mode, and RPC mode implementations. Fix null safety for readFileSync and
contextUsage.percent in auto.ts. Add discriminated union narrowing for
dispatch results. Add string type guards for select() return values in
commands.ts. Align ProviderErrorPauseUI notify signature. Simplify
AuthStorage return type.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 19:12:23 -06:00
Flux Labs
e6d55f8aaf Perf/gsd startup speed (#497)
* docs: add startup performance analysis and optimization plan

Profiled GSD CLI startup finding 2.2s for --version and ~3.8s for
interactive mode. Identified 5 root causes with measured timings and
created a phased optimization plan targeting <0.2s for --version
and ~0.8s for interactive startup.

* perf: speed up GSD startup with lazy loading and fast paths

- Fast-path --version/-v and --help/-h in loader.ts before importing
  any heavy dependencies (2.2s → 0.15s, 14x faster)
- Lazy-load undici (~200ms) only when HTTP_PROXY env vars are set
- Skip initResources cpSync when managed-resources.json version
  matches current GSD version (~128ms saved per launch)
- Lazy-load Mistral SDK (~369ms) on first API call instead of startup
- Lazy-load Google GenAI SDK (~186ms) on first API call instead of
  startup
- Parallelize extension loading with Promise.all() instead of
  sequential for-loop

---------

Co-authored-by: TÂCHES <afromanguy@me.com>
2026-03-15 13:33:43 -06:00
Flux Labs
16364c7dba fix: prevent web_search tool injection for non-Anthropic providers serving Claude models (#444) (#446)
GitHub Copilot users with Claude models got 400 errors because the native
Anthropic web_search_20250305 tool was injected into requests to Copilot's
API proxy, which doesn't support it. The root cause was that model_select
never fires before the first API request on new sessions, so the fallback
heuristic (model name starts with "claude-") couldn't distinguish direct
Anthropic from proxied providers.

Fix: pass the resolved Model object through to the before_provider_request
event so extensions can check model.provider directly instead of relying on
model name heuristics.
2026-03-14 22:15:00 -06:00
TÂCHES
3084366d87 fix: handle undefined result from custom() in RPC mode for ask_user_questions (#156, #165, #171) (#199)
In RPC mode, `ctx.ui.custom()` returns `undefined as never`, causing
`showInterviewRound` to return undefined and `Object.keys(result.answers)`
to throw TypeError.

When `showInterviewRound` returns undefined (RPC mode), fall back to
sequential `ctx.ui.select()` calls for each question, forwarding the
abort signal (#171) and supporting `allowMultiple` (#165).

- Add `allowMultiple` to `ExtensionUIDialogOptions`
- Widen `select()` return type to `string | string[] | undefined`
- Add `allowMultiple` to RPC select request and `values` array to response
- Update RPC `select()` to forward `allowMultiple` and parse array responses
- Guard existing `ctx.ui.select()` callers against the widened return type

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 10:24:57 -06:00
Lex Christopherson
4e82688de6 fix: alias @mariozechner/* imports to @gsd/* for external PI ecosystem packages
External packages (pi-rtk, pi-context, pi-agent-browser, etc.) import from
the original @mariozechner/* scope which GSD forked to @gsd/*. Add aliases
in both jiti resolution paths (virtualModules for Bun, getAliases for Node)
so these packages resolve correctly without manual workarounds.

Closes #161

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 07:27:24 -06:00
Lex Christopherson
c80d640d35 feat: vendor Pi source into workspace monorepo
Vendor all 4 Pi packages (tui, ai, agent-core, coding-agent) from
pi-mono v0.57.1 as @gsd/* workspace packages under packages/. This
replaces the compiled npm dependency (@mariozechner/pi-coding-agent)
and patch-package workflow, giving direct source access for
modifications.

- Copy Pi source from pi-mono v0.57.1 into packages/
- Create workspace package.json + tsconfig.json for each package
- Rename ~240 imports from @mariozechner/pi-* to @gsd/pi-*
- Apply existing patches as source edits (setModel persist, VT input)
- Remove @mariozechner/pi-coding-agent dep and patch-package
- Update build pipeline to build packages in dependency order
- Add pi-upstream git remote for future selective syncing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:55:17 -06:00