* fix: prevent data loss on crash with atomic writes, file locking, and error handling Wave 1 of failure recovery safeguards: 1. Atomic session file rewrites (tmp+rename) — _rewriteFile() and forkFrom() now use atomicWriteFileSync to prevent session file corruption on crash 2. Atomic auto.lock writes — crash-recovery.ts writeLock() uses tmp+rename so the crash detection system itself can't be corrupted 3. unhandledRejection handler — catches silent process death from unhandled promise rejections in OAuth, extensions, LSP, or MCP connections 4. try/catch in emitToolCall — matches pattern used by emitUserBash, emitContext, and emitToolResult to prevent extension handler crashes from killing the entire agent turn 5. File locking on session appends — prevents concurrent pi instances from interleaving partial JSON lines in session JSONL files using the same proper-lockfile pattern established in auth-storage.ts and settings-manager.ts * fix: add OAuth timeouts, RPC exit detection, and command context guards Wave 2 of failure recovery safeguards: 1. OAuth fetch timeouts — all fetch() calls across all OAuth providers (Anthropic, OpenAI Codex, Google Antigravity, Google Gemini CLI, GitHub Copilot) now have 30-second AbortSignal.timeout() to prevent indefinite hangs when OAuth servers are unresponsive 2. RPC subprocess exit detection — pending requests are now rejected when the agent subprocess exits unexpectedly, preventing indefinite hangs in the RPC client 3. Extension command context guards — default handlers for newSession, fork, navigateTree, switchSession, and reload now throw explicit errors instead of silently returning success when called before bindCommandContext() 4. OAuth error detail preservation — token refresh errors now preserve the original error as `cause` for better diagnostics * fix: resource cleanup, LSP retry, and crash detection on session resume Wave 3 of failure recovery safeguards: 1. Atomic completed-units.json cleanup — milestone completion writes now use tmp+rename pattern for consistency with auto-recovery.ts 2. Bash temp file cleanup — track temp files created for large output and register a process exit handler to clean them up 3. Settings write queue flush on shutdown — call settingsManager.flush() during interactive mode shutdown so queued writes aren't lost 4. LSP initialization retry — wrap getOrCreateClient with up to 2 retries with exponential backoff (1s, 2s) for transient spawn failures 5. Crash detection on session resume — wasInterrupted() checks if last assistant turn had tool calls without results, shows warning on resume * fix: blob garbage collection and LSP debug logging Wave 4 of failure recovery safeguards: 1. Blob garbage collection — BlobStore.gc(referencedHashes) removes orphaned blobs not referenced by any session file, plus totalSize() for monitoring blob directory growth 2. LSP JSON parse error logging — malformed LSP messages are now logged at debug level (when DEBUG env is set) instead of being silently dropped
12 lines
503 B
TypeScript
12 lines
503 B
TypeScript
import { renameSync, writeFileSync } from "node:fs";
|
|
|
|
/**
|
|
* Atomically write a file by writing to a temporary path then renaming.
|
|
* This prevents data loss if the process crashes mid-write — either the
|
|
* old file remains intact or the new content is fully written.
|
|
*/
|
|
export function atomicWriteFileSync(filePath: string, content: string | Buffer, encoding?: BufferEncoding): void {
|
|
const tmpPath = filePath + ".tmp";
|
|
writeFileSync(tmpPath, content, encoding);
|
|
renameSync(tmpPath, filePath);
|
|
}
|