fix: handle ECOMPROMISED in uncaughtException guard and align retry onCompromised (#1322) (#1332)

When a GSD session crashes hard (SIGKILL, OOM, etc.) without running its
exit handler, the proper-lockfile OS lock directory (.gsd.lock/) is left
stranded. On the next /gsd auto resume, acquireSessionLock detects the dead
PID, cleans up the stale directory, and re-acquires via the retry path.

10 seconds later, proper-lockfile's update timer fires. Due to a subtle
interaction between the synchronous fs adapter (lockSync / toSyncOptions)
and the setTimeout boundary in Node.js v25+, the ECOMPROMISED error
propagates up through the synchronous callback chain and becomes an
uncaught exception — even though the onCompromised callback sets
_lockCompromised = true without throwing.

The _gsdEpipeGuard uncaughtException handler only handled EPIPE, so it
re-threw ECOMPROMISED, crashing the process. Each crash wrote a new
"interrupted session" record, causing an infinite crash loop on resume.

Two fixes:

1. index.ts: Handle ECOMPROMISED in _gsdEpipeGuard. Exit with code 1
   (non-zero to signal failure) so the process.once("exit") handler runs
   and removes the lock directory, allowing the next session to start clean.

2. session-lock.ts: The retry path's onCompromised was missing
   `_releaseFunction = null`, unlike the primary path. This left the
   release function pointer live after compromise, causing validateSessionLock
   to return true and preventing graceful stop detection. Now matches primary.
This commit is contained in:
Jeremy McSpadden 2026-03-18 23:06:03 -05:00 committed by GitHub
parent d25c174f8b
commit fc56cdf93e

View file

@ -223,11 +223,22 @@ export default function (pi: ExtensionAPI) {
// chance to persist state and pause instead of crashing (see issue #739).
if (!process.listeners("uncaughtException").some(l => l.name === "_gsdEpipeGuard")) {
const _gsdEpipeGuard = (err: Error): void => {
if ((err as NodeJS.ErrnoException).code === "EPIPE") {
const code = (err as NodeJS.ErrnoException).code;
if (code === "EPIPE") {
// Pipe closed — nothing we can write; just exit cleanly
process.exit(0);
}
// Re-throw anything that isn't EPIPE so real crashes still surface
// ECOMPROMISED: proper-lockfile's update timer detected mtime drift (system
// sleep, heavy event loop stall, or filesystem precision mismatch on Node.js
// v25+). The onCompromised callback already set _lockCompromised = true, but
// due to a subtle interaction between the synchronous fs adapter and the
// setTimeout boundary, the error can still propagate here as an uncaught
// exception. Exit cleanly so the process.once("exit") handler removes the
// lock directory — allowing the next session to acquire cleanly (#1322).
if (code === "ECOMPROMISED") {
process.exit(1);
}
// Re-throw anything that isn't EPIPE or ECOMPROMISED so real crashes still surface
throw err;
};
process.on("uncaughtException", _gsdEpipeGuard);