singularity-forge

History

Mikael Hugo f8e53840da fix(rpc, web): integrate drain into forceShutdown + healthz-503 on shutdown Three fixes addressing codex's adversarial review of the earlier orphan- recovery / graceful-shutdown landing: (1) Codex point B — single shutdown path. Removed the parallel installGracefulShutdown() handler in rpc-mode.ts that was adding a second SIGTERM listener and racing forceShutdown()'s teardown. The drain is now the FIRST step inside forceShutdown() (before killTrackedDetachedChildren / extension session_shutdown / etc.) so DB writes complete cleanly while child processes are still alive to flush. Race-free against the existing shutdown ordering. (2) Codex point D — recovery-before-each-drain. Cloud-volume mtime visibility lags between containers can mean an orphan `.draining` file from a previous container isn't visible during the startup scan but appears moments later. drainQueuedSfFeedbackCommands() now runs recoverOrphanedFeedbackDrains() as its first step, so each dispatch's drain sees the latest filesystem state. (3) Codex point E — healthz returns 503 during shutdown. New module src/web/shutdown-state.ts holds a per-process flag, auto-registers SIGTERM/SIGINT/SIGHUP handlers on first read, and exposes a snapshot (signal, startedAt, elapsedMs) for diagnostics. The healthz route imports isShuttingDown() and returns 503 when set, so k8s readinessProbe / Forgejo blue-green probes drain traffic BEFORE we actually stop responding. Tests: - rpc-mode-orphan-recovery.test.ts: 8/8 still green - web-shutdown-state.test.ts: 5/5 new — default false, mark sets flag, idempotent, signal exposed via snapshot, null signal for manual mark Deferred to a follow-up commit (codex didn't flag, but noted for completeness): a SIGTERM-drain child-process integration test that spawns rpc-mode + sends a real signal. The 5 unit tests cover the flag logic; the integration test would cover the full process tree and is bulkier than the current commit warrants. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-17 22:35:50 +02:00
..
scripts	feat(web): add error boundaries, expand test coverage, add README	2026-05-10 11:24:40 +02:00
src	fix(rpc, web): integrate drain into forceShutdown + healthz-503 on shutdown	2026-05-17 22:35:50 +02:00
package.json	chore(release): 2.75.3 → 2.75.4 + workspace dependency refresh	2026-05-16 23:59:14 +02:00
tsconfig.json	sf snapshot: uncommitted changes after 268m inactivity	2026-05-15 02:08:06 +02:00