Commit graph

2897 commits

Author SHA1 Message Date
Jeremy McSpadden
2d58b3fc1b Merge pull request #3739 from jeremymcs/fix/orphaned-worktree-audit
fix(gsd): add orphaned milestone branch audit at auto-mode bootstrap
2026-04-07 21:54:57 -05:00
Jeremy McSpadden
ff9ad68526 Merge pull request #3743 from mastertyko/fix/3742-pre-exec-annotated-paths
fix(gsd): parse annotated pre-exec file paths
2026-04-07 21:47:30 -05:00
Jeremy
b7d7c69b9e fix(gsd): add logWarning to empty catch block in orphaned worktree cleanup
The workflow-logger coverage test (#3348) requires all catch blocks in
migrated files to include logging. Add logWarning for the expected
failure case when nativeWorktreeRemove fails on orphaned directories.

Refs #3739
2026-04-07 21:41:08 -05:00
Jeremy McSpadden
f9a6cac958 Merge pull request #3744 from mastertyko/fix/3721-tui-autocomplete-ghost-lines
fix(pi-tui): clear autocomplete rows from content bottom
2026-04-07 21:32:14 -05:00
Jeremy McSpadden
ae5e4af61e Merge pull request #3745 from mastertyko/fix/3705-subagent-tools-frontmatter
fix(subagent): support list-style tools frontmatter
2026-04-07 21:31:38 -05:00
Jeremy McSpadden
0c62fada0e Merge pull request #3758 from jeremymcs/fix/pre-verification-timeout-guard
fix(gsd): add timeout guard around postUnitPreVerification
2026-04-07 21:28:21 -05:00
Jeremy
5972e4d809 fix(gsd): add consecutiveFinalizeTimeouts to LoopState in journal tests
Update all LoopState object literals in journal-integration.test.ts
to include the new consecutiveFinalizeTimeouts property.

Refs #3757
2026-04-07 21:15:37 -05:00
Jeremy
0ca76f6813 fix(gsd): add escalation and unit-detach guards to finalize timeout handlers
Address adversarial review findings:

1. Timed-out pre/post verification continues running in background and
   can mutate s.currentUnit for the wrong unit. Fix: null out
   s.currentUnit on timeout so late async completions are harmless
   (all side effects in postUnitPreVerification guard on s.currentUnit).

2. Finalize timeouts were treated as successful iterations, resetting
   consecutiveErrors and enabling silent infinite churn. Fix: add
   consecutiveFinalizeTimeouts counter to LoopState, increment on each
   timeout, hard-stop auto-mode after MAX_FINALIZE_TIMEOUTS (3)
   consecutive timeouts. Reset to 0 on successful finalize.

Both fixes apply symmetrically to pre and post verification timeouts.

Refs #3757
2026-04-07 21:10:55 -05:00
Jeremy McSpadden
b628bfda4f Merge pull request #3754 from jeremymcs/fix/os-specific-keyboard-shortcuts
fix(gsd): OS-specific keyboard shortcut hints via formatShortcut helper
2026-04-07 21:08:26 -05:00
Jeremy
32a7af0513 fix(gsd): add timeout guard around postUnitPreVerification to prevent auto-loop hang
postUnitPostVerification already has a 60s timeout guard (#2344) but
postUnitPreVerification was called with bare await — if any async
operation inside it never resolves (browser teardown, worktree sync,
safety harness validation), the auto-loop freezes permanently with no
error, notification, or recovery.

Wrap postUnitPreVerification in the same withTimeout() pattern with a
dedicated FINALIZE_PRE_TIMEOUT_MS constant. On timeout, log a warning
and force-continue to the next iteration.

Closes #3757
2026-04-07 20:57:30 -05:00
Jeremy
e52be4fc09 fix(gsd): OS-specific keyboard shortcut hints via formatShortcut helper
Keyboard shortcut hints were hardcoded as Ctrl+Alt+X everywhere except
auto-dashboard.ts which had an inline platform check. On macOS these
should render as ⌃⌥X.

- Add formatShortcut() to files.ts — converts Ctrl/Alt/Shift/Cmd
  modifiers to macOS symbols (⌃/⌥/⇧/⌘) when process.platform is darwin
- Replace all inline platform checks and hardcoded hints with
  formatShortcut() calls
- Use template variables in system.md for shortcut hints
- Update comments in overlay files for consistency
- Add 7 tests covering all modifier conversions and passthrough

Closes #3753
2026-04-07 20:53:04 -05:00
Jeremy McSpadden
94f9b311d3 Merge pull request #3750 from jeremymcs/fix/depth-verification-answer-validation
fix(gsd): validate depth verification answer before unlocking write-gate
2026-04-07 19:56:49 -05:00
Jeremy
dd08f501e5 fix(gsd): validate depth verification answer before unlocking write-gate
The tool_result handler called markDepthVerified() whenever
ask_user_questions returned any response with a depth_verification
question ID — without checking what the user actually selected.
Selecting "Not quite", "None of the above", or garbage input all
unlocked the gate.

- Extract isDepthConfirmationAnswer() into write-gate.ts with structural
  validation: cross-references selected answer against the question's
  defined options, only accepting an exact match of the first option
  (confirmation by convention). Rejects free-form "Other" text and
  decouples from any specific label substring.
- Harden block message with explicit anti-bypass language
- Add anti-bypass instructions to all three discuss prompts
- Add 8 new tests covering: structural validation, free-form bypass
  rejection, label-drift resilience, fallback behavior, edge cases

Closes #3749
2026-04-07 19:45:00 -05:00
mastertyko
fc38dd6f91 fix(subagent): support list-style tools frontmatter 2026-04-08 01:56:38 +02:00
Jeremy McSpadden
a3ea23bda1 Merge pull request #3727 from jeremymcs/fix/state-machine-wave5-consistency
fix(gsd): consistency and cleanup (wave 5/5)
2026-04-07 18:27:42 -05:00
mastertyko
9f4666be61 fix: clear autocomplete rows from content bottom
Use the rendered content row as the shrink diff baseline instead of\nreusing the IME hardware cursor row. Add a focused TUI regression test\nthat reproduces the ghost-line cleanup path when autocomplete shrinks.\n\nCloses #3721
2026-04-08 01:25:31 +02:00
mastertyko
12b6a01dae fix: parse annotated pre-exec file paths
Strip planner-style path annotations before pre-execution checks compare\ninputs and expected outputs. This keeps existing files, prior outputs,\nand ordering checks aligned even when task-plan entries include inline\ndescriptions.\n\nCloses #3742
2026-04-08 01:25:24 +02:00
Jeremy
c038f68898 Merge upstream/main into fix/state-machine-wave5-consistency
Resolve conflict in workflow-events.ts: keep HEAD's v? schema version
field addition alongside the cmd field.
2026-04-07 18:06:40 -05:00
Jeremy McSpadden
2ff938596b Merge pull request #3726 from jeremymcs/fix/state-machine-wave4-writes
fix(gsd): write safety — atomic writes and randomized tmp (wave 4/5)
2026-04-07 18:04:43 -05:00
Jeremy McSpadden
00f401ab98 Merge pull request #3725 from jeremymcs/fix/state-machine-wave3-session
fix(gsd): session and recovery robustness (wave 3/5)
2026-04-07 18:04:27 -05:00
Jeremy McSpadden
1c415f3a83 Merge pull request #3724 from jeremymcs/fix/state-machine-wave2-events
fix(gsd): event log and reconciliation robustness (wave 2/5)
2026-04-07 18:04:10 -05:00
Jeremy
ae8d3b6983 Merge upstream/main into fix/state-machine-wave2-events
Resolve conflict in workflow-reconcile.ts: keep upstream's
complete_milestone invariant check (getMilestoneSlices guard) and
HEAD's plan_milestone/plan_slice/plan_task replay handlers.
2026-04-07 17:48:24 -05:00
Jeremy
4fdce5d179 test(gsd): align hasImplementationArtifacts tests with string return type
The function signature changed from boolean to "present" | "absent" |
"unknown" but three test assertions still compared against true/false.

Update assertions to match the new return type.
2026-04-07 17:20:37 -05:00
Jeremy McSpadden
c49b1f2cb9 Merge pull request #3722 from jeremymcs/fix/state-machine-wave1-critical
fix(gsd): critical state machine data integrity (wave 1/5)
2026-04-07 17:17:28 -05:00
Jeremy
fe8d67579e fix(gsd): add orphaned milestone branch audit at auto-mode bootstrap
When a milestone completes but the session ends before teardown runs,
the milestone branch and worktree directory are orphaned — the DB says
complete so auto-mode won't re-enter, and the teardown is never retried.

Adds auditOrphanedMilestoneBranches() that runs after DB open during
bootstrap. For each milestone/* branch where the DB status is complete:
- If already merged into main → deletes the branch + cleans worktree dir
- If NOT merged → preserves the branch and warns the user

Includes 9 regression tests covering merged/unmerged/active/none-mode
scenarios.
2026-04-07 16:33:03 -05:00
Jeremy
5cfc865040 fix(gsd): revert unknown artifact check to warn-and-proceed
Blocking on "unknown" from hasImplementationArtifacts broke real-world
auto-mode in projects without clean git merge-bases (single-branch,
fresh repos, detached HEAD). The auto-loop silently stopped at
completing-milestone with no visible error.

Reverted to warn-and-proceed for "unknown" — only "absent" (confirmed
no implementation files) blocks completion. This matches the original
fail-open behavior for inconclusive git checks.
2026-04-07 15:34:46 -05:00
Jeremy
01f5557520 test(gsd): update audit tests for expanded SAFE_KEYS allowlist 2026-04-07 14:04:14 -05:00
Jeremy
9b998def03 fix(gsd): add missing cmd field to test base WorkflowEvent 2026-04-07 13:53:42 -05:00
Jeremy
a9c62adf22 fix(gsd): address remaining adversarial review findings for wave 3
1. hasImplementationArtifacts "unknown" now blocks completion instead of
   warn-and-proceed. Both auto-dispatch.ts and auto-recovery.ts updated
   to treat "unknown" as a stop condition, preventing milestone completion
   when git status cannot be verified.

2. Audit log SAFE_KEYS allowlist expanded to include "id", "error", and
   "count" fields. SPLIT BRAIN logError entries now persist the entity ID
   and rollback error details to audit-log.jsonl for triage/repair.
2026-04-07 13:50:49 -05:00
Jeremy
6d9b7f91b2 fix(gsd): detect concurrent event log growth during reconcile
Adds a pre-write guard in reconcileWorktreeLogs: re-reads the event
log before overwriting and retries if it grew since the initial read.
Prevents appendEvent calls between read and rewrite from being silently
dropped by the atomic overwrite.
2026-04-07 13:46:37 -05:00
Jeremy
42141d8979 fix(gsd): address adversarial review findings for wave 3
1. Paused session file deletion deferred until after lock acquisition.
   Previously the file was deleted before acquireSessionLock — if the
   lock failed, the pause metadata was lost on disk and in memory,
   making the session unresumable. Now the file path is stored in
   s.pausedSessionFile and only deleted after successful lock.

2. Lock failure path preserves pause file for retry.
2026-04-07 13:43:10 -05:00
Jeremy
5f581c891c fix(gsd): address adversarial review findings for wave 2
1. plan_task and plan_slice replay now use strict INSERT OR IGNORE
   instead of calling insertTask/insertSlice which use ON CONFLICT
   DO UPDATE. Prevents replay of older plan events from downgrading
   progressed task/slice status back to pending.

2. Type guard on cmd normalization: non-string cmd values are skipped
   with a warning instead of throwing.

3. Type guard on extractEntityKey for consistency.
2026-04-07 13:40:50 -05:00
Jeremy
40a37125fe fix(gsd): address adversarial review findings for wave 1
1. Type guard on cmd normalization: non-string cmd values are now
   skipped with a warning instead of throwing, preventing replay
   from crashing on malformed event lines.

2. complete_milestone replay now validates all slices are closed
   before marking milestone complete. Prevents a reordered/partial
   event stream from closing a milestone with incomplete work.

3. Type guard on extractEntityKey cmd normalization for consistency.
2026-04-07 13:36:33 -05:00
Jeremy
49ede9e3bb fix(gsd): WAL-safe migration backup + stronger regression tests
Addresses Codex adversarial review findings:

1. Migration backup now flushes WAL via PRAGMA wal_checkpoint(TRUNCATE)
   before copyFileSync. Without this, the backup could miss committed
   data that only exists in the -wal file. Backup failure is now logged
   via logWarning instead of silently swallowed.

2. Wave 5 regression tests strengthened:
   - Added behavior-level test for skipped/blocked/pending status mapping
     to checkbox rendering (not just isClosedStatus helper)
   - Added extractEntityKey round-trip tests for underscored cmd formats
   - Added unknown cmd → null safety test
2026-04-07 13:27:30 -05:00
Jeremy
d7ce14a50a test(gsd): add regression tests for wave 5 consistency fixes
Tests isClosedStatus coverage for projections, upsertDecision seq
preservation (ON CONFLICT DO UPDATE vs INSERT OR REPLACE), and
event schema versioning (v:2 field in new events).
2026-04-07 12:48:11 -05:00
Jeremy
8d80cb1209 test(gsd): add regression tests for wave 4 write safety
Tests saveJsonFile atomic write correctness, no residual .tmp files,
concurrent write safety, and round-trip through loadJsonFile.
2026-04-07 12:47:30 -05:00
Jeremy
49080c90e2 test(gsd): add regression tests for wave 3 session fixes
Tests tri-state hasImplementationArtifacts return values and
AutoSession.consecutiveCompleteBootstraps per-session isolation
and reset() behavior.
2026-04-07 12:46:14 -05:00
Jeremy
ac2a832f67 test(gsd): add regression tests for wave 2 + fix empty catch blocks
Adds tests for plan event entity key extraction and unknown cmd handling.
Fixes empty catch blocks in auto-recovery.ts appendEvent calls that failed
the "no empty catch blocks" CI lint.
2026-04-07 12:45:45 -05:00
Jeremy
03dc62308d test(gsd): add regression tests for wave 1 critical fixes
Covers event log cmd format normalization (hyphens + underscores),
extractEntityKey for complete-milestone, and isClosedStatus
including skipped status.
2026-04-07 12:44:58 -05:00
Jeremy
5c9ee8f10d fix(gsd): consistency and cleanup (wave 5/5)
Five consistency fixes to eliminate divergence sources:

1. workflow-projections.ts: Direct string comparisons for task/slice status
   replaced with isClosedStatus() from status-guards.ts. Skipped tasks now
   correctly show checked checkboxes in PLAN.md and ROADMAP.md.

2. gsd-db.ts upsertDecision: INSERT OR REPLACE changed to INSERT ... ON
   CONFLICT(id) DO UPDATE SET. Preserves the seq column so decision ordering
   in DECISIONS.md is stable after reconcile replay.

3. state.ts: Duplicate private isStatusDone() removed, replaced with alias
   to isClosedStatus from status-guards.ts. Single source of truth for
   "what counts as closed."

4. gsd-db.ts migrateSchema: Database is now backed up to
   gsd.db.backup-v{currentVersion} before running migration steps. A mid-
   migration crash no longer leaves a partially-migrated DB with no recovery.

5. workflow-events.ts: WorkflowEvent interface now includes optional v field
   (schema version). New events are written with v:2. Legacy events (no v
   field) are still accepted. Prevents future cmd-format drift from requiring
   another dual-read fix.
2026-04-07 12:42:53 -05:00
Jeremy
c595eb2b22 fix(gsd): write safety — atomic writes and randomized tmp paths (wave 4/5)
Three write-safety fixes:

1. json-persistence.ts: Fixed .tmp suffix replaced with randomized suffix
   using crypto.randomBytes(4). Prevents concurrent-write data loss when two
   callers write the same JSON file simultaneously (metrics ledger at risk
   during parallel slice execution).

2. undo.ts: Raw writeFileSync on PLAN.md replaced with atomicWriteSync.
   Prevents crash mid-write from corrupting PLAN.md permanently.

3. triage-resolution.ts: All 6 writeFileSync calls replaced with
   atomicWriteSync. Covers PLAN.md inject, REPLAN-TRIGGER.md, REGRESSION.md,
   and CONTEXT-DRAFT.md writes.
2026-04-07 12:39:08 -05:00
Jeremy
dc9899c9d6 fix(gsd): session and recovery robustness (wave 3/5)
Five fixes for session lifecycle and recovery reliability:

1. hasImplementationArtifacts now returns tri-state ("present"|"absent"|"unknown")
   instead of boolean. "unknown" on git errors lets callers warn+proceed instead
   of either silently blocking or silently allowing. Both callers updated.

2. DB-ahead-of-disk split-brain: rollback DELETE in db-writer.ts saveDecisionToDb
   and saveRequirementToDb now wrapped in try/catch with logError. A failed
   rollback is explicitly logged as SPLIT BRAIN so the orphaned row is auditable.

3. _consecutiveCompleteBootstraps moved from module-level in auto-start.ts into
   AutoSession class. Now properly reset by s.reset(), preventing cross-session
   counter bleed in long-running processes (VS Code extension).

4. s.paused sticky on lock failure: when acquireSessionLock fails during resume,
   s.paused is now set back to false so isAutoPaused() doesn't return true
   permanently.

5. nativeCommit empty message replaced with "chore(gsd): reconcile merge state"
   to avoid rejection by strict git configurations.
2026-04-07 12:35:43 -05:00
Jeremy
40b893ea22 fix(gsd): event log and reconciliation robustness (wave 2/5)
Five fixes for event log integrity and worktree reconciliation:

1. writeBlockerPlaceholder now appends event log entries after DB writes
   so recovery-path completions are visible to worktree reconciliation.

2. appendEvent failure is no longer silently swallowed in completion tools.
   Each post-mutation step (projections, manifest, event log) now has its
   own try/catch so a projection failure cannot prevent the event log entry.
   Event log failures use logError (persisted to audit-log.jsonl) instead
   of logWarning.

3. verification_evidence dedup confirmed already in place — INSERT OR IGNORE
   with unique index on (task_id, slice_id, milestone_id, command, verdict).

4. New entity replay handlers added to replayEvents: plan_milestone (creates
   milestone via INSERT OR IGNORE), plan_slice (creates slice), plan_task
   (creates task), replan_slice (informational no-op). Also added to
   extractEntityKey for conflict detection.

5. Post-reconcile cache invalidation added — targeted invalidation
   (invalidateStateCache + clearPathCache + clearParseCache) at the end of
   reconcileWorktreeLogs so deriveState() sees post-reconcile DB state.
2026-04-07 12:14:46 -05:00
Jeremy
b357411a0f fix(gsd): critical state machine data integrity fixes (wave 1/5)
Four critical fixes for the GSD state machine:

1. Event log cmd format mismatch — completion tools write hyphenated cmds
   ("complete-task") but replayEvents handled only underscored ("complete_task").
   Worktree reconciliation replay was completely broken for modern completions.
   Fix: normalize cmd via replace(/-/g, "_") in both replayEvents and
   extractEntityKey. Also adds complete_milestone replay handler and warns
   on unknown commands instead of silently skipping.

2. Dead if-block at state.ts:434-440 — empty block with misleading comments
   wasted getMilestoneSlices() + every() computation. Removed and replaced
   with clear comment explaining why all-slices-done milestones without
   SUMMARY are intentionally not added to completeMilestoneIds.

3. getActiveMilestoneId missing "skipped" status — checked complete/done/parked
   but not skipped. isStatusDone() includes skipped, creating divergence where
   a skipped milestone could become permanently "active". Fix: use
   isClosedStatus() || parked check.

4. executeReplan disk-file fallback — triage-resolution.ts writes replan
   trigger to disk and DB (best-effort). If DB write fails, deriveStateFromDb
   only checked the DB column, making the trigger invisible. Fix: fall back
   to checking the disk REPLAN-TRIGGER file when DB column is null.
2026-04-07 12:07:06 -05:00
Jeremy
9f7071ea6f fix(gsd): critical state machine data integrity fixes (wave 1/5)
Four critical fixes for the GSD state machine:

1. Event log cmd format mismatch — completion tools write hyphenated cmds
   ("complete-task") but replayEvents handled only underscored ("complete_task").
   Worktree reconciliation replay was completely broken for modern completions.
   Fix: normalize cmd via replace(/-/g, "_") in both replayEvents and
   extractEntityKey. Also adds complete_milestone replay handler and warns
   on unknown commands instead of silently skipping.

2. Dead if-block at state.ts:434-440 — empty block with misleading comments
   wasted getMilestoneSlices() + every() computation. Removed and replaced
   with clear comment explaining why all-slices-done milestones without
   SUMMARY are intentionally not added to completeMilestoneIds.

3. getActiveMilestoneId missing "skipped" status — checked complete/done/parked
   but not skipped. isStatusDone() includes skipped, creating divergence where
   a skipped milestone could become permanently "active". Fix: use
   isClosedStatus() || parked check.

4. executeReplan disk-file fallback — triage-resolution.ts writes replan
   trigger to disk and DB (best-effort). If DB write fails, deriveStateFromDb
   only checked the DB column, making the trigger invisible. Fix: fall back
   to checking the disk REPLAN-TRIGGER file when DB column is null.
2026-04-07 12:06:24 -05:00
Jeremy McSpadden
0dd7c31213 Merge pull request #3719 from jeremymcs/fix/suppress-model-notify-automode
fix(gsd): suppress model change notification in auto-mode unless verbose
2026-04-07 11:00:50 -05:00
Jeremy McSpadden
11f12e31a9 Merge pull request #3602 from OfficialDelta/feat/enhanced-discussion
feat(gsd): add deep evidence-backed discussion system with preparation engine
2026-04-07 11:00:36 -05:00
Jeremy
230939e558 test(gsd): add regression test for verbose-gated model change notification
Verifies that the Model [phase] [tier] notification in selectAndApplyModel
is gated behind the verbose flag to prevent auto-mode notification noise.
2026-04-07 10:48:41 -05:00
OfficialDelta
c4a1fc11a0 fix(gsd): remove ecosystem research stub and address adversarial review
Ecosystem research: executeSearchQuery() was a stub returning empty
results. Research now happens during the discussion (between Layer 1
and Layer 2) using whatever web search tools are available — native
Anthropic web search for Claude, search-the-web for other providers.
Preparation phase focuses on mechanical work only.

Adversarial review fixes:
- Clear lastPreparationResult on every discuss entry to prevent
  cross-session/project state leaks
- Replace invalid JS regex anchor \z with indexOf-based section
  extraction in prompt-validation.ts
- Document consecutive error counter finding as upstream behavior
  (agent-loop.ts is part of pi-agent-core, not gsd extension)
2026-04-07 11:39:01 -04:00
OfficialDelta
f8c148eed7 chore: auto-commit after quick-task
GSD-Unit: Q13
2026-04-07 11:38:49 -04:00