Covers event log cmd format normalization (hyphens + underscores),
extractEntityKey for complete-milestone, and isClosedStatus
including skipped status.
Four critical fixes for the GSD state machine:
1. Event log cmd format mismatch — completion tools write hyphenated cmds
("complete-task") but replayEvents handled only underscored ("complete_task").
Worktree reconciliation replay was completely broken for modern completions.
Fix: normalize cmd via replace(/-/g, "_") in both replayEvents and
extractEntityKey. Also adds complete_milestone replay handler and warns
on unknown commands instead of silently skipping.
2. Dead if-block at state.ts:434-440 — empty block with misleading comments
wasted getMilestoneSlices() + every() computation. Removed and replaced
with clear comment explaining why all-slices-done milestones without
SUMMARY are intentionally not added to completeMilestoneIds.
3. getActiveMilestoneId missing "skipped" status — checked complete/done/parked
but not skipped. isStatusDone() includes skipped, creating divergence where
a skipped milestone could become permanently "active". Fix: use
isClosedStatus() || parked check.
4. executeReplan disk-file fallback — triage-resolution.ts writes replan
trigger to disk and DB (best-effort). If DB write fails, deriveStateFromDb
only checked the DB column, making the trigger invisible. Fix: fall back
to checking the disk REPLAN-TRIGGER file when DB column is null.
Ecosystem research: executeSearchQuery() was a stub returning empty
results. Research now happens during the discussion (between Layer 1
and Layer 2) using whatever web search tools are available — native
Anthropic web search for Claude, search-the-web for other providers.
Preparation phase focuses on mechanical work only.
Adversarial review fixes:
- Clear lastPreparationResult on every discuss entry to prevent
cross-session/project state leaks
- Replace invalid JS regex anchor \z with indexOf-based section
extraction in prompt-validation.ts
- Document consecutive error counter finding as upstream behavior
(agent-loop.ts is part of pi-agent-core, not gsd extension)
The Model [phase] [tier] notification fired on every unit dispatch during
auto-mode, cluttering the notification widget. The dashboard header already
displays the active model, making this redundant. Gate behind verbose flag
consistent with all other model routing notifications in the same function.
task.files ("files likely touched") is a planning hint that includes files
a task will create, not a dependency contract. Including it in ordering
checks caused false "sequence violation" blocking errors when a task listed
files it would create. Only task.inputs (machine-parsed prerequisites)
should trigger ordering violations, matching checkFilePathConsistency (#3626).
Closes#3677
- Changed checkTaskOrdering to check [...task.inputs] instead of
[...task.files, ...task.inputs]
- Updated 4 existing tests to use inputs (were testing buggy behavior)
- Added 8 regression tests: 5 ordering false-positive cases,
3 consistency edge cases
The ghost milestone check (#3645) was eliminating queued shell
milestones before the deferred-shell logic (#3470) could handle them,
causing queued milestones to vanish from the registry entirely.
Fixes workflow-logger coverage test failures: empty catch blocks in
reopen-slice/reopen-task and raw process.stderr in reopen-milestone
now use logWarning from workflow-logger.
Fixes identified by comprehensive state machine validation:
- M12: reopen-task/slice now deletes SUMMARY.md from disk, preventing
the DB-filesystem reconciler from auto-correcting tasks back to
"complete" — reopen was previously a no-op when artifacts existed
- H4: add 30s hard timeout to unitPromise via Promise.race — prevents
permanent hang if supervision fails to resolve agent_end
- H5: add handleReopenMilestone — milestone completion was irrevocable
- H6: pass ID as title when auto-creating phantom parent entities
- H7: guard loadRegistry() against missing/corrupt registry.json
- M4: report_blocker replay now sets blocker_discovered flag via
new setTaskBlockerDiscovered() DB function
- M5: insertVerificationEvidence uses INSERT OR IGNORE with unique
index on (task_id, slice_id, milestone_id, command, verdict)
- M11: complete-slice rollback preserves original status instead of
hardcoding "pending"
- M14: deriveWorkflowAction shows contextual labels for blocked,
paused, validating-milestone, completing-milestone, needs-discussion,
and replanning-slice phases instead of generic "Continue"
Includes 86 regression tests (49 unit + 37 integration) validating
every phase transition, completion guard, and edge case.
Closes#3161