Addresses Codex adversarial review findings:
1. Migration backup now flushes WAL via PRAGMA wal_checkpoint(TRUNCATE)
before copyFileSync. Without this, the backup could miss committed
data that only exists in the -wal file. Backup failure is now logged
via logWarning instead of silently swallowed.
2. Wave 5 regression tests strengthened:
- Added behavior-level test for skipped/blocked/pending status mapping
to checkbox rendering (not just isClosedStatus helper)
- Added extractEntityKey round-trip tests for underscored cmd formats
- Added unknown cmd → null safety test
Tests isClosedStatus coverage for projections, upsertDecision seq
preservation (ON CONFLICT DO UPDATE vs INSERT OR REPLACE), and
event schema versioning (v:2 field in new events).
Five consistency fixes to eliminate divergence sources:
1. workflow-projections.ts: Direct string comparisons for task/slice status
replaced with isClosedStatus() from status-guards.ts. Skipped tasks now
correctly show checked checkboxes in PLAN.md and ROADMAP.md.
2. gsd-db.ts upsertDecision: INSERT OR REPLACE changed to INSERT ... ON
CONFLICT(id) DO UPDATE SET. Preserves the seq column so decision ordering
in DECISIONS.md is stable after reconcile replay.
3. state.ts: Duplicate private isStatusDone() removed, replaced with alias
to isClosedStatus from status-guards.ts. Single source of truth for
"what counts as closed."
4. gsd-db.ts migrateSchema: Database is now backed up to
gsd.db.backup-v{currentVersion} before running migration steps. A mid-
migration crash no longer leaves a partially-migrated DB with no recovery.
5. workflow-events.ts: WorkflowEvent interface now includes optional v field
(schema version). New events are written with v:2. Legacy events (no v
field) are still accepted. Prevents future cmd-format drift from requiring
another dual-read fix.
Ecosystem research: executeSearchQuery() was a stub returning empty
results. Research now happens during the discussion (between Layer 1
and Layer 2) using whatever web search tools are available — native
Anthropic web search for Claude, search-the-web for other providers.
Preparation phase focuses on mechanical work only.
Adversarial review fixes:
- Clear lastPreparationResult on every discuss entry to prevent
cross-session/project state leaks
- Replace invalid JS regex anchor \z with indexOf-based section
extraction in prompt-validation.ts
- Document consecutive error counter finding as upstream behavior
(agent-loop.ts is part of pi-agent-core, not gsd extension)
The Model [phase] [tier] notification fired on every unit dispatch during
auto-mode, cluttering the notification widget. The dashboard header already
displays the active model, making this redundant. Gate behind verbose flag
consistent with all other model routing notifications in the same function.
task.files ("files likely touched") is a planning hint that includes files
a task will create, not a dependency contract. Including it in ordering
checks caused false "sequence violation" blocking errors when a task listed
files it would create. Only task.inputs (machine-parsed prerequisites)
should trigger ordering violations, matching checkFilePathConsistency (#3626).
Closes#3677
- Changed checkTaskOrdering to check [...task.inputs] instead of
[...task.files, ...task.inputs]
- Updated 4 existing tests to use inputs (were testing buggy behavior)
- Added 8 regression tests: 5 ordering false-positive cases,
3 consistency edge cases
The ghost milestone check (#3645) was eliminating queued shell
milestones before the deferred-shell logic (#3470) could handle them,
causing queued milestones to vanish from the registry entirely.
Fixes workflow-logger coverage test failures: empty catch blocks in
reopen-slice/reopen-task and raw process.stderr in reopen-milestone
now use logWarning from workflow-logger.
Fixes identified by comprehensive state machine validation:
- M12: reopen-task/slice now deletes SUMMARY.md from disk, preventing
the DB-filesystem reconciler from auto-correcting tasks back to
"complete" — reopen was previously a no-op when artifacts existed
- H4: add 30s hard timeout to unitPromise via Promise.race — prevents
permanent hang if supervision fails to resolve agent_end
- H5: add handleReopenMilestone — milestone completion was irrevocable
- H6: pass ID as title when auto-creating phantom parent entities
- H7: guard loadRegistry() against missing/corrupt registry.json
- M4: report_blocker replay now sets blocker_discovered flag via
new setTaskBlockerDiscovered() DB function
- M5: insertVerificationEvidence uses INSERT OR IGNORE with unique
index on (task_id, slice_id, milestone_id, command, verdict)
- M11: complete-slice rollback preserves original status instead of
hardcoding "pending"
- M14: deriveWorkflowAction shows contextual labels for blocked,
paused, validating-milestone, completing-milestone, needs-discussion,
and replanning-slice phases instead of generic "Continue"
Includes 86 regression tests (49 unit + 37 integration) validating
every phase transition, completion guard, and edge case.
Closes#3161