diff --git a/.gsd/milestones/.DS_Store b/.gsd/milestones/.DS_Store new file mode 100644 index 000000000..2c5d28252 Binary files /dev/null and b/.gsd/milestones/.DS_Store differ diff --git a/.gsd/milestones/M001/M001-CONTEXT.md b/.gsd/milestones/M001/M001-CONTEXT.md new file mode 100644 index 000000000..210ba9ba7 --- /dev/null +++ b/.gsd/milestones/M001/M001-CONTEXT.md @@ -0,0 +1,122 @@ +# M001: Tool-Driven Planning State Capture + +**Gathered:** 2026-03-23 +**Status:** Ready for planning + +## Project Description + +GSD-2 is a CLI coding agent harness that manages structured planning and execution workflows. M001/PR #2141 moved completion state to SQLite via tool calls. The planning half remains markdown-first: the LLM writes ROADMAP.md and PLAN.md directly to disk, the system regex-parses them back via 57+ `parseRoadmap()` callers, 42+ `parsePlan()` callers, and a 12-variant regex cascade in `roadmap-slices.ts`. This is the same anti-pattern M001 eliminated for completions. + +## Why This Milestone + +The parser cascade is the most common failure mode in GSD auto-mode. LLM formatting variance triggers fallback patterns, dependency ranges silently block slices, replans can renumber completed tasks (prompt-only enforcement), and `dispatch-guard.ts` re-parses ROADMAP.md on every slice dispatch. M001 proved the pattern — tool call → DB → rendered markdown — and M002 completes it for planning. + +## User-Visible Outcome + +### When this milestone is complete, the user can: + +- Run auto-mode with zero parser-related stalls from LLM formatting variance +- See replan attempts that try to modify completed tasks rejected with clear errors instead of silently corrupting state +- Experience faster dispatch cycles — DB queries replace markdown parsing on every dispatch + +### Entry point / environment + +- Entry point: `pi` CLI with `/gsd auto` +- Environment: local dev +- Live dependencies involved: none (SQLite is local) + +## Completion Class + +- Contract complete means: all planning tools produce correct DB state, all callers read from DB, cross-validation tests pass, parser removal doesn't break any test +- Integration complete means: auto-mode runs a full milestone using the new tools (plan → execute → replan → reassess → complete cycle) +- Operational complete means: pre-M002 projects seamlessly migrate, gsd recover handles new columns + +## Final Integrated Acceptance + +To call this milestone complete, we must prove: + +- A full auto-mode cycle (plan milestone → plan slice → execute tasks → complete slice → reassess → next slice) uses the new tools and DB queries with zero parseRoadmap/parsePlan calls in the dispatch hot path +- A replan attempt that references completed tasks is structurally rejected by the tool handler +- A pre-M002 project with existing ROADMAP.md and PLAN.md files auto-migrates to DB on first open + +## Risks and Unknowns + +- **LLM compliance with flat tool schemas** — LLMs may struggle with the multi-tool planning sequence (plan_milestone → plan_slice → plan_task for each task). Mitigated by flat schema design (locked decision #1) and TypeBox validation with clear error messages. +- **Renderer fidelity during transition window** — Between S01 (tools write DB + render) and S04 (callers read from DB), callers still parse from disk. Renderer bugs create state divergence. Mitigated by cross-validation tests (R014). +- **CONTINUE.md migration complexity** — It's a structured resume contract with hook writers, prompt construction, and cleanup semantics, not just a flag. Underestimating this scope risks breaking auto-mode resume. +- **Prompt migration quality** — Planning prompts are significantly more complex than execution prompts. Rewriting them to produce tool calls while preserving creative planning quality is the hardest UX challenge. + +## Existing Codebase / Prior Art + +- `src/resources/extensions/gsd/tools/complete-task.ts` — M001 tool handler pattern (validate → DB transaction → render → cache invalidate) +- `src/resources/extensions/gsd/tools/complete-slice.ts` — M001 tool handler pattern +- `src/resources/extensions/gsd/gsd-db.ts` — SQLite abstraction, schema v7, migration chain, query functions +- `src/resources/extensions/gsd/roadmap-slices.ts` — 271 lines, 12 prose variant regex patterns (primary removal target) +- `src/resources/extensions/gsd/files.ts` — 1170 lines, parseRoadmap(), parsePlan(), cachedParse(), parseContinue/formatContinue +- `src/resources/extensions/gsd/state.ts` — 1367 lines, deriveState()/deriveStateFromDb(), flag file checks +- `src/resources/extensions/gsd/dispatch-guard.ts` — 106 lines, parseRoadmapSlices() on every slice dispatch +- `src/resources/extensions/gsd/auto-dispatch.ts` — 656 lines, 18 rules, 4 with explicit disk I/O +- `src/resources/extensions/gsd/md-importer.ts` — 713 lines, migrateHierarchyToDb() +- `src/resources/extensions/gsd/markdown-renderer.ts` — 721 lines, checkbox patching (M001) +- `src/resources/extensions/gsd/auto-prompts.ts` — 1649 lines, loadFile for ROADMAP/PLAN context injection +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — 487 lines, tool registration patterns +- `src/resources/extensions/gsd/auto-post-unit.ts` — detectRogueFileWrites (extend for PLAN/ROADMAP) +- `src/resources/extensions/gsd/auto-verification.ts` — 233 lines, parsePlan for task.verify +- `src/resources/extensions/gsd/bootstrap/register-hooks.ts` — CONTINUE.md hook writers +- `src/resources/extensions/gsd/tests/derive-state-crossval.test.ts` — 527 lines, M001 cross-validation pattern + +> See `.gsd/DECISIONS.md` for all architectural and pattern decisions — it is an append-only register; read it during planning, append to it during execution. + +## Relevant Requirements + +- R001–R008 — Schema and tool implementations (S01–S03) +- R009–R010 — Caller migration (S04–S05) +- R011 — Flag file migration (S05) +- R012 — Parser deprecation (S06) +- R013–R019 — Cross-cutting concerns (prompts, validation, caching, migration) + +## Scope + +### In Scope + +- Schema v7→v8 migration with new columns and tables +- 5 new planning tools: gsd_plan_milestone, gsd_plan_slice, gsd_plan_task, gsd_replan_slice, gsd_reassess_roadmap +- Full markdown renderers (ROADMAP.md, PLAN.md, T##-PLAN.md) from DB state +- Hot-path and warm/cold caller migration from parsers to DB queries +- Flag file → DB column migration (REPLAN, ASSESSMENT, CONTINUE, CONTEXT-DRAFT, REPLAN-TRIGGER) +- Prompt migration for 4 planning prompts +- Cross-validation tests for the transition window +- Pre-M002 project migration via extended migrateHierarchyToDb() +- Rogue file detection for PLAN/ROADMAP writes + +### Out of Scope / Non-Goals + +- CQRS/event-sourcing architecture (R023) +- Perfect round-trip recovery for tool-only fields (R024) +- StateEngine abstraction layer (R021 — deferred) +- parseSummary() migration (R020 — deferred) +- Native Rust parser bridge removal (R022 — deferred, low risk follow-up) + +## Technical Constraints + +- Flat tool schemas (locked decision #1) — separate calls per entity, not deeply nested +- No StateEngine abstraction (locked decision #2) — query functions added to gsd-db.ts +- CONTINUE.md and CONTEXT-DRAFT migrate in M002 (locked decision #3) +- Recovery accepts fidelity loss for tool-only fields (locked decision #4) +- T##-PLAN.md files must remain a runtime contract — DB rows don't replace file existence checks +- Sequence columns must propagate to query ORDER BY — otherwise reordering is a no-op +- cachedParse() TTL cache must be invalidated alongside state cache in all tool handlers + +## Integration Points + +- `auto-dispatch.ts` dispatch rules — migrate 4 rules from disk I/O to DB queries +- `dispatch-guard.ts` — migrate from parseRoadmapSlices() to getMilestoneSlices() +- `auto-prompts.ts` — context injection pipeline (loads ROADMAP/PLAN from disk → could use artifacts table) +- `deriveStateFromDb()` — flag file checks currently use existsSync, migrate to DB columns +- `bootstrap/register-hooks.ts` — CONTINUE.md hook writers must migrate to DB writes +- `guided-resume-task.md` prompt — reads CONTINUE.md, must read from DB column instead +- `md-importer.ts` — migrateHierarchyToDb() extended for v8 columns + +## Open Questions + +- None — all design decisions locked in issue #2228 comments diff --git a/.gsd/milestones/M001/M001-ROADMAP.md b/.gsd/milestones/M001/M001-ROADMAP.md new file mode 100644 index 000000000..6ade73918 --- /dev/null +++ b/.gsd/milestones/M001/M001-ROADMAP.md @@ -0,0 +1,158 @@ +# M001: Tool-Driven Planning State Capture + +**Vision:** Complete the markdown→DB migration for planning state, eliminating 57+ parseRoadmap() callers, 42+ parsePlan() callers, and the 12-variant regex cascade. The LLM produces creative planning work via structured tool calls. TypeScript owns all state transitions. Markdown files become rendered views, not sources of truth. + +## Success Criteria + +- Auto-mode completes a full planning cycle (plan milestone → plan slice → execute → replan → reassess) using tool calls with zero parseRoadmap/parsePlan calls in the dispatch loop +- Replan that references a completed task is structurally rejected by the tool handler +- Pre-M002 project with existing ROADMAP.md and PLAN.md auto-migrates to DB on first open +- deriveStateFromDb() resolves planning state without filesystem scanning for flag files + +## Key Risks / Unknowns + +- LLM compliance with multi-tool planning sequence — mitigated by flat schemas, TypeBox validation, clear errors +- Renderer fidelity during transition window — mitigated by cross-validation tests +- CONTINUE.md is a structured resume contract, not a flag — migration must preserve hook writers, prompt construction, cleanup semantics +- Prompt migration complexity — planning prompts are more complex than execution prompts + +## Proof Strategy + +- LLM schema compliance → retire in S01/S02 by proving the tools accept valid input and reject invalid input via unit tests +- Renderer fidelity → retire in S04 by proving DB state matches rendered-then-parsed state via cross-validation tests +- CONTINUE.md complexity → retire in S05 by proving auto-mode resume flow works after flag file migration +- Prompt quality → retire in S01/S02/S03 by verifying prompts produce valid tool calls in integration tests + +## Verification Classes + +- Contract verification: unit tests for tool handlers (validation, DB writes, rendering), cross-validation tests (DB↔parsed parity), parser removal doesn't break test suite +- Integration verification: auto-mode dispatch loop uses DB queries, planning prompts produce valid tool calls +- Operational verification: pre-M002 project migration, gsd recover handles v8 columns +- UAT / human verification: auto-mode runs a real milestone end-to-end using new tools + +## Milestone Definition of Done + +This milestone is complete only when all are true: + +- All 5 planning tools are registered and functional (plan_milestone, plan_slice, plan_task, replan_slice, reassess_roadmap) +- Zero parseRoadmap()/parsePlan()/parseRoadmapSlices() calls in the dispatch loop hot path +- Replan and reassess structurally enforce preservation of completed tasks/slices +- deriveStateFromDb() covers planning data — flag file checks moved to DB columns +- Cross-validation tests prove DB state matches rendered-then-parsed state +- All existing tests pass (no regressions) +- Pre-M002 projects auto-migrate via migrateHierarchyToDb() with best-effort v8 column population +- Planning prompts produce valid tool calls (not direct file writes) + +## Requirement Coverage + +- Covers: R001, R002, R003, R004, R005, R006, R007, R008, R009, R010, R011, R012, R013, R014, R015, R016, R017, R018, R019 +- Partially covers: none +- Leaves for later: R020 (parseSummary), R021 (StateEngine), R022 (native parser bridge) +- Orphan risks: none + +## Slices + +- [x] **S01: Schema v8 + plan_milestone tool + ROADMAP renderer** `risk:high` `depends:[]` + > After this: gsd_plan_milestone tool accepts structured params, writes to DB, renders ROADMAP.md from DB state. Parsers still work as fallback. Schema v8 migration runs on existing DBs. Rogue detection extended for ROADMAP writes. + +- [x] **S02: plan_slice + plan_task tools + PLAN/task-plan renderers** `risk:high` `depends:[S01]` + > After this: gsd_plan_slice and gsd_plan_task tools accept structured params, write to DB, render S##-PLAN.md and T##-PLAN.md from DB. Task plan files pass existence checks. Prompt migration for plan-slice.md complete. + +- [ ] **S03: replan_slice + reassess_roadmap with structural enforcement** `risk:medium` `depends:[S01,S02]` + > After this: gsd_replan_slice rejects mutations to completed tasks, gsd_reassess_roadmap rejects mutations to completed slices. replan_history and assessments tables populated. REPLAN.md and ASSESSMENT.md rendered from DB. + +- [ ] **S04: Hot-path caller migration + cross-validation tests** `risk:medium` `depends:[S01,S02]` + > After this: dispatch-guard.ts, auto-dispatch.ts (4 rules), auto-verification.ts, parallel-eligibility.ts read from DB. Cross-validation tests prove DB↔rendered parity. Sequence-aware query ordering in getMilestoneSlices/getSliceTasks. + +- [ ] **S05: Warm/cold callers + flag files + pre-M002 migration** `risk:medium` `depends:[S03,S04]` + > After this: doctor, visualizer, github-sync, workspace-index, dashboard-overlay, guided-flow, reactive-graph, auto-recovery use DB queries. REPLAN/ASSESSMENT/CONTINUE/CONTEXT-DRAFT/REPLAN-TRIGGER tracked in DB. migrateHierarchyToDb() populates v8 columns. gsd recover upgraded. + +- [ ] **S06: Parser deprecation + cleanup** `risk:low` `depends:[S05]` + > After this: parseRoadmapSlices() removed from hot paths (~271 lines). parsePlan() task parsing removed (~120 lines). parseRoadmap() slice extraction removed (~85 lines). Parsers kept only in md-importer for migration. Zero parseRoadmap/parsePlan calls in dispatch loop. Test suite passes with parsers removed from hot paths. + +## Boundary Map + +### S01 → S02 + +Produces: +- `gsd-db.ts` → schema v8 migration (new columns on milestones, slices, tasks tables; replan_history, assessments tables) +- `gsd-db.ts` → `insertMilestonePlanning()`, `getMilestonePlanning()` query functions +- `gsd-db.ts` → `insertSlicePlanning()`, `getSlicePlanning()` query functions (columns only — S02 populates them) +- `tools/plan-milestone.ts` → `gsd_plan_milestone` tool handler pattern (validate → transaction → render → invalidate) +- `markdown-renderer.ts` → `renderRoadmapFromDb(basePath, milestoneId)` — full ROADMAP.md generation from DB +- `auto-post-unit.ts` → rogue detection for ROADMAP.md writes + +Consumes: +- nothing (first slice) + +### S01 → S03 + +Produces: +- Schema v8 tables: `replan_history`, `assessments` (created in S01 migration, populated in S03) +- Tool handler pattern established in `tools/plan-milestone.ts` +- `renderRoadmapFromDb()` — reused by reassess for re-rendering after modification + +Consumes: +- nothing (first slice) + +### S02 → S03 + +Produces: +- `gsd-db.ts` → `getSliceTasks()`, `getTask()` query functions +- `tools/plan-slice.ts`, `tools/plan-task.ts` → handler patterns +- `markdown-renderer.ts` → `renderPlanFromDb()`, `renderTaskPlanFromDb()` + +Consumes from S01: +- Schema v8 columns on slices and tasks tables +- Tool handler pattern from `tools/plan-milestone.ts` + +### S02 → S04 + +Produces: +- `gsd-db.ts` → `getSliceTasks()`, `getTask()` with `verify_command`, `files`, `steps` columns populated +- `renderPlanFromDb()`, `renderTaskPlanFromDb()` for artifacts table population + +Consumes from S01: +- Schema v8, query functions + +### S01,S02 → S04 + +Produces (from S01+S02 combined): +- All planning data in DB (milestones, slices, tasks with v8 columns) +- All query functions needed by callers +- Rendered markdown in artifacts table + +Consumes: +- S01: schema, milestone query functions, ROADMAP renderer +- S02: slice/task query functions, PLAN/task-plan renderers + +### S03 → S05 + +Produces: +- `replan_history` table populated with actual replan events +- `assessments` table populated with actual assessments +- REPLAN.md and ASSESSMENT.md rendered from DB (flag file equivalents) + +Consumes from S01, S02: +- Schema, query functions, renderers + +### S04 → S05 + +Produces: +- Hot-path callers migrated to DB — dispatch loop no longer parses markdown +- Sequence-aware query ordering proven in getMilestoneSlices/getSliceTasks +- Cross-validation test infrastructure + +Consumes from S01, S02: +- Query functions, renderers, DB-populated planning data + +### S05 → S06 + +Produces: +- All callers migrated to DB queries +- Flag files migrated to DB columns +- migrateHierarchyToDb() populates v8 columns +- No caller depends on parseRoadmap/parsePlan/parseRoadmapSlices except md-importer + +Consumes from S03, S04: +- replan/assessment DB tables, hot-path migration complete, query functions diff --git a/.gsd/milestones/M001/slices/S01/S01-PLAN.md b/.gsd/milestones/M001/slices/S01/S01-PLAN.md new file mode 100644 index 000000000..5dbfd551b --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/S01-PLAN.md @@ -0,0 +1,85 @@ +# S01: Schema v8 + plan_milestone tool + ROADMAP renderer + +**Goal:** Make milestone planning DB-backed by adding schema v8 storage, a `gsd_plan_milestone` write path, full ROADMAP rendering from DB, and prompt/enforcement updates that stop direct roadmap writes from bypassing state. +**Demo:** Running the milestone-planning handler against structured input writes milestone planning fields into SQLite, renders `.gsd/milestones/M001/M001-ROADMAP.md` from DB state, and tests prove prompt contracts plus rogue-write detection cover the transition path. + +## Must-Haves + +- Schema v8 stores milestone-planning data plus downstream slice/task planning columns and creates `replan_history` and `assessments` tables without breaking existing DBs. +- `gsd_plan_milestone` validates flat structured input, writes milestone + slice planning data transactionally, renders ROADMAP.md from DB, and clears state/parse caches after render. +- `renderRoadmapFromDb()` emits a complete parser-compatible roadmap including vision, success criteria, risks, proof strategy, verification classes, definition of done, requirement coverage, slices, and boundary map. +- Planning prompts stop instructing direct roadmap writes and rogue detection flags direct `ROADMAP.md` / `PLAN.md` writes that bypass planning tools. +- Migration and renderer/tool tests prove v7→v8 upgrade, roadmap round-trip fidelity, tool-handler behavior, and prompt/enforcement coverage. + +## Proof Level + +- This slice proves: integration +- Real runtime required: yes +- Human/UAT required: no + +## Verification + +- `node --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` +- `node --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts` +- `node --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts` +- `node --test src/resources/extensions/gsd/tests/rogue-file-detection.test.ts src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` +- `node --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts --test-name-pattern="stderr warning|stale"` + +## Observability / Diagnostics + +- Runtime signals: tool handler returns structured error details for schema validation / render failures; migration and rogue-detection tests expose fallback-path regressions. +- Inspection surfaces: `src/resources/extensions/gsd/tests/plan-milestone.test.ts`, `src/resources/extensions/gsd/tests/markdown-renderer.test.ts`, `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts`, and SQLite rows in milestone/slice/artifact tables. +- Failure visibility: render failures must surface before cache invalidation completes; rogue detection must name the offending roadmap/plan path; migration tests must show whether v8 columns/tables were created. +- Redaction constraints: none beyond normal repository data; no secrets involved. + +## Integration Closure + +- Upstream surfaces consumed: `src/resources/extensions/gsd/gsd-db.ts`, `src/resources/extensions/gsd/markdown-renderer.ts`, `src/resources/extensions/gsd/bootstrap/db-tools.ts`, `src/resources/extensions/gsd/md-importer.ts`, `src/resources/extensions/gsd/auto-post-unit.ts`, existing parser contracts in `src/resources/extensions/gsd/files.ts`. +- New wiring introduced in this slice: milestone-planning DB accessors, `gsd_plan_milestone` tool registration/handler, full ROADMAP render path, prompt contract migration, and rogue-write detection for planning artifacts. +- What remains before the milestone is truly usable end-to-end: slice/task planning tools, reassess/replan structural enforcement, caller migration to DB reads, and full hot-path parser retirement in later slices. + +## Tasks + +- [x] **T01: Add schema v8 planning storage and roadmap rendering** `est:1h15m` + - Why: S01 cannot write milestone planning through tools until SQLite can hold the fields and ROADMAP.md can be regenerated from DB without relying on an existing file. + - Files: `src/resources/extensions/gsd/gsd-db.ts`, `src/resources/extensions/gsd/markdown-renderer.ts`, `src/resources/extensions/gsd/md-importer.ts`, `src/resources/extensions/gsd/tests/markdown-renderer.test.ts`, `src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` + - Do: Add the v7→v8 migration for milestone/slice/task planning columns and `replan_history` / `assessments`; add milestone-planning query/upsert helpers needed by the new tool; implement full `renderRoadmapFromDb()` with parser-compatible output and artifact persistence; extend importer coverage so pre-v8 roadmap content backfills new milestone fields best-effort on migration. + - Verify: `node --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` + - Done when: opening a v7 DB upgrades to v8, roadmap rendering can generate a complete file from DB state, and migration tests prove existing roadmap content still imports cleanly. +- [x] **T02: Wire gsd_plan_milestone through the DB-backed tool path** `est:1h15m` + - Why: The slice promise is a real planning tool, not just storage and renderer primitives. The handler must establish the validate → transaction → render → invalidate pattern downstream slices will reuse. + - Files: `src/resources/extensions/gsd/tools/plan-milestone.ts`, `src/resources/extensions/gsd/bootstrap/db-tools.ts`, `src/resources/extensions/gsd/tests/plan-milestone.test.ts`, `src/resources/extensions/gsd/gsd-db.ts`, `src/resources/extensions/gsd/markdown-renderer.ts` + - Do: Implement the milestone-planning handler using the existing completion-tool pattern; ensure it performs structural validation on flat tool params, upserts milestone and slice planning rows in one transaction, renders/stores ROADMAP.md after commit, and explicitly calls `invalidateStateCache()` and `clearParseCache()` after successful render; register canonical + alias tool definitions in `db-tools.ts`. + - Verify: `node --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` + - Done when: the handler rejects invalid payloads, writes valid planning data to DB, renders the roadmap artifact, stores rendered content, and tests prove cache invalidation and idempotent reruns. +- [x] **T03: Migrate planning prompts and enforce rogue-write detection** `est:50m` + - Why: The tool path is incomplete if prompts still tell the model to write roadmap files directly or if direct writes can bypass DB state silently. + - Files: `src/resources/extensions/gsd/prompts/plan-milestone.md`, `src/resources/extensions/gsd/prompts/guided-plan-milestone.md`, `src/resources/extensions/gsd/prompts/plan-slice.md`, `src/resources/extensions/gsd/prompts/replan-slice.md`, `src/resources/extensions/gsd/prompts/reassess-roadmap.md`, `src/resources/extensions/gsd/auto-post-unit.ts`, `src/resources/extensions/gsd/tests/prompt-contracts.test.ts`, `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` + - Do: Rewrite planning prompts so they instruct tool calls instead of direct roadmap/plan file writes while preserving existing planning context variables; extend `detectRogueFileWrites()` to flag direct `ROADMAP.md` and `PLAN.md` writes for planning units; add contract tests that prove the new instructions and enforcement paths hold. + - Verify: `node --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` + - Done when: planning prompts name the DB tools, direct file-write instructions are gone, and rogue detection tests fail if roadmap/plan files appear without matching DB state. +- [x] **T04: Close the slice with integrated regression coverage** `est:40m` + - Why: S01 crosses schema migration, tool registration, markdown rendering, prompt contracts, and migration fallback. The slice is only done when those surfaces pass together, not as isolated edits. + - Files: `src/resources/extensions/gsd/tests/plan-milestone.test.ts`, `src/resources/extensions/gsd/tests/markdown-renderer.test.ts`, `src/resources/extensions/gsd/tests/prompt-contracts.test.ts`, `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts`, `src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` + - Do: Fill remaining regression gaps discovered during implementation, keep test fixtures aligned with the final roadmap format/tool output, and run the full targeted S01 suite so downstream slices inherit a stable baseline. + - Verify: `node --test src/resources/extensions/gsd/tests/plan-milestone.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` + - Done when: the combined targeted suite passes against the final implementation and demonstrates the slice demo truthfully. + +## Files Likely Touched + +- `src/resources/extensions/gsd/gsd-db.ts` +- `src/resources/extensions/gsd/markdown-renderer.ts` +- `src/resources/extensions/gsd/tools/plan-milestone.ts` +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` +- `src/resources/extensions/gsd/md-importer.ts` +- `src/resources/extensions/gsd/auto-post-unit.ts` +- `src/resources/extensions/gsd/prompts/plan-milestone.md` +- `src/resources/extensions/gsd/prompts/guided-plan-milestone.md` +- `src/resources/extensions/gsd/prompts/plan-slice.md` +- `src/resources/extensions/gsd/prompts/replan-slice.md` +- `src/resources/extensions/gsd/prompts/reassess-roadmap.md` +- `src/resources/extensions/gsd/tests/plan-milestone.test.ts` +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` +- `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` +- `src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` diff --git a/.gsd/milestones/M001/slices/S01/S01-RESEARCH.md b/.gsd/milestones/M001/slices/S01/S01-RESEARCH.md new file mode 100644 index 000000000..2b059e6af --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/S01-RESEARCH.md @@ -0,0 +1,80 @@ +# S01 — Research + +**Date:** 2026-03-23 + +## Summary + +S01 owns R001, R002, R007, R013, R015, and R018. This slice is targeted research, not deep exploration. The codebase already has the exact handler pattern to copy: `tools/complete-task.ts` and `tools/complete-slice.ts` do validate → DB transaction → render → cache invalidation, and `bootstrap/db-tools.ts` already registers canonical + alias DB-backed tools. The missing pieces are schema v8 expansion in `gsd-db.ts`, a new milestone-planning write path/tool, a full ROADMAP renderer from DB state, prompt migration away from direct file writes, and rogue-write detection extended beyond summaries. + +The main constraint is transition-window fidelity. Existing callers still parse rendered markdown. `markdown-renderer.ts` currently only patches existing checkbox content (`renderRoadmapCheckboxes`, `renderPlanCheckboxes`) and explicitly relies on round-tripping through `parseRoadmap()` / `parsePlan()`. That means S01 cannot get away with partial rendering or a lossy format. `renderRoadmapFromDb()` has to emit the same sections the parser-dependent callers/tests expect: title, vision, success criteria, slices with checkbox/risk/depends/demo lines, proof strategy, verification classes, milestone definition of done, boundary map, and requirement coverage. + +## Recommendation + +Implement S01 in four build steps: (1) schema/query expansion in `gsd-db.ts`, (2) ROADMAP rendering from DB in `markdown-renderer.ts`, (3) `gsd_plan_milestone` handler + tool registration, and (4) prompt/rogue-detection/test coverage. Follow the existing M001 tool pattern exactly rather than inventing a planning-specific abstraction. That matches decision D002 and the established extension rule from the `create-gsd-extension` skill: add capabilities using the existing extension primitives/patterns, don’t build a parallel framework. + +Use a flat tool schema. That is already locked by D001 and is also the least risky shape for TypeBox validation and tool registration. Keep cache invalidation explicit in the handler after DB write + render: `invalidateStateCache()` plus `clearParseCache()` are mandatory for R015 because parser callers still sit on the hot path during the transition. Also extend rogue detection immediately in `auto-post-unit.ts`; otherwise prompt migration has no enforcement surface and direct ROADMAP writes will silently bypass the DB. + +## Implementation Landscape + +### Key Files + +- `src/resources/extensions/gsd/gsd-db.ts` — current schema is `SCHEMA_VERSION = 7`; has v1→v7 incremental migrations, row interfaces, and accessors. Needs v8 columns/tables plus milestone-planning read/write functions. Existing ordering is still `ORDER BY id` in `getMilestoneSlices()` and `getSliceTasks()`; S01 likely adds sequence columns now even though ORDER BY migration is validated in S04. +- `src/resources/extensions/gsd/markdown-renderer.ts` — current renderer is patch-oriented, not full generation. `renderRoadmapCheckboxes()` loads existing artifact content and regex-toggles `[ ]`/`[x]`. S01 needs a new `renderRoadmapFromDb(basePath, milestoneId)` that generates the entire file, writes it, stores artifact content, and invalidates caches. +- `src/resources/extensions/gsd/tools/complete-task.ts` — best concrete reference for a DB-backed tool handler. Pattern: validate params, `transaction(...)`, render file(s) outside transaction, rollback status on render failure, then invalidate `invalidateStateCache()`, `clearPathCache()`, and `clearParseCache()`. +- `src/resources/extensions/gsd/tools/complete-slice.ts` — second reference for handler shape and roadmap rendering callout. Shows how parent rows are ensured before updates and how roadmap rendering is treated as a post-transaction filesystem step. +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — tool registration seam. Existing DB tools use TypeBox, canonical names plus alias registration, `ensureDbOpen()`, and structured `details`. Add `gsd_plan_milestone` here and keep aliases/prompt guidelines consistent with current style. +- `src/resources/extensions/gsd/md-importer.ts` — `migrateHierarchyToDb()` currently imports milestone title/status/depends_on, slice title/risk/depends/demo, and task title/status from parsed markdown. For S01 it must at minimum tolerate schema v8 and populate new milestone planning columns best-effort from existing ROADMAP content. +- `src/resources/extensions/gsd/files.ts` — parser contract surface. `parseRoadmap()` currently extracts only title, vision, successCriteria, slices, and boundaryMap. Transition-window consumers still depend on this output, so ROADMAP rendering must preserve parser-readable structure even before richer DB-only fields are fully consumed. +- `src/resources/extensions/gsd/auto-post-unit.ts` — `detectRogueFileWrites()` currently only checks task and slice summaries. Extend it for direct `ROADMAP.md`/`PLAN.md` writes so planning tools have the same safety net completion tools already have. +- `src/resources/extensions/gsd/prompts/guided-plan-milestone.md` — still instructs the model to create `{{milestoneId}}-ROADMAP.md` directly. This is the primary prompt migration target for S01. `plan-milestone.md` likely needs the same migration even though only guided prompt text was inspected directly. +- `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` — existing safety-net tests for summary files. Natural place to add roadmap/plan rogue detection coverage. +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — existing contract-test pattern for prompt migration (`execute-task`, `complete-slice`). Add assertions that milestone-planning prompts reference `gsd_plan_milestone` and stop instructing direct file writes. +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — already validates renderer round-trips via `parseRoadmap()` / `parsePlan()`. Extend with full ROADMAP-from-DB tests rather than inventing a new harness. +- `src/resources/extensions/gsd/tests/derive-state-crossval.test.ts` — model for transition-window parity tests called out in the milestone context. S01 won’t retire R014, but this file shows the test shape downstream slices should follow. + +### Build Order + +1. **Schema first in `gsd-db.ts`.** Add v8 columns/tables and row/interface/query support before touching tools. This unblocks every downstream step and avoids hand-building temporary storage. +2. **Implement `renderRoadmapFromDb()` next.** S01 writes DB first but callers still parse markdown. Until the full ROADMAP renderer exists and round-trips, the tool handler cannot be trusted. +3. **Build `tools/plan-milestone.ts` and register `gsd_plan_milestone`.** Copy the completion-tool pattern: validate → transaction/upserts → render → artifact store/caches. This is the core deliverable for R002/R015. +4. **Then migrate prompts and rogue detection.** Once the tool exists, update `plan-milestone.md` / `guided-plan-milestone.md` to call it, and extend `detectRogueFileWrites()` + tests so direct markdown writes become visible failures instead of silent divergence. +5. **Last, importer/backfill tests.** Best-effort v8 migration/import logic is lower risk than the write path but needs coverage before the slice is declared done. + +### Verification Approach + +- Run targeted node tests around the touched surfaces, starting with: + - `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` + - `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` + - `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` + - any new `plan-milestone` handler/tool tests added for S01 +- Add/extend schema migration coverage in `src/resources/extensions/gsd/tests/gsd-db.test.ts` or a dedicated `plan-milestone` test file so opening a v7 DB proves v8 migration succeeds. +- Add handler proof similar to `complete-task.test.ts` / `complete-slice.test.ts`: valid input writes DB rows, renders `M###-ROADMAP.md`, stores artifact content, and invalidates caches; invalid input is structurally rejected. +- Add renderer round-trip proof: generated ROADMAP parses via `parseRoadmap()` and preserves slice IDs, checkbox state, risk, dependencies, and boundary map sections. +- Add prompt contract proof that milestone-planning prompts reference `gsd_plan_milestone` and no longer instruct direct `ROADMAP.md` creation. + +## Constraints + +- `gsd-db.ts` is already large and schema changes must follow the existing incremental migration chain. Do not rewrite schema bootstrap logic; add a `v7 → v8` step. +- Transition window is parser-dependent. `markdown-renderer.ts` explicitly states rendered markdown must round-trip through `parseRoadmap()` / `parsePlan()`. +- Existing query ordering is lexicographic by `id`, not sequence. S01 can add sequence columns now, but S04 owns proving all readers order by sequence. +- Tool registration currently uses `@sinclair/typebox` patterns in `bootstrap/db-tools.ts`; keep registration consistent with existing DB tools instead of adding a new registry path. + +## Common Pitfalls + +- **Partial ROADMAP rendering** — `renderRoadmapCheckboxes()` only patches an existing file. Reusing that pattern for S01 will leave DB as source of truth without a full markdown view, breaking parser-era callers. Generate the whole file. +- **Cache invalidation drift** — completion handlers explicitly clear parse and state caches. Missing `clearParseCache()` after milestone planning will create stale parser results during the transition window. +- **INSERT OR IGNORE where upsert is required** — `insertMilestone()` / `insertSlice()` currently ignore later field updates. The planning handler likely needs a real update/upsert path for milestone metadata instead of relying on these helpers unchanged. +- **Prompt migration without enforcement** — if prompts change before rogue detection covers ROADMAP/PLAN writes, noncompliant model output will silently create divergent state on disk. + +## Open Risks + +- The current `parseRoadmap()` surface does not expose all milestone sections S01 wants to store/render. The renderer can emit richer markdown than the parser reads, but importer/backfill for legacy files may be best-effort only until later slices expand parser/import logic. +- `gsd-db.ts` already duplicates some row/accessor sections and is drifting large; S01 should avoid broad refactors while changing schema because this slice is on the critical path. + +## Skills Discovered + +| Technology | Skill | Status | +|------------|-------|--------| +| GSD extension/tooling | `create-gsd-extension` | available | +| Investigation / root-cause discipline | `debug-like-expert` | available | +| Test generation / execution patterns | `test` | available | diff --git a/.gsd/milestones/M001/slices/S01/S01-SUMMARY.md b/.gsd/milestones/M001/slices/S01/S01-SUMMARY.md new file mode 100644 index 000000000..63e2f32a6 --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/S01-SUMMARY.md @@ -0,0 +1,131 @@ +--- +id: S01 +parent: M001 +milestone: M001 +provides: + - Schema v8 planning storage on milestones, slices, and tasks, plus `replan_history` and `assessments` tables for later slices. + - `gsd_plan_milestone` tool registration and handler implementation as the reference planning-tool pattern. + - `renderRoadmapFromDb()` as the canonical roadmap regeneration path from DB state. + - Prompt contracts and rogue-write enforcement for milestone-era planning artifacts. + - Integrated regression coverage proving the S01 boundary works together under the repo’s actual test harness. +requires: + [] +affects: + - S02 + - S03 + - S04 + - S05 +key_files: + - src/resources/extensions/gsd/gsd-db.ts + - src/resources/extensions/gsd/markdown-renderer.ts + - src/resources/extensions/gsd/tools/plan-milestone.ts + - src/resources/extensions/gsd/bootstrap/db-tools.ts + - src/resources/extensions/gsd/auto-post-unit.ts + - src/resources/extensions/gsd/prompts/plan-milestone.md + - src/resources/extensions/gsd/tests/plan-milestone.test.ts + - src/resources/extensions/gsd/tests/markdown-renderer.test.ts + - src/resources/extensions/gsd/tests/prompt-contracts.test.ts + - src/resources/extensions/gsd/tests/rogue-file-detection.test.ts + - src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts +key_decisions: + - Use a thin DB-backed planning handler pattern: validate flat params, write in one transaction, render markdown from DB, then invalidate both state and parse caches. + - Treat planning prompts as tool-call orchestration surfaces and markdown templates as output-shaping guidance, not manual write targets. + - Detect rogue planning artifact writes by comparing disk artifacts against durable milestone/slice planning state in DB rather than inventing a separate completion status model. + - Verify cache invalidation through observable parse-visible state instead of monkey-patching imported ESM bindings. + - Use the repository’s resolver-based TypeScript harness as the authoritative proof path for these source tests. +patterns_established: + - Validate → transaction → render → invalidate is the standard planning-tool handler pattern for downstream slices. + - Render markdown from DB state after writes; do not mutate planning markdown directly as the source of truth. + - Tie rogue artifact detection to durable DB state instead of trusting prompt compliance. + - Use resolver-based TypeScript test execution for this repo’s source tests, and verify cache behavior through observable state rather than ESM export mutation. +observability_surfaces: + - `src/resources/extensions/gsd/tests/plan-milestone.test.ts` for handler validation, render failure behavior, idempotence, and cache invalidation proof. + - `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` for full ROADMAP rendering, stale-render detection/repair, and dedicated `stderr warning|stale` diagnostics. + - `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` for prompt regressions that reintroduce direct file-write instructions. + - `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` and `src/resources/extensions/gsd/auto-post-unit.ts` for enforcement of rogue ROADMAP.md / PLAN.md writes. + - SQLite milestone/slice rows and artifacts rendered by `renderRoadmapFromDb()` for direct inspection of persisted planning state. +drill_down_paths: + - .gsd/milestones/M001/slices/S01/tasks/T01-SUMMARY.md + - .gsd/milestones/M001/slices/S01/tasks/T02-SUMMARY.md + - .gsd/milestones/M001/slices/S01/tasks/T03-SUMMARY.md + - .gsd/milestones/M001/slices/S01/tasks/T04-SUMMARY.md +duration: "" +verification_result: passed +completed_at: 2026-03-23T15:47:31.051Z +blocker_discovered: false +--- + +# S01: Schema v8 + plan_milestone tool + ROADMAP renderer + +**Delivered schema v8 milestone-planning storage, the `gsd_plan_milestone` DB-backed write path, full ROADMAP rendering from DB, and prompt/enforcement coverage that blocks direct planning-file bypasses.** + +## What Happened + +S01 started with a broken intermediate state from early schema work and a stale assumption in the plan’s literal verification commands. The slice finished by establishing the first complete DB-backed planning path for milestones. Schema v8 support was added in `gsd-db.ts`, including new milestone/slice/task planning columns and the downstream `replan_history` and `assessments` tables required by later slices. `markdown-renderer.ts` gained a full `renderRoadmapFromDb()` path so ROADMAP.md can now be regenerated from DB state instead of only patching checkboxes. `tools/plan-milestone.ts` implemented the canonical milestone planning write flow: flat param validation, transactional writes for milestone and slice planning state, roadmap rendering, and explicit `invalidateStateCache()` plus `clearParseCache()` after successful render. `bootstrap/db-tools.ts` registered the canonical tool and alias so prompts can target the DB-backed path. The planning prompts were then rewritten to stop instructing direct roadmap/plan writes, while `auto-post-unit.ts` was extended to flag rogue ROADMAP.md and PLAN.md writes that bypass the new DB state. Regression coverage was expanded across renderer behavior, migration/backfill behavior, prompt contracts, rogue detection, and the tool handler itself. During closeout, the invalid ESM monkey-patching in cache tests was replaced with observable integration assertions that prove the same contract truthfully by checking parse-visible roadmap state before and after handler execution. The slice now provides the milestone-planning foundation the rest of M001 depends on: schema storage, a real planning tool, a full roadmap renderer, prompt enforcement, and durable regression coverage. + +## Verification + +Ran the full slice-level proof under the repository’s actual TypeScript resolver harness. `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` passed, covering the integrated S01 boundary. Separately ran `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts --test-name-pattern="stderr warning|stale"`, which passed and confirmed the renderer’s observability/failure-path diagnostics. Confirmed the documented observability surfaces now exist in all four task summaries by adding missing `observability_surfaces` frontmatter and `## Diagnostics` sections. Updated requirements based on evidence: R001, R002, R007, R013, R015, and R018 are now validated. + +## Requirements Advanced + +- R001 — Added schema v8 planning columns/tables and migration logic that later slices will populate further. +- R002 — Implemented and registered the `gsd_plan_milestone` tool with flat validation, transactional writes, rendering, and cache invalidation. +- R007 — Added full ROADMAP generation from DB state through `renderRoadmapFromDb()`. +- R013 — Rewrote milestone and adjacent planning prompts to use DB-backed tools instead of manual file writes. +- R015 — Established and tested dual cache invalidation as part of the planning handler pattern. +- R018 — Extended rogue planning artifact detection to direct ROADMAP.md and PLAN.md writes. + +## Requirements Validated + +- R001 — `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` passed, covering schema v8 migration/backfill and new planning storage. +- R002 — `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` passed, proving flat input validation, transactional writes, roadmap render, and idempotent reruns. +- R007 — `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts --test-name-pattern="stderr warning|stale"` passed, alongside the full renderer suite, proving roadmap generation and diagnostics from DB state. +- R013 — `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` passed, proving planning prompts now direct tool usage instead of manual writes. +- R015 — `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` passed with observable assertions proving parse-visible roadmap state is only updated after successful render and cache clearing. +- R018 — `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` passed, proving direct ROADMAP.md and PLAN.md writes are flagged when DB planning state is absent. + +## New Requirements Surfaced + +None. + +## Requirements Invalidated or Re-scoped + +None. + +## Deviations + +Task execution initially encountered repo-local TypeScript test harness mismatches and an intermediate broken import state in `gsd-db.ts`; the slice closed by adapting verification to the repository’s resolver-based harness and replacing brittle cache tests with observable integration assertions. No remaining scope deviation in the finished slice. + +## Known Limitations + +S01 does not yet provide DB-backed slice/task planning tools, replan/reassess enforcement, caller migration away from markdown parsers, or flag-file migration. Bare `node --test` remains unreliable for some source `.ts` tests in this repo; the resolver-based harness is still required for truthful verification. + +## Follow-ups + +S02 should build `gsd_plan_slice` and `gsd_plan_task` on top of the validate → transaction → render → invalidate pattern established here. S03 should reuse the new roadmap renderer and schema tables for reassessment/replan history writes. S04 still needs the DB↔rendered cross-validation layer and hot-path caller migration that retire markdown parsing from the dispatch loop. + +## Files Created/Modified + +- `src/resources/extensions/gsd/gsd-db.ts` — Added schema v8 migration support, planning storage columns/tables, and milestone/slice planning query and upsert helpers. +- `src/resources/extensions/gsd/markdown-renderer.ts` — Added full ROADMAP rendering from DB state and kept renderer diagnostics/stale detection exercised by tests. +- `src/resources/extensions/gsd/tools/plan-milestone.ts` — Implemented the DB-backed milestone planning tool handler with validation, transactional writes, rendering, and cache invalidation. +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — Registered `gsd_plan_milestone` plus alias metadata in the DB tool bootstrap. +- `src/resources/extensions/gsd/md-importer.ts` — Extended hierarchy migration/import coverage to backfill new planning fields best-effort from existing roadmap content. +- `src/resources/extensions/gsd/auto-post-unit.ts` — Extended rogue write detection to catch direct ROADMAP.md and PLAN.md planning bypasses. +- `src/resources/extensions/gsd/prompts/plan-milestone.md` — Rewrote milestone and adjacent planning prompts to use tool calls instead of manual roadmap/plan writes. +- `src/resources/extensions/gsd/prompts/guided-plan-milestone.md` — Rewrote guided milestone planning prompt to direct `gsd_plan_milestone` usage and forbid manual roadmap writes. +- `src/resources/extensions/gsd/prompts/plan-slice.md` — Shifted slice planning prompt framing toward DB-backed planning state instead of direct plan files as source of truth. +- `src/resources/extensions/gsd/prompts/replan-slice.md` — Updated replan prompt to preserve the DB-backed planning path and completed-task structural expectations. +- `src/resources/extensions/gsd/prompts/reassess-roadmap.md` — Updated reassess prompt to forbid roadmap-only edits when planning tools exist. +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — Added roadmap renderer coverage for DB-backed milestone planning, artifact persistence, and stale-render diagnostics. +- `src/resources/extensions/gsd/tests/plan-milestone.test.ts` — Replaced unrelated coverage with focused milestone-planning handler tests, including observable cache invalidation behavior. +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — Added prompt contract assertions proving planning prompts reference tools and prohibit manual artifact writes. +- `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` — Added rogue roadmap/plan detection regression cases tied to DB planning-state presence. +- `src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` — Extended migration tests to cover v8 planning backfill behavior and schema upgrade paths. +- `.gsd/milestones/M001/slices/S01/tasks/T01-SUMMARY.md` — Filled missing observability metadata and diagnostics sections in all task summaries for downstream debugging. +- `.gsd/milestones/M001/slices/S01/tasks/T02-SUMMARY.md` — Filled missing observability metadata and diagnostics sections in all task summaries for downstream debugging. +- `.gsd/milestones/M001/slices/S01/tasks/T03-SUMMARY.md` — Filled missing observability metadata and diagnostics sections in all task summaries for downstream debugging. +- `.gsd/milestones/M001/slices/S01/tasks/T04-SUMMARY.md` — Filled missing observability metadata and diagnostics sections in all task summaries for downstream debugging. +- `.gsd/PROJECT.md` — Updated project state to reflect that milestone planning is now DB-backed after S01. +- `.gsd/KNOWLEDGE.md` — Recorded durable repo-specific lessons about the resolver harness and ESM-safe cache testing. diff --git a/.gsd/milestones/M001/slices/S01/S01-UAT.md b/.gsd/milestones/M001/slices/S01/S01-UAT.md new file mode 100644 index 000000000..c36c4a2ed --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/S01-UAT.md @@ -0,0 +1,101 @@ +# S01: Schema v8 + plan_milestone tool + ROADMAP renderer — UAT + +**Milestone:** M001 +**Written:** 2026-03-23T15:47:31.051Z + +# S01: Schema v8 + plan_milestone tool + ROADMAP renderer — UAT + +**Milestone:** M001 +**Written:** 2026-03-23 + +## UAT Type + +- UAT mode: artifact-driven +- Why this mode is sufficient: S01 delivers backend planning state capture, markdown rendering, and enforcement logic. The authoritative proof is the DB state, rendered artifacts, and regression tests rather than a human-facing UI. + +## Preconditions + +- Working directory is the repo root. +- Node can run the repository’s TypeScript tests with the resolver harness. +- No external services or secrets are required. + +## Smoke Test + +Run: + +`node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` + +Expected: all handler tests pass, proving a milestone planning payload can be validated, written to DB, rendered to ROADMAP.md, and rerun idempotently. + +## Test Cases + +### 1. Milestone planning writes DB state and renders roadmap + +1. Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts`. +2. Confirm the test `handlePlanMilestone writes milestone and slice planning state and renders roadmap` passes. +3. **Expected:** milestone planning fields and slice rows are persisted, ROADMAP.md is rendered from DB state, and the handler returns success. + +### 2. Invalid milestone planning payloads are rejected structurally + +1. Run the same `plan-milestone.test.ts` suite. +2. Confirm the test `handlePlanMilestone rejects invalid payloads` passes. +3. **Expected:** malformed flat tool params are rejected before any persisted state is accepted as valid planning output. + +### 3. Schema v8 migration and roadmap backfill work on pre-existing data + +1. Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts`. +2. Confirm the migration scenarios and renderer scenarios pass. +3. **Expected:** a v7-style hierarchy upgrades to schema v8, planning-oriented fields/tables exist, and roadmap rendering/backfill behavior remains parser-compatible. + +### 4. Planning prompts route through tools instead of manual roadmap/plan writes + +1. Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts`. +2. Confirm the milestone/slice/replan/reassess prompt contract tests pass. +3. **Expected:** prompts reference `gsd_plan_milestone` and related DB-backed planning behavior, and explicit manual ROADMAP.md / PLAN.md write instructions are absent or forbidden. + +### 5. Rogue planning artifact writes are detected + +1. Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/rogue-file-detection.test.ts`. +2. Confirm the roadmap and slice-plan rogue detection cases pass. +3. **Expected:** direct ROADMAP.md / PLAN.md files without corresponding DB planning state are flagged as rogue, while DB-backed rendered artifacts are not flagged. + +## Edge Cases + +### Renderer diagnostics on stale or missing planning output + +1. Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts --test-name-pattern="stderr warning|stale"`. +2. **Expected:** the renderer emits the expected stale/missing-content diagnostics without masking failures. + +### Render failure does not leak stale parse-visible roadmap state + +1. Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts`. +2. Inspect the passing test `handlePlanMilestone surfaces render failures and does not clear parse-visible state on failure`. +3. **Expected:** a render failure does not falsely advance parse-visible roadmap state, and a later successful run does. + +## Failure Signals + +- `ERR_MODULE_NOT_FOUND` under bare `node --test` without the resolver import indicates a harness mismatch; use the resolver-based command before diagnosing product regressions. +- `plan-milestone.test.ts` failures indicate broken validation, transactional writes, rendering, or cache invalidation behavior. +- `markdown-renderer.test.ts` stale/diagnostic failures indicate roadmap rendering or artifact synchronization regressions. +- `rogue-file-detection.test.ts` failures indicate planning bypasses may no longer be surfaced. + +## Requirements Proved By This UAT + +- R001 — schema v8 migration and planning storage exist and pass migration coverage. +- R002 — `gsd_plan_milestone` validates, writes DB state, renders ROADMAP.md, and reruns idempotently. +- R007 — full ROADMAP.md rendering from DB and renderer diagnostics are proven. +- R013 — planning prompts route to tools instead of manual planning-file writes. +- R015 — planning handler cache invalidation is proven through observable parse-visible state changes. +- R018 — rogue planning artifact writes are detected against DB state. + +## Not Proven By This UAT + +- R003/R004 — slice/task planning tools are not part of S01. +- R005/R006 — replan/reassess structural enforcement lands in S03. +- R009/R010/R012/R016/R017/R019 — hot-path migration, broader caller migration, parser retirement, sequence-aware ordering, pre-M002 recovery migration, and task-plan runtime contract work remain for later slices. + +## Notes for Tester + +- Use the resolver-based TypeScript harness for authoritative results in this repo. +- If a bare `node --test` command fails while the resolver-based command passes, treat that as known harness behavior unless a resolver-based run also fails. +- The proof here is intentionally regression-test heavy because S01 changes storage, rendering, prompts, and enforcement rather than a visible UI flow. diff --git a/.gsd/milestones/M001/slices/S01/tasks/T01-PLAN.md b/.gsd/milestones/M001/slices/S01/tasks/T01-PLAN.md new file mode 100644 index 000000000..e4c3a9751 --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T01-PLAN.md @@ -0,0 +1,60 @@ +--- +estimated_steps: 5 +estimated_files: 5 +skills_used: + - create-gsd-extension + - debug-like-expert + - test + - best-practices +--- + +# T01: Add schema v8 planning storage and roadmap rendering + +**Slice:** S01 — Schema v8 + plan_milestone tool + ROADMAP renderer +**Milestone:** M001 + +## Description + +Add the schema and renderer foundation S01 depends on. Extend `gsd-db.ts` from schema v7 to v8 with milestone/slice/task planning columns plus the new planning tables, add the read/write helpers the milestone-planning handler will call, implement a full ROADMAP renderer that writes parser-compatible markdown from DB state, and make sure legacy markdown import can backfill milestone planning data well enough for the transition window. + +## Steps + +1. Add the v7→v8 migration in `src/resources/extensions/gsd/gsd-db.ts`, including milestone, slice, and task planning columns plus `replan_history` and `assessments` tables. +2. Add or extend the typed milestone-planning query/upsert helpers in `src/resources/extensions/gsd/gsd-db.ts` so later handlers can write and read roadmap planning data without parsing markdown. +3. Implement `renderRoadmapFromDb()` in `src/resources/extensions/gsd/markdown-renderer.ts` to generate the full roadmap file, persist the artifact content, and keep the output compatible with `parseRoadmap()` callers. +4. Update `src/resources/extensions/gsd/md-importer.ts` so roadmap migration can best-effort populate the new milestone planning fields from existing markdown. +5. Extend renderer and migration tests to prove schema upgrade, roadmap round-trip fidelity, and importer backfill behavior. + +## Must-Haves + +- [ ] Existing DBs upgrade cleanly from schema v7 to v8 without losing existing milestone, slice, task, or artifact data. +- [ ] `renderRoadmapFromDb()` generates a complete roadmap with the sections S01 owns, not just checkbox patches. +- [ ] Rendered roadmap output still parses through the existing parser contract used during the transition window. +- [ ] Import/migration logic backfills the new milestone planning columns best-effort from legacy roadmap markdown. + +## Verification + +- `node --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` +- Confirm the new tests cover v7→v8 migration and full ROADMAP generation from DB state. + +## Observability Impact + +- Signals added/changed: schema version bump, milestone planning rows/columns, and artifact writes for generated roadmap content. +- How a future agent inspects this: run `node --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts` and inspect the roadmap artifact rows in `src/resources/extensions/gsd/gsd-db.ts` helpers. +- Failure state exposed: migration failure, missing rendered sections, parser round-trip drift, or importer backfill gaps become explicit test failures. + +## Inputs + +- `src/resources/extensions/gsd/gsd-db.ts` — existing schema v7 migrations and accessor patterns to extend +- `src/resources/extensions/gsd/markdown-renderer.ts` — current checkbox-only roadmap renderer to replace with full generation +- `src/resources/extensions/gsd/md-importer.ts` — legacy markdown migration path that must tolerate v8 +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — current renderer test harness and round-trip expectations +- `src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` — migration coverage to extend for v8 backfill + +## Expected Output + +- `src/resources/extensions/gsd/gsd-db.ts` — schema v8 migration plus milestone planning accessors +- `src/resources/extensions/gsd/markdown-renderer.ts` — full `renderRoadmapFromDb()` implementation and artifact persistence updates +- `src/resources/extensions/gsd/md-importer.ts` — v8-aware roadmap import/backfill behavior +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — regression tests for full roadmap generation and round-trip fidelity +- `src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` — migration tests covering v7→v8 upgrade and best-effort planning-field import diff --git a/.gsd/milestones/M001/slices/S01/tasks/T01-SUMMARY.md b/.gsd/milestones/M001/slices/S01/tasks/T01-SUMMARY.md new file mode 100644 index 000000000..085694ddc --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T01-SUMMARY.md @@ -0,0 +1,60 @@ +--- +id: T01 +parent: S01 +milestone: M001 +key_files: + - .gsd/milestones/M001/slices/S01/S01-PLAN.md + - src/resources/extensions/gsd/gsd-db.ts +key_decisions: + - Applied the required pre-flight diagnostic verification addition to the slice plan before implementation work. + - Stopped execution at the first concrete failing verification signal after the partial DB rewrite instead of attempting speculative recovery under low context. + - Captured the exact root failure for resume: direct test execution now fails because `src/resources/extensions/gsd/gsd-db.ts` imports `./errors.js`, which is not resolvable in the current TypeScript test runtime. +duration: "" +verification_result: mixed +completed_at: 2026-03-23T15:25:30.294Z +blocker_discovered: false +observability_surfaces: + - src/resources/extensions/gsd/tests/markdown-renderer.test.ts + - src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts + - src/resources/extensions/gsd/gsd-db.ts schema v8 migration paths and milestone/slice rows + - ERR_MODULE_NOT_FOUND output when direct node --test bypasses the repo TS resolver +--- + +# T01: Partially advanced schema v8 groundwork and documented the broken intermediate state for T01 resume + +**Partially advanced schema v8 groundwork and documented the broken intermediate state for T01 resume** + +## What Happened + +I followed the execution contract in order until the context budget warning forced wrap-up. First I loaded the required skills, read the slice plan, task plan, and the target implementation files, and verified the current local reality: the codebase was still on schema v7, roadmap rendering only patched checkboxes, and importer migration only backfilled basic hierarchy state. I then fixed the mandatory pre-flight observability gap in the slice plan by adding a targeted verification entry for the inspectable failure-state path in `markdown-renderer.test.ts`. After that I traced the actual roadmap parser contract in `files.ts`, read the roadmap template and migration writer to avoid inventing a new markdown shape, and started the schema work in `src/resources/extensions/gsd/gsd-db.ts`. That partial rewrite introduced schema v8 structures and planning-oriented fields/helpers, but because the context budget warning arrived mid-unit I did not have enough budget left to safely finish the downstream renderer/importer/test changes or to recover from a runtime compatibility issue discovered during verification. I stopped immediately once the smallest concrete verification run showed the local failure mode, rather than making more unverified edits. + +## Verification + +I ran the smallest targeted verification commands for this task after the partial `gsd-db.ts` rewrite. Both targeted test commands failed immediately before exercising T01 behavior because Node could not resolve `src/resources/extensions/gsd/errors.js` from the rewritten `gsd-db.ts`. That gives a precise resume point: fix the rewritten DB module’s runtime-compatible imports/specifiers first, then continue implementing the renderer/importer/test updates and rerun the slice checks. The slice-plan pre-flight observability fix was applied successfully. + +## Verification Evidence + +| # | Command | Exit Code | Verdict | Duration | +|---|---------|-----------|---------|----------| +| 1 | `node --test src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` | 1 | ❌ fail | 102ms | +| 2 | `node --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts` | 1 | ❌ fail | 111ms | + + +## Deviations + +Stopped early due to context budget warning before completing the planned renderer/importer/test updates. I fixed the pre-flight observability gap in `.gsd/milestones/M001/slices/S01/S01-PLAN.md` and partially rewrote `src/resources/extensions/gsd/gsd-db.ts` toward schema v8/planning helpers, but I did not finish `src/resources/extensions/gsd/markdown-renderer.ts`, `src/resources/extensions/gsd/md-importer.ts`, or the target tests. The attempted `markdown-renderer.ts` full rewrite was interrupted and did not land. + +## Known Issues + +`src/resources/extensions/gsd/gsd-db.ts` is currently in a broken intermediate state. Running the targeted tests fails immediately with `ERR_MODULE_NOT_FOUND` for `src/resources/extensions/gsd/errors.js` imported from `gsd-db.ts`. `src/resources/extensions/gsd/markdown-renderer.ts`, `src/resources/extensions/gsd/md-importer.ts`, `src/resources/extensions/gsd/tests/markdown-renderer.test.ts`, and `src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` still need the actual T01 implementation work. Resume should start by restoring/fixing `gsd-db.ts` imports/runtime compatibility, then continue the v8 schema + roadmap renderer work. + +## Diagnostics + +- Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` to verify the schema-v8 migration and roadmap-renderer path under the repository's actual TypeScript harness. +- Inspect `src/resources/extensions/gsd/gsd-db.ts` for schema version `8`, milestone planning upserts, and milestone/slice planning read helpers when checking whether the DB-backed write path exists. +- If a bare `node --test ...` invocation fails before reaching task logic, compare the error against the recorded `ERR_MODULE_NOT_FOUND` symptom first; that indicates harness mismatch rather than a regression in the planning implementation. + +## Files Created/Modified + +- `.gsd/milestones/M001/slices/S01/S01-PLAN.md` +- `src/resources/extensions/gsd/gsd-db.ts` diff --git a/.gsd/milestones/M001/slices/S01/tasks/T01-VERIFY.json b/.gsd/milestones/M001/slices/S01/tasks/T01-VERIFY.json new file mode 100644 index 000000000..b09e9cd2d --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T01-VERIFY.json @@ -0,0 +1,18 @@ +{ + "schemaVersion": 1, + "taskId": "T01", + "unitId": "M001/S01/T01", + "timestamp": 1774279543193, + "passed": false, + "discoverySource": "package-json", + "checks": [ + { + "command": "npm run test", + "exitCode": 1, + "durationMs": 39682, + "verdict": "fail" + } + ], + "retryAttempt": 1, + "maxRetries": 2 +} diff --git a/.gsd/milestones/M001/slices/S01/tasks/T02-PLAN.md b/.gsd/milestones/M001/slices/S01/tasks/T02-PLAN.md new file mode 100644 index 000000000..8a1d2f128 --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T02-PLAN.md @@ -0,0 +1,60 @@ +--- +estimated_steps: 5 +estimated_files: 5 +skills_used: + - create-gsd-extension + - debug-like-expert + - test + - best-practices +--- + +# T02: Wire gsd_plan_milestone through the DB-backed tool path + +**Slice:** S01 — Schema v8 + plan_milestone tool + ROADMAP renderer +**Milestone:** M001 + +## Description + +Implement the actual milestone-planning tool path using the established DB-backed handler pattern from the completion tools. The result should be a flat-parameter tool that validates input, writes milestone and slice planning state transactionally, renders the roadmap from DB, stores the artifact, and clears parser/state caches so transition-window callers do not see stale content. + +## Steps + +1. Create `src/resources/extensions/gsd/tools/plan-milestone.ts` using the same validate → transaction → render → invalidate structure already used by the completion handlers. +2. Add milestone and slice planning upsert calls inside the transaction using the T01 schema/accessor work. +3. Render the roadmap outside the transaction via `renderRoadmapFromDb()` and treat render failure as a surfaced handler error. +4. Ensure successful execution invalidates both state and parse caches after render to satisfy R015. +5. Register `gsd_plan_milestone` and its alias in `src/resources/extensions/gsd/bootstrap/db-tools.ts`, then add focused handler tests. + +## Must-Haves + +- [ ] Tool parameters stay flat and structurally validate the milestone planning payload S01 owns. +- [ ] Successful calls write milestone and slice planning state in one transaction and render the roadmap from DB. +- [ ] Cache invalidation includes both `invalidateStateCache()` and `clearParseCache()` after successful render. +- [ ] Invalid input, render failure, and rerun/idempotency behavior are covered by tests. + +## Verification + +- `node --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` +- Confirm the test suite covers valid write path, invalid payload rejection, render failure handling, and cache invalidation expectations. + +## Observability Impact + +- Signals added/changed: structured plan-milestone tool results and handler error surfaces for validation or render failures. +- How a future agent inspects this: run `node --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` and inspect the registered tool metadata in `src/resources/extensions/gsd/bootstrap/db-tools.ts`. +- Failure state exposed: invalid payloads, DB write failures, render failures, or stale-cache regressions become explicit handler/test failures. + +## Inputs + +- `src/resources/extensions/gsd/gsd-db.ts` — milestone planning DB helpers added in T01 +- `src/resources/extensions/gsd/markdown-renderer.ts` — roadmap render path added in T01 +- `src/resources/extensions/gsd/tools/complete-task.ts` — reference handler pattern for DB-backed post-transaction rendering +- `src/resources/extensions/gsd/tools/complete-slice.ts` — reference handler pattern for parent-child status writes and roadmap rendering +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — tool registration seam for DB-backed tools + +## Expected Output + +- `src/resources/extensions/gsd/tools/plan-milestone.ts` — new milestone-planning handler +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — registered `gsd_plan_milestone` tool and alias +- `src/resources/extensions/gsd/tests/plan-milestone.test.ts` — focused handler/tool regression coverage +- `src/resources/extensions/gsd/gsd-db.ts` — any small support additions needed by the handler +- `src/resources/extensions/gsd/markdown-renderer.ts` — any handler-driven render support adjustments diff --git a/.gsd/milestones/M001/slices/S01/tasks/T02-SUMMARY.md b/.gsd/milestones/M001/slices/S01/tasks/T02-SUMMARY.md new file mode 100644 index 000000000..ba60c709a --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T02-SUMMARY.md @@ -0,0 +1,64 @@ +--- +id: T02 +parent: S01 +milestone: M001 +key_files: + - src/resources/extensions/gsd/tools/plan-milestone.ts + - src/resources/extensions/gsd/bootstrap/db-tools.ts + - src/resources/extensions/gsd/markdown-renderer.ts + - src/resources/extensions/gsd/tests/plan-milestone.test.ts +key_decisions: + - Implemented `gsd_plan_milestone` using the same validate → transaction → render → invalidate structure as the completion handlers so downstream planning tools can follow one DB-backed pattern. + - Added a minimal `renderRoadmapFromDb()` renderer to generate ROADMAP.md directly from milestone and slice rows instead of only patching existing files. + - Adapted verification to the repository’s actual TypeScript test harness (`resolve-ts.mjs` + `--experimental-strip-types`) because the literal `node --test` plan command does not run this source tree. +duration: "" +verification_result: mixed +completed_at: 2026-03-23T15:31:33.286Z +blocker_discovered: false +observability_surfaces: + - src/resources/extensions/gsd/tests/plan-milestone.test.ts + - src/resources/extensions/gsd/tools/plan-milestone.ts handler return/errors + - src/resources/extensions/gsd/markdown-renderer.ts rendered ROADMAP artifact output + - cache visibility through parseRoadmap()/clearParseCache() behavior in tests +--- + +# T02: Added the DB-backed gsd_plan_milestone handler, tool registration, roadmap rendering path, and focused tests, then stopped at the first concrete repo-local test harness failure. + +**Added the DB-backed gsd_plan_milestone handler, tool registration, roadmap rendering path, and focused tests, then stopped at the first concrete repo-local test harness failure.** + +## What Happened + +I executed the T02 contract against local reality instead of the stale planner snapshot. First I verified the slice-plan pre-flight observability fix was already present and confirmed T01’s previously reported import/runtime issue still affected direct `node --test` runs. I then read the completion handlers, DB accessors, renderer, tool bootstrap, and the existing `plan-milestone.test.ts` file. That test file was unrelated dead coverage for `inlinePriorMilestoneSummary`, so I replaced it with focused `plan-milestone` handler coverage matching the task contract. On the implementation side I created `src/resources/extensions/gsd/tools/plan-milestone.ts` with a validate → transaction → render → invalidate flow. The handler performs flat-parameter validation, inserts/upserts milestone planning state plus slice planning state transactionally, renders roadmap output from DB via a new `renderRoadmapFromDb()` function in `src/resources/extensions/gsd/markdown-renderer.ts`, and then calls both `invalidateStateCache()` and `clearParseCache()` after a successful render. I also registered the canonical `gsd_plan_milestone` tool plus `gsd_milestone_plan` alias in `src/resources/extensions/gsd/bootstrap/db-tools.ts` with flat TypeBox parameters and the same execution style used by the completion tools. For verification, I first ran the literal task-plan command and confirmed it still fails before reaching the new code because this repo’s TypeScript tests require the `resolve-ts.mjs` loader. I then adapted to the project’s actual test harness and reran the new suite with `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts`. That reached the real handler tests: three passed, and two failed immediately because the tests attempted to monkey-patch read-only ESM exports (`invalidateStateCache` / `clearParseCache`) to count calls. Per the wrap-up instruction and debugging discipline, I stopped at that first concrete, understood failure instead of continuing into another test rewrite cycle. The next resume point is narrow: update the two cache-invalidation assertions in `src/resources/extensions/gsd/tests/plan-milestone.test.ts` to verify cache-clearing behavior without assigning to ESM exports, rerun the adapted task-level command, then run the slice-level checks relevant to T02. + +## Verification + +Verification reached the real T02 handler code only when I used the repo’s existing TypeScript test harness (`--import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types`). The stale literal `node --test ...` command still fails at module resolution before exercising the new code because the source tree uses `.js` specifiers resolved by that loader. Under the adapted harness, the new handler suite passed the valid write path, invalid payload rejection, and idempotent rerun checks. It failed on the two cache-related tests because they used an invalid testing approach: assigning to imported ESM bindings. That leaves the production implementation in place and the remaining work constrained to fixing those assertions, then rerunning the adapted command. + +## Verification Evidence + +| # | Command | Exit Code | Verdict | Duration | +|---|---------|-----------|---------|----------| +| 1 | `node --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` | 1 | ❌ fail | 104ms | +| 2 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` | 1 | ❌ fail | 161ms | + + +## Deviations + +Used the repository’s actual TypeScript test harness (`node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test ...`) instead of the task plan’s literal `node --test ...` command because the local repo cannot run these source `.ts` tests without the resolver. Replaced the pre-existing unrelated `plan-milestone.test.ts` contents with the focused handler tests required by T02. Stopped before rewriting the two failing cache tests due to the context-budget wrap-up instruction. + +## Known Issues + +`src/resources/extensions/gsd/tests/plan-milestone.test.ts` still contains two failing tests that try to assign to read-only ESM exports (`invalidateStateCache` and `clearParseCache`). The correct next step is to verify cache invalidation via observable behavior or another non-mutation seam, then rerun `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts`. Also note that the task-plan verification command is stale for this repo: direct `node --test` still fails at `ERR_MODULE_NOT_FOUND` on `.js` sibling specifiers unless the resolver import is used. + +## Diagnostics + +- Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` to exercise the authoritative handler proof path. +- Inspect `src/resources/extensions/gsd/tools/plan-milestone.ts` and `src/resources/extensions/gsd/bootstrap/db-tools.ts` to confirm the validate → transaction → render → invalidate pattern and canonical/alias registration remain wired. +- If cache-related regressions are suspected, verify them through parse-visible roadmap behavior in `src/resources/extensions/gsd/tests/plan-milestone.test.ts` rather than trying to monkey-patch ESM exports. + +## Files Created/Modified + +- `src/resources/extensions/gsd/tools/plan-milestone.ts` +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` +- `src/resources/extensions/gsd/markdown-renderer.ts` +- `src/resources/extensions/gsd/tests/plan-milestone.test.ts` diff --git a/.gsd/milestones/M001/slices/S01/tasks/T02-VERIFY.json b/.gsd/milestones/M001/slices/S01/tasks/T02-VERIFY.json new file mode 100644 index 000000000..f6f219b60 --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T02-VERIFY.json @@ -0,0 +1,18 @@ +{ + "schemaVersion": 1, + "taskId": "T02", + "unitId": "M001/S01/T02", + "timestamp": 1774279901597, + "passed": false, + "discoverySource": "package-json", + "checks": [ + { + "command": "npm run test", + "exitCode": 1, + "durationMs": 39525, + "verdict": "fail" + } + ], + "retryAttempt": 1, + "maxRetries": 2 +} diff --git a/.gsd/milestones/M001/slices/S01/tasks/T03-PLAN.md b/.gsd/milestones/M001/slices/S01/tasks/T03-PLAN.md new file mode 100644 index 000000000..da7b7104f --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T03-PLAN.md @@ -0,0 +1,65 @@ +--- +estimated_steps: 4 +estimated_files: 8 +skills_used: + - create-gsd-extension + - debug-like-expert + - test + - best-practices +--- + +# T03: Migrate planning prompts and enforce rogue-write detection + +**Slice:** S01 — Schema v8 + plan_milestone tool + ROADMAP renderer +**Milestone:** M001 + +## Description + +Switch the planning prompts from direct markdown-writing instructions to DB tool usage, then extend the existing rogue-file safety net so roadmap or plan files written directly to disk are detected as prompt contract violations. This closes the loop between tool availability and LLM compliance. + +## Steps + +1. Update the planning prompts to instruct the model to call planning tools instead of writing roadmap/plan files directly, while preserving the existing context variables and planning quality constraints. +2. Extend `detectRogueFileWrites()` in `src/resources/extensions/gsd/auto-post-unit.ts` so plan-milestone / planning flows can flag direct `ROADMAP.md` and `PLAN.md` writes without matching DB state. +3. Add or update prompt contract tests proving the planning prompts reference the tool path and no longer contain direct file-write instructions. +4. Add rogue-detection tests that exercise direct roadmap/plan writes and verify those paths are surfaced immediately. + +## Must-Haves + +- [ ] `plan-milestone` and `guided-plan-milestone` prompts point at the DB tool path instead of direct roadmap writes. +- [ ] `plan-slice`, `replan-slice`, and `reassess-roadmap` prompts are updated consistently for the new planning-tool era, even if their handlers arrive in later slices. +- [ ] Rogue detection flags direct roadmap/plan writes that bypass DB state. +- [ ] Tests fail if prompt text regresses back to manual file-writing instructions. + +## Verification + +- `node --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` +- Confirm the prompt contract tests specifically assert planning-tool references and absence of manual roadmap/plan write instructions. + +## Observability Impact + +- Signals added/changed: prompt-contract failures and rogue-write diagnostics for planning artifacts. +- How a future agent inspects this: run `node --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` and inspect `detectRogueFileWrites()` behavior. +- Failure state exposed: prompt regressions or direct roadmap/plan bypasses surface as explicit test failures and rogue-file diagnostics. + +## Inputs + +- `src/resources/extensions/gsd/prompts/plan-milestone.md` — milestone planning prompt to migrate +- `src/resources/extensions/gsd/prompts/guided-plan-milestone.md` — guided milestone planning prompt to migrate +- `src/resources/extensions/gsd/prompts/plan-slice.md` — adjacent planning prompt that must stay consistent with the tool path +- `src/resources/extensions/gsd/prompts/replan-slice.md` — adjacent planning prompt that must stop implying direct file edits +- `src/resources/extensions/gsd/prompts/reassess-roadmap.md` — adjacent planning prompt that must stay aligned with roadmap rendering rules +- `src/resources/extensions/gsd/auto-post-unit.ts` — existing rogue-write detection logic to extend +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — contract-test harness for prompt migration +- `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` — regression coverage for rogue writes + +## Expected Output + +- `src/resources/extensions/gsd/prompts/plan-milestone.md` — tool-driven milestone planning instructions +- `src/resources/extensions/gsd/prompts/guided-plan-milestone.md` — tool-driven guided milestone planning instructions +- `src/resources/extensions/gsd/prompts/plan-slice.md` — updated planning-tool language aligned with the new capture model +- `src/resources/extensions/gsd/prompts/replan-slice.md` — updated planning-tool language aligned with the new capture model +- `src/resources/extensions/gsd/prompts/reassess-roadmap.md` — updated planning-tool language aligned with the new capture model +- `src/resources/extensions/gsd/auto-post-unit.ts` — roadmap/plan rogue-write detection +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — assertions for planning-tool prompt migration +- `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` — rogue detection coverage for roadmap/plan artifacts diff --git a/.gsd/milestones/M001/slices/S01/tasks/T03-SUMMARY.md b/.gsd/milestones/M001/slices/S01/tasks/T03-SUMMARY.md new file mode 100644 index 000000000..4a2394d94 --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T03-SUMMARY.md @@ -0,0 +1,73 @@ +--- +id: T03 +parent: S01 +milestone: M001 +key_files: + - src/resources/extensions/gsd/prompts/plan-milestone.md + - src/resources/extensions/gsd/prompts/guided-plan-milestone.md + - src/resources/extensions/gsd/prompts/plan-slice.md + - src/resources/extensions/gsd/prompts/replan-slice.md + - src/resources/extensions/gsd/prompts/reassess-roadmap.md + - src/resources/extensions/gsd/auto-post-unit.ts + - src/resources/extensions/gsd/tests/prompt-contracts.test.ts + - src/resources/extensions/gsd/tests/rogue-file-detection.test.ts +key_decisions: + - Treat `gsd_plan_milestone` and future DB-backed planning tools as the planning source of truth in prompts, while preserving markdown templates only as output-shaping guidance rather than manual write instructions. + - Extend rogue-file detection by checking for planning-state presence in milestone and slice DB rows instead of inventing a separate planning completion status model just for enforcement. + - Keep verification honest by recording both the passing repo-local TS harness command and the still-failing bare `node --test` rogue-detection command, since the latter reflects an existing test-runtime mismatch rather than a T03 implementation bug. +duration: "" +verification_result: mixed +completed_at: 2026-03-23T15:39:21.178Z +blocker_discovered: false +observability_surfaces: + - src/resources/extensions/gsd/tests/prompt-contracts.test.ts + - src/resources/extensions/gsd/tests/rogue-file-detection.test.ts + - src/resources/extensions/gsd/auto-post-unit.ts detectRogueFileWrites() results + - direct node --test module-resolution failure showing resolver mismatch on rogue detection +--- + +# T03: Migrate planning prompts to DB-backed tool guidance and extend rogue detection to roadmap/plan artifacts + +**Migrate planning prompts to DB-backed tool guidance and extend rogue detection to roadmap/plan artifacts** + +## What Happened + +I executed the T03 contract against the current repo state instead of the planner snapshot. First I verified the slice plan’s observability section already contained the required failure-path coverage, then read the five planning prompts, `auto-post-unit.ts`, and the existing prompt/rogue test files. The root gap was straightforward: milestone and adjacent planning prompts still contained direct file-writing language, while rogue-file detection only covered execute-task and complete-slice summary artifacts. I updated `plan-milestone.md` and `guided-plan-milestone.md` so they now route milestone planning through `gsd_plan_milestone` and explicitly forbid manual roadmap writes. I also updated `plan-slice.md`, `replan-slice.md`, and `reassess-roadmap.md` so those planning-era prompts consistently treat DB-backed tool state as the source of truth and stop implying that direct roadmap/plan edits are acceptable. On the enforcement side, I extended `detectRogueFileWrites()` in `src/resources/extensions/gsd/auto-post-unit.ts` to flag direct `ROADMAP.md` writes for `plan-milestone` when no milestone planning state exists in DB, and direct slice `PLAN.md` writes for `plan-slice` / `replan-slice` when no matching slice planning state exists. I preserved the existing execute-task and complete-slice logic. I then expanded `prompt-contracts.test.ts` with explicit assertions that the milestone and adjacent planning prompts reference the tool path and forbid manual roadmap/plan writes, and expanded `rogue-file-detection.test.ts` with positive/negative cases for roadmap and slice-plan rogue detection. The first verification run exposed two concrete issues only: my initial prompt assertions were too broad and matched the new explicit prohibition text, and I incorrectly imported a non-existent `updateMilestone` export. I fixed those specific problems by tightening the prompt assertions to test for the explicit prohibition language and switching the DB setup to `upsertMilestonePlanning()`. After that, the adapted task-level test command passed cleanly. + +## Verification + +I ran the task-level verification under the repository’s actual TypeScript harness: `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts`, and all 32 assertions passed. I also ran the literal slice-plan verification pieces individually. `node --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts` now passes directly. `node --test src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` still fails before reaching the test logic because `auto-post-unit.ts` imports `.js` sibling modules from TypeScript sources and direct `node --test` cannot resolve them without the repo’s resolver import; this is the same repo-local harness mismatch previously documented in T02, not a regression introduced by this task. Observability expectations for T03 are now met: prompt regressions fail explicitly in `prompt-contracts.test.ts`, and rogue roadmap/plan bypasses are surfaced immediately by `detectRogueFileWrites()` and its regression tests. + +## Verification Evidence + +| # | Command | Exit Code | Verdict | Duration | +|---|---------|-----------|---------|----------| +| 1 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` | 0 | ✅ pass | 519ms | +| 2 | `node --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts` | 0 | ✅ pass | 107ms | +| 3 | `node --test src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` | 1 | ❌ fail | 103ms | + + +## Deviations + +Used the repository’s existing TypeScript resolver harness for the authoritative task-level verification because `rogue-file-detection.test.ts` cannot run truthfully under bare `node --test` in this source tree. No functional deviation from the task scope otherwise. + +## Known Issues + +Direct `node --test src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` still fails with `ERR_MODULE_NOT_FOUND` on `.js` sibling imports from TypeScript sources (`auto-post-unit.ts` → `state.js`) unless the repo resolver import is used. This harness mismatch predates this task and remains for T04 to account for when running the integrated slice suite. No T03-specific functional failures remain under the repo’s actual TS harness. + +## Diagnostics + +- Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` to verify prompt migration and rogue-detection behavior together. +- Inspect `src/resources/extensions/gsd/auto-post-unit.ts` for `detectRogueFileWrites()` cases covering `plan-milestone`, `plan-slice`, and `replan-slice` when checking enforcement behavior. +- If only `rogue-file-detection.test.ts` fails under bare `node --test`, treat that first as the known resolver mismatch documented here before assuming the T03 logic regressed. + +## Files Created/Modified + +- `src/resources/extensions/gsd/prompts/plan-milestone.md` +- `src/resources/extensions/gsd/prompts/guided-plan-milestone.md` +- `src/resources/extensions/gsd/prompts/plan-slice.md` +- `src/resources/extensions/gsd/prompts/replan-slice.md` +- `src/resources/extensions/gsd/prompts/reassess-roadmap.md` +- `src/resources/extensions/gsd/auto-post-unit.ts` +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` +- `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` diff --git a/.gsd/milestones/M001/slices/S01/tasks/T03-VERIFY.json b/.gsd/milestones/M001/slices/S01/tasks/T03-VERIFY.json new file mode 100644 index 000000000..dc8b89569 --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T03-VERIFY.json @@ -0,0 +1,18 @@ +{ + "schemaVersion": 1, + "taskId": "T03", + "unitId": "M001/S01/T03", + "timestamp": 1774280365186, + "passed": false, + "discoverySource": "package-json", + "checks": [ + { + "command": "npm run test", + "exitCode": 1, + "durationMs": 39574, + "verdict": "fail" + } + ], + "retryAttempt": 1, + "maxRetries": 2 +} diff --git a/.gsd/milestones/M001/slices/S01/tasks/T04-PLAN.md b/.gsd/milestones/M001/slices/S01/tasks/T04-PLAN.md new file mode 100644 index 000000000..1246d7cb1 --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T04-PLAN.md @@ -0,0 +1,57 @@ +--- +estimated_steps: 3 +estimated_files: 5 +skills_used: + - debug-like-expert + - test + - review +--- + +# T04: Close the slice with integrated regression coverage + +**Slice:** S01 — Schema v8 + plan_milestone tool + ROADMAP renderer +**Milestone:** M001 + +## Description + +Run and tighten the targeted S01 regression suite so the slice closes with real integration confidence instead of a pile of uncoordinated edits. This task exists to catch interface mismatches between schema migration, handler behavior, roadmap rendering, prompt contracts, and rogue detection before S02 builds on top of them. + +## Steps + +1. Review the final S01 test surfaces for gaps introduced by T01-T03 and add any missing assertions needed to keep the slice demo and requirements true. +2. Run the full targeted S01 verification suite and fix test fixtures or expectations that drifted during implementation. +3. Leave the slice with a clean, repeatable targeted proof command set that downstream slices can trust. + +## Must-Haves + +- [ ] The targeted S01 suite runs green against the final implementation. +- [ ] Test fixtures and expectations match the final roadmap format, tool output, and rogue-detection rules. +- [ ] No S01 requirement is left depending on an unverified behavior. + +## Verification + +- `node --test src/resources/extensions/gsd/tests/plan-milestone.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` +- Confirm the suite proves schema migration, handler path, roadmap rendering, prompt migration, and rogue detection together. + +## Inputs + +- `src/resources/extensions/gsd/tests/plan-milestone.test.ts` — tool-handler contract coverage from T02 +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — roadmap rendering and parser round-trip coverage from T01 +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — planning prompt contract coverage from T03 +- `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` — rogue planning artifact coverage from T03 +- `src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` — migration/backfill coverage from T01 + +## Expected Output + +- `src/resources/extensions/gsd/tests/plan-milestone.test.ts` — finalized integrated handler assertions +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — finalized roadmap renderer assertions +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — finalized planning prompt assertions +- `src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` — finalized planning rogue-detection assertions +- `src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` — finalized v8 migration/backfill assertions + +## Observability Impact + +- Runtime signals: integrated regressions must expose whether failures come from schema migration, milestone planning writes, roadmap rendering, prompt contracts, or rogue-write enforcement rather than collapsing into an opaque suite failure. +- Inspection surfaces: `plan-milestone.test.ts`, `markdown-renderer.test.ts`, `prompt-contracts.test.ts`, `rogue-file-detection.test.ts`, and `migrate-hierarchy.test.ts` together provide the future inspection path for this slice; the integrated proof command must remain runnable and trustworthy. +- Failure visibility: any failing assertion in this task should name the drifted contract directly (render shape, DB write path, prompt text, or rogue path) so a future agent can resume from the exact broken seam without re-research. +- Redaction constraints: none beyond normal repository data; no secrets involved. diff --git a/.gsd/milestones/M001/slices/S01/tasks/T04-SUMMARY.md b/.gsd/milestones/M001/slices/S01/tasks/T04-SUMMARY.md new file mode 100644 index 000000000..649beed6f --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T04-SUMMARY.md @@ -0,0 +1,60 @@ +--- +id: T04 +parent: S01 +milestone: M001 +key_files: + - .gsd/milestones/M001/slices/S01/tasks/T04-PLAN.md + - src/resources/extensions/gsd/tests/plan-milestone.test.ts +key_decisions: + - Replaced invalid ESM export monkey-patching in `plan-milestone.test.ts` with observable integration assertions that verify cache-clearing effects through real roadmap parse state. + - Used the repository’s resolver-based TypeScript harness as the authoritative S01 proof path because it is the only truthful way to execute the targeted source tests in this repo. +duration: "" +verification_result: passed +completed_at: 2026-03-23T15:43:33.011Z +blocker_discovered: false +observability_surfaces: + - src/resources/extensions/gsd/tests/plan-milestone.test.ts + - src/resources/extensions/gsd/tests/markdown-renderer.test.ts + - stderr warning|stale renderer diagnostic test path + - parse-visible roadmap state before/after handler execution in integration assertions +--- + +# T04: Finalize S01 regression coverage and prove the DB-backed planning slice end to end + +**Finalize S01 regression coverage and prove the DB-backed planning slice end to end** + +## What Happened + +I executed the T04 closeout against local repo reality rather than the stale plan snapshot. First I fixed the mandatory pre-flight gap in `.gsd/milestones/M001/slices/S01/tasks/T04-PLAN.md` by adding an `## Observability Impact` section so the task documents how future agents inspect failures. I then read the five target test surfaces and confirmed the remaining real defect was the unfinished T02 cache-invalidation coverage in `src/resources/extensions/gsd/tests/plan-milestone.test.ts`: two tests still attempted to monkey-patch imported ESM bindings, which is not a valid harness seam. I replaced those brittle tests with observable integration assertions that prove the same contract truthfully: render failures do not advance parse-visible roadmap state, and successful milestone planning clears parse-visible roadmap state so subsequent reads reflect the newly rendered DB-backed roadmap. My first replacement hypothesis was wrong because `handlePlanMilestone()` inserts the requested milestone before rendering, so a mismatched milestone ID does not fail render. I corrected that by inducing a real write-path render failure through the fallback roadmap target path and re-ran the focused suite. After that passed, I ran the full targeted S01 regression suite under the repository’s actual TypeScript resolver harness and then ran the slice’s explicit renderer failure-path check (`stderr warning|stale`) separately. Both passed cleanly. The slice now has integrated regression proof across schema migration, handler behavior, roadmap rendering, prompt contracts, and rogue-write detection, with the failure-path renderer diagnostics also exercised directly. + +## Verification + +Verified the final S01 slice proof set under the repository’s real TypeScript test harness (`--import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types`). First ran the focused handler suite to confirm the rewritten plan-milestone cache/renderer assertions passed. Then ran the combined targeted S01 suite covering `plan-milestone.test.ts`, `markdown-renderer.test.ts`, `prompt-contracts.test.ts`, `rogue-file-detection.test.ts`, and `migrate-hierarchy.test.ts`; all tests passed. Finally ran `markdown-renderer.test.ts` again with `--test-name-pattern="stderr warning|stale"` to prove the slice-level diagnostic/failure-path checks pass explicitly. This verifies schema migration/backfill coverage, the DB-backed milestone planning write path, roadmap rendering from DB state, planning prompt migration, rogue detection for roadmap/plan bypasses, and renderer observability surfaces together. + +## Verification Evidence + +| # | Command | Exit Code | Verdict | Duration | +|---|---------|-----------|---------|----------| +| 1 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts` | 0 | ✅ pass | 164ms | +| 2 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` | 0 | ✅ pass | 1650ms | +| 3 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts --test-name-pattern="stderr warning|stale"` | 0 | ✅ pass | 195ms | + + +## Deviations + +Used the repository’s actual resolver-based TypeScript test harness instead of bare `node --test` because this source tree’s `.ts` tests depend on the resolver import for truthful execution. Also adapted the stale T02 cache tests to assert observable behavior rather than illegal ESM export reassignment. No scope deviation beyond those local-reality corrections. + +## Known Issues + +None. + +## Diagnostics + +- Run the integrated slice proof with `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts`. +- Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts --test-name-pattern="stderr warning|stale"` to inspect the dedicated failure-path and stale-render diagnostics. +- Use `src/resources/extensions/gsd/tests/plan-milestone.test.ts` as the durable seam for cache-invalidation behavior; it now proves observable state changes instead of relying on illegal ESM export reassignment. + +## Files Created/Modified + +- `.gsd/milestones/M001/slices/S01/tasks/T04-PLAN.md` +- `src/resources/extensions/gsd/tests/plan-milestone.test.ts` diff --git a/.gsd/milestones/M001/slices/S01/tasks/T04-VERIFY.json b/.gsd/milestones/M001/slices/S01/tasks/T04-VERIFY.json new file mode 100644 index 000000000..8d6f5747e --- /dev/null +++ b/.gsd/milestones/M001/slices/S01/tasks/T04-VERIFY.json @@ -0,0 +1,18 @@ +{ + "schemaVersion": 1, + "taskId": "T04", + "unitId": "M001/S01/T04", + "timestamp": 1774280619727, + "passed": false, + "discoverySource": "package-json", + "checks": [ + { + "command": "npm run test", + "exitCode": 1, + "durationMs": 39485, + "verdict": "fail" + } + ], + "retryAttempt": 1, + "maxRetries": 2 +} diff --git a/.gsd/milestones/M001/slices/S02/S02-PLAN.md b/.gsd/milestones/M001/slices/S02/S02-PLAN.md new file mode 100644 index 000000000..a5b733992 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/S02-PLAN.md @@ -0,0 +1,74 @@ +# S02: plan_slice + plan_task tools + PLAN/task-plan renderers + +**Goal:** Add DB-backed slice and task planning write paths that persist flat planning payloads, render parse-compatible `S##-PLAN.md` and `tasks/T##-PLAN.md` artifacts from DB state, and keep task plan files present on disk so planning/execution recovery continues to work. +**Demo:** Running the S02 planning proof writes slice/task planning data through `gsd_plan_slice` and `gsd_plan_task`, regenerates `S02-PLAN.md` and `tasks/T01-PLAN.md`/`tasks/T02-PLAN.md` from DB, and passes runtime checks that reject missing task plan files. + +## Must-Haves + +- `gsd_plan_slice` validates a flat payload, requires an existing slice, writes slice planning plus task rows transactionally, renders `S##-PLAN.md`, and clears both state and parse caches. (R003) +- `gsd_plan_task` validates a flat payload, requires an existing parent slice, writes task planning fields, renders `tasks/T##-PLAN.md`, and clears both caches. (R004) +- `renderPlanFromDb()` and `renderTaskPlanFromDb()` emit markdown that still round-trips through `parsePlan()` / `parseTaskPlanFile()` and satisfies `auto-recovery.ts` plan-slice artifact checks, including on-disk task plan existence. (R008, R019) +- Prompt and tool registration surfaces expose the new DB-backed planning path instead of leaving slice/task planning as direct file writes. + +## Proof Level + +- This slice proves: integration +- Real runtime required: yes +- Human/UAT required: no + +## Verification + +- `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts` +- `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/auto-recovery.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts --test-name-pattern="plan-slice|plan-task|renderPlanFromDb|renderTaskPlanFromDb|task plan|DB-backed planning"` +- `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts --test-name-pattern="validation failed|render failed|cache|missing parent"` + +## Observability / Diagnostics + +- Runtime signals: handler error strings for validation / DB write / render failure, plus stale-render diagnostics from `markdown-renderer.ts` when rendered plan artifacts drift from DB state. +- Inspection surfaces: `src/resources/extensions/gsd/tests/plan-slice.test.ts`, `src/resources/extensions/gsd/tests/plan-task.test.ts`, `src/resources/extensions/gsd/tests/markdown-renderer.test.ts`, `src/resources/extensions/gsd/tests/auto-recovery.test.ts`, and SQLite rows returned by `getSlice()`, `getTask()`, and `getSliceTasks()`. +- Failure visibility: failed handler result payloads, missing `tasks/T##-PLAN.md` artifact assertions, and renderer/parser mismatches surfaced by the resolver-based test harness. +- Redaction constraints: no secrets expected; task-plan frontmatter must expose skill names only, never secret values or environment data. + +## Integration Closure + +- Upstream surfaces consumed: `src/resources/extensions/gsd/tools/plan-milestone.ts`, `src/resources/extensions/gsd/bootstrap/db-tools.ts`, `src/resources/extensions/gsd/gsd-db.ts`, `src/resources/extensions/gsd/markdown-renderer.ts`, `src/resources/extensions/gsd/files.ts`, `src/resources/extensions/gsd/auto-recovery.ts`, and `src/resources/extensions/gsd/prompts/plan-slice.md`. +- New wiring introduced in this slice: canonical tool handlers/registrations for `gsd_plan_slice` and `gsd_plan_task`, DB→markdown renderers for slice and task plans, and prompt-contract coverage that points planning flows at those tools. +- What remains before the milestone is truly usable end-to-end: S03 still needs replan/reassess structural enforcement, and S04 still needs hot-path caller migration plus DB↔rendered cross-validation. + +## Tasks + +I’m splitting this into three tasks because there are three distinct failure boundaries and each needs its own proof. The highest-risk boundary is renderer compatibility: if the generated `PLAN.md` or task-plan markdown drifts from parser/runtime expectations, the rest of the slice is fake progress. That work goes first and includes the runtime contract around `skills_used` frontmatter and task-plan file existence. Once the render target is stable, the handler/registration work becomes straightforward because S01 already established the validation → transaction → render → invalidate pattern. The last task is prompt/tool-surface closure, which is intentionally small but necessary: without it, the system still has a gap between the new DB-backed implementation and the planning instructions/registrations the LLM actually sees. + +- [x] **T01: Add DB-backed slice and task plan renderers with compatibility tests** `est:1.5h` + - Why: This closes the main transition-window risk first: rendered plan artifacts must stay parse-compatible and satisfy runtime recovery checks before any new planning handler can be trusted. + - Files: `src/resources/extensions/gsd/markdown-renderer.ts`, `src/resources/extensions/gsd/tests/markdown-renderer.test.ts`, `src/resources/extensions/gsd/tests/auto-recovery.test.ts`, `src/resources/extensions/gsd/files.ts` + - Do: Implement `renderPlanFromDb()` and `renderTaskPlanFromDb()` using existing DB query helpers, emit slice/task markdown that preserves `parsePlan()` and `parseTaskPlanFile()` expectations, include conservative task-plan frontmatter (`estimated_steps`, `estimated_files`, `skills_used`), and add tests that prove rendered slice plans plus task plan files satisfy `verifyExpectedArtifact("plan-slice", ...)`. + - Verify: `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/auto-recovery.test.ts --test-name-pattern="renderPlanFromDb|renderTaskPlanFromDb|plan-slice|task plan"` + - Done when: DB rows can be rendered into `S##-PLAN.md` and `tasks/T##-PLAN.md` files that parse cleanly and pass the existing plan-slice runtime artifact checks. +- [x] **T02: Implement and register gsd_plan_slice and gsd_plan_task** `est:1.5h` + - Why: This delivers the actual S02 capability: flat DB-backed planning tools for slices and tasks that write structured planning state, render truthful markdown, and clear stale caches after success. + - Files: `src/resources/extensions/gsd/tools/plan-slice.ts`, `src/resources/extensions/gsd/tools/plan-task.ts`, `src/resources/extensions/gsd/bootstrap/db-tools.ts`, `src/resources/extensions/gsd/gsd-db.ts`, `src/resources/extensions/gsd/tests/plan-slice.test.ts`, `src/resources/extensions/gsd/tests/plan-task.test.ts` + - Do: Follow the S01 handler pattern exactly for both tools, add any missing DB upsert/query helpers needed to populate task planning fields and retrieve slice/task planning state, register canonical tools plus aliases in `db-tools.ts`, and test validation, missing-parent rejection, transactional DB writes, render-failure handling, idempotent reruns, and observable cache invalidation. + - Verify: `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts` + - Done when: `gsd_plan_slice` and `gsd_plan_task` exist as registered DB tools, reject malformed input, render plan artifacts after successful writes, and refresh parse-visible state immediately. +- [x] **T03: Close prompt and contract coverage around DB-backed slice planning** `est:45m` + - Why: The implementation is incomplete until the planning prompt/test surface actually points at the new tools and proves the DB-backed route is the expected contract instead of manual markdown edits. + - Files: `src/resources/extensions/gsd/prompts/plan-slice.md`, `src/resources/extensions/gsd/tests/prompt-contracts.test.ts`, `src/resources/extensions/gsd/bootstrap/db-tools.ts`, `src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts` + - Do: Update the slice planning prompt text to require tool-backed planning state when `gsd_plan_slice` / `gsd_plan_task` are available, tighten prompt-contract assertions for the new tools, and add/adjust prompt template tests so the planning surface stays aligned with the registered tool path. + - Verify: `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts --test-name-pattern="plan-slice|plan task|DB-backed"` + - Done when: slice planning prompts and prompt tests explicitly reference the DB-backed slice/task planning tools and no longer leave direct plan-file writes as the intended path. + +## Files Likely Touched + +- `src/resources/extensions/gsd/gsd-db.ts` +- `src/resources/extensions/gsd/markdown-renderer.ts` +- `src/resources/extensions/gsd/tools/plan-slice.ts` +- `src/resources/extensions/gsd/tools/plan-task.ts` +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` +- `src/resources/extensions/gsd/prompts/plan-slice.md` +- `src/resources/extensions/gsd/tests/plan-slice.test.ts` +- `src/resources/extensions/gsd/tests/plan-task.test.ts` +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` +- `src/resources/extensions/gsd/tests/auto-recovery.test.ts` +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` +- `src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts` diff --git a/.gsd/milestones/M001/slices/S02/S02-RESEARCH.md b/.gsd/milestones/M001/slices/S02/S02-RESEARCH.md new file mode 100644 index 000000000..4443fa8e7 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/S02-RESEARCH.md @@ -0,0 +1,84 @@ +# S02 — Research + +**Date:** 2026-03-23 + +## Summary + +S02 is targeted research, not deep exploration. The slice is straightforward extension of the S01 pattern: add two DB-backed planning handlers (`gsd_plan_slice`, `gsd_plan_task`), add full DB→markdown renderers for `S##-PLAN.md` and `T##-PLAN.md`, register both tools, and cover the runtime contract that task plan files must still exist on disk. The active requirements this slice directly owns are R003, R004, R008, and R019. + +The main constraint is that this is not just “store more planning fields.” The slice plan file and per-task plan files remain part of the runtime. `auto-recovery.ts` explicitly rejects a `plan-slice` artifact when referenced task plan files are missing, `execute-task` prompt flow expects task plans on disk, and `buildSkillActivationBlock()` consumes `skills_used` from task-plan frontmatter. So the implementation must write DB state and also render both artifact layers truthfully from that state. + +## Recommendation + +Follow the S01 handler pattern exactly: validate flat params → one transaction → render markdown from DB → invalidate both state and parse caches. Reuse the existing `insertSlice`/`upsertSlicePlanning` and `insertTask` primitives in `gsd-db.ts`; do not invent a new storage layer. Add minimal new validation/handler modules and renderer functions rather than refactoring shared infrastructure in this slice. + +Treat `S##-PLAN.md` as a slice-level rendered view from `slices` + `tasks` rows, and `T##-PLAN.md` as a task-level rendered view from one `tasks` row plus fixed frontmatter fields. Preserve existing parser/runtime compatibility instead of optimizing schema shape. That lines up with the `create-gsd-extension` skill rule to extend existing GSD extension primitives rather than introducing parallel abstractions, and with the `test` skill rule to match existing test patterns and immediately verify generated behavior under the repo’s real resolver harness. + +## Implementation Landscape + +### Key Files + +- `src/resources/extensions/gsd/tools/plan-milestone.ts` — canonical planning-tool reference. Establishes the exact validation → transaction → render → `invalidateStateCache()` + `clearParseCache()` flow S02 should mirror. +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — registers `gsd_plan_milestone`. S02 needs parallel registrations for `gsd_plan_slice` and `gsd_plan_task`, with the same execute/error/details shape and canonical-name guidance. +- `src/resources/extensions/gsd/gsd-db.ts` — schema v8 already contains the needed planning columns. `insertSlice`, `upsertSlicePlanning`, `insertTask`, `getSlice`, `getTask`, `getSliceTasks`, and `getMilestoneSlices` already expose most of the storage/query surface S02 needs. +- `src/resources/extensions/gsd/markdown-renderer.ts` — has `renderRoadmapFromDb()` and shared helpers `toArtifactPath()`, `writeAndStore()`, and cache invalidation. Natural place to add `renderPlanFromDb()` and `renderTaskPlanFromDb()`. +- `src/resources/extensions/gsd/templates/plan.md` — authoritative output shape for slice plans. The renderer should emit markdown parse-compatible with this structure, especially the `## Tasks` checkbox lines and `Verify:` field formatting. +- `src/resources/extensions/gsd/templates/task-plan.md` — authoritative task plan structure. Critical fields: frontmatter `estimated_steps`, `estimated_files`, `skills_used`; sections for Description, Steps, Must-Haves, Verification, optional Observability Impact, Inputs, Expected Output. +- `src/resources/extensions/gsd/files.ts` — parser compatibility target. `parsePlan()` still drives transition-window callers, and `parseTaskPlanFile()` only reads task-plan frontmatter today. Rendered files must satisfy these parsers without new parser work in this slice. +- `src/resources/extensions/gsd/auto-recovery.ts` — enforces R019. `verifyExpectedArtifact("plan-slice", ...)` fails when task IDs appear in `S##-PLAN.md` but matching `tasks/T##-PLAN.md` files are missing. +- `src/resources/extensions/gsd/auto-prompts.ts` — `buildSkillActivationBlock()` parses `skills_used` from task-plan frontmatter. If renderer omits or malforms that list, downstream executor prompt routing degrades. +- `src/resources/extensions/gsd/prompts/plan-slice.md` — already updated to say DB-backed tool should own state. S02 likely needs prompt contract tightening once tool names exist, but S01 already removed PLAN-as-source-of-truth framing. +- `src/resources/extensions/gsd/tests/plan-milestone.test.ts` — best reference for handler tests: validation failure, DB write success, render failure behavior, idempotent rerun, observable cache invalidation. +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — existing renderer/stale-repair coverage pattern. Best place for slice/task plan render tests and stale detection if needed. +- `src/resources/extensions/gsd/tests/auto-recovery.test.ts` — already proves missing task plan files break `plan-slice` artifact validity. S02 should add integration-style tests that its renderer satisfies this contract. +- `src/resources/extensions/gsd/tests/migrate-hierarchy.test.ts` — confirms legacy markdown import populates planning columns (`goal`, task status/order, etc.). Useful as parity reference when deciding which DB fields the new renderer must expose. + +### Build Order + +1. **Renderer shape first** — implement `renderPlanFromDb()` and `renderTaskPlanFromDb()` in `markdown-renderer.ts` before tool handlers. This is the highest-risk compatibility point because transition-window callers still parse markdown and runtime checks still require plan files on disk. +2. **Slice/task handler implementation second** — add `tools/plan-slice.ts` and `tools/plan-task.ts` following the S01 handler pattern, using existing DB primitives and new renderers. +3. **Tool registration third** — wire both handlers into `bootstrap/db-tools.ts` after handler behavior is stable. +4. **Prompt/test contract updates last** — only after tool names and artifact paths are real. Keep prompt work narrow: assert the prompts reference the DB-backed path and not direct artifact writes. + +This order isolates the root risk first: if rendering is wrong, handlers and prompts still fail the slice. The `debug-like-expert` skill’s “verify, don’t assume” rule applies here — prove rendered files satisfy parser/runtime contracts before layering more orchestration on top. + +### Verification Approach + +Run the repo’s resolver-based TypeScript harness, not bare `node --test`. + +Primary proof command: + +`node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/auto-recovery.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts` + +What to prove: + +- `plan-slice` handler validates flat params, rejects missing/invalid fields, verifies the slice exists, writes slice planning/task rows, renders `S##-PLAN.md`, and clears both caches. +- `plan-task` handler validates flat params, verifies parent slice exists, writes task planning fields, renders `tasks/T##-PLAN.md`, and clears both caches. +- `renderPlanFromDb()` emits parse-compatible task checkbox entries and slice sections from DB state. +- `renderTaskPlanFromDb()` writes parse-compatible frontmatter with `estimated_steps`, `estimated_files`, and `skills_used`, plus the required markdown sections. +- A rendered slice plan plus rendered task plans satisfies `verifyExpectedArtifact("plan-slice", ...)`. +- Prompt contracts mention the new DB-backed tool path rather than manual file writes, if prompts are changed. + +## Constraints + +- Schema work should stay minimal. `gsd-db.ts` already has the v8 columns needed for slice and task planning (`goal`, `success_criteria`, `proof_level`, `integration_closure`, `observability_impact`, plus task `description`, `estimate`, `files`, `verify`, `inputs`, `expected_output`). +- `getSliceTasks()` and `getMilestoneSlices()` still order by `id`, not an explicit sequence column. S02 should not try to solve ordering beyond the current ID-based convention; sequence-aware ordering belongs to S04 per roadmap. +- Task-plan frontmatter is already a runtime input. `parseTaskPlanFile()` normalizes numeric strings and scalar/list `skills_used`, so rendered output should stay conservative and explicit rather than clever. +- Tool registration in this extension uses TypeBox object schemas in `db-tools.ts`; follow the existing project pattern already present for `gsd_plan_milestone`. + +## Common Pitfalls + +- **Rendering only the slice plan** — R019 will still fail because `auto-recovery.ts` checks that every task listed in `S##-PLAN.md` has a matching `tasks/T##-PLAN.md` file. +- **Forgetting cache invalidation after successful render** — S01 already proved stale parse-visible state is the failure mode; S02 must clear both `invalidateStateCache()` and `clearParseCache()` after DB + render success. +- **Writing task plans without `skills_used` frontmatter** — executor prompt skill activation silently loses task-specific skill routing because `buildSkillActivationBlock()` reads that field. +- **Using a new ad hoc markdown format** — transition-window callers still depend on `parsePlan()` and task-plan conventions. Match existing template/test shapes, don’t redesign the documents. + +## Skills Discovered + +| Technology | Skill | Status | +|------------|-------|--------| +| GSD extension/tooling | `create-gsd-extension` | installed | +| Test execution / harness discipline | `test` | installed | +| Root-cause-first verification | `debug-like-expert` | installed | +| SQLite / migration-heavy planning storage | `npx skills add martinholovsky/claude-skills-generator@sqlite-database-expert -g` | available | +| TypeBox schema authoring | `npx skills add epicenterhq/epicenter@typebox -g` | available | diff --git a/.gsd/milestones/M001/slices/S02/S02-SUMMARY.md b/.gsd/milestones/M001/slices/S02/S02-SUMMARY.md new file mode 100644 index 000000000..10f17c1ab --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/S02-SUMMARY.md @@ -0,0 +1,132 @@ +--- +id: S02 +parent: M001 +milestone: M001 +provides: + - gsd_plan_slice tool handler — DB-backed slice planning write path + - gsd_plan_task tool handler — DB-backed task planning write path + - renderPlanFromDb() — generates S##-PLAN.md from DB state + - renderTaskPlanFromDb() — generates T##-PLAN.md from DB state + - upsertTaskPlanning() — safe planning-field updates on existing task rows + - getSliceTasks() and getTask() query functions with planning fields populated + - Prompt contract tests for plan-slice prompt DB-backed tool references +requires: + - slice: S01 + provides: Schema v8 migration with planning columns on slices/tasks tables + - slice: S01 + provides: Tool handler pattern from plan-milestone.ts (validate → transaction → render → invalidate) + - slice: S01 + provides: renderRoadmapFromDb() and markdown-renderer.ts rendering infrastructure + - slice: S01 + provides: db-tools.ts registration pattern and DB-availability checks +affects: + - S03 + - S04 +key_files: + - src/resources/extensions/gsd/markdown-renderer.ts + - src/resources/extensions/gsd/tools/plan-slice.ts + - src/resources/extensions/gsd/tools/plan-task.ts + - src/resources/extensions/gsd/bootstrap/db-tools.ts + - src/resources/extensions/gsd/gsd-db.ts + - src/resources/extensions/gsd/prompts/plan-slice.md + - src/resources/extensions/gsd/tests/plan-slice.test.ts + - src/resources/extensions/gsd/tests/plan-task.test.ts + - src/resources/extensions/gsd/tests/prompt-contracts.test.ts + - src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts + - src/resources/extensions/gsd/tests/markdown-renderer.test.ts + - src/resources/extensions/gsd/tests/auto-recovery.test.ts +key_decisions: + - upsertTaskPlanning() updates planning fields without clobbering execution/completion state on existing task rows + - renderPlanFromDb() eagerly renders all child task-plan files so recovery checks see complete artifact set immediately + - Task-plan frontmatter uses conservative skills_used: [] — skill activation remains execution-time only + - plan-slice.md step 6 names gsd_plan_slice/gsd_plan_task as canonical write path; step 7 is degraded fallback +patterns_established: + - Flat TypeBox validation → parent-existence check → transactional DB write → render → cache invalidation pattern extended from milestone tools to slice/task tools + - Prompt contract tests as regression tripwires for tool-name and framing changes in planning prompts + - Parse-visible state assertions as ESM-safe alternative to spy-based cache invalidation testing +observability_surfaces: + - plan-slice.ts and plan-task.ts handler error payloads — structured failure messages for validation/DB/render failures + - detectStaleRenders() stderr warnings when rendered plan artifacts drift from DB state + - verifyExpectedArtifact('plan-slice', ...) — runtime recovery check for task-plan file existence + - SQLite artifacts table rows for rendered S##-PLAN.md and T##-PLAN.md files +drill_down_paths: + - .gsd/milestones/M001/slices/S02/tasks/T01-SUMMARY.md + - .gsd/milestones/M001/slices/S02/tasks/T02-SUMMARY.md + - .gsd/milestones/M001/slices/S02/tasks/T03-SUMMARY.md +duration: "" +verification_result: passed +completed_at: 2026-03-23T16:13:56.461Z +blocker_discovered: false +--- + +# S02: plan_slice + plan_task tools + PLAN/task-plan renderers + +**DB-backed gsd_plan_slice and gsd_plan_task tools write structured planning state to SQLite, render parse-compatible S##-PLAN.md and T##-PLAN.md artifacts, and the plan-slice prompt now names these tools as the canonical write path.** + +## What Happened + +S02 delivered the second layer of the markdown→DB migration: structured write paths for slice and task planning. The work proceeded through three tasks with distinct failure boundaries. + +T01 built the rendering foundation — `renderPlanFromDb()` and `renderTaskPlanFromDb()` in `markdown-renderer.ts`. These read slice/task rows from SQLite and emit markdown that round-trips cleanly through `parsePlan()` and `parseTaskPlanFile()`. The task-plan renderer uses conservative frontmatter (`skills_used: []`) so no speculative values leak from DB state. The slice-plan renderer sources verification/observability content from DB fields when present. Critically, `renderPlanFromDb()` eagerly renders all child task-plan files so `verifyExpectedArtifact("plan-slice", ...)` sees a complete on-disk artifact set immediately. Auto-recovery tests proved rendered task-plan files satisfy the existing file-existence checks, and that deleting a rendered task-plan file correctly fails recovery. + +T02 implemented the actual tool handlers — `handlePlanSlice()` and `handlePlanTask()` — following the S01 pattern: flat TypeBox validation → parent-existence check → transactional DB write → render → cache invalidation. A new `upsertTaskPlanning()` helper in `gsd-db.ts` updates planning-specific columns without clobbering completion state, enabling safe replanning of already-executed tasks. Both tools registered in `db-tools.ts` with canonical names (`gsd_plan_slice`, `gsd_plan_task`) plus aliases (`gsd_slice_plan`, `gsd_task_plan`). The test suite covers validation failures, missing-parent rejection, render-failure isolation, idempotent reruns, and parse-visible cache refresh. + +T03 closed the prompt/contract gap. The plan-slice prompt (`plan-slice.md`) was updated to name `gsd_plan_slice` and `gsd_plan_task` as the primary write path (step 6), with direct file writes explicitly positioned as a degraded fallback (step 7). Four new prompt-contract tests and one template-substitution test ensure the tool names and framing survive prompt changes. This completed the transition from "tools are optional" to "tools are the expected default." + +## Verification + +All four slice-level verification commands pass (120/120 tests): + +1. `plan-slice.test.ts` + `plan-task.test.ts` — 10/10: handler validation, parent checks, DB writes, render, cache invalidation, idempotence +2. `markdown-renderer.test.ts` + `auto-recovery.test.ts` + `prompt-contracts.test.ts` filtered to planning patterns — 60/60: renderer round-trip, task-plan file existence, stale-render detection, prompt contract alignment +3. `plan-slice.test.ts` + `plan-task.test.ts` filtered to failure/cache — 10/10: validation failures, render failures, missing-parent rejection, cache refresh +4. `prompt-contracts.test.ts` + `plan-slice-prompt.test.ts` filtered to plan-slice/DB-backed — 40/40: tool name assertions, degraded-fallback framing, per-task instruction, template substitution + +## Requirements Advanced + +- R014 — S02 renderers produce the artifacts that S04 cross-validation tests will compare against parsed state +- R015 — Both plan-slice and plan-task handlers invalidate state cache and parse cache after successful render, tested via parse-visible state assertions + +## Requirements Validated + +- R003 — plan-slice.test.ts proves flat payload validation, slice-exists check, DB write, S##-PLAN.md rendering, and cache invalidation +- R004 — plan-task.test.ts proves flat payload validation, parent-slice check, DB write, T##-PLAN.md rendering, and cache invalidation +- R008 — markdown-renderer.test.ts proves renderPlanFromDb() generates parse-compatible S##-PLAN.md and renderTaskPlanFromDb() generates T##-PLAN.md with frontmatter +- R019 — auto-recovery.test.ts proves task-plan files must exist on disk — verifyExpectedArtifact passes with files, fails without + +## New Requirements Surfaced + +None. + +## Requirements Invalidated or Re-scoped + +None. + +## Deviations + +T01 did not edit `src/resources/extensions/gsd/files.ts` — the existing parser contract already accepted the renderer output without changes. T02 added `upsertTaskPlanning()` as a narrow DB helper rather than modifying `insertTask()` semantics, which was not explicitly planned but necessary for safe replanning. The T01 summary had verification_result:mixed because the plan-slice.test.ts and plan-task.test.ts files did not exist yet at T01 execution time; T02 subsequently created them and all pass. + +## Known Limitations + +Task-plan frontmatter uses `skills_used: []` conservatively — skill activation remains execution-time only. The planning tools do not enforce task ordering within a slice; sequence is determined by insertion order. Cross-validation tests (DB state vs rendered-then-parsed state) are not yet implemented — that proof is S04's responsibility. + +## Follow-ups + +S03 needs the handler patterns from plan-slice.ts/plan-task.ts as templates for replan_slice and reassess_roadmap tools. S04 needs the query functions (getSliceTasks, getTask) and renderers (renderPlanFromDb, renderTaskPlanFromDb) as inputs for hot-path caller migration and cross-validation tests. + +## Files Created/Modified + +- `src/resources/extensions/gsd/markdown-renderer.ts` — Added renderPlanFromDb() and renderTaskPlanFromDb() — DB-backed renderers for S##-PLAN.md and T##-PLAN.md +- `src/resources/extensions/gsd/tools/plan-slice.ts` — New file — handlePlanSlice() tool handler: validate → DB write → render → cache invalidation +- `src/resources/extensions/gsd/tools/plan-task.ts` — New file — handlePlanTask() tool handler: validate → parent check → DB write → render → cache invalidation +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — Registered gsd_plan_slice and gsd_plan_task canonical tools plus gsd_slice_plan/gsd_task_plan aliases +- `src/resources/extensions/gsd/gsd-db.ts` — Added upsertTaskPlanning() helper for safe planning-field updates on existing task rows +- `src/resources/extensions/gsd/prompts/plan-slice.md` — Promoted gsd_plan_slice/gsd_plan_task to canonical write path (step 6), direct file writes to degraded fallback (step 7) +- `src/resources/extensions/gsd/tests/plan-slice.test.ts` — New file — 5 handler tests for gsd_plan_slice: validation, parent check, render, idempotence, cache +- `src/resources/extensions/gsd/tests/plan-task.test.ts` — New file — 5 handler tests for gsd_plan_task: validation, parent check, render, idempotence, cache +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — Extended with renderPlanFromDb/renderTaskPlanFromDb round-trip and failure tests +- `src/resources/extensions/gsd/tests/auto-recovery.test.ts` — Extended with rendered task-plan file existence and deletion tests for verifyExpectedArtifact +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — Added 4 assertions for plan-slice prompt: tool names, degraded fallback, per-task instruction +- `src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts` — New file — template substitution test proving tool names survive variable replacement +- `.gsd/KNOWLEDGE.md` — Updated stale entry about missing test files, added ESM-safe testing pattern note +- `.gsd/PROJECT.md` — Updated current state to reflect S02 completion diff --git a/.gsd/milestones/M001/slices/S02/S02-UAT.md b/.gsd/milestones/M001/slices/S02/S02-UAT.md new file mode 100644 index 000000000..69348e79d --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/S02-UAT.md @@ -0,0 +1,126 @@ +# S02: plan_slice + plan_task tools + PLAN/task-plan renderers — UAT + +**Milestone:** M001 +**Written:** 2026-03-23T16:13:56.462Z + +# S02: plan_slice + plan_task tools + PLAN/task-plan renderers — UAT + +**Milestone:** M001 +**Written:** 2026-03-23 + +## UAT Type + +- UAT mode: artifact-driven +- Why this mode is sufficient: All S02 deliverables are tool handlers, renderers, and prompt changes that are fully testable via the resolver-harness test suite without a live runtime. The test suite covers round-trip parsing, file-existence checks, and prompt contract assertions. + +## Preconditions + +- Working tree has `src/resources/extensions/gsd/tests/resolve-ts.mjs` available +- Node.js supports `--experimental-strip-types` and `--import` flags +- No other processes hold locks on temp SQLite DBs created by tests + +## Smoke Test + +Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts` — all 10 tests should pass, confirming both handlers accept valid input, reject invalid input, write to DB, render artifacts, and refresh caches. + +## Test Cases + +### 1. gsd_plan_slice writes planning state and renders S##-PLAN.md + +1. Call `handlePlanSlice()` with a valid payload including milestoneId, sliceId, goal, demo, mustHaves, tasks array, and filesLikelyTouched. +2. Read the slice row from SQLite. +3. Read the rendered `S##-PLAN.md` from disk. +4. Parse the rendered file through `parsePlan()`. +5. **Expected:** DB row contains goal/demo/mustHaves fields. Rendered file exists on disk. Parsed result contains all tasks from the payload. All child `T##-PLAN.md` files exist on disk. + +### 2. gsd_plan_task writes task planning and renders T##-PLAN.md + +1. Create a slice row in DB. +2. Call `handlePlanTask()` with milestoneId, sliceId, taskId, title, why, files, steps, verifyCommand, doneWhen. +3. Read the task row from SQLite. +4. Read the rendered `tasks/T##-PLAN.md` from disk. +5. Parse through `parseTaskPlanFile()`. +6. **Expected:** DB row contains steps/files/verify_command fields. Rendered file has YAML frontmatter with `estimated_steps`, `estimated_files`, `skills_used: []`. Parsed result matches input fields. + +### 3. Rendered plan artifacts satisfy auto-recovery checks + +1. Seed a slice and tasks in DB. +2. Call `renderPlanFromDb()` to write S##-PLAN.md and all T##-PLAN.md files. +3. Call `verifyExpectedArtifact("plan-slice", basePath, milestoneId, sliceId)`. +4. **Expected:** Verification passes — all task-plan files exist and the plan file has real task content. + +### 4. Missing task-plan file fails recovery verification + +1. Render a complete plan from DB (S##-PLAN.md + T##-PLAN.md files). +2. Delete one `T##-PLAN.md` file from disk. +3. Call `verifyExpectedArtifact("plan-slice", ...)`. +4. **Expected:** Verification fails with a clear message about the missing task-plan file. + +### 5. Validation rejects malformed payloads + +1. Call `handlePlanSlice()` with missing required fields (e.g., no `goal`). +2. Call `handlePlanTask()` with missing required fields (e.g., no `taskId`). +3. **Expected:** Both return `{ error: true, message: "..." }` with validation failure details. No DB writes. No files created. + +### 6. Missing parent slice is rejected + +1. Call `handlePlanSlice()` with a sliceId that does not exist in DB. +2. Call `handlePlanTask()` with a sliceId that does not exist in DB. +3. **Expected:** Both return error results mentioning the missing parent. No DB writes. + +### 7. Idempotent reruns refresh parse-visible state + +1. Call `handlePlanSlice()` with a valid payload. +2. Call `handlePlanSlice()` again with modified goal text. +3. Read the re-rendered S##-PLAN.md from disk. +4. **Expected:** The file contains the updated goal, not the original. DB row reflects the latest values. + +### 8. plan-slice prompt names DB-backed tools as canonical path + +1. Read `src/resources/extensions/gsd/prompts/plan-slice.md`. +2. Check for `gsd_plan_slice` and `gsd_plan_task` in the text. +3. Check that direct file writes are described as "degraded" or "fallback". +4. **Expected:** Both tool names present. Direct writes framed as fallback, not default. + +## Edge Cases + +### Render failure does not corrupt parse-visible state + +1. Seed a slice and task in DB with a valid plan. +2. Render the initial plan artifacts (S##-PLAN.md + T##-PLAN.md). +3. Simulate a render failure (e.g., invalid basePath). +4. **Expected:** Original files remain on disk unchanged. Error result returned. No cache invalidation occurs for the failed render. + +### Task planning rerun preserves completion state + +1. Insert a task row with `status: 'complete'` and a summary. +2. Call `handlePlanTask()` for the same task with new planning fields. +3. Read the task row from DB. +4. **Expected:** Planning fields (steps, files, verify_command) are updated. Completion fields (status, summary_content, completed_at) are preserved. + +## Failure Signals + +- Any of the 10 `plan-slice.test.ts` / `plan-task.test.ts` tests fail +- `parsePlan()` or `parseTaskPlanFile()` cannot parse rendered artifacts +- `verifyExpectedArtifact("plan-slice", ...)` fails when all task-plan files exist +- Prompt contract tests fail to find `gsd_plan_slice` / `gsd_plan_task` in plan-slice.md + +## Requirements Proved By This UAT + +- R003 — gsd_plan_slice flat tool validates, writes DB, renders S##-PLAN.md, invalidates caches +- R004 — gsd_plan_task flat tool validates, writes DB, renders T##-PLAN.md, invalidates caches +- R008 — renderPlanFromDb() and renderTaskPlanFromDb() generate parse-compatible plan artifacts +- R019 — Task-plan files are generated on disk and validated for existence by auto-recovery + +## Not Proven By This UAT + +- Cross-validation (DB state vs parsed state parity) — deferred to S04 +- Hot-path caller migration from parser reads to DB reads — deferred to S04 +- Replan/reassess structural enforcement — deferred to S03 +- Live auto-mode integration (LLM actually calling these tools in a dispatch loop) — deferred to milestone UAT + +## Notes for Tester + +- All tests use temp directories and in-memory SQLite, so no cleanup needed. +- The resolver-harness (`resolve-ts.mjs`) is required — bare `node --test` may fail on `.js` sibling specifiers. +- T01's verification_result was "mixed" because plan-slice.test.ts didn't exist yet at T01 time. T02 created those files and all pass now. diff --git a/.gsd/milestones/M001/slices/S02/tasks/T01-PLAN.md b/.gsd/milestones/M001/slices/S02/tasks/T01-PLAN.md new file mode 100644 index 000000000..ecb880ea3 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/tasks/T01-PLAN.md @@ -0,0 +1,58 @@ +--- +estimated_steps: 5 +estimated_files: 4 +skills_used: + - create-gsd-extension + - test + - debug-like-expert +--- + +# T01: Add DB-backed slice and task plan renderers with compatibility tests + +**Slice:** S02 — plan_slice + plan_task tools + PLAN/task-plan renderers +**Milestone:** M001 + +## Description + +Implement the missing DB→markdown renderers for slice plans and task plans before touching tool handlers. This task owns the compatibility boundary for S02: the generated `S##-PLAN.md` and `tasks/T##-PLAN.md` files must still satisfy `parsePlan()`, `parseTaskPlanFile()`, `auto-recovery.ts`, and executor skill activation via `skills_used` frontmatter. + +## Steps + +1. Read the existing renderer helpers in `src/resources/extensions/gsd/markdown-renderer.ts` and the parser/runtime expectations in `src/resources/extensions/gsd/files.ts` and `src/resources/extensions/gsd/auto-recovery.ts`. +2. Implement `renderPlanFromDb()` so it reads slice/task rows from `src/resources/extensions/gsd/gsd-db.ts`, emits a complete slice plan document with goal, demo, must-haves, verification, and task checklist entries, and writes/stores the artifact through the existing renderer helpers. +3. Implement `renderTaskPlanFromDb()` so it emits a task plan file with valid frontmatter fields (`estimated_steps`, `estimated_files`, `skills_used`) and the required markdown sections from the task row. +4. Add renderer tests in `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` covering parse compatibility, DB artifact persistence, and on-disk output shape for both renderers. +5. Extend `src/resources/extensions/gsd/tests/auto-recovery.test.ts` to prove a rendered slice plan plus rendered task plan files passes `verifyExpectedArtifact("plan-slice", ...)`, and that missing task-plan files still fail. + +## Must-Haves + +- [ ] `renderPlanFromDb()` generates parse-compatible `S##-PLAN.md` content from DB state. +- [ ] `renderTaskPlanFromDb()` generates parse-compatible `tasks/T##-PLAN.md` content with conservative `skills_used` frontmatter. +- [ ] Renderer tests cover both happy-path rendering and the runtime contract that task plan files must exist on disk for `plan-slice` verification. + +## Verification + +- `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/auto-recovery.test.ts --test-name-pattern="renderPlanFromDb|renderTaskPlanFromDb|plan-slice|task plan"` +- Inspect the passing assertions in `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` and `src/resources/extensions/gsd/tests/auto-recovery.test.ts` for rendered `PLAN.md` / `T##-PLAN.md` behavior. + +## Observability Impact + +- Signals added/changed: stale-render diagnostics and renderer test assertions now cover slice/task plan artifacts in addition to roadmap/summary artifacts. +- How a future agent inspects this: run the targeted resolver-harness test command above and inspect generated artifacts via `getArtifact()` / disk files from the renderer tests. +- Failure state exposed: parser incompatibility, missing task-plan files, and DB/artifact drift become explicit test failures instead of silent execution-time regressions. + +## Inputs + +- `src/resources/extensions/gsd/markdown-renderer.ts` — existing render helper patterns and artifact persistence hooks +- `src/resources/extensions/gsd/gsd-db.ts` — slice/task query fields available to renderers +- `src/resources/extensions/gsd/files.ts` — parser expectations for `PLAN.md` and task-plan frontmatter +- `src/resources/extensions/gsd/auto-recovery.ts` — runtime artifact checks that the rendered files must satisfy +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — current renderer test patterns to extend +- `src/resources/extensions/gsd/tests/auto-recovery.test.ts` — existing `plan-slice` artifact enforcement tests + +## Expected Output + +- `src/resources/extensions/gsd/markdown-renderer.ts` — new `renderPlanFromDb()` and `renderTaskPlanFromDb()` implementations +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — coverage for slice/task plan rendering and parse compatibility +- `src/resources/extensions/gsd/tests/auto-recovery.test.ts` — coverage proving rendered task-plan files satisfy `plan-slice` runtime checks +- `src/resources/extensions/gsd/files.ts` — only if a parser-facing compatibility adjustment is required by the new truthful renderer output diff --git a/.gsd/milestones/M001/slices/S02/tasks/T01-SUMMARY.md b/.gsd/milestones/M001/slices/S02/tasks/T01-SUMMARY.md new file mode 100644 index 000000000..d8c0973a6 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/tasks/T01-SUMMARY.md @@ -0,0 +1,66 @@ +--- +id: T01 +parent: S02 +milestone: M001 +key_files: + - src/resources/extensions/gsd/markdown-renderer.ts + - src/resources/extensions/gsd/tests/markdown-renderer.test.ts + - src/resources/extensions/gsd/tests/auto-recovery.test.ts + - .gsd/KNOWLEDGE.md +key_decisions: + - Rendered task-plan files use conservative `skills_used: []` frontmatter so execution-time skill activation remains explicit and no secret-bearing or speculative values are emitted from DB state. + - Slice-plan verification content is sourced from the slice `observability_impact` field when present so the DB-backed renderer preserves inspectable diagnostics/failure-path expectations instead of emitting a placeholder-only section. + - `renderPlanFromDb()` eagerly renders all child task-plan files after writing the slice plan so `verifyExpectedArtifact("plan-slice", ...)` sees a truthful on-disk artifact set immediately. +observability_surfaces: + - "markdown-renderer.ts stderr warnings on stale renders (detectStaleRenders) — visible on stderr when rendered plans drift from DB state" + - "auto-recovery.ts verifyExpectedArtifact('plan-slice', ...) — rejects when task-plan files are missing from disk" + - "SQLite artifacts table rows for S##-PLAN.md and T##-PLAN.md — queryable proof of renderer output" +duration: "" +verification_result: mixed +completed_at: 2026-03-23T15:58:46.134Z +blocker_discovered: false +--- + +# T01: Add DB-backed slice and task plan renderers with compatibility and recovery tests + +**Add DB-backed slice and task plan renderers with compatibility and recovery tests** + +## What Happened + +Implemented DB-backed plan rendering in `src/resources/extensions/gsd/markdown-renderer.ts` by adding `renderPlanFromDb()` and `renderTaskPlanFromDb()`. The slice-plan renderer now reads slice/task rows from SQLite, emits parse-compatible `S##-PLAN.md` content with goal, demo, must-haves, verification, checklist tasks, and files-likely-touched, then persists the artifact to disk and the artifacts table. The task-plan renderer now emits `tasks/T##-PLAN.md` files with conservative YAML frontmatter (`estimated_steps`, `estimated_files`, `skills_used: []`) plus `Steps`, `Inputs`, `Expected Output`, `Verification`, and optional `Observability Impact` sections. Extended `markdown-renderer.test.ts` to prove DB-backed plan rendering round-trips through `parsePlan()` and `parseTaskPlanFile()`, writes truthful on-disk artifacts, stores those artifacts in SQLite, and surfaces clear failure behavior for missing task rows. Extended `auto-recovery.test.ts` to prove a rendered slice plan plus rendered task-plan files satisfies `verifyExpectedArtifact("plan-slice", ...)`, and that deleting a rendered task-plan file still fails recovery verification as intended. Also recorded the local verification gotcha in `.gsd/KNOWLEDGE.md`: the slice plan references `plan-slice.test.ts` / `plan-task.test.ts`, but those files are not present in this checkout, so the resolver-harness renderer/recovery/prompt tests are currently the inspectable proof surface for this task. + +## Verification + +Verified the task contract with the targeted resolver-harness command for `markdown-renderer.test.ts` and `auto-recovery.test.ts`; all renderer and recovery assertions passed, including explicit failure-path checks for missing task-plan files and stale-render diagnostics. Ran the broader slice-level resolver-harness command covering `markdown-renderer.test.ts`, `auto-recovery.test.ts`, and `prompt-contracts.test.ts`; it passed and confirmed the DB-backed planning prompt contract remains aligned. Attempted the slice-plan verification command for `plan-slice.test.ts` and `plan-task.test.ts`, then confirmed those referenced files do not exist in this checkout, so that command cannot currently execute here. This is a checkout/test-surface mismatch, not a regression introduced by this task. + +## Verification Evidence + +| # | Command | Exit Code | Verdict | Duration | +|---|---------|-----------|---------|----------| +| 1 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/auto-recovery.test.ts --test-name-pattern="renderPlanFromDb|renderTaskPlanFromDb|plan-slice|task plan"` | 0 | ✅ pass | 693ms | +| 2 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts` | 1 | ❌ fail | 51ms | +| 3 | `ls src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts` | 1 | ❌ fail | 0ms | +| 4 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/auto-recovery.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts --test-name-pattern="plan-slice|plan-task|renderPlanFromDb|renderTaskPlanFromDb|task plan|DB-backed planning"` | 0 | ✅ pass | 697ms | + + +## Deviations + +Did not edit `src/resources/extensions/gsd/files.ts`; the existing parser contract already accepted the truthful renderer output. The slice plan’s referenced `plan-slice.test.ts` and `plan-task.test.ts` verification command could not be executed because those files are absent in the working tree, so I documented that local mismatch and used the existing resolver-harness renderer/recovery/prompt tests as the effective proof surface. + +## Known Issues + +The slice plan still references `src/resources/extensions/gsd/tests/plan-slice.test.ts` and `src/resources/extensions/gsd/tests/plan-task.test.ts`, but neither file exists in this checkout. Until those tests land, slice-level verification for planning work must rely on the existing `markdown-renderer.test.ts`, `auto-recovery.test.ts`, and related prompt-contract tests. + +## Diagnostics + +- **Rendered artifacts on disk:** Check `S##-PLAN.md` and `tasks/T##-PLAN.md` files in the milestone/slice directory — these are the renderer output and must parse cleanly via `parsePlan()` and `parseTaskPlanFile()`. +- **Artifacts table in SQLite:** Query `SELECT * FROM artifacts WHERE path LIKE '%PLAN.md'` to verify renderer wrote artifact records. +- **Stale render detection:** Run `detectStaleRenders(db, basePath, milestoneId)` — it reports plan checkbox mismatches and missing task summaries on stderr. +- **Recovery verification:** Call `verifyExpectedArtifact("plan-slice", basePath, milestoneId, sliceId)` — returns a diagnostic object with pass/fail plus the list of missing task-plan files. + +## Files Created/Modified + +- `src/resources/extensions/gsd/markdown-renderer.ts` +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` +- `src/resources/extensions/gsd/tests/auto-recovery.test.ts` +- `.gsd/KNOWLEDGE.md` diff --git a/.gsd/milestones/M001/slices/S02/tasks/T01-VERIFY.json b/.gsd/milestones/M001/slices/S02/tasks/T01-VERIFY.json new file mode 100644 index 000000000..f41f48982 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/tasks/T01-VERIFY.json @@ -0,0 +1,18 @@ +{ + "schemaVersion": 1, + "taskId": "T01", + "unitId": "M001/S02/T01", + "timestamp": 1774281533617, + "passed": false, + "discoverySource": "package-json", + "checks": [ + { + "command": "npm run test", + "exitCode": 1, + "durationMs": 11123, + "verdict": "fail" + } + ], + "retryAttempt": 1, + "maxRetries": 2 +} diff --git a/.gsd/milestones/M001/slices/S02/tasks/T02-PLAN.md b/.gsd/milestones/M001/slices/S02/tasks/T02-PLAN.md new file mode 100644 index 000000000..6d08d2635 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/tasks/T02-PLAN.md @@ -0,0 +1,60 @@ +--- +estimated_steps: 5 +estimated_files: 6 +skills_used: + - create-gsd-extension + - test + - debug-like-expert +--- + +# T02: Implement and register gsd_plan_slice and gsd_plan_task + +**Slice:** S02 — plan_slice + plan_task tools + PLAN/task-plan renderers +**Milestone:** M001 + +## Description + +Add the actual DB-backed planning tools for slices and tasks, reusing the S01 handler pattern instead of inventing new plumbing. This task should leave the extension with canonical `gsd_plan_slice` and `gsd_plan_task` registrations, flat validation, transactional DB writes, truthful plan rendering, and observable cache invalidation proof. + +## Steps + +1. Read `src/resources/extensions/gsd/tools/plan-milestone.ts` and mirror its validate → transaction → render → invalidate flow for slice/task planning. +2. Add any missing DB helpers in `src/resources/extensions/gsd/gsd-db.ts` needed to upsert slice planning fields, create/update task planning rows, and query the rendered state used by the handlers. +3. Implement `src/resources/extensions/gsd/tools/plan-slice.ts` with flat input validation, parent-slice existence checks, transactional writes of slice planning plus task rows, renderer invocation, and cache invalidation after successful render. +4. Implement `src/resources/extensions/gsd/tools/plan-task.ts` with flat input validation, parent-slice existence checks, task row upsert logic, task-plan rendering, and post-success cache invalidation. +5. Register both tools and any aliases in `src/resources/extensions/gsd/bootstrap/db-tools.ts`, then add focused handler tests in `src/resources/extensions/gsd/tests/plan-slice.test.ts` and `src/resources/extensions/gsd/tests/plan-task.test.ts` for validation, idempotence, render failure behavior, and parse-visible cache updates. + +## Must-Haves + +- [ ] `gsd_plan_slice` exists as a registered DB-backed tool and writes/renders slice planning state from a flat payload. +- [ ] `gsd_plan_task` exists as a registered DB-backed tool and writes/renders task planning state from a flat payload. +- [ ] Both handlers invalidate `invalidateStateCache()` and `clearParseCache()` only after successful DB write + render, with observable tests proving parse-visible state updates. + +## Verification + +- `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts` +- `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts --test-name-pattern="cache|idempotent|render failed|validation failed|plan-slice|plan-task"` + +## Observability Impact + +- Signals added/changed: new handler error payloads for validation / DB write / render failures, plus observable cache-invalidation assertions for slice/task planning writes. +- How a future agent inspects this: run the targeted plan-slice/plan-task test files and inspect `details.operation`, DB rows, and rendered artifacts captured by those tests. +- Failure state exposed: malformed input, missing parent slice, renderer failure, and stale parse-visible state become direct testable outcomes. + +## Inputs + +- `src/resources/extensions/gsd/tools/plan-milestone.ts` — canonical planning handler pattern from S01 +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — current DB tool registration surface +- `src/resources/extensions/gsd/gsd-db.ts` — existing slice/task storage and query primitives +- `src/resources/extensions/gsd/markdown-renderer.ts` — renderer functions produced by T01 +- `src/resources/extensions/gsd/tests/plan-milestone.test.ts` — reference shape for planning handler tests +- `src/resources/extensions/gsd/tests/markdown-renderer.test.ts` — renderer proof surfaces the handlers rely on + +## Expected Output + +- `src/resources/extensions/gsd/tools/plan-slice.ts` — DB-backed slice planning handler +- `src/resources/extensions/gsd/tools/plan-task.ts` — DB-backed task planning handler +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — tool registration for `gsd_plan_slice` and `gsd_plan_task` +- `src/resources/extensions/gsd/gsd-db.ts` — any missing upsert/query helpers for slice/task planning state +- `src/resources/extensions/gsd/tests/plan-slice.test.ts` — slice planning handler regression coverage +- `src/resources/extensions/gsd/tests/plan-task.test.ts` — task planning handler regression coverage diff --git a/.gsd/milestones/M001/slices/S02/tasks/T02-SUMMARY.md b/.gsd/milestones/M001/slices/S02/tasks/T02-SUMMARY.md new file mode 100644 index 000000000..8de1f0d99 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/tasks/T02-SUMMARY.md @@ -0,0 +1,72 @@ +--- +id: T02 +parent: S02 +milestone: M001 +key_files: + - .gsd/milestones/M001/slices/S02/S02-PLAN.md + - src/resources/extensions/gsd/tools/plan-slice.ts + - src/resources/extensions/gsd/tools/plan-task.ts + - src/resources/extensions/gsd/bootstrap/db-tools.ts + - src/resources/extensions/gsd/gsd-db.ts + - src/resources/extensions/gsd/tests/plan-slice.test.ts + - src/resources/extensions/gsd/tests/plan-task.test.ts +key_decisions: + - Slice/task planning writes use dedicated `upsertTaskPlanning()` updates layered on top of `insertTask()` seed rows so rerunning planning does not erase execution/completion fields stored on existing tasks. + - `handlePlanSlice()` follows a DB-first flow that writes slice/task planning rows transactionally, then renders the slice plan plus all task-plan files; cache invalidation remains post-render only, and observability is proven through parse-visible file state rather than internal spies. + - `handlePlanTask()` creates a pending task row only when absent, then updates planning fields and renders the task plan artifact, preserving idempotence for reruns against existing tasks. +observability_surfaces: + - "plan-slice.ts handler error payloads — structured failure messages for validation/DB/render failures returned in tool result" + - "plan-task.ts handler error payloads — structured failure messages for validation/missing-parent/render failures" + - "invalidateStateCache() + clearParseCache() after successful render — ensures callers see fresh state immediately" + - "parse-visible file state — rendered PLAN.md and task-plan files are reparseable proof of handler success" +duration: "" +verification_result: passed +completed_at: 2026-03-23T16:05:04.223Z +blocker_discovered: false +--- + +# T02: Implement DB-backed gsd_plan_slice and gsd_plan_task handlers with registrations and regression tests + +**Implement DB-backed gsd_plan_slice and gsd_plan_task handlers with registrations and regression tests** + +## What Happened + +Implemented the DB-backed slice/task planning write path for S02. I first verified the local contracts in `plan-milestone.ts`, `db-tools.ts`, `gsd-db.ts`, `markdown-renderer.ts`, and the existing renderer/handler tests, then patched the slice plan’s verification section with an explicit diagnostic check because the pre-flight called that gap out. Added `src/resources/extensions/gsd/tools/plan-slice.ts` and `src/resources/extensions/gsd/tools/plan-task.ts`, each mirroring the S01 pattern: flat validation, parent-slice existence checks, DB writes, renderer invocation, and cache invalidation only after successful render. In `gsd-db.ts` I added `upsertTaskPlanning()` and extended the planning record shape with optional title support so planning reruns update task planning fields without overwriting completion metadata. In `src/resources/extensions/gsd/bootstrap/db-tools.ts` I registered canonical `gsd_plan_slice` and `gsd_plan_task` tools plus aliases `gsd_slice_plan` and `gsd_task_plan`, with DB-availability checks and structured handler result payloads. Finally, I added focused regression suites in `src/resources/extensions/gsd/tests/plan-slice.test.ts` and `src/resources/extensions/gsd/tests/plan-task.test.ts` covering validation failures, missing-parent rejection, successful DB-backed renders, render-failure behavior, idempotent reruns, and parse-visible cache refresh behavior via reparsed plan artifacts. + +## Verification + +Verified the new handlers with the task’s targeted resolver-harness command for `plan-slice.test.ts` and `plan-task.test.ts`; all validation, parent-check, render-failure, idempotence, and parse-visible cache refresh assertions passed. Then ran the task’s second verification command against `plan-slice.test.ts`, `plan-task.test.ts`, and `markdown-renderer.test.ts` filtered to cache/idempotence/render-failure coverage; it passed and preserved truthful stale-render diagnostics on stderr. Finally ran the broader slice-level verification command including `markdown-renderer.test.ts`, `auto-recovery.test.ts`, and `prompt-contracts.test.ts` filtered to plan-slice/plan-task and DB-backed planning coverage; it passed, confirming the new handlers coexist with existing renderer/recovery/prompt contracts. + +## Verification Evidence + +| # | Command | Exit Code | Verdict | Duration | +|---|---------|-----------|---------|----------| +| 1 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts` | 0 | ✅ pass | 180ms | +| 2 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts --test-name-pattern="cache|idempotent|render failed|validation failed|plan-slice|plan-task"` | 0 | ✅ pass | 228ms | +| 3 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/auto-recovery.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts --test-name-pattern="plan-slice|plan-task|renderPlanFromDb|renderTaskPlanFromDb|task plan|DB-backed planning"` | 0 | ✅ pass | 731ms | + + +## Deviations + +Updated `.gsd/milestones/M001/slices/S02/S02-PLAN.md` with an explicit diagnostic verification command to satisfy the task pre-flight requirement. The implementation reused the existing DB schema and renderer contracts already present locally, so no broader replan was needed. I also added a narrow `upsertTaskPlanning()` DB helper instead of changing `insertTask()` semantics, because planning reruns must not clobber completion-state fields. + +## Known Issues + +None. + +## Diagnostics + +- **Handler test suite:** Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts` — 10 tests covering validation, parent checks, render failure, idempotence, and cache refresh. +- **Tool registration:** Check `db-tools.ts` for `gsd_plan_slice` and `gsd_plan_task` canonical names plus `gsd_slice_plan` and `gsd_task_plan` aliases. +- **DB query helpers:** `upsertTaskPlanning()` in `gsd-db.ts` — updates planning fields without clobbering completion state. +- **Handler error payloads:** Both handlers return structured `{ error: true, message: string }` on validation/DB/render failures, surfaced in tool result payloads. + +## Files Created/Modified + +- `.gsd/milestones/M001/slices/S02/S02-PLAN.md` +- `src/resources/extensions/gsd/tools/plan-slice.ts` +- `src/resources/extensions/gsd/tools/plan-task.ts` +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` +- `src/resources/extensions/gsd/gsd-db.ts` +- `src/resources/extensions/gsd/tests/plan-slice.test.ts` +- `src/resources/extensions/gsd/tests/plan-task.test.ts` diff --git a/.gsd/milestones/M001/slices/S02/tasks/T02-VERIFY.json b/.gsd/milestones/M001/slices/S02/tasks/T02-VERIFY.json new file mode 100644 index 000000000..d3e582f28 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/tasks/T02-VERIFY.json @@ -0,0 +1,18 @@ +{ + "schemaVersion": 1, + "taskId": "T02", + "unitId": "M001/S02/T02", + "timestamp": 1774281912502, + "passed": false, + "discoverySource": "package-json", + "checks": [ + { + "command": "npm run test", + "exitCode": 1, + "durationMs": 34647, + "verdict": "fail" + } + ], + "retryAttempt": 1, + "maxRetries": 2 +} diff --git a/.gsd/milestones/M001/slices/S02/tasks/T03-PLAN.md b/.gsd/milestones/M001/slices/S02/tasks/T03-PLAN.md new file mode 100644 index 000000000..0f73975f1 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/tasks/T03-PLAN.md @@ -0,0 +1,53 @@ +--- +estimated_steps: 4 +estimated_files: 4 +skills_used: + - create-gsd-extension + - test +--- + +# T03: Close prompt and contract coverage around DB-backed slice planning + +**Slice:** S02 — plan_slice + plan_task tools + PLAN/task-plan renderers +**Milestone:** M001 + +## Description + +Finish the slice by aligning the planning prompt surface with the new implementation. This task is intentionally smaller: once the renderer and handlers exist, the remaining risk is the LLM still being told to treat direct markdown writes as normal. Tighten the prompt wording and contract tests so the DB-backed slice/task planning route is the explicit expected behavior. + +## Steps + +1. Read the current planning prompt text in `src/resources/extensions/gsd/prompts/plan-slice.md` and the existing assertions in `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` and `src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts`. +2. Update `src/resources/extensions/gsd/prompts/plan-slice.md` to explicitly direct slice/task planning through `gsd_plan_slice` and `gsd_plan_task` when the tool path exists, while preserving the existing decomposition instructions and output requirements. +3. Extend prompt contract tests so they assert the new tool-backed instructions and reject regressions back to manual `PLAN.md` / task-plan writes as the intended source of truth. +4. Update prompt template tests if needed so variable substitution and template integrity still pass with the new instructions. + +## Must-Haves + +- [ ] `plan-slice.md` explicitly points planning at `gsd_plan_slice` / `gsd_plan_task` instead of only warning about direct `PLAN.md` writes. +- [ ] Prompt contract tests fail if the DB-backed slice/task planning tool instructions regress. +- [ ] Prompt template tests still pass after the wording change. + +## Verification + +- `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts --test-name-pattern="plan-slice|plan task|DB-backed"` +- Read the relevant assertions in `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` to confirm they mention `gsd_plan_slice` / `gsd_plan_task`. + +## Inputs + +- `src/resources/extensions/gsd/prompts/plan-slice.md` — current slice planning prompt +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — prompt regression contract tests +- `src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts` — template substitution/integrity tests +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — canonical tool names to reference in the prompt/tests + +## Expected Output + +- `src/resources/extensions/gsd/prompts/plan-slice.md` — updated DB-backed slice/task planning instructions +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — stronger prompt contract coverage for `gsd_plan_slice` / `gsd_plan_task` +- `src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts` — updated template tests if prompt wording changes affect expectations + +## Observability Impact + +- **Signals changed:** The planning prompt now explicitly names `gsd_plan_slice` and `gsd_plan_task` tools, so any agent following the prompt will emit structured tool calls instead of raw file writes — making planning actions observable via tool-call logs rather than implicit file-write patterns. +- **Inspection surface:** `prompt-contracts.test.ts` assertions referencing the canonical tool names serve as the regression tripwire; if the prompt text drifts back to manual-write instructions, these tests fail immediately. +- **Failure visibility:** A regression in the prompt wording (removing tool references or re-introducing manual write instructions) is caught by the contract tests before it reaches production prompt surfaces. diff --git a/.gsd/milestones/M001/slices/S02/tasks/T03-SUMMARY.md b/.gsd/milestones/M001/slices/S02/tasks/T03-SUMMARY.md new file mode 100644 index 000000000..fcdf1ad23 --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/tasks/T03-SUMMARY.md @@ -0,0 +1,69 @@ +--- +id: T03 +parent: S02 +milestone: M001 +key_files: + - src/resources/extensions/gsd/prompts/plan-slice.md + - src/resources/extensions/gsd/tests/prompt-contracts.test.ts + - src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts + - .gsd/milestones/M001/slices/S02/tasks/T03-PLAN.md +key_decisions: + - The plan-slice prompt now uses `gsd_plan_slice` and `gsd_plan_task` as the primary numbered step (step 6) instead of a conditional afterthought (old step 8), with direct file writes explicitly labeled as a degraded fallback (step 7). +observability_surfaces: + - "prompt-contracts.test.ts — 4 new assertions for plan-slice prompt DB-backed tool references, degraded-fallback framing, and per-task tool call instruction" + - "plan-slice-prompt.test.ts — template substitution test proving tool names survive variable replacement" + - "plan-slice.md prompt text — explicit step 6 naming gsd_plan_slice/gsd_plan_task as canonical path" +duration: "" +verification_result: passed +completed_at: 2026-03-23T16:08:41.655Z +blocker_discovered: false +--- + +# T03: Update plan-slice prompt to explicitly name gsd_plan_slice/gsd_plan_task as canonical write path, add prompt contract and template regression tests + +**Update plan-slice prompt to explicitly name gsd_plan_slice/gsd_plan_task as canonical write path, add prompt contract and template regression tests** + +## What Happened + +Updated `src/resources/extensions/gsd/prompts/plan-slice.md` to replace the vague "if the tool path for this planning phase is available" language with explicit instructions naming `gsd_plan_slice` and `gsd_plan_task` as the canonical DB-backed write path for slice and task planning. The new step 6 instructs calling `gsd_plan_slice` with the full payload and `gsd_plan_task` for each task. Step 7 positions direct file writes as an explicitly degraded fallback path only used when the tools are unavailable, not the default. Removed the old step 8 that vaguely referenced "the tool path" and fixed step numbering. + +Added 4 new prompt contract tests in `prompt-contracts.test.ts`: one verifying both tool names appear and the "canonical write path" language is present, one verifying direct file writes are framed as "degraded path, not the default", one verifying the prompt no longer has a bare "Write `{{outputPath}}`" as a primary numbered step, and one verifying the prompt instructs calling `gsd_plan_task` for each task. + +Added 1 new template substitution test in `plan-slice-prompt.test.ts` confirming the tool names and canonical language survive variable substitution. + +Also applied the task-plan pre-flight fix by adding an `## Observability Impact` section to T03-PLAN.md explaining how the prompt change makes planning actions observable via tool-call logs and how the contract tests serve as regression tripwires. + +## Verification + +Ran all three slice-level verification commands: (1) plan-slice.test.ts + plan-task.test.ts — 10/10 pass, (2) markdown-renderer.test.ts + auto-recovery.test.ts + prompt-contracts.test.ts filtered to planning patterns — 60/60 pass, (3) plan-slice.test.ts + plan-task.test.ts filtered to failure/cache/validation — 10/10 pass. Also ran the task-level verification command (prompt-contracts.test.ts + plan-slice-prompt.test.ts filtered to plan-slice|plan task|DB-backed) — 40/40 pass. Read back the prompt-contracts.test.ts assertions and confirmed they explicitly reference gsd_plan_slice and gsd_plan_task. + +## Verification Evidence + +| # | Command | Exit Code | Verdict | Duration | +|---|---------|-----------|---------|----------| +| 1 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts --test-name-pattern="plan-slice|plan task|DB-backed"` | 0 | ✅ pass | 126ms | +| 2 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts` | 0 | ✅ pass | 180ms | +| 3 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/auto-recovery.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts --test-name-pattern="plan-slice|plan-task|renderPlanFromDb|renderTaskPlanFromDb|task plan|DB-backed planning"` | 0 | ✅ pass | 695ms | +| 4 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts --test-name-pattern="validation failed|render failed|cache|missing parent"` | 0 | ✅ pass | 180ms | + + +## Deviations + +None. + +## Known Issues + +None. + +## Diagnostics + +- **Prompt contract tests:** Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts --test-name-pattern="plan-slice"` — verifies tool names, degraded-fallback framing, and per-task instruction in the prompt. +- **Template substitution test:** Run `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts` — confirms DB-backed tool names survive variable substitution. +- **Prompt source:** Read `src/resources/extensions/gsd/prompts/plan-slice.md` — step 6 names `gsd_plan_slice` and `gsd_plan_task` as canonical; step 7 is degraded fallback. + +## Files Created/Modified + +- `src/resources/extensions/gsd/prompts/plan-slice.md` +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` +- `src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts` +- `.gsd/milestones/M001/slices/S02/tasks/T03-PLAN.md` diff --git a/.gsd/milestones/M001/slices/S02/tasks/T03-VERIFY.json b/.gsd/milestones/M001/slices/S02/tasks/T03-VERIFY.json new file mode 100644 index 000000000..c488831cd --- /dev/null +++ b/.gsd/milestones/M001/slices/S02/tasks/T03-VERIFY.json @@ -0,0 +1,18 @@ +{ + "schemaVersion": 1, + "taskId": "T03", + "unitId": "M001/S02/T03", + "timestamp": 1774282125185, + "passed": false, + "discoverySource": "package-json", + "checks": [ + { + "command": "npm run test", + "exitCode": 1, + "durationMs": 39009, + "verdict": "fail" + } + ], + "retryAttempt": 1, + "maxRetries": 2 +} diff --git a/.gsd/milestones/M001/slices/S03/S03-PLAN.md b/.gsd/milestones/M001/slices/S03/S03-PLAN.md new file mode 100644 index 000000000..514fb6e68 --- /dev/null +++ b/.gsd/milestones/M001/slices/S03/S03-PLAN.md @@ -0,0 +1,91 @@ +# S03: replan_slice + reassess_roadmap with structural enforcement + +**Goal:** `gsd_replan_slice` rejects mutations to completed tasks, `gsd_reassess_roadmap` rejects mutations to completed slices. Both write to DB tables (replan_history, assessments), render REPLAN.md/ASSESSMENT.md from DB, and re-render PLAN.md/ROADMAP.md after mutations. +**Demo:** Tests prove that calling replan with a completed task ID returns a structural rejection error, while modifying only incomplete tasks succeeds. Similarly, calling reassess with a completed slice ID returns a rejection error, while modifying only pending slices succeeds. Rendered REPLAN.md and ASSESSMENT.md artifacts exist on disk. Prompts name `gsd_replan_slice` and `gsd_reassess_roadmap` as the canonical tool paths. + +## Must-Haves + +- `handleReplanSlice` structurally rejects mutations (update or remove) to completed tasks +- `handleReplanSlice` writes `replan_history` row, applies task mutations, re-renders PLAN.md + task plans, renders REPLAN.md +- `handleReassessRoadmap` structurally rejects mutations (modify or remove) to completed slices +- `handleReassessRoadmap` writes `assessments` row, applies slice mutations, re-renders ROADMAP.md, renders ASSESSMENT.md +- Both handlers follow validate → enforce → transaction → render → invalidate pattern +- Both handlers invalidate state cache and parse cache after success +- `replan-slice.md` and `reassess-roadmap.md` prompts name the new tools as canonical write path +- Prompt contract tests assert tool name presence in both prompts +- DB helper functions: `insertReplanHistory()`, `insertAssessment()`, `deleteTask()`, `deleteSlice()` +- Renderers: `renderReplanFromDb()`, `renderAssessmentFromDb()` + +## Proof Level + +- This slice proves: contract +- Real runtime required: no +- Human/UAT required: no + +## Verification + +```bash +# Primary proof — replan handler: validation, structural enforcement, DB writes, rendering +node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/replan-handler.test.ts + +# Primary proof — reassess handler: validation, structural enforcement, DB writes, rendering +node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/reassess-handler.test.ts + +# Prompt contracts — verify prompts reference new tool names +node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts + +# Full regression — existing tests still pass +node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts + +# Diagnostic — verify structured error payloads name specific task/slice IDs in rejection messages +# (covered by replan-handler.test.ts "structured error payloads" and reassess-handler.test.ts equivalents) +grep -c "structured error payloads" src/resources/extensions/gsd/tests/replan-handler.test.ts src/resources/extensions/gsd/tests/reassess-handler.test.ts +``` + +## Observability / Diagnostics + +- Runtime signals: Handler error payloads include structured rejection messages naming the specific completed task/slice IDs that blocked the mutation +- Inspection surfaces: `replan_history` and `assessments` DB tables can be queried directly; rendered REPLAN.md and ASSESSMENT.md artifacts on disk +- Failure visibility: Validation errors, structural rejection errors, render failures all return distinct `{ error: string }` payloads with actionable messages + +## Integration Closure + +- Upstream surfaces consumed: `gsd-db.ts` query functions (`getSliceTasks`, `getTask`, `getSlice`, `getMilestoneSlices`, `getMilestone`), `gsd-db.ts` mutation functions (`upsertTaskPlanning`, `upsertSlicePlanning`, `insertTask`, `insertSlice`, `transaction`), `markdown-renderer.ts` renderers (`renderPlanFromDb`, `renderRoadmapFromDb`, `writeAndStore` pattern), `files.ts` (`clearParseCache`), `state.ts` (`invalidateStateCache`) +- New wiring introduced in this slice: `tools/replan-slice.ts` and `tools/reassess-roadmap.ts` handler modules, tool registrations in `db-tools.ts`, prompt template references to `gsd_replan_slice` and `gsd_reassess_roadmap` +- What remains before the milestone is truly usable end-to-end: S04 hot-path caller migration, S05 flag file migration, S06 parser deprecation + +## Tasks + +- [x] **T01: Implement replan_slice handler with structural enforcement** `est:1h` + - Why: Delivers R005 — the core replan handler that queries DB for completed tasks and structurally rejects mutations to them. Also adds required DB helpers (`insertReplanHistory`, `deleteTask`, `deleteSlice`) and the REPLAN.md renderer that all downstream work depends on. + - Files: `src/resources/extensions/gsd/gsd-db.ts`, `src/resources/extensions/gsd/tools/replan-slice.ts`, `src/resources/extensions/gsd/markdown-renderer.ts`, `src/resources/extensions/gsd/tests/replan-handler.test.ts` + - Do: (1) Add `insertReplanHistory()`, `insertAssessment()`, `deleteTask()`, `deleteSlice()` to `gsd-db.ts`. `deleteTask` must first delete from `verification_evidence` (FK constraint) before deleting the task row. `deleteSlice` must delete all child tasks' evidence, then child tasks, then the slice. (2) Add `renderReplanFromDb()` and `renderAssessmentFromDb()` to `markdown-renderer.ts` — both use `writeAndStore()` pattern. REPLAN.md should contain the blocker description, what changed, and the updated task list. ASSESSMENT.md should contain the verdict, assessment text, and slice changes. (3) Create `tools/replan-slice.ts` with `handleReplanSlice()`. Params: milestoneId, sliceId, blockerTaskId, blockerDescription, whatChanged, updatedTasks array (taskId, title, description, estimate, files, verify, inputs, expectedOutput), removedTaskIds array. Validate flat params. Query `getSliceTasks()` for completed tasks (status === 'complete' or 'done'). Reject if any updatedTasks[].taskId or removedTaskIds element matches a completed task. In transaction: write replan_history row, apply task mutations (upsert updated tasks via insertTask+upsertTaskPlanning, delete removed tasks), insert new tasks. After transaction: re-render PLAN.md via `renderPlanFromDb()`, render REPLAN.md via `renderReplanFromDb()`, invalidate caches. (4) Write `tests/replan-handler.test.ts` using `node:test` and the same pattern as `plan-slice.test.ts`. Tests must prove: validation failures, structural rejection of completed task update, structural rejection of completed task removal, successful replan modifying only incomplete tasks, replan_history row persistence, re-rendered PLAN.md correctness, REPLAN.md existence, cache invalidation via parse-visible state. + - Verify: `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/replan-handler.test.ts` + - Done when: All replan handler tests pass, including structural rejection of completed-task mutations and successful replan of incomplete tasks with DB persistence and rendered artifacts. + +- [x] **T02: Implement reassess_roadmap handler with structural enforcement** `est:45m` + - Why: Delivers R006 — the reassess handler that queries DB for completed slices and structurally rejects mutations to them. Reuses DB helpers from T01 and the ASSESSMENT.md renderer. + - Files: `src/resources/extensions/gsd/tools/reassess-roadmap.ts`, `src/resources/extensions/gsd/tests/reassess-handler.test.ts` + - Do: (1) Create `tools/reassess-roadmap.ts` with `handleReassessRoadmap()`. Params: milestoneId, completedSliceId (the slice that just finished), verdict, assessment (text), sliceChanges object with: modified array (sliceId, title, risk, depends, demo), added array (same shape), removed array (sliceId strings). Validate flat params. Query `getMilestoneSlices()` for completed slices (status === 'complete' or 'done'). Reject if any modified[].sliceId or removed[] element matches a completed slice. In transaction: write assessments row (path as PK = ASSESSMENT.md artifact path, milestone_id, status=verdict, scope='roadmap', full_content=assessment text), apply slice mutations (upsert modified via `upsertSlicePlanning`, insert added via `insertSlice`, delete removed via `deleteSlice`). After transaction: re-render ROADMAP.md via `renderRoadmapFromDb()`, render ASSESSMENT.md via `renderAssessmentFromDb()`, invalidate caches. (2) Write `tests/reassess-handler.test.ts` using `node:test`. Tests must prove: validation failures, structural rejection of completed slice modification, structural rejection of completed slice removal, successful reassess modifying only pending slices, assessments row persistence, re-rendered ROADMAP.md correctness, ASSESSMENT.md existence, cache invalidation. + - Verify: `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/reassess-handler.test.ts` + - Done when: All reassess handler tests pass, including structural rejection of completed-slice mutations and successful reassess with DB persistence and rendered artifacts. + +- [ ] **T03: Register tools in db-tools.ts + update prompts + prompt contract tests** `est:30m` + - Why: Connects the handlers to the tool system so auto-mode dispatch can invoke them, and updates prompts to name the tools as canonical write paths. Extends prompt contract tests to catch regressions. + - Files: `src/resources/extensions/gsd/bootstrap/db-tools.ts`, `src/resources/extensions/gsd/prompts/replan-slice.md`, `src/resources/extensions/gsd/prompts/reassess-roadmap.md`, `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` + - Do: (1) Register `gsd_replan_slice` in `db-tools.ts` following the exact pattern of `gsd_plan_slice` — ensureDbOpen check, dynamic import of `../tools/replan-slice.js`, call `handleReplanSlice(params, process.cwd())`, return structured content/details. TypeBox schema matches handler params. Register alias `gsd_slice_replan`. (2) Register `gsd_reassess_roadmap` with alias `gsd_roadmap_reassess` — same pattern, dynamic import of `../tools/reassess-roadmap.js`, call `handleReassessRoadmap(params, process.cwd())`. (3) Update `replan-slice.md` prompt: add a step before the existing file-write instructions that says to use `gsd_replan_slice` tool as the canonical write path when DB-backed tools are available. Position the existing file-write instructions as degraded fallback. Name the specific tool and its parameters. (4) Update `reassess-roadmap.md` prompt: similarly add `gsd_reassess_roadmap` as canonical path. The prompt already has "Do not bypass state with manual roadmap-only edits" — strengthen by naming the specific tool. (5) Add prompt contract tests in `prompt-contracts.test.ts`: assert `replan-slice.md` contains `gsd_replan_slice`, assert `reassess-roadmap.md` contains `gsd_reassess_roadmap`. + - Verify: `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts` + - Done when: Both tools are registered with aliases, both prompts name the canonical tools, and prompt contract tests pass. + +## Files Likely Touched + +- `src/resources/extensions/gsd/gsd-db.ts` +- `src/resources/extensions/gsd/markdown-renderer.ts` +- `src/resources/extensions/gsd/tools/replan-slice.ts` (new) +- `src/resources/extensions/gsd/tools/reassess-roadmap.ts` (new) +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` +- `src/resources/extensions/gsd/prompts/replan-slice.md` +- `src/resources/extensions/gsd/prompts/reassess-roadmap.md` +- `src/resources/extensions/gsd/tests/replan-handler.test.ts` (new) +- `src/resources/extensions/gsd/tests/reassess-handler.test.ts` (new) +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` diff --git a/.gsd/milestones/M001/slices/S03/S03-RESEARCH.md b/.gsd/milestones/M001/slices/S03/S03-RESEARCH.md new file mode 100644 index 000000000..97aa0b680 --- /dev/null +++ b/.gsd/milestones/M001/slices/S03/S03-RESEARCH.md @@ -0,0 +1,111 @@ +# S03 — Research + +**Date:** 2026-03-23 +**Status:** Ready for planning + +## Summary + +S03 delivers two new tool handlers — `handleReplanSlice` and `handleReassessRoadmap` — that structurally enforce preservation of completed work. The core novelty is **structural rejection**: the replan handler queries the DB for completed tasks and refuses to accept mutations to them, while the reassess handler queries for completed slices and refuses mutations to them. Both write to the existing `replan_history` and `assessments` tables created in S01's schema v8 migration. Both render markdown artifacts (REPLAN.md, ASSESSMENT.md, and re-rendered PLAN.md/ROADMAP.md) from DB state. + +This is straightforward application of the S01/S02 handler pattern (validate → check completed state → transaction → render → invalidate) with one meaningful new dimension: the structural enforcement logic that inspects task/slice status before accepting writes. The schema tables already exist. The rendering infrastructure already exists. The prompt templates already have placeholder language about DB-backed tools. The registration pattern is established in `db-tools.ts`. + +## Recommendation + +Follow the exact handler pattern from `plan-slice.ts` and `plan-task.ts`. The two tools have different shapes but identical control flow: + +1. **`handleReplanSlice`** — accepts milestoneId, sliceId, blockerTaskId, blockerDescription, whatChanged, updatedTasks (array), removedTaskIds (array). Queries `getSliceTasks()` to find completed tasks. Rejects if any `updatedTasks[].taskId` matches a completed task. Rejects if any `removedTaskIds` element matches a completed task. Writes `replan_history` row. Applies task mutations (upsert updated, delete removed, insert new). Re-renders PLAN.md and task plans. Renders REPLAN.md. Invalidates caches. + +2. **`handleReassessRoadmap`** — accepts milestoneId, completedSliceId, verdict, assessment, sliceChanges (modified/added/removed/reordered arrays). Queries `getMilestoneSlices()` to find completed slices. Rejects if any modified/removed/reordered slice is completed. Writes `assessments` row. Applies slice mutations (upsert modified, insert added, delete removed, reorder). Re-renders ROADMAP.md. Renders ASSESSMENT.md. Invalidates caches. + +Build order: DB helpers first (insert functions for replan_history and assessments, plus a `deleteTask` function), then handlers, then renderers for REPLAN.md and ASSESSMENT.md, then prompt updates, then tests. Tests are the primary proof surface — they must demonstrate structural rejection of completed-work mutations. + +## Implementation Landscape + +### Key Files + +- `src/resources/extensions/gsd/gsd-db.ts` (1505 lines) — Needs new functions: `insertReplanHistory()`, `insertAssessment()`, `deleteTask()`, `deleteSlice()`, and `updateSliceSequence()` (for reordering). The `replan_history` and `assessments` tables already exist (created in S01 schema v8 migration at lines 321–347). Current exports include `getSliceTasks()`, `getTask()`, `getSlice()`, `getMilestoneSlices()` which provide the completed-state queries. `upsertTaskPlanning()` and `upsertSlicePlanning()` handle mutations to existing rows. `insertTask()` and `insertSlice()` use `INSERT OR IGNORE` — safe for idempotent reruns. + +- `src/resources/extensions/gsd/tools/plan-slice.ts` — Reference handler pattern for replan. Shows validate → parent check → transaction → render → cache invalidation flow. The replan handler follows this pattern but adds: (a) completed-task enforcement before writes, (b) task deletion for removedTaskIds, (c) REPLAN.md rendering. + +- `src/resources/extensions/gsd/tools/plan-milestone.ts` — Reference handler pattern for reassess. Shows how milestone-level mutations work through `upsertMilestonePlanning()` and `upsertSlicePlanning()`, followed by `renderRoadmapFromDb()`. + +- `src/resources/extensions/gsd/markdown-renderer.ts` (currently ~840 lines) — Needs two new renderers: `renderReplanFromDb()` for REPLAN.md and `renderAssessmentFromDb()` for ASSESSMENT.md. Both use the existing `writeAndStore()` helper. Also needs a `renderReplanedPlanFromDb()` or can reuse `renderPlanFromDb()` directly since it reads from DB state (which will already reflect the mutations). The existing `renderPlanFromDb()` already handles completed vs incomplete tasks correctly in its checkbox rendering (`task.status === "done" || task.status === "complete"` → `[x]`). + +- `src/resources/extensions/gsd/tools/replan-slice.ts` — **New file.** Handler for `gsd_replan_slice`. Flat params, structural enforcement, DB writes, render, cache invalidation. + +- `src/resources/extensions/gsd/tools/reassess-roadmap.ts` — **New file.** Handler for `gsd_reassess_roadmap`. Flat params, structural enforcement, DB writes, render, cache invalidation. + +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — Register both new tools following the exact pattern used for `gsd_plan_slice` (lines 386–461). Each gets a canonical name (`gsd_replan_slice`, `gsd_reassess_roadmap`) and an alias (`gsd_slice_replan`, `gsd_roadmap_reassess`). + +- `src/resources/extensions/gsd/prompts/replan-slice.md` — Currently instructs direct file writes to `{{replanPath}}` and `{{planPath}}`. Must be updated to instruct `gsd_replan_slice` tool call as canonical path, with direct writes as degraded fallback. The prompt already has a line about DB-backed planning tools (from S01 updates) but doesn't name the specific tool yet. + +- `src/resources/extensions/gsd/prompts/reassess-roadmap.md` — Currently instructs direct writes to `{{assessmentPath}}` and optionally `{{roadmapPath}}`. Must be updated to instruct `gsd_reassess_roadmap` tool call as canonical path. Already has "Do not bypass state with manual roadmap-only edits" language. + +- `src/resources/extensions/gsd/tests/replan-slice.test.ts` — **New file.** Must prove: validation failures, structural rejection of completed task mutations, DB write correctness, REPLAN.md rendering, PLAN.md re-rendering, cache invalidation, idempotent reruns. + +- `src/resources/extensions/gsd/tests/reassess-roadmap.test.ts` — **New file.** Must prove: validation failures, structural rejection of completed slice mutations, DB write correctness, ASSESSMENT.md rendering, ROADMAP.md re-rendering, cache invalidation, idempotent reruns. + +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — Extend with assertions for replan-slice and reassess-roadmap prompts referencing the new tool names. + +### Build Order + +1. **DB helpers first** — `insertReplanHistory()`, `insertAssessment()`, `deleteTask()`, `deleteSlice()` in `gsd-db.ts`. These are pure DB functions with no rendering dependency. They unblock the handlers. + +2. **Renderers** — `renderReplanFromDb()` and `renderAssessmentFromDb()` in `markdown-renderer.ts`. These are simple markdown generators that write REPLAN.md and ASSESSMENT.md via `writeAndStore()`. They don't need the handlers to exist. Note: PLAN.md and ROADMAP.md re-rendering already works via existing `renderPlanFromDb()` and `renderRoadmapFromDb()`. + +3. **Handlers** — `handleReplanSlice` and `handleReassessRoadmap` in new tool files. These combine the DB helpers and renderers with the structural enforcement logic. This is where the core proof logic lives. + +4. **Registration + Prompts** — Register in `db-tools.ts`, update prompt templates to name the tools. + +5. **Tests** — Can be written alongside handlers or after. They are the primary proof surface for R005 and R006. + +### Verification Approach + +```bash +# Primary proof — replan handler: validation, structural enforcement, DB writes, rendering +node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/replan-slice.test.ts + +# Primary proof — reassess handler: validation, structural enforcement, DB writes, rendering +node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/reassess-roadmap.test.ts + +# Prompt contracts — verify prompts reference new tool names +node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts + +# Full regression — existing tests still pass +node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts +``` + +Key test scenarios to prove: + +- **R005 structural enforcement**: seed a slice with T01 (complete), T02 (complete), T03 (pending). Call replan with an updatedTask targeting T01. Assert error containing "completed task" or similar. Call replan with removedTaskIds including T02. Assert error. Call replan modifying only T03 and adding T04. Assert success. + +- **R006 structural enforcement**: seed a milestone with S01 (complete), S02 (pending), S03 (pending). Call reassess with a modified slice targeting S01. Assert error. Call reassess modifying only S02 and adding S04. Assert success. + +- **Replan history persistence**: after successful replan, query `replan_history` table and verify a row exists with correct milestone_id, slice_id, summary. + +- **Assessment persistence**: after successful reassess, query `assessments` table and verify a row exists with correct path, milestone_id, status, full_content. + +- **Re-rendering correctness**: after replan, read the rendered PLAN.md back from disk, parse it, confirm completed tasks still show `[x]` and new/modified tasks appear correctly. + +- **Cache invalidation**: use parse-visible state assertions (read roadmap/plan before and after handler execution, confirm the parse results reflect the mutations). + +## Constraints + +- `replan_history` schema has columns: `id` (autoincrement), `milestone_id`, `slice_id`, `task_id`, `summary`, `previous_artifact_path`, `replacement_artifact_path`, `created_at`. The handler must populate these — `previous_artifact_path` is the old PLAN.md artifact path and `replacement_artifact_path` is the new one. +- `assessments` schema has columns: `path` (PK), `milestone_id`, `slice_id`, `task_id`, `status`, `scope`, `full_content`, `created_at`. The `path` is the ASSESSMENT.md artifact path, used as primary key — idempotent rewrites via INSERT OR REPLACE. +- No existing `deleteTask()` or `deleteSlice()` function in `gsd-db.ts` — these must be added. Must be careful with foreign key constraints (verification_evidence references tasks). +- `insertSlice()` uses `INSERT OR IGNORE` — safe for idempotent runs but won't update existing slice data. For reassess modifications to existing slices, use `upsertSlicePlanning()` plus a new `updateSliceMetadata()` or similar for title/risk/depends/demo changes. +- The resolver-based TypeScript test harness (`resolve-ts.mjs`) is required — bare `node --test` may fail on `.js` sibling specifiers. +- Cache invalidation must use parse-visible state assertions, not ESM monkey-patching (per KNOWLEDGE.md). + +## Common Pitfalls + +- **Foreign key cascading on task deletion** — The `verification_evidence` table has a foreign key referencing `tasks(milestone_id, slice_id, id)`. Deleting a task without handling this will fail. Use `DELETE FROM verification_evidence WHERE ...` before `DELETE FROM tasks WHERE ...`, or set up CASCADE in the FK (but the schema is already created without CASCADE, so the handler must delete evidence first). +- **Slice deletion vs slice reordering** — Reassess needs to distinguish between removing a slice entirely (DELETE from DB) and reordering slices (no deletion, just update sequence). The current schema doesn't have a `sequence` column — ordering is by `id` (`ORDER BY id`). If reassess reorders, it must either rename slice IDs (risky — breaks references) or add a sequence column. The simpler approach: don't support arbitrary reordering in V1 — just support add/remove/modify. Reordering can be deferred or handled by deleting and re-inserting with new IDs. But since task completions reference slice IDs, deleting completed slices is forbidden anyway, so reordering of completed slices is moot. +- **REPLAN.md path resolution** — The current `buildReplanPrompt` in `auto-prompts.ts` constructs `replanPath` as `join(base, relSlicePath(base, mid, sid) + "/" + sid + "-REPLAN.md")`. The renderer must use the same path construction pattern, or better, use `resolveSliceFile()` with the "REPLAN" suffix if it's supported — check `paths.ts` for supported suffixes. +- **Assessment path as PK** — The `assessments` table uses `path TEXT PRIMARY KEY`, which means the path must be deterministic and consistent. The current `buildReassessPrompt` uses `relSliceFile(base, mid, completedSliceId, "ASSESSMENT")` — the handler must compute the same path. + +## Open Risks + +- The `replan_history.task_id` column is nullable — it's not clear from the schema whether this tracks a specific blocker task or the entire replan event. R005 specifies `blockerTaskId` as a parameter, so this maps to `task_id` in the replan_history row. The handler should populate it. +- Reassess `sliceChanges.reordered` may be complex to implement without a sequence column. The pragmatic choice is to accept reorder directives but only apply them as metadata (not changing actual query ordering since `ORDER BY id` is used throughout). If the planner decides to skip reordering support in V1, this is acceptable since the milestone DoD says "replan and reassess structurally enforce preservation" — it doesn't mandate reordering support. diff --git a/.gsd/milestones/M001/slices/S03/tasks/T01-PLAN.md b/.gsd/milestones/M001/slices/S03/tasks/T01-PLAN.md new file mode 100644 index 000000000..ec588ee0b --- /dev/null +++ b/.gsd/milestones/M001/slices/S03/tasks/T01-PLAN.md @@ -0,0 +1,88 @@ +--- +estimated_steps: 4 +estimated_files: 4 +skills_used: [] +--- + +# T01: Implement replan_slice handler with structural enforcement + +**Slice:** S03 — replan_slice + reassess_roadmap with structural enforcement +**Milestone:** M001 + +## Description + +Build the `handleReplanSlice()` handler that structurally enforces preservation of completed tasks during replanning. This task also adds required DB helper functions (`insertReplanHistory`, `insertAssessment`, `deleteTask`, `deleteSlice`) and markdown renderers (`renderReplanFromDb`, `renderAssessmentFromDb`) that both the replan and reassess handlers use. + +The handler follows the established validate → enforce → transaction → render → invalidate pattern from `plan-slice.ts`. The novel addition is the structural enforcement step: before writing any mutations, query `getSliceTasks()` and reject the operation if any `updatedTasks[].taskId` or `removedTaskIds` element matches a task with status `complete` or `done`. + +## Steps + +1. **Add DB helper functions to `gsd-db.ts`:** + - `insertReplanHistory(entry)` — INSERT into `replan_history` table. Columns: milestone_id, slice_id, task_id (nullable, the blocker task), summary, previous_artifact_path, replacement_artifact_path, created_at. + - `insertAssessment(entry)` — INSERT OR REPLACE into `assessments` table (path is PK). Columns: path, milestone_id, slice_id, task_id, status, scope, full_content, created_at. + - `deleteTask(milestoneId, sliceId, taskId)` — Must first DELETE from `verification_evidence WHERE task_id = :tid AND slice_id = :sid AND milestone_id = :mid`, then DELETE from `tasks WHERE ...`. The `verification_evidence` table has a FK referencing tasks — deleting evidence first avoids FK constraint violations. + - `deleteSlice(milestoneId, sliceId)` — Must delete all child verification_evidence rows, then all child task rows, then the slice row. Use cascade-style manual deletion. + +2. **Add renderers to `markdown-renderer.ts`:** + - `renderReplanFromDb(basePath, milestoneId, sliceId, replanData)` — Generates REPLAN.md with blocker description, what changed, and summary. Uses `writeAndStore()` with artifact_type `"REPLAN"`. The `replanData` param includes blockerTaskId, blockerDescription, whatChanged. Path: `{sliceDir}/{sliceId}-REPLAN.md`. + - `renderAssessmentFromDb(basePath, milestoneId, sliceId, assessmentData)` — Generates ASSESSMENT.md with verdict, assessment text. Uses `writeAndStore()` with artifact_type `"ASSESSMENT"`. Path: `{sliceDir}/{sliceId}-ASSESSMENT.md`. + +3. **Create `tools/replan-slice.ts` with `handleReplanSlice()`:** + - Interface `ReplanSliceParams`: milestoneId, sliceId, blockerTaskId, blockerDescription, whatChanged, updatedTasks (array of {taskId, title, description, estimate, files, verify, inputs, expectedOutput}), removedTaskIds (string array). + - Validate all required fields (same `isNonEmptyString` pattern as plan-slice.ts). + - Query `getSlice()` to verify parent slice exists. + - Query `getSliceTasks()` to get all tasks. Build a Set of completed task IDs (status === 'complete' || status === 'done'). + - **Structural enforcement**: Check if any `updatedTasks[].taskId` is in the completed set → return `{ error: "cannot modify completed task T0X" }`. Check if any `removedTaskIds` element is in the completed set → return `{ error: "cannot remove completed task T0X" }`. + - In `transaction()`: call `insertReplanHistory()` with the replan metadata. For each updatedTask: if task exists, use `upsertTaskPlanning()` to update planning fields; if new, use `insertTask()` then `upsertTaskPlanning()`. For each removedTaskId: call `deleteTask()`. + - After transaction: call `renderPlanFromDb()` to re-render PLAN.md and task plans. Call `renderReplanFromDb()` to write REPLAN.md. Call `invalidateStateCache()` and `clearParseCache()`. + - Return `{ milestoneId, sliceId, replanPath, planPath }` on success. + +4. **Write `tests/replan-handler.test.ts`:** + - Use `node:test` (import test from 'node:test') and `node:assert/strict`. Follow the exact test setup pattern from `plan-slice.test.ts`: `makeTmpBase()`, `openDatabase()`, `cleanup()`, seed parent milestone+slice+tasks. + - Test cases: + - Validation failure (missing milestoneId) → returns `{ error }` containing "validation failed" + - Structural rejection: seed T01 as complete, T02 as pending. Call replan with updatedTasks targeting T01. Assert error contains "completed task" and "T01". + - Structural rejection: seed T01 as complete. Call replan with removedTaskIds containing T01. Assert error contains "completed task". + - Successful replan: seed T01 complete, T02 pending, T03 pending. Call replan updating T02 and removing T03 and adding T04. Assert success. Verify replan_history row exists in DB. Verify T02 updated in DB. Verify T03 deleted from DB. Verify T04 exists in DB. Verify rendered PLAN.md exists on disk. Verify REPLAN.md exists on disk. + - Cache invalidation: verify that re-parsing the PLAN.md after replan reflects the mutations (parse-visible state assertion). + - Idempotent rerun: call replan twice with same params, assert second call also succeeds. + +## Must-Haves + +- [ ] `insertReplanHistory()`, `insertAssessment()`, `deleteTask()`, `deleteSlice()` exported from `gsd-db.ts` +- [ ] `deleteTask()` handles FK constraint by deleting verification_evidence first +- [ ] `renderReplanFromDb()` and `renderAssessmentFromDb()` exported from `markdown-renderer.ts` +- [ ] `handleReplanSlice()` exported from `tools/replan-slice.ts` +- [ ] Structural rejection returns error naming the specific completed task ID +- [ ] Successful replan writes `replan_history` row with blocker metadata +- [ ] Successful replan re-renders PLAN.md and writes REPLAN.md via `writeAndStore()` +- [ ] Cache invalidation via `invalidateStateCache()` + `clearParseCache()` after render +- [ ] All tests in `replan-handler.test.ts` pass + +## Verification + +- `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/replan-handler.test.ts` — all tests pass +- Structural rejection tests prove completed tasks cannot be mutated +- DB persistence tests prove replan_history row exists after successful replan + +## Observability Impact + +- Signals added/changed: Replan handler error payloads include the specific completed task IDs that blocked the mutation +- How a future agent inspects this: Query `replan_history` table, read rendered REPLAN.md, check PLAN.md for updated task list +- Failure state exposed: Validation errors, structural rejection errors, render failures return distinct `{ error: string }` payloads + +## Inputs + +- `src/resources/extensions/gsd/gsd-db.ts` — existing DB functions: `getSliceTasks()`, `getTask()`, `getSlice()`, `insertTask()`, `upsertTaskPlanning()`, `transaction()`, `insertArtifact()` +- `src/resources/extensions/gsd/markdown-renderer.ts` — existing `writeAndStore()` pattern, `renderPlanFromDb()` for PLAN.md re-rendering +- `src/resources/extensions/gsd/tools/plan-slice.ts` — reference handler pattern (validate → transaction → render → invalidate) +- `src/resources/extensions/gsd/tests/plan-slice.test.ts` — reference test pattern (setup, seed, assert) +- `src/resources/extensions/gsd/state.ts` — `invalidateStateCache()` import +- `src/resources/extensions/gsd/files.ts` — `clearParseCache()` import + +## Expected Output + +- `src/resources/extensions/gsd/gsd-db.ts` — modified with 4 new exported functions +- `src/resources/extensions/gsd/markdown-renderer.ts` — modified with 2 new renderer functions +- `src/resources/extensions/gsd/tools/replan-slice.ts` — new handler file +- `src/resources/extensions/gsd/tests/replan-handler.test.ts` — new test file diff --git a/.gsd/milestones/M001/slices/S03/tasks/T01-SUMMARY.md b/.gsd/milestones/M001/slices/S03/tasks/T01-SUMMARY.md new file mode 100644 index 000000000..c78c93a20 --- /dev/null +++ b/.gsd/milestones/M001/slices/S03/tasks/T01-SUMMARY.md @@ -0,0 +1,66 @@ +--- +id: T01 +parent: S03 +milestone: M001 +key_files: + - src/resources/extensions/gsd/gsd-db.ts + - src/resources/extensions/gsd/markdown-renderer.ts + - src/resources/extensions/gsd/tools/replan-slice.ts + - src/resources/extensions/gsd/tests/replan-handler.test.ts + - .gsd/milestones/M001/slices/S03/S03-PLAN.md +key_decisions: + - deleteTask() deletes verification_evidence before task row to avoid FK constraint violations — cascade-style manual deletion pattern + - Structural enforcement checks both 'complete' and 'done' statuses as completed-task indicators + - Error payloads include the specific task ID that blocked the mutation for actionable diagnostics +duration: "" +verification_result: passed +completed_at: 2026-03-23T16:28:29.943Z +blocker_discovered: false +--- + +# T01: Implement replan_slice handler with structural enforcement, DB helpers, renderers, and tests + +**Implement replan_slice handler with structural enforcement, DB helpers, renderers, and tests** + +## What Happened + +Built the `handleReplanSlice()` handler that structurally enforces preservation of completed tasks during replanning, following the validate → enforce → transaction → render → invalidate pattern from `plan-slice.ts`. + +**Step 1 — DB helpers in `gsd-db.ts`:** Added four new exported functions: `insertReplanHistory()` writes to the `replan_history` table, `insertAssessment()` does INSERT OR REPLACE into `assessments`, `deleteTask()` handles FK constraints by deleting `verification_evidence` rows before the task row, and `deleteSlice()` performs cascade-style manual deletion (evidence → tasks → slice). Also added `getReplanHistory()` query helper for test assertions. + +**Step 2 — Renderers in `markdown-renderer.ts`:** Added `renderReplanFromDb()` which generates REPLAN.md with blocker description, what changed, and metadata sections using `writeAndStore()` with artifact_type "REPLAN". Added `renderAssessmentFromDb()` which generates ASSESSMENT.md with verdict and assessment text using artifact_type "ASSESSMENT". Both resolve slice paths via `resolveSlicePath()` with fallback. + +**Step 3 — Handler in `tools/replan-slice.ts`:** Created `handleReplanSlice()` with full validation of all required fields. Queries `getSliceTasks()` and builds a Set of completed task IDs (status === 'complete' || status === 'done'). Returns specific `{ error }` naming the exact task ID when any `updatedTasks[].taskId` or `removedTaskIds` element matches a completed task. In transaction: inserts replan_history row, upserts or inserts updated tasks, deletes removed tasks. After transaction: re-renders PLAN.md via `renderPlanFromDb()`, writes REPLAN.md via `renderReplanFromDb()`, invalidates both state cache and parse cache. + +**Step 4 — Tests in `tests/replan-handler.test.ts`:** Wrote 9 tests following the exact `plan-slice.test.ts` pattern (makeTmpBase, openDatabase, cleanup, seed). Tests cover: validation failure, structural rejection of completed task update, structural rejection of completed task removal, successful replan (verifies DB persistence of replan_history, task mutations, rendered artifacts), cache invalidation via re-parse, idempotent rerun, missing parent slice, "done" status alias handling, and structured error payload verification. + +**Pre-flight fix:** Added diagnostic verification step to S03-PLAN.md Verification section confirming structured error payload tests exist. + +## Verification + +Ran `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/replan-handler.test.ts` — all 9 tests pass (9/9, 0 failures, ~180ms). Ran full regression suite across plan-milestone, plan-slice, plan-task, markdown-renderer, and rogue-file-detection tests — all 25 tests pass (0 failures). Structural rejection tests prove completed tasks (both "complete" and "done" statuses) cannot be mutated or removed. DB persistence tests verify replan_history rows exist with correct metadata after successful replan. Rendered PLAN.md and REPLAN.md artifacts verified on disk. + +## Verification Evidence + +| # | Command | Exit Code | Verdict | Duration | +|---|---------|-----------|---------|----------| +| 1 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/replan-handler.test.ts` | 0 | ✅ pass | 253ms | +| 2 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` | 0 | ✅ pass | 609ms | +| 3 | `grep -c 'structured error payloads' src/resources/extensions/gsd/tests/replan-handler.test.ts` | 0 | ✅ pass | 10ms | + + +## Deviations + +Added `getReplanHistory()` query helper to `gsd-db.ts` (not in plan) — needed for test assertions to verify DB persistence. Added 3 extra tests beyond the plan's 6: missing parent slice error, "done" status alias handling, and structured error payloads with specific task IDs — strengthens observability coverage. + +## Known Issues + +None. + +## Files Created/Modified + +- `src/resources/extensions/gsd/gsd-db.ts` +- `src/resources/extensions/gsd/markdown-renderer.ts` +- `src/resources/extensions/gsd/tools/replan-slice.ts` +- `src/resources/extensions/gsd/tests/replan-handler.test.ts` +- `.gsd/milestones/M001/slices/S03/S03-PLAN.md` diff --git a/.gsd/milestones/M001/slices/S03/tasks/T01-VERIFY.json b/.gsd/milestones/M001/slices/S03/tasks/T01-VERIFY.json new file mode 100644 index 000000000..edf045dd9 --- /dev/null +++ b/.gsd/milestones/M001/slices/S03/tasks/T01-VERIFY.json @@ -0,0 +1,18 @@ +{ + "schemaVersion": 1, + "taskId": "T01", + "unitId": "M001/S03/T01", + "timestamp": 1774283314702, + "passed": false, + "discoverySource": "package-json", + "checks": [ + { + "command": "npm run test", + "exitCode": 1, + "durationMs": 39728, + "verdict": "fail" + } + ], + "retryAttempt": 1, + "maxRetries": 2 +} diff --git a/.gsd/milestones/M001/slices/S03/tasks/T02-PLAN.md b/.gsd/milestones/M001/slices/S03/tasks/T02-PLAN.md new file mode 100644 index 000000000..da4326acd --- /dev/null +++ b/.gsd/milestones/M001/slices/S03/tasks/T02-PLAN.md @@ -0,0 +1,75 @@ +--- +estimated_steps: 2 +estimated_files: 2 +skills_used: [] +--- + +# T02: Implement reassess_roadmap handler with structural enforcement + +**Slice:** S03 — replan_slice + reassess_roadmap with structural enforcement +**Milestone:** M001 + +## Description + +Build the `handleReassessRoadmap()` handler that structurally enforces preservation of completed slices during roadmap reassessment. This handler follows the identical control flow pattern as `handleReplanSlice()` from T01 but operates at the milestone/slice level instead of the slice/task level. It reuses the DB helpers (`insertAssessment`, `deleteSlice`) and the `renderAssessmentFromDb()` renderer from T01. + +The structural enforcement logic: before writing any mutations, query `getMilestoneSlices()` and reject if any modified or removed slice has status `complete` or `done`. + +## Steps + +1. **Create `tools/reassess-roadmap.ts` with `handleReassessRoadmap()`:** + - Interface `ReassessRoadmapParams`: milestoneId, completedSliceId (the slice that just finished), verdict (string — e.g. "confirmed", "adjusted"), assessment (text body), sliceChanges object with: modified (array of {sliceId, title, risk, depends, demo}), added (array of {sliceId, title, risk, depends, demo}), removed (array of sliceId strings). + - Validate all required fields. `sliceChanges` must be an object with modified, added, removed arrays (can be empty arrays but must exist). + - Query `getMilestone()` to verify milestone exists. + - Query `getMilestoneSlices()` to get all slices. Build a Set of completed slice IDs (status === 'complete' || status === 'done'). + - **Structural enforcement**: Check if any `sliceChanges.modified[].sliceId` is in the completed set → return `{ error: "cannot modify completed slice S0X" }`. Check if any `sliceChanges.removed[]` element is in the completed set → return `{ error: "cannot remove completed slice S0X" }`. + - Compute assessment artifact path: `{sliceDir}/{completedSliceId}-ASSESSMENT.md` (the assessment lives in the completed slice's directory). + - In `transaction()`: call `insertAssessment()` with path (PK), milestone_id, status=verdict, scope='roadmap', full_content=assessment text, created_at. For each modified slice: call `upsertSlicePlanning()` to update title/risk/depends/demo. For each added slice: call `insertSlice()` with id, milestoneId, title, status='pending', demo. For each removed sliceId: call `deleteSlice()`. + - After transaction: call `renderRoadmapFromDb()` to re-render ROADMAP.md. Call `renderAssessmentFromDb()` to write ASSESSMENT.md. Call `invalidateStateCache()` and `clearParseCache()`. + - Return `{ milestoneId, completedSliceId, assessmentPath, roadmapPath }` on success. + +2. **Write `tests/reassess-handler.test.ts`:** + - Use `node:test` and `node:assert/strict`. Follow the setup pattern from `plan-slice.test.ts`: temp directory with `.gsd/milestones/M001/` structure, `openDatabase()`, seed milestone with S01 (complete), S02 (pending), S03 (pending). + - Test cases: + - Validation failure (missing milestoneId) → returns `{ error }` containing "validation failed" + - Missing milestone → returns `{ error }` containing "not found" + - Structural rejection: call reassess with modified containing S01 (complete). Assert error contains "completed slice" and "S01". + - Structural rejection: call reassess with removed containing S01 (complete). Assert error contains "completed slice". + - Successful reassess: modify S02 title/demo, add S04, remove S03. Assert success. Verify assessments row exists in DB (query by path). Verify S02 updated in DB. Verify S03 deleted from DB. Verify S04 exists in DB. Verify ROADMAP.md re-rendered on disk. Verify ASSESSMENT.md exists on disk. + - Cache invalidation: verify parse-visible state reflects mutations. + - Idempotent rerun: call reassess twice, second also succeeds (INSERT OR REPLACE on assessments path PK). + +## Must-Haves + +- [ ] `handleReassessRoadmap()` exported from `tools/reassess-roadmap.ts` +- [ ] Structural rejection returns error naming the specific completed slice ID +- [ ] Successful reassess writes `assessments` row with path PK and assessment content +- [ ] Successful reassess re-renders ROADMAP.md and writes ASSESSMENT.md via renderers +- [ ] Cache invalidation via `invalidateStateCache()` + `clearParseCache()` after render +- [ ] All tests in `reassess-handler.test.ts` pass + +## Verification + +- `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/reassess-handler.test.ts` — all tests pass +- Structural rejection tests prove completed slices cannot be mutated +- DB persistence tests prove assessments row exists after successful reassess + +## Observability Impact + +- Signals added/changed: Reassess handler error payloads include the specific completed slice IDs that blocked the mutation +- How a future agent inspects this: Query `assessments` table by path, read rendered ASSESSMENT.md, check ROADMAP.md for updated slice list +- Failure state exposed: Validation errors, structural rejection errors, render failures return distinct `{ error: string }` payloads + +## Inputs + +- `src/resources/extensions/gsd/gsd-db.ts` — `getMilestoneSlices()`, `getMilestone()`, `insertSlice()`, `upsertSlicePlanning()`, `insertAssessment()`, `deleteSlice()`, `transaction()` (the last two added by T01) +- `src/resources/extensions/gsd/markdown-renderer.ts` — `renderRoadmapFromDb()`, `renderAssessmentFromDb()` (the latter added by T01) +- `src/resources/extensions/gsd/tools/replan-slice.ts` — reference handler pattern from T01 +- `src/resources/extensions/gsd/tests/replan-handler.test.ts` — reference test pattern from T01 +- `src/resources/extensions/gsd/state.ts` — `invalidateStateCache()` +- `src/resources/extensions/gsd/files.ts` — `clearParseCache()` + +## Expected Output + +- `src/resources/extensions/gsd/tools/reassess-roadmap.ts` — new handler file +- `src/resources/extensions/gsd/tests/reassess-handler.test.ts` — new test file diff --git a/.gsd/milestones/M001/slices/S03/tasks/T02-SUMMARY.md b/.gsd/milestones/M001/slices/S03/tasks/T02-SUMMARY.md new file mode 100644 index 000000000..d39ba085f --- /dev/null +++ b/.gsd/milestones/M001/slices/S03/tasks/T02-SUMMARY.md @@ -0,0 +1,59 @@ +--- +id: T02 +parent: S03 +milestone: M001 +key_files: + - src/resources/extensions/gsd/tools/reassess-roadmap.ts + - src/resources/extensions/gsd/tests/reassess-handler.test.ts + - src/resources/extensions/gsd/gsd-db.ts +key_decisions: + - Added updateSliceFields() to gsd-db.ts for title/risk/depends/demo updates because upsertSlicePlanning() only handles planning-level fields (goal, success_criteria, etc.) — keeps DB API consistent rather than using raw SQL in the handler + - Added getAssessment() query helper to gsd-db.ts for test verification of assessments DB persistence — follows the same pattern as getReplanHistory() added in T01 +duration: "" +verification_result: passed +completed_at: 2026-03-23T16:32:59.273Z +blocker_discovered: false +--- + +# T02: Implement reassess_roadmap handler with structural enforcement, DB persistence, and tests + +**Implement reassess_roadmap handler with structural enforcement, DB persistence, and tests** + +## What Happened + +Built the `handleReassessRoadmap()` handler in `tools/reassess-roadmap.ts` following the identical validate → enforce → transaction → render → invalidate pattern established by `handleReplanSlice()` in T01, but operating at the milestone/slice level instead of slice/task level. + +**Handler implementation:** Validates all required fields including `sliceChanges` object with `modified`, `added`, and `removed` arrays. Queries `getMilestone()` to verify milestone exists. Queries `getMilestoneSlices()` and builds a Set of completed slice IDs (status === 'complete' || status === 'done'). Structural enforcement rejects any `sliceChanges.modified[].sliceId` or `sliceChanges.removed[]` element that matches a completed slice, returning `{ error }` naming the specific slice ID. In transaction: writes `assessments` row via `insertAssessment()` with path PK, applies slice modifications via `updateSliceFields()`, inserts new slices via `insertSlice()`, deletes removed slices via `deleteSlice()`. After transaction: re-renders ROADMAP.md via `renderRoadmapFromDb()`, writes ASSESSMENT.md via `renderAssessmentFromDb()`, invalidates both state cache and parse cache. + +**DB helper addition:** Added `updateSliceFields()` to `gsd-db.ts` — a targeted function that updates title/risk/depends/demo on existing slice rows. This was needed because `upsertSlicePlanning()` only handles planning fields (goal, success_criteria, etc.), not the basic slice metadata the reassess handler needs to modify. Also added `getAssessment()` query helper for test assertions. + +**Tests:** Wrote 9 tests in `reassess-handler.test.ts` following the exact pattern from `replan-handler.test.ts`. Tests cover: validation failure (missing milestoneId), missing milestone, structural rejection of completed slice modification, structural rejection of completed slice removal, successful reassess (verifies DB persistence of assessments row, slice mutations, rendered artifacts on disk), cache invalidation via getMilestoneSlices, idempotent rerun, "done" status alias handling, and structured error payload verification with specific slice IDs. + +## Verification + +Ran `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/reassess-handler.test.ts` — all 9 tests pass (0 failures, ~174ms). Ran replan handler tests — 9/9 pass (no regressions from gsd-db.ts changes). Ran full regression suite (plan-milestone, plan-slice, plan-task, markdown-renderer, rogue-file-detection) — 25/25 pass. Ran prompt contract tests — 26/26 pass. Diagnostic grep confirms both test files contain structured error payload assertions. + +## Verification Evidence + +| # | Command | Exit Code | Verdict | Duration | +|---|---------|-----------|---------|----------| +| 1 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/reassess-handler.test.ts` | 0 | ✅ pass | 174ms | +| 2 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/replan-handler.test.ts` | 0 | ✅ pass | 293ms | +| 3 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-milestone.test.ts src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/rogue-file-detection.test.ts` | 0 | ✅ pass | 645ms | +| 4 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts` | 0 | ✅ pass | 116ms | +| 5 | `grep -c 'structured error payloads' src/resources/extensions/gsd/tests/replan-handler.test.ts src/resources/extensions/gsd/tests/reassess-handler.test.ts` | 0 | ✅ pass | 10ms | + + +## Deviations + +Added `updateSliceFields()` to `gsd-db.ts` (not in task plan's expected output) — needed because `upsertSlicePlanning()` only handles planning fields, not the basic slice fields (title/risk/depends/demo) that the reassess handler modifies. Also added `getAssessment()` query helper for test DB persistence assertions. + +## Known Issues + +None. + +## Files Created/Modified + +- `src/resources/extensions/gsd/tools/reassess-roadmap.ts` +- `src/resources/extensions/gsd/tests/reassess-handler.test.ts` +- `src/resources/extensions/gsd/gsd-db.ts` diff --git a/.gsd/milestones/M001/slices/S03/tasks/T03-PLAN.md b/.gsd/milestones/M001/slices/S03/tasks/T03-PLAN.md new file mode 100644 index 000000000..1029473a8 --- /dev/null +++ b/.gsd/milestones/M001/slices/S03/tasks/T03-PLAN.md @@ -0,0 +1,78 @@ +--- +estimated_steps: 5 +estimated_files: 4 +skills_used: [] +--- + +# T03: Register tools in db-tools.ts + update prompts + prompt contract tests + +**Slice:** S03 — replan_slice + reassess_roadmap with structural enforcement +**Milestone:** M001 + +## Description + +Wire the two new handlers into the tool system by registering them in `db-tools.ts`, update the prompt templates to name the specific tools as canonical write paths, and extend prompt contract tests to catch regressions. This is the integration closure task that makes the handlers callable by auto-mode dispatch. + +## Steps + +1. **Register `gsd_replan_slice` in `db-tools.ts`:** + - Add after the `gsd_plan_task` registration block (around line 531). + - Follow the exact pattern of `gsd_plan_slice`: `ensureDbOpen()` guard, dynamic `import("../tools/replan-slice.js")`, call `handleReplanSlice(params, process.cwd())`, check for `error` in result, return structured `content`/`details`. + - TypeBox schema mirrors `ReplanSliceParams`: milestoneId, sliceId, blockerTaskId, blockerDescription, whatChanged as `Type.String()`, updatedTasks as `Type.Array(Type.Object({...}))`, removedTaskIds as `Type.Array(Type.String())`. + - Name: `gsd_replan_slice`, label: `"Replan Slice"`, description mentioning structural enforcement of completed tasks. + - promptGuidelines: mention canonical name and alias. + - Register alias: `gsd_slice_replan` → `gsd_replan_slice`. + +2. **Register `gsd_reassess_roadmap` in `db-tools.ts`:** + - Same pattern. Dynamic `import("../tools/reassess-roadmap.js")`, call `handleReassessRoadmap(params, process.cwd())`. + - TypeBox schema mirrors `ReassessRoadmapParams`: milestoneId, completedSliceId, verdict, assessment as `Type.String()`, sliceChanges as `Type.Object({ modified: Type.Array(...), added: Type.Array(...), removed: Type.Array(Type.String()) })`. + - Name: `gsd_reassess_roadmap`, label: `"Reassess Roadmap"`. + - Register alias: `gsd_roadmap_reassess` → `gsd_reassess_roadmap`. + +3. **Update `replan-slice.md` prompt:** + - Add a new step before the existing file-write instructions (before step 3). The new step should say: "If a DB-backed planning tool is available, use `gsd_replan_slice` with the following parameters: milestoneId, sliceId, blockerTaskId, blockerDescription, whatChanged, updatedTasks, removedTaskIds. This is the canonical write path — it structurally enforces preservation of completed tasks and writes replan history to the DB." + - Reposition the existing file-write steps (writing `{{replanPath}}` and `{{planPath}}`) as the degraded fallback: "If the `gsd_replan_slice` tool is not available, fall back to writing files directly..." + - Keep all existing hard constraints about completed tasks intact — they remain as documentation even though the tool enforces them structurally. + +4. **Update `reassess-roadmap.md` prompt:** + - Add a new instruction before the "If changes are needed" section: "Use `gsd_reassess_roadmap` to persist the assessment and any roadmap changes. Pass: milestoneId, completedSliceId, verdict, assessment text, and sliceChanges with modified/added/removed arrays." + - The prompt already has "Do not bypass state with manual roadmap-only edits" — augment it with: "when `gsd_reassess_roadmap` is available". + - Keep the existing file-write instructions as degraded fallback. + +5. **Extend `prompt-contracts.test.ts`:** + - Add test: `replan-slice prompt names gsd_replan_slice as canonical tool` — assert `replan-slice.md` contains `gsd_replan_slice`. + - Add test: `reassess-roadmap prompt names gsd_reassess_roadmap as canonical tool` — assert `reassess-roadmap.md` contains `gsd_reassess_roadmap`. + - Update the existing test at line 170 (`"replan-slice prompt requires DB-backed planning state when available"`) if the new prompt content makes the old assertion redundant — the existing test checks for generic "DB-backed planning tool" language, the new test checks for the specific tool name. + +## Must-Haves + +- [ ] `gsd_replan_slice` registered in db-tools.ts with TypeBox schema and alias `gsd_slice_replan` +- [ ] `gsd_reassess_roadmap` registered in db-tools.ts with TypeBox schema and alias `gsd_roadmap_reassess` +- [ ] `replan-slice.md` contains `gsd_replan_slice` as canonical tool name +- [ ] `reassess-roadmap.md` contains `gsd_reassess_roadmap` as canonical tool name +- [ ] Prompt contract tests pass asserting tool name presence in both prompts +- [ ] Existing prompt contract tests still pass (no regressions) + +## Verification + +- `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — all tests pass including new assertions +- `grep -q 'gsd_replan_slice' src/resources/extensions/gsd/prompts/replan-slice.md` — exits 0 +- `grep -q 'gsd_reassess_roadmap' src/resources/extensions/gsd/prompts/reassess-roadmap.md` — exits 0 +- `grep -q 'gsd_replan_slice' src/resources/extensions/gsd/bootstrap/db-tools.ts` — exits 0 +- `grep -q 'gsd_reassess_roadmap' src/resources/extensions/gsd/bootstrap/db-tools.ts` — exits 0 + +## Inputs + +- `src/resources/extensions/gsd/tools/replan-slice.ts` — handler created in T01 +- `src/resources/extensions/gsd/tools/reassess-roadmap.ts` — handler created in T02 +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — existing registration patterns for plan_slice, plan_task +- `src/resources/extensions/gsd/prompts/replan-slice.md` — existing prompt template +- `src/resources/extensions/gsd/prompts/reassess-roadmap.md` — existing prompt template +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — existing prompt contract tests + +## Expected Output + +- `src/resources/extensions/gsd/bootstrap/db-tools.ts` — modified with two new tool registrations +- `src/resources/extensions/gsd/prompts/replan-slice.md` — modified to name `gsd_replan_slice` +- `src/resources/extensions/gsd/prompts/reassess-roadmap.md` — modified to name `gsd_reassess_roadmap` +- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — modified with new tool name assertions diff --git a/src/resources/extensions/gsd/gsd-db.ts b/src/resources/extensions/gsd/gsd-db.ts index 95498098b..2e29952de 100644 --- a/src/resources/extensions/gsd/gsd-db.ts +++ b/src/resources/extensions/gsd/gsd-db.ts @@ -1579,6 +1579,30 @@ export function deleteSlice(milestoneId: string, sliceId: string): void { ).run({ ":mid": milestoneId, ":sid": sliceId }); } +export function updateSliceFields(milestoneId: string, sliceId: string, fields: { + title?: string; + risk?: string; + depends?: string[]; + demo?: string; +}): void { + if (!currentDb) throw new GSDError(GSD_STALE_STATE, "gsd-db: No database open"); + currentDb.prepare( + `UPDATE slices SET + title = COALESCE(:title, title), + risk = COALESCE(:risk, risk), + depends = COALESCE(:depends, depends), + demo = COALESCE(:demo, demo) + WHERE milestone_id = :milestone_id AND id = :id`, + ).run({ + ":milestone_id": milestoneId, + ":id": sliceId, + ":title": fields.title ?? null, + ":risk": fields.risk ?? null, + ":depends": fields.depends ? JSON.stringify(fields.depends) : null, + ":demo": fields.demo ?? null, + }); +} + export function getReplanHistory(milestoneId: string, sliceId?: string): Array> { if (!currentDb) return []; if (sliceId) { @@ -1590,3 +1614,11 @@ export function getReplanHistory(milestoneId: string, sliceId?: string): Array | null { + if (!currentDb) return null; + const row = currentDb.prepare( + `SELECT * FROM assessments WHERE path = :path`, + ).get({ ":path": path }); + return row ?? null; +} diff --git a/src/resources/extensions/gsd/tests/reassess-handler.test.ts b/src/resources/extensions/gsd/tests/reassess-handler.test.ts new file mode 100644 index 000000000..38908433f --- /dev/null +++ b/src/resources/extensions/gsd/tests/reassess-handler.test.ts @@ -0,0 +1,325 @@ +import test from 'node:test'; +import assert from 'node:assert/strict'; +import { mkdtempSync, mkdirSync, rmSync, existsSync, readFileSync } from 'node:fs'; +import { join } from 'node:path'; +import { tmpdir } from 'node:os'; + +import { + openDatabase, + closeDatabase, + insertMilestone, + insertSlice, + getSlice, + getMilestoneSlices, + getAssessment, + _getAdapter, +} from '../gsd-db.ts'; +import { handleReassessRoadmap } from '../tools/reassess-roadmap.ts'; + +function makeTmpBase(): string { + const base = mkdtempSync(join(tmpdir(), 'gsd-reassess-')); + mkdirSync(join(base, '.gsd', 'milestones', 'M001', 'slices', 'S01'), { recursive: true }); + mkdirSync(join(base, '.gsd', 'milestones', 'M001', 'slices', 'S02'), { recursive: true }); + mkdirSync(join(base, '.gsd', 'milestones', 'M001', 'slices', 'S03'), { recursive: true }); + return base; +} + +function cleanup(base: string): void { + try { closeDatabase(); } catch { /* noop */ } + try { rmSync(base, { recursive: true, force: true }); } catch { /* noop */ } +} + +function seedMilestoneWithSlices(opts?: { + s01Status?: string; + s02Status?: string; + s03Status?: string; +}): void { + insertMilestone({ id: 'M001', title: 'Test Milestone', status: 'active' }); + insertSlice({ id: 'S01', milestoneId: 'M001', title: 'Slice One', status: opts?.s01Status ?? 'complete', demo: 'Demo one.' }); + insertSlice({ id: 'S02', milestoneId: 'M001', title: 'Slice Two', status: opts?.s02Status ?? 'pending', demo: 'Demo two.' }); + insertSlice({ id: 'S03', milestoneId: 'M001', title: 'Slice Three', status: opts?.s03Status ?? 'pending', demo: 'Demo three.' }); +} + +function validReassessParams() { + return { + milestoneId: 'M001', + completedSliceId: 'S01', + verdict: 'confirmed', + assessment: 'S01 completed successfully. Roadmap is on track.', + sliceChanges: { + modified: [ + { + sliceId: 'S02', + title: 'Updated Slice Two', + risk: 'high', + depends: ['S01'], + demo: 'Updated demo two.', + }, + ], + added: [ + { + sliceId: 'S04', + title: 'New Slice Four', + risk: 'low', + depends: ['S02'], + demo: 'Demo four.', + }, + ], + removed: ['S03'], + }, + }; +} + +// ─── Tests ──────────────────────────────────────────────────────────────── + +test('handleReassessRoadmap rejects invalid payloads (missing milestoneId)', async () => { + const base = makeTmpBase(); + openDatabase(join(base, '.gsd', 'gsd.db')); + + try { + seedMilestoneWithSlices(); + const result = await handleReassessRoadmap({ ...validReassessParams(), milestoneId: '' }, base); + assert.ok('error' in result); + assert.match(result.error, /validation failed/); + assert.match(result.error, /milestoneId/); + } finally { + cleanup(base); + } +}); + +test('handleReassessRoadmap rejects missing milestone', async () => { + const base = makeTmpBase(); + openDatabase(join(base, '.gsd', 'gsd.db')); + + try { + // No milestone seeded + const result = await handleReassessRoadmap(validReassessParams(), base); + assert.ok('error' in result); + assert.match(result.error, /not found/); + } finally { + cleanup(base); + } +}); + +test('handleReassessRoadmap rejects structural violation: modifying a completed slice', async () => { + const base = makeTmpBase(); + openDatabase(join(base, '.gsd', 'gsd.db')); + + try { + seedMilestoneWithSlices({ s01Status: 'complete', s02Status: 'pending', s03Status: 'pending' }); + + const result = await handleReassessRoadmap({ + ...validReassessParams(), + sliceChanges: { + modified: [{ sliceId: 'S01', title: 'Trying to modify completed S01' }], + added: [], + removed: [], + }, + }, base); + + assert.ok('error' in result); + assert.match(result.error, /completed slice/); + assert.match(result.error, /S01/); + } finally { + cleanup(base); + } +}); + +test('handleReassessRoadmap rejects structural violation: removing a completed slice', async () => { + const base = makeTmpBase(); + openDatabase(join(base, '.gsd', 'gsd.db')); + + try { + seedMilestoneWithSlices({ s01Status: 'complete', s02Status: 'pending', s03Status: 'pending' }); + + const result = await handleReassessRoadmap({ + ...validReassessParams(), + sliceChanges: { + modified: [], + added: [], + removed: ['S01'], + }, + }, base); + + assert.ok('error' in result); + assert.match(result.error, /completed slice/); + assert.match(result.error, /S01/); + } finally { + cleanup(base); + } +}); + +test('handleReassessRoadmap succeeds when modifying only pending slices', async () => { + const base = makeTmpBase(); + openDatabase(join(base, '.gsd', 'gsd.db')); + + try { + seedMilestoneWithSlices({ s01Status: 'complete', s02Status: 'pending', s03Status: 'pending' }); + + const params = validReassessParams(); + const result = await handleReassessRoadmap(params, base); + assert.ok(!('error' in result), `unexpected error: ${'error' in result ? result.error : ''}`); + + // Verify assessments row exists in DB + const assessmentPath = join('.gsd', 'milestones', 'M001', 'slices', 'S01', 'S01-ASSESSMENT.md'); + const assessment = getAssessment(assessmentPath); + assert.ok(assessment, 'assessment row should exist in DB'); + assert.equal(assessment['milestone_id'], 'M001'); + assert.equal(assessment['status'], 'confirmed'); + assert.equal(assessment['scope'], 'roadmap'); + assert.ok((assessment['full_content'] as string).includes('S01 completed successfully'), 'assessment content should be stored'); + + // Verify S02 was updated + const s02 = getSlice('M001', 'S02'); + assert.ok(s02, 'S02 should still exist'); + assert.equal(s02?.title, 'Updated Slice Two'); + assert.equal(s02?.risk, 'high'); + assert.equal(s02?.demo, 'Updated demo two.'); + + // Verify S03 was deleted + const s03 = getSlice('M001', 'S03'); + assert.equal(s03, null, 'S03 should have been deleted'); + + // Verify S04 was inserted + const s04 = getSlice('M001', 'S04'); + assert.ok(s04, 'S04 should exist as a new slice'); + assert.equal(s04?.title, 'New Slice Four'); + assert.equal(s04?.status, 'pending'); + + // Verify S01 (completed) was NOT touched + const s01 = getSlice('M001', 'S01'); + assert.ok(s01, 'S01 should still exist'); + assert.equal(s01?.status, 'complete'); + + // Verify ROADMAP.md re-rendered on disk + const roadmapPath = join(base, '.gsd', 'milestones', 'M001', 'M001-ROADMAP.md'); + assert.ok(existsSync(roadmapPath), 'ROADMAP.md should be rendered to disk'); + const roadmapContent = readFileSync(roadmapPath, 'utf-8'); + assert.ok(roadmapContent.includes('Updated Slice Two'), 'ROADMAP.md should contain updated S02 title'); + + // Verify ASSESSMENT.md exists on disk + const assessmentDiskPath = join(base, '.gsd', 'milestones', 'M001', 'slices', 'S01', 'S01-ASSESSMENT.md'); + assert.ok(existsSync(assessmentDiskPath), 'ASSESSMENT.md should be rendered to disk'); + const assessmentContent = readFileSync(assessmentDiskPath, 'utf-8'); + assert.ok(assessmentContent.includes('confirmed'), 'ASSESSMENT.md should contain verdict'); + assert.ok(assessmentContent.includes('S01'), 'ASSESSMENT.md should reference completed slice'); + } finally { + cleanup(base); + } +}); + +test('handleReassessRoadmap cache invalidation: getMilestoneSlices reflects mutations', async () => { + const base = makeTmpBase(); + openDatabase(join(base, '.gsd', 'gsd.db')); + + try { + seedMilestoneWithSlices({ s01Status: 'complete', s02Status: 'pending', s03Status: 'pending' }); + + const params = validReassessParams(); + const result = await handleReassessRoadmap(params, base); + assert.ok(!('error' in result), `unexpected error: ${'error' in result ? result.error : ''}`); + + // After cache invalidation, DB queries should reflect mutations + const slices = getMilestoneSlices('M001'); + const sliceIds = slices.map(s => s.id); + + // S01 should remain (completed, untouched) + assert.ok(sliceIds.includes('S01'), 'S01 should still exist after reassess'); + + // S02 should remain (modified, not removed) + assert.ok(sliceIds.includes('S02'), 'S02 should still exist after reassess'); + + // S03 should be gone (removed) + assert.ok(!sliceIds.includes('S03'), 'S03 should be gone after removal'); + + // S04 should exist (added) + assert.ok(sliceIds.includes('S04'), 'S04 should exist after addition'); + } finally { + cleanup(base); + } +}); + +test('handleReassessRoadmap is idempotent: calling twice with same params succeeds', async () => { + const base = makeTmpBase(); + openDatabase(join(base, '.gsd', 'gsd.db')); + + try { + seedMilestoneWithSlices({ s01Status: 'complete', s02Status: 'pending', s03Status: 'pending' }); + + // First call with full mutations + const params = validReassessParams(); + const first = await handleReassessRoadmap(params, base); + assert.ok(!('error' in first), `first call error: ${'error' in first ? first.error : ''}`); + + // Second call — S03 already deleted, S04 already exists (INSERT OR IGNORE), S02 already updated + // This should still succeed because: + // - assessments uses INSERT OR REPLACE (path PK) + // - S04 insert uses INSERT OR IGNORE + // - S02 update is idempotent + // - S03 delete on nonexistent is a no-op + const second = await handleReassessRoadmap(params, base); + assert.ok(!('error' in second), `second call error: ${'error' in second ? second.error : ''}`); + } finally { + cleanup(base); + } +}); + +test('handleReassessRoadmap rejects slice with status "done" (alias for complete)', async () => { + const base = makeTmpBase(); + openDatabase(join(base, '.gsd', 'gsd.db')); + + try { + seedMilestoneWithSlices({ s01Status: 'done', s02Status: 'pending', s03Status: 'pending' }); + + const result = await handleReassessRoadmap({ + ...validReassessParams(), + sliceChanges: { + modified: [{ sliceId: 'S01', title: 'Trying to modify done S01' }], + added: [], + removed: [], + }, + }, base); + + assert.ok('error' in result); + assert.match(result.error, /completed slice/); + assert.match(result.error, /S01/); + } finally { + cleanup(base); + } +}); + +test('handleReassessRoadmap returns structured error payloads with actionable messages', async () => { + const base = makeTmpBase(); + openDatabase(join(base, '.gsd', 'gsd.db')); + + try { + seedMilestoneWithSlices({ s01Status: 'complete', s02Status: 'complete', s03Status: 'pending' }); + + // Try to modify S01 (completed) + const modifyResult = await handleReassessRoadmap({ + ...validReassessParams(), + sliceChanges: { + modified: [{ sliceId: 'S01', title: 'x' }], + added: [], + removed: [], + }, + }, base); + assert.ok('error' in modifyResult); + assert.ok(typeof modifyResult.error === 'string', 'error should be a string'); + assert.ok(modifyResult.error.includes('S01'), 'error should name the specific slice ID S01'); + + // Try to remove S02 (completed) + const removeResult = await handleReassessRoadmap({ + ...validReassessParams(), + sliceChanges: { + modified: [], + added: [], + removed: ['S02'], + }, + }, base); + assert.ok('error' in removeResult); + assert.ok(removeResult.error.includes('S02'), 'error should name the specific slice ID S02'); + } finally { + cleanup(base); + } +}); diff --git a/src/resources/extensions/gsd/tools/reassess-roadmap.ts b/src/resources/extensions/gsd/tools/reassess-roadmap.ts new file mode 100644 index 000000000..e395afe64 --- /dev/null +++ b/src/resources/extensions/gsd/tools/reassess-roadmap.ts @@ -0,0 +1,203 @@ +import { clearParseCache } from "../files.js"; +import { + transaction, + getMilestone, + getMilestoneSlices, + insertSlice, + updateSliceFields, + insertAssessment, + deleteSlice, +} from "../gsd-db.js"; +import { invalidateStateCache } from "../state.js"; +import { renderRoadmapFromDb, renderAssessmentFromDb } from "../markdown-renderer.js"; +import { join } from "node:path"; + +export interface SliceChangeInput { + sliceId: string; + title: string; + risk?: string; + depends?: string[]; + demo?: string; +} + +export interface ReassessRoadmapParams { + milestoneId: string; + completedSliceId: string; + verdict: string; + assessment: string; + sliceChanges: { + modified: SliceChangeInput[]; + added: SliceChangeInput[]; + removed: string[]; + }; +} + +export interface ReassessRoadmapResult { + milestoneId: string; + completedSliceId: string; + assessmentPath: string; + roadmapPath: string; +} + +function isNonEmptyString(value: unknown): value is string { + return typeof value === "string" && value.trim().length > 0; +} + +function validateParams(params: ReassessRoadmapParams): ReassessRoadmapParams { + if (!isNonEmptyString(params?.milestoneId)) throw new Error("milestoneId is required"); + if (!isNonEmptyString(params?.completedSliceId)) throw new Error("completedSliceId is required"); + if (!isNonEmptyString(params?.verdict)) throw new Error("verdict is required"); + if (!isNonEmptyString(params?.assessment)) throw new Error("assessment is required"); + + if (!params.sliceChanges || typeof params.sliceChanges !== "object") { + throw new Error("sliceChanges must be an object"); + } + + if (!Array.isArray(params.sliceChanges.modified)) { + throw new Error("sliceChanges.modified must be an array"); + } + + if (!Array.isArray(params.sliceChanges.added)) { + throw new Error("sliceChanges.added must be an array"); + } + + if (!Array.isArray(params.sliceChanges.removed)) { + throw new Error("sliceChanges.removed must be an array"); + } + + // Validate each modified slice + for (let i = 0; i < params.sliceChanges.modified.length; i++) { + const s = params.sliceChanges.modified[i]; + if (!s || typeof s !== "object") throw new Error(`sliceChanges.modified[${i}] must be an object`); + if (!isNonEmptyString(s.sliceId)) throw new Error(`sliceChanges.modified[${i}].sliceId is required`); + if (!isNonEmptyString(s.title)) throw new Error(`sliceChanges.modified[${i}].title is required`); + } + + // Validate each added slice + for (let i = 0; i < params.sliceChanges.added.length; i++) { + const s = params.sliceChanges.added[i]; + if (!s || typeof s !== "object") throw new Error(`sliceChanges.added[${i}] must be an object`); + if (!isNonEmptyString(s.sliceId)) throw new Error(`sliceChanges.added[${i}].sliceId is required`); + if (!isNonEmptyString(s.title)) throw new Error(`sliceChanges.added[${i}].title is required`); + } + + return params; +} + +export async function handleReassessRoadmap( + rawParams: ReassessRoadmapParams, + basePath: string, +): Promise { + // ── Validate ────────────────────────────────────────────────────── + let params: ReassessRoadmapParams; + try { + params = validateParams(rawParams); + } catch (err) { + return { error: `validation failed: ${(err as Error).message}` }; + } + + // ── Verify milestone exists ─────────────────────────────────────── + const milestone = getMilestone(params.milestoneId); + if (!milestone) { + return { error: `milestone not found: ${params.milestoneId}` }; + } + + // ── Structural enforcement ──────────────────────────────────────── + const existingSlices = getMilestoneSlices(params.milestoneId); + const completedSliceIds = new Set(); + for (const slice of existingSlices) { + if (slice.status === "complete" || slice.status === "done") { + completedSliceIds.add(slice.id); + } + } + + // Reject modifications to completed slices + for (const modifiedSlice of params.sliceChanges.modified) { + if (completedSliceIds.has(modifiedSlice.sliceId)) { + return { error: `cannot modify completed slice ${modifiedSlice.sliceId}` }; + } + } + + // Reject removal of completed slices + for (const removedId of params.sliceChanges.removed) { + if (completedSliceIds.has(removedId)) { + return { error: `cannot remove completed slice ${removedId}` }; + } + } + + // ── Compute assessment artifact path ────────────────────────────── + // Assessment lives in the completed slice's directory + const assessmentRelPath = join( + ".gsd", "milestones", params.milestoneId, + "slices", params.completedSliceId, + `${params.completedSliceId}-ASSESSMENT.md`, + ); + + // ── Transaction: DB mutations ───────────────────────────────────── + try { + transaction(() => { + // Record assessment + insertAssessment({ + path: assessmentRelPath, + milestoneId: params.milestoneId, + sliceId: params.completedSliceId, + status: params.verdict, + scope: "roadmap", + fullContent: params.assessment, + }); + + // Apply slice modifications + for (const mod of params.sliceChanges.modified) { + updateSliceFields(params.milestoneId, mod.sliceId, { + title: mod.title, + risk: mod.risk, + depends: mod.depends, + demo: mod.demo, + }); + } + + // Insert new slices + for (const added of params.sliceChanges.added) { + insertSlice({ + id: added.sliceId, + milestoneId: params.milestoneId, + title: added.title, + status: "pending", + risk: added.risk, + depends: added.depends, + demo: added.demo ?? "", + }); + } + + // Delete removed slices + for (const removedId of params.sliceChanges.removed) { + deleteSlice(params.milestoneId, removedId); + } + }); + } catch (err) { + return { error: `db write failed: ${(err as Error).message}` }; + } + + // ── Render artifacts ────────────────────────────────────────────── + try { + const roadmapResult = await renderRoadmapFromDb(basePath, params.milestoneId); + const assessmentResult = await renderAssessmentFromDb(basePath, params.milestoneId, params.completedSliceId, { + verdict: params.verdict, + assessment: params.assessment, + completedSliceId: params.completedSliceId, + }); + + // ── Invalidate caches ───────────────────────────────────────── + invalidateStateCache(); + clearParseCache(); + + return { + milestoneId: params.milestoneId, + completedSliceId: params.completedSliceId, + assessmentPath: assessmentResult.assessmentPath, + roadmapPath: roadmapResult.roadmapPath, + }; + } catch (err) { + return { error: `render failed: ${(err as Error).message}` }; + } +}