test(S02/T03): Update plan-slice prompt to explicitly name gsd_plan_sli…

- src/resources/extensions/gsd/prompts/plan-slice.md - src/resources/extensions/gsd/tests/prompt-contracts.test.ts - src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts - .gsd/milestones/M001/slices/S02/tasks/T03-PLAN.md
2026-03-23 10:08:44 -06:00 · 2026-03-23 10:08:44 -06:00 · d53bf56bae
commit d53bf56bae
parent a380b8ed77
7 changed files with 114 additions and 5 deletions
--- a/.gsd/milestones/M001/slices/S02/S02-PLAN.md
+++ b/.gsd/milestones/M001/slices/S02/S02-PLAN.md
@ -51,7 +51,7 @@ I’m splitting this into three tasks because there are three distinct failure b
  - Do: Follow the S01 handler pattern exactly for both tools, add any missing DB upsert/query helpers needed to populate task planning fields and retrieve slice/task planning state, register canonical tools plus aliases in `db-tools.ts`, and test validation, missing-parent rejection, transactional DB writes, render-failure handling, idempotent reruns, and observable cache invalidation.
  - Verify: `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts`
  - Done when: `gsd_plan_slice` and `gsd_plan_task` exist as registered DB tools, reject malformed input, render plan artifacts after successful writes, and refresh parse-visible state immediately.
- [ ] **T03: Close prompt and contract coverage around DB-backed slice planning** `est:45m`
+- [x] **T03: Close prompt and contract coverage around DB-backed slice planning** `est:45m`
  - Why: The implementation is incomplete until the planning prompt/test surface actually points at the new tools and proves the DB-backed route is the expected contract instead of manual markdown edits.
  - Files: `src/resources/extensions/gsd/prompts/plan-slice.md`, `src/resources/extensions/gsd/tests/prompt-contracts.test.ts`, `src/resources/extensions/gsd/bootstrap/db-tools.ts`, `src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts`
  - Do: Update the slice planning prompt text to require tool-backed planning state when `gsd_plan_slice` / `gsd_plan_task` are available, tighten prompt-contract assertions for the new tools, and add/adjust prompt template tests so the planning surface stays aligned with the registered tool path.
--- a/.gsd/milestones/M001/slices/S02/tasks/T02-VERIFY.json
+++ b/.gsd/milestones/M001/slices/S02/tasks/T02-VERIFY.json
@ -0,0 +1,18 @@
+{
+  "schemaVersion": 1,
+  "taskId": "T02",
+  "unitId": "M001/S02/T02",
+  "timestamp": 1774281912502,
+  "passed": false,
+  "discoverySource": "package-json",
+  "checks": [
+    {
+      "command": "npm run test",
+      "exitCode": 1,
+      "durationMs": 34647,
+      "verdict": "fail"
+    }
+  ],
+  "retryAttempt": 1,
+  "maxRetries": 2
+}
--- a/.gsd/milestones/M001/slices/S02/tasks/T03-PLAN.md
+++ b/.gsd/milestones/M001/slices/S02/tasks/T03-PLAN.md
@ -45,3 +45,9 @@ Finish the slice by aligning the planning prompt surface with the new implementa
 - `src/resources/extensions/gsd/prompts/plan-slice.md` — updated DB-backed slice/task planning instructions
 - `src/resources/extensions/gsd/tests/prompt-contracts.test.ts` — stronger prompt contract coverage for `gsd_plan_slice` / `gsd_plan_task`
 - `src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts` — updated template tests if prompt wording changes affect expectations
+
+## Observability Impact
+
+- **Signals changed:** The planning prompt now explicitly names `gsd_plan_slice` and `gsd_plan_task` tools, so any agent following the prompt will emit structured tool calls instead of raw file writes — making planning actions observable via tool-call logs rather than implicit file-write patterns.
+- **Inspection surface:** `prompt-contracts.test.ts` assertions referencing the canonical tool names serve as the regression tripwire; if the prompt text drifts back to manual-write instructions, these tests fail immediately.
+- **Failure visibility:** A regression in the prompt wording (removing tool references or re-introducing manual write instructions) is caught by the contract tests before it reaches production prompt surfaces.
--- a/.gsd/milestones/M001/slices/S02/tasks/T03-SUMMARY.md
+++ b/.gsd/milestones/M001/slices/S02/tasks/T03-SUMMARY.md
@ -0,0 +1,59 @@
+---
+id: T03
+parent: S02
+milestone: M001
+key_files:
+  - src/resources/extensions/gsd/prompts/plan-slice.md
+  - src/resources/extensions/gsd/tests/prompt-contracts.test.ts
+  - src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts
+  - .gsd/milestones/M001/slices/S02/tasks/T03-PLAN.md
+key_decisions:
+  - The plan-slice prompt now uses `gsd_plan_slice` and `gsd_plan_task` as the primary numbered step (step 6) instead of a conditional afterthought (old step 8), with direct file writes explicitly labeled as a degraded fallback (step 7).
+duration: ""
+verification_result: passed
+completed_at: 2026-03-23T16:08:41.655Z
+blocker_discovered: false
+---
+
+# T03: Update plan-slice prompt to explicitly name gsd_plan_slice/gsd_plan_task as canonical write path, add prompt contract and template regression tests
+
+**Update plan-slice prompt to explicitly name gsd_plan_slice/gsd_plan_task as canonical write path, add prompt contract and template regression tests**
+
+## What Happened
+
+Updated `src/resources/extensions/gsd/prompts/plan-slice.md` to replace the vague "if the tool path for this planning phase is available" language with explicit instructions naming `gsd_plan_slice` and `gsd_plan_task` as the canonical DB-backed write path for slice and task planning. The new step 6 instructs calling `gsd_plan_slice` with the full payload and `gsd_plan_task` for each task. Step 7 positions direct file writes as an explicitly degraded fallback path only used when the tools are unavailable, not the default. Removed the old step 8 that vaguely referenced "the tool path" and fixed step numbering.
+
+Added 4 new prompt contract tests in `prompt-contracts.test.ts`: one verifying both tool names appear and the "canonical write path" language is present, one verifying direct file writes are framed as "degraded path, not the default", one verifying the prompt no longer has a bare "Write `{{outputPath}}`" as a primary numbered step, and one verifying the prompt instructs calling `gsd_plan_task` for each task.
+
+Added 1 new template substitution test in `plan-slice-prompt.test.ts` confirming the tool names and canonical language survive variable substitution.
+
+Also applied the task-plan pre-flight fix by adding an `## Observability Impact` section to T03-PLAN.md explaining how the prompt change makes planning actions observable via tool-call logs and how the contract tests serve as regression tripwires.
+
+## Verification
+
+Ran all three slice-level verification commands: (1) plan-slice.test.ts + plan-task.test.ts — 10/10 pass, (2) markdown-renderer.test.ts + auto-recovery.test.ts + prompt-contracts.test.ts filtered to planning patterns — 60/60 pass, (3) plan-slice.test.ts + plan-task.test.ts filtered to failure/cache/validation — 10/10 pass. Also ran the task-level verification command (prompt-contracts.test.ts + plan-slice-prompt.test.ts filtered to plan-slice|plan task|DB-backed) — 40/40 pass. Read back the prompt-contracts.test.ts assertions and confirmed they explicitly reference gsd_plan_slice and gsd_plan_task.
+
+## Verification Evidence
+
+| # | Command | Exit Code | Verdict | Duration |
+|---|---------|-----------|---------|----------|
+| 1 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/prompt-contracts.test.ts src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts --test-name-pattern="plan-slice|plan task|DB-backed"` | 0 | ✅ pass | 126ms |
+| 2 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts` | 0 | ✅ pass | 180ms |
+| 3 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/markdown-renderer.test.ts src/resources/extensions/gsd/tests/auto-recovery.test.ts src/resources/extensions/gsd/tests/prompt-contracts.test.ts --test-name-pattern="plan-slice|plan-task|renderPlanFromDb|renderTaskPlanFromDb|task plan|DB-backed planning"` | 0 | ✅ pass | 695ms |
+| 4 | `node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/plan-slice.test.ts src/resources/extensions/gsd/tests/plan-task.test.ts --test-name-pattern="validation failed|render failed|cache|missing parent"` | 0 | ✅ pass | 180ms |
+
+
+## Deviations
+
+None.
+
+## Known Issues
+
+None.
+
+## Files Created/Modified
+
+- `src/resources/extensions/gsd/prompts/plan-slice.md`
+- `src/resources/extensions/gsd/tests/prompt-contracts.test.ts`
+- `src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts`
+- `.gsd/milestones/M001/slices/S02/tasks/T03-PLAN.md`
--- a/src/resources/extensions/gsd/prompts/plan-slice.md
+++ b/src/resources/extensions/gsd/prompts/plan-slice.md
@ -63,10 +63,9 @@ Then:
   - a matching task plan file with description, steps, must-haves, verification, inputs, and expected output
   - **Inputs and Expected Output must list concrete backtick-wrapped file paths** (e.g. `` `src/types.ts` ``). These are machine-parsed to derive task dependencies — vague prose without paths breaks parallel execution. Every task must have at least one output file path.
   - Observability Impact section **only if the task touches runtime boundaries, async flows, or error paths** — omit it otherwise
-6. Write `{{outputPath}}`
-7. Write individual task plans in `{{slicePath}}/tasks/`: `T01-PLAN.md`, `T02-PLAN.md`, etc.
-8. If the tool path for this planning phase is available, call it to persist the slice planning state before finishing. Do **not** rely on direct `PLAN.md` writes as the source of truth; any plan file you write must reflect tool-backed state rather than bypass it.
-9. **Self-audit the plan.** Walk through each check — if any fail, fix the plan files before moving on:
+6. **Persist planning state through DB-backed tools.** Call `gsd_plan_slice` with the full slice planning payload (goal, demo, must-haves, verification, tasks, and metadata). Then call `gsd_plan_task` for each task to persist its planning fields. These tools write to the DB and render `{{outputPath}}` and `{{slicePath}}/tasks/T##-PLAN.md` files automatically. Do **not** rely on direct `PLAN.md` writes as the source of truth; the DB-backed tools are the canonical write path for slice and task planning state.
+7. If `gsd_plan_slice` / `gsd_plan_task` are unavailable (tool not registered), fall back to writing `{{outputPath}}` and task plan files directly — but treat this as a degraded path, not the default.
+8. **Self-audit the plan.** Walk through each check — if any fail, fix the plan files before moving on:
    - **Completion semantics:** If every task were completed exactly as written, the slice goal/demo should actually be true.
    - **Requirement coverage:** Every must-have in the slice maps to at least one task. No must-have is orphaned. If `REQUIREMENTS.md` exists, every Active requirement this slice owns maps to at least one task.
    - **Task completeness:** Every task has steps, must-haves, verification, inputs, and expected output — none are blank or vague. Inputs and Expected Output list backtick-wrapped file paths, not prose descriptions.
--- a/src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts
+++ b/src/resources/extensions/gsd/tests/plan-slice-prompt.test.ts
@ -54,6 +54,13 @@ test("plan-slice prompt: all variables substituted", () => {
  assert.ok(result.includes("S01"));
 });

+test("plan-slice prompt: DB-backed tool names survive template substitution", () => {
+  const result = loadPrompt("plan-slice", { ...BASE_VARS, commitInstruction: "Do not commit." });
+  assert.ok(result.includes("gsd_plan_slice"), "gsd_plan_slice should appear in rendered prompt");
+  assert.ok(result.includes("gsd_plan_task"), "gsd_plan_task should appear in rendered prompt");
+  assert.ok(result.includes("canonical write path"), "canonical write path language should survive substitution");
+});
+
 test("domain-work prompts use skillActivation placeholder", () => {
  const prompts = [
    "research-milestone",
--- a/src/resources/extensions/gsd/tests/prompt-contracts.test.ts
+++ b/src/resources/extensions/gsd/tests/prompt-contracts.test.ts
@ -147,6 +147,26 @@ test("plan-slice prompt no longer frames direct PLAN writes as the source of tru
  assert.match(prompt, /Do \*\*not\*\* rely on direct `PLAN\.md` writes as the source of truth/i);
 });

+test("plan-slice prompt explicitly names gsd_plan_slice and gsd_plan_task as DB-backed planning tools", () => {
+  const prompt = readPrompt("plan-slice");
+  assert.match(prompt, /gsd_plan_slice/);
+  assert.match(prompt, /gsd_plan_task/);
+  // The prompt should describe these as the canonical write path
+  assert.match(prompt, /DB-backed tools are the canonical write path/i);
+});
+
+test("plan-slice prompt treats direct file writes as a degraded fallback, not the default", () => {
+  const prompt = readPrompt("plan-slice");
+  assert.match(prompt, /degraded path, not the default/i);
+  // Should not instruct to "Write {{outputPath}}" as a primary step
+  assert.doesNotMatch(prompt, /^\d+\.\s+Write `?\{\{outputPath\}\}`?\s*$/m);
+});
+
+test("plan-slice prompt instructs calling gsd_plan_task for each task", () => {
+  const prompt = readPrompt("plan-slice");
+  assert.match(prompt, /call `gsd_plan_task` for each task/i);
+});
+
 test("replan-slice prompt requires DB-backed planning state when available", () => {
  const prompt = readPrompt("replan-slice");
  assert.match(prompt, /DB-backed planning tool exists for this phase, use it as the source of truth/i);