fix: verify UAT artifact before marking complete-slice done (#176, #175)

complete-slice verification only checked for the SUMMARY file, so when
the LLM skipped writing the UAT, the unit was marked complete and UAT
was never produced. Users saw doctor-created placeholder UATs instead
of real test scripts.

- verifyExpectedArtifact now checks both SUMMARY and UAT for complete-slice
- complete-slice prompt strengthened: step 7 requires concrete test cases,
  MUST line lists all three required artifacts with enforcement warning

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Lex Christopherson 2026-03-13 09:09:59 -06:00
parent b07d34b448
commit 71d3a69646
2 changed files with 26 additions and 6 deletions

View file

@ -2744,14 +2744,34 @@ export function resolveExpectedArtifactPath(unitType: string, unitId: string, ba
}
/**
* Check whether the expected artifact for a unit exists on disk.
* Returns true if the artifact file exists, or if the unit type has no
* Check whether the expected artifact(s) for a unit exist on disk.
* Returns true if all required artifacts exist, or if the unit type has no
* single verifiable artifact (e.g., replan-slice).
*
* complete-slice requires both SUMMARY and UAT files verifying only
* the summary allowed the unit to be marked complete when the LLM
* skipped writing the UAT file (see #176).
*/
function verifyExpectedArtifact(unitType: string, unitId: string, base: string): boolean {
const absPath = resolveExpectedArtifactPath(unitType, unitId, base);
if (!absPath) return true;
return existsSync(absPath);
if (!existsSync(absPath)) return false;
// complete-slice must also produce a UAT file
if (unitType === "complete-slice") {
const parts = unitId.split("/");
const mid = parts[0];
const sid = parts[1];
if (mid && sid) {
const dir = resolveSlicePath(base, mid, sid);
if (dir) {
const uatPath = join(dir, buildSliceFileName(sid, "UAT"));
if (!existsSync(uatPath)) return false;
}
}
}
return true;
}
/**
@ -2795,7 +2815,7 @@ function diagnoseExpectedArtifact(unitType: string, unitId: string, base: string
return `Task ${tid} marked [x] in ${relSliceFile(base, mid!, sid!, "PLAN")} + summary written`;
}
case "complete-slice":
return `Slice ${sid} marked [x] in ${relMilestoneFile(base, mid!, "ROADMAP")} + summary written`;
return `Slice ${sid} marked [x] in ${relMilestoneFile(base, mid!, "ROADMAP")} + summary + UAT written`;
case "replan-slice":
return `${relSliceFile(base, mid!, sid!, "REPLAN")} + updated ${relSliceFile(base, mid!, sid!, "PLAN")}`;
case "reassess-roadmap":

View file

@ -17,13 +17,13 @@ Then:
4. If the slice plan includes observability/diagnostic surfaces, confirm they work. Skip this for simple slices that don't have observability sections.
5. If `.gsd/REQUIREMENTS.md` exists, update it based on what this slice actually proved. Move requirements between Active, Validated, Deferred, Blocked, or Out of Scope only when the evidence from execution supports that change.
6. Write `{{sliceSummaryAbsPath}}` (compress all task summaries).
7. Write `{{sliceUatAbsPath}}`.
7. Write `{{sliceUatAbsPath}}` — a concrete UAT script with real test cases derived from the slice plan and task summaries. Include preconditions, numbered steps with expected outcomes, and edge cases. This must NOT be a placeholder or generic template — tailor every test case to what this slice actually built.
8. Review task summaries for `key_decisions`. Append any significant decisions to `.gsd/DECISIONS.md` if missing.
9. Mark {{sliceId}} done in `{{roadmapPath}}` (change `[ ]` to `[x]`)
10. Do not commit or squash-merge manually — the system auto-commits your changes and handles the merge after this unit succeeds.
11. Update `.gsd/PROJECT.md` if it exists — refresh current state if needed.
12. Update `.gsd/STATE.md`
**You MUST mark {{sliceId}} as `[x]` in `{{roadmapPath}}` AND write `{{sliceSummaryAbsPath}}` before finishing.**
**You MUST do ALL THREE before finishing: (1) write `{{sliceSummaryAbsPath}}`, (2) write `{{sliceUatAbsPath}}`, (3) mark {{sliceId}} as `[x]` in `{{roadmapPath}}`. The unit will not be marked complete if any of these files are missing.**
When done, say: "Slice {{sliceId}} complete."