Merge pull request #3850 from jeremymcs/fix/auto-loop-test-timeouts
fix: make gsd_complete_task the only execute-task summary path
This commit is contained in:
commit
ff54c91dd8
4 changed files with 31 additions and 14 deletions
|
|
@ -25,11 +25,11 @@ Then:
|
|||
4. If the slice plan includes observability/diagnostic surfaces, confirm they work. Skip this for simple slices that don't have observability sections.
|
||||
5. If the slice involved runtime behavior, fill the **Operational Readiness** section (Q8) in the slice summary: health signal, failure signal, recovery procedure, and monitoring gaps. Omit entirely for simple slices with no runtime concerns.
|
||||
6. If this slice produced evidence that a requirement changed status (Active → Validated, Active → Deferred, etc.), call `gsd_requirement_update` with the requirement ID, updated `status`, and `validation` evidence. Do NOT write `.gsd/REQUIREMENTS.md` directly — the engine renders it from the database.
|
||||
7. Write `{{sliceSummaryPath}}` (compress all task summaries).
|
||||
8. Write `{{sliceUatPath}}` — a concrete UAT script with real test cases derived from the slice plan and task summaries. Include preconditions, numbered steps with expected outcomes, and edge cases. This must NOT be a placeholder or generic template — tailor every test case to what this slice actually built.
|
||||
7. Prepare the slice completion content you will pass to `gsd_complete_slice` using the camelCase fields `milestoneId`, `sliceId`, `sliceTitle`, `oneLiner`, `narrative`, `verification`, and `uatContent`. Do **not** manually write `{{sliceSummaryPath}}`. Do **not** manually write `{{sliceUatPath}}` — the DB-backed tool is the canonical write path for both artifacts.
|
||||
8. Draft the UAT content you will pass as `uatContent` — a concrete UAT script with real test cases derived from the slice plan and task summaries. Include preconditions, numbered steps with expected outcomes, and edge cases. This must NOT be a placeholder or generic template — tailor every test case to what this slice actually built.
|
||||
9. Review task summaries for `key_decisions`. Append any significant decisions to `.gsd/DECISIONS.md` if missing.
|
||||
10. Review task summaries for patterns, gotchas, or non-obvious lessons learned. If any would save future agents from repeating investigation or hitting the same issues, append them to `.gsd/KNOWLEDGE.md`. Only add entries that are genuinely useful — don't pad with obvious observations.
|
||||
11. Call `gsd_complete_slice` with milestoneId, sliceId, the slice summary, and the UAT result. Do NOT manually mark the roadmap checkbox — the tool writes to the DB and renders the ROADMAP.md projection automatically.
|
||||
11. Call `gsd_complete_slice` with the camelCase fields `milestoneId`, `sliceId`, `sliceTitle`, `oneLiner`, `narrative`, `verification`, and `uatContent`, plus any optional enrichment fields you have. Do NOT manually mark the roadmap checkbox — the tool writes to the DB, renders `{{sliceSummaryPath}}` and `{{sliceUatPath}}`, and updates the ROADMAP.md projection automatically.
|
||||
12. Do not run git commands — the system commits your changes and handles any merge after this unit succeeds.
|
||||
13. Update `.gsd/PROJECT.md` if it exists — refresh current state if needed: use the `write` tool with `path: ".gsd/PROJECT.md"` and `content` containing the full updated document reflecting current project state. Do NOT use the `edit` tool for this — PROJECT.md is a full-document refresh.
|
||||
|
||||
|
|
|
|||
|
|
@ -69,14 +69,14 @@ Then:
|
|||
16. If you made an architectural, pattern, library, or observability decision during this task that downstream work should know about, append it to `.gsd/DECISIONS.md` (read the template at `~/.gsd/agent/extensions/gsd/templates/decisions.md` if the file doesn't exist yet). Not every task produces decisions — only append when a meaningful choice was made.
|
||||
17. If you discover a non-obvious rule, recurring gotcha, or useful pattern during execution, append it to `.gsd/KNOWLEDGE.md`. Only add entries that would save future agents from repeating your investigation. Don't add obvious things.
|
||||
18. Read the template at `~/.gsd/agent/extensions/gsd/templates/task-summary.md`
|
||||
19. Write `{{taskSummaryPath}}`
|
||||
20. Call `gsd_complete_task` with milestoneId, sliceId, taskId, and a summary of what was accomplished. This is your final required step — do NOT manually edit PLAN.md checkboxes. The tool marks the task complete, updates the DB, and renders PLAN.md automatically.
|
||||
19. Use that template to prepare the completion content you will pass to `gsd_complete_task` using the camelCase fields `milestoneId`, `sliceId`, `taskId`, `oneLiner`, `narrative`, `verification`, and `verificationEvidence`. Do **not** manually write `{{taskSummaryPath}}` — the DB-backed tool is the canonical write path and renders the summary file for you.
|
||||
20. Call `gsd_complete_task` with milestoneId, sliceId, taskId, and the completion fields derived from the template. This is your final required step — do NOT manually edit PLAN.md checkboxes. The tool marks the task complete, updates the DB, renders `{{taskSummaryPath}}`, and updates PLAN.md automatically.
|
||||
21. Do not run git commands — the system reads your task summary after completion and creates a meaningful commit from it (type inferred from title, message from your one-liner, key files from frontmatter). Write a clear, specific one-liner in the summary — it becomes the commit message.
|
||||
|
||||
All work stays in your working directory: `{{workingDirectory}}`.
|
||||
|
||||
**Autonomous execution:** Do not call `ask_user_questions` or `secure_env_collect`. You are running in auto-mode — there is no human available to answer questions. Make reasonable assumptions and document them in the task summary. If a decision genuinely requires human input, note it in the summary and proceed with the best available option.
|
||||
|
||||
**You MUST call `gsd_complete_task` AND write `{{taskSummaryPath}}` before finishing.**
|
||||
**You MUST call `gsd_complete_task` before finishing. Do not manually write `{{taskSummaryPath}}`.**
|
||||
|
||||
When done, say: "Task {{taskId}} complete."
|
||||
|
|
|
|||
|
|
@ -40,9 +40,9 @@ After all reviewers complete, aggregate their verdicts:
|
|||
- If any reviewer says NEEDS-ATTENTION → overall verdict: `needs-attention`
|
||||
- If any reviewer says FAIL → overall verdict: `needs-remediation`
|
||||
|
||||
### Step 3 — Write VALIDATION File
|
||||
### Step 3 — Persist Validation
|
||||
|
||||
Write to `{{validationPath}}`:
|
||||
Prepare the validation content you will pass to `gsd_validate_milestone`. Do **not** manually write `{{validationPath}}` — the DB-backed tool is the canonical write path and renders the validation file for you.
|
||||
|
||||
```markdown
|
||||
---
|
||||
|
|
@ -69,13 +69,15 @@ reviewers: 3
|
|||
<if verdict is not pass: specific actions required>
|
||||
```
|
||||
|
||||
Call `gsd_validate_milestone` with the camelCase fields `milestoneId`, `verdict`, `remediationRound`, `successCriteriaChecklist`, `sliceDeliveryAudit`, `crossSliceIntegration`, `requirementCoverage`, `verdictRationale`, and `remediationPlan` when needed. If you include verification-class analysis, pass it in `verificationClasses`.
|
||||
|
||||
**DB access safety:** Do NOT query `.gsd/gsd.db` directly via `sqlite3` or `node -e require('better-sqlite3')` — the engine owns the WAL connection. Use `gsd_milestone_status` to read milestone and slice state. All data you need is already inlined in the context above or accessible via the `gsd_*` tools. Direct DB access corrupts the WAL and bypasses tool-level validation.
|
||||
|
||||
If verdict is `needs-remediation`:
|
||||
- Add new slices to `{{roadmapPath}}` with unchecked `[ ]` status
|
||||
- These slices will be planned and executed before validation re-runs
|
||||
- Use `gsd_reassess_roadmap` to add the remediation slices instead of editing `{{roadmapPath}}` manually
|
||||
- Those slices will be planned and executed before validation re-runs
|
||||
|
||||
**You MUST write `{{validationPath}}` before finishing.**
|
||||
**You MUST call `gsd_validate_milestone` before finishing. Do not manually write `{{validationPath}}`.**
|
||||
|
||||
**File system safety:** When scanning milestone directories for evidence, use `ls` or `find` to list directory contents first — never pass a directory path (e.g. `tasks/`, `slices/`) directly to the `read` tool. The `read` tool only accepts file paths, not directories.
|
||||
|
||||
|
|
|
|||
|
|
@ -71,11 +71,13 @@ test("execute-task prompt references gsd_complete_task tool", () => {
|
|||
assert.match(prompt, /gsd_complete_task/);
|
||||
});
|
||||
|
||||
test("execute-task prompt instructs writing task summary before tool call", () => {
|
||||
test("execute-task prompt uses gsd_complete_task as canonical summary write path", () => {
|
||||
const prompt = readPrompt("execute-task");
|
||||
// The prompt instructs writing the summary file AND calling the tool
|
||||
assert.match(prompt, /\{\{taskSummaryPath\}\}/);
|
||||
assert.match(prompt, /gsd_complete_task/);
|
||||
assert.match(prompt, /DB-backed tool is the canonical write path/i);
|
||||
assert.match(prompt, /Do \*\*not\*\* manually write `?\{\{taskSummaryPath\}\}`?/i);
|
||||
assert.doesNotMatch(prompt, /^\d+\.\s+Write `?\{\{taskSummaryPath\}\}`?\s*$/m);
|
||||
});
|
||||
|
||||
test("execute-task prompt does not instruct LLM to toggle checkboxes manually", () => {
|
||||
|
|
@ -119,10 +121,14 @@ test("guided-complete-slice prompt references gsd_slice_complete tool", () => {
|
|||
|
||||
test("complete-slice prompt instructs writing summary and UAT files before tool call", () => {
|
||||
const prompt = readPrompt("complete-slice");
|
||||
// The prompt instructs writing the summary AND UAT files, then calling the tool
|
||||
assert.match(prompt, /\{\{sliceSummaryPath\}\}/);
|
||||
assert.match(prompt, /\{\{sliceUatPath\}\}/);
|
||||
assert.match(prompt, /gsd_complete_slice/);
|
||||
assert.match(prompt, /DB-backed tool is the canonical write path/i);
|
||||
assert.match(prompt, /Do \*\*not\*\* manually write `?\{\{sliceSummaryPath\}\}`?/i);
|
||||
assert.match(prompt, /Do \*\*not\*\* manually write `?\{\{sliceUatPath\}\}`?/i);
|
||||
assert.doesNotMatch(prompt, /^\d+\.\s+Write `?\{\{sliceSummaryPath\}\}`?.*$/m);
|
||||
assert.doesNotMatch(prompt, /^\d+\.\s+Write `?\{\{sliceUatPath\}\}`?.*$/m);
|
||||
});
|
||||
|
||||
test("complete-slice prompt preserves decisions and knowledge review steps", () => {
|
||||
|
|
@ -131,6 +137,15 @@ test("complete-slice prompt preserves decisions and knowledge review steps", ()
|
|||
assert.match(prompt, /KNOWLEDGE\.md/);
|
||||
});
|
||||
|
||||
test("validate-milestone prompt uses gsd_validate_milestone as canonical validation write path", () => {
|
||||
const prompt = readPrompt("validate-milestone");
|
||||
assert.match(prompt, /gsd_validate_milestone/);
|
||||
assert.match(prompt, /\{\{validationPath\}\}/);
|
||||
assert.match(prompt, /DB-backed tool is the canonical write path/i);
|
||||
assert.match(prompt, /Do \*\*not\*\* manually write `?\{\{validationPath\}\}`?/i);
|
||||
assert.doesNotMatch(prompt, /Write to `?\{\{validationPath\}\}`?:/i);
|
||||
});
|
||||
|
||||
test("complete-slice prompt still contains template variables for context", () => {
|
||||
const prompt = readPrompt("complete-slice");
|
||||
assert.match(prompt, /\{\{sliceSummaryPath\}\}/);
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue