Merge pull request #4019 from jeremymcs/fix/4018-anti-fabrication-guardrails

fix(gsd): enforce anti-fabrication turn-taking in discuss prompts
2026-04-12 00:29:47 -05:00 · 2026-04-12 00:29:47 -05:00 · 8ffec7fad0
commit 8ffec7fad0
parent c900e1004a d5e4938320
8 changed files with 32 additions and 3 deletions
--- a/src/resources/GSD-WORKFLOW.md
+++ b/src/resources/GSD-WORKFLOW.md
@ -275,7 +275,7 @@ Work flows through these phases. Each phase produces a file.
 **How to do it manually:**
 1. Read the roadmap to understand the scope.
 2. Identify 3-5 gray areas — implementation decisions the user cares about.
-3. Use `ask_user_questions` to discuss each area.
+3. Use `ask_user_questions` to discuss each area, one round at a time. Never fabricate user input; wait for the user's actual response before the next round.
 4. Write decisions to the appropriate context file (`M###-CONTEXT.md` or `S##-CONTEXT.md`).
 5. Do NOT discuss how to implement — only what the user wants.

--- a/src/resources/extensions/gsd/prompts/discuss.md
+++ b/src/resources/extensions/gsd/prompts/discuss.md
@ -73,6 +73,8 @@ After each round of answers, decide whether you already have enough depth to wri

 You are a thinking partner, not an interviewer.

+**Turn-taking contract (non-bypassable).** Never fabricate, simulate, or role-play user responses. Never generate fake transcript markers like `[User]`, `[Human]`, or `User:` to invent input. Ask one question round (1-3 questions) per turn, then stop and wait for the user's actual response before continuing. If you use `ask_user_questions`, call it at most once per turn and treat its returned response as the only valid structured user input for that round.
+
 **Start open, follow energy.** Let the user's enthusiasm guide where you dig deeper. If they light up about a particular aspect, explore it. If they're vague about something, that's where you probe.

 **Challenge vagueness, make abstract concrete.** When the user says something abstract ("it should be smart" / "it needs to handle edge cases" / "good UX"), push for specifics. What does "smart" mean in practice? Which edge cases? What does good UX look like for this specific interaction?
--- a/src/resources/extensions/gsd/prompts/guided-discuss-milestone.md
+++ b/src/resources/extensions/gsd/prompts/guided-discuss-milestone.md
@ -32,6 +32,8 @@ Ask **1–3 questions per round**. Keep each question focused on one of:
 - **The biggest technical unknowns / risks** — what could fail, what hasn't been proven
 - **What external systems/services this touches** — APIs, databases, third-party services

+**Never fabricate or simulate user input.** Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
+
 **If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions` for each round. 1–3 questions per call, each as a separate question object. Keep option labels short (3–5 words). Always include a freeform "Other / let me explain" option. When the user picks that option or writes a long freeform answer, switch to plain text follow-up for that thread before resuming structured questions. **IMPORTANT: Call `ask_user_questions` exactly once per turn. Never make multiple calls with the same or overlapping questions — wait for the user's response before asking the next round.**

 **If `{{structuredQuestionsAvailable}}` is `false`:** ask questions in plain text. Keep each round to 1–3 focused questions. Wait for answers before asking the next round.
--- a/src/resources/extensions/gsd/prompts/guided-discuss-slice.md
+++ b/src/resources/extensions/gsd/prompts/guided-discuss-slice.md
@ -22,6 +22,8 @@ Do **not** go deep — just enough that your questions reflect what's actually t

 ### Question rounds

+**Never fabricate or simulate user input.** Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
+
 **If `{{structuredQuestionsAvailable}}` is `true`:** Ask **1–3 questions per round** using `ask_user_questions`. **Call `ask_user_questions` exactly once per turn — never make multiple calls with the same or overlapping questions. Wait for the user's response before asking the next round.**
 **If `{{structuredQuestionsAvailable}}` is `false`:** Ask **1–3 questions per round** in plain text. Number them and wait for the user's response before asking the next round.
 Keep each question focused on one of:
--- a/src/resources/extensions/gsd/prompts/queue.md
+++ b/src/resources/extensions/gsd/prompts/queue.md
@ -18,6 +18,7 @@ Say exactly: "What do you want to add?" — nothing else. Wait for the user's an
 ## Discussion Phase

 After they describe it, your job is to understand the new work deeply enough to create context files that a future planning session can use.
+Never fabricate or simulate user input during this discussion. Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.

 **If the user provides a file path or pastes a large document** (spec, design doc, product plan, chat export), read it fully before asking questions. Use it as the starting point — don't ask them to re-explain what's already in the document. Your questions should fill gaps and resolve ambiguities the document doesn't cover.

@ -36,11 +37,11 @@ Don't go deep — just enough that your next question reflects what's actually t
 - How the new work relates to existing milestones — overlap, dependencies, prerequisites
 - If `.gsd/REQUIREMENTS.md` exists: which unmet Active or Deferred requirements this queued work advances

-**Then use ask_user_questions** to dig into gray areas — scope boundaries, proof expectations, integration choices, tech preferences when they materially matter, and what's in vs out. 1-3 questions per round.
+**Then use ask_user_questions** to dig into gray areas — scope boundaries, proof expectations, integration choices, tech preferences when they materially matter, and what's in vs out. Ask 1-3 questions per round, then wait for the user's response before asking the next round.

 If a `GSD Skill Preferences` block is present in system context, use it to decide which skills to load and follow during discuss/planning work, but do not let it override the required discuss flow or artifact requirements.

-**Self-regulate:** Do **not** ask a meta "ready to queue?" question after every round. Keep going until you have enough depth to write the context well, then use a single wrap-up prompt if needed. If the user clearly keeps adding detail instead of objecting, treat that as permission to continue.
+**Self-regulate:** Do **not** ask a meta "ready to queue?" question after every round. Keep going until you have enough depth to write the context well, then use a single wrap-up prompt if needed. Do not infer permission to continue from silence or from partial prior answers — each new round requires an actual user response.

 ## Existing Milestone Awareness

--- a/src/resources/extensions/gsd/prompts/system.md
+++ b/src/resources/extensions/gsd/prompts/system.md
@ -35,6 +35,7 @@ GSD ships with bundled skills. Load the relevant skill file with the `read` tool
 - Read before edit.
 - Reproduce before fix when possible.
 - Work is not done until the relevant verification has passed.
+- **Never fabricate, simulate, or role-play user responses.** Never generate markers like `[User]`, `[Human]`, `User:`, or similar to represent user input inside your own output. Ask one question round (1-3 questions), then stop and wait for the user's actual response before continuing. If `ask_user_questions` is available, treat its returned response as the only valid structured user input for that round.
 - Never print, echo, log, or restate secrets or credentials. Report only key names and applied/skipped status.
 - Never ask the user to edit `.env` files or set secrets manually. Use `secure_env_collect`.
 - In enduring files, write current state only unless the file is explicitly historical.
--- a/src/resources/extensions/gsd/tests/prompt-contracts.test.ts
+++ b/src/resources/extensions/gsd/tests/prompt-contracts.test.ts
@ -42,9 +42,19 @@ test("system prompt references CODEBASE.md and /gsd codebase", () => {
  assert.match(prompt, /auto-refreshes it when tracked files change/i);
 });

+test("system prompt hard rules forbid fabricating user responses", () => {
+  const prompt = readPrompt("system");
+  assert.match(prompt, /never fabricate, simulate, or role-play user responses/i);
+  assert.match(prompt, /never generate markers like `?\[User\]`?, `?\[Human\]`?, `?User:`?/i);
+  assert.match(prompt, /ask one question round \(1-3 questions\), then stop and wait for the user's actual response/i);
+  assert.match(prompt, /ask_user_questions.*only valid structured user input/i);
+});
+
 test("discuss prompt allows implementation questions when they materially matter", () => {
  const prompt = readPrompt("discuss");
  assert.match(prompt, /Lead with experience, but ask implementation when it materially matters/i);
+  assert.match(prompt, /Never fabricate, simulate, or role-play user responses/i);
+  assert.match(prompt, /Ask one question round \(1-3 questions\) per turn, then stop and wait for the user's actual response/i);
  assert.match(prompt, /one gate, not two/i);
  assert.doesNotMatch(prompt, /Questions must be about the experience, not the implementation/i);
 });
@ -56,6 +66,8 @@ test("guided discussion prompts avoid wrap-up prompts after every round", () =>
  assert.match(slicePrompt, /Do \*\*not\*\* ask a meta "ready to wrap up\?" question after every round/i);
  assert.doesNotMatch(milestonePrompt, /I think I have a solid picture of this milestone\. Ready to wrap up/i);
  assert.doesNotMatch(slicePrompt, /I think I have a solid picture of this slice\. Ready to wrap up/i);
+  assert.match(milestonePrompt, /Never fabricate or simulate user input/i);
+  assert.match(slicePrompt, /Never fabricate or simulate user input/i);
 });

 test("guided milestone discussion scopes depth verification to the milestone id", () => {
@ -64,6 +76,13 @@ test("guided milestone discussion scopes depth verification to the milestone id"
  assert.doesNotMatch(prompt, /depth_verification_confirm" — this enables the write-gate downstream/i, "legacy global depth gate wording should be gone");
 });

+test("queue prompt requires waiting for user response between rounds", () => {
+  const prompt = readPrompt("queue");
+  assert.match(prompt, /Never fabricate or simulate user input during this discussion/i);
+  assert.match(prompt, /Ask 1-3 questions per round, then wait for the user's response before asking the next round\./i);
+  assert.doesNotMatch(prompt, /treat that as permission to continue/i);
+});
+
 test("guided-resume-task prompt preserves recovery state until work is superseded", () => {
  const prompt = readPrompt("guided-resume-task");
  assert.match(prompt, /Do \*\*not\*\* delete the continue file immediately/i);
--- a/src/resources/skills/create-skill/SKILL.md
+++ b/src/resources/skills/create-skill/SKILL.md
@ -78,6 +78,8 @@ Based on the user's message, route directly to the appropriate workflow:
 **If user intent is unclear, ask minimal clarifying questions:**
 - "Create a MIDI skill" → "Task-execution skill (does MIDI tasks) or domain expertise (complete MIDI knowledge base)?"
 - "Work on my skill" → "Which skill? What do you want to do with it?"
+- Ask one clarifying question round at a time, then wait for the user's actual response before asking another.
+- Never fabricate or simulate user responses while clarifying (for example, fake `[User]` markers or imagined answers).

 Then proceed directly to the workflow.
 </routing>