fix(gsd): enforce anti-fabrication turn-taking in discuss prompts

This commit is contained in:
Jeremy 2026-04-12 00:04:08 -05:00
parent 804f1d4b94
commit 5aa1fe0c0c
9 changed files with 42 additions and 6 deletions

View file

@ -275,7 +275,7 @@ Work flows through these phases. Each phase produces a file.
**How to do it manually:**
1. Read the roadmap to understand the scope.
2. Identify 3-5 gray areas — implementation decisions the user cares about.
3. Use `ask_user_questions` to discuss each area.
3. Use `ask_user_questions` to discuss each area, one round at a time. Never fabricate user input; wait for the user's actual response before the next round.
4. Write decisions to the appropriate context file (`M###-CONTEXT.md` or `S##-CONTEXT.md`).
5. Do NOT discuss how to implement — only what the user wants.

View file

@ -29,6 +29,7 @@ This discussion proceeds through four mandatory layers. At each layer:
4. **Gate** — use `ask_user_questions` to get explicit sign-off before advancing
**Do NOT skip layers.** Each layer builds on the previous. The user must explicitly approve each layer before you proceed.
Never fabricate or simulate user input while moving through layers. Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
---
@ -79,7 +80,7 @@ Include:
### Resolve Scope — Mandatory Rounds
After presenting your recommendation, you MUST complete these rounds in order. Each round uses `ask_user_questions` or direct questions. Do NOT skip rounds. Do NOT combine rounds. Do NOT jump to the Layer 1 Gate until all rounds are complete.
After presenting your recommendation, you MUST complete these rounds in order. Each round uses `ask_user_questions` or direct questions. Do NOT skip rounds. Do NOT combine rounds. Do NOT jump to the Layer 1 Gate until all rounds are complete. **Each round is multi-turn: run one round, then wait for the user's response before starting the next round.**
**Complexity calibration:** If the milestone is simple (1-2 slices, well-understood patterns, no ambiguity), you may compress rounds — but you must still explicitly address each round's topic, even if briefly. You may NOT skip rounds entirely. For complex milestones (3+ slices, novel architecture, significant ambiguity), give each round full treatment.
@ -150,7 +151,7 @@ Cover:
### Resolve Architecture — Mandatory Rounds
After presenting your recommendation, you MUST complete these rounds in order. Do NOT skip rounds. Do NOT jump to the Layer 2 Gate until all rounds are complete.
After presenting your recommendation, you MUST complete these rounds in order. Do NOT skip rounds. Do NOT jump to the Layer 2 Gate until all rounds are complete. **Each round is multi-turn: run one round, then wait for the user's response before starting the next round.**
**Complexity calibration:** If the milestone is simple (1-2 slices, well-understood patterns, no ambiguity), you may compress rounds — but you must still explicitly address each round's topic, even if briefly. You may NOT skip rounds entirely. For complex milestones (3+ slices, novel architecture, significant ambiguity), give each round full treatment.
@ -270,7 +271,7 @@ Include:
### Resolve Quality — Mandatory Rounds
After presenting your recommendation, you MUST complete these rounds in order. Do NOT skip rounds.
After presenting your recommendation, you MUST complete these rounds in order. Do NOT skip rounds. **Each round is multi-turn: run one round, then wait for the user's response before starting the next round.**
**Complexity calibration:** If the milestone is simple, you may compress rounds — but you must still explicitly address each round's topic, even if briefly. You may NOT skip rounds entirely.

View file

@ -51,6 +51,8 @@ For subsequent rounds, continue investigating between rounds — check docs, sea
You are a thinking partner, not an interviewer.
**Turn-taking contract (non-bypassable).** Never fabricate, simulate, or role-play user responses. Never generate fake transcript markers like `[User]`, `[Human]`, or `User:` to invent input. Ask one question round (1-3 questions) per turn, then stop and wait for the user's actual response before continuing. If you use `ask_user_questions`, call it at most once per turn and treat its returned response as the only valid structured user input for that round.
**Start open, follow energy.** Let the user's enthusiasm guide where you dig deeper. If they light up about a particular aspect, explore it. If they're vague about something, that's where you probe.
**Challenge vagueness, make abstract concrete.** When the user says something abstract ("it should be smart" / "it needs to handle edge cases" / "good UX"), push for specifics. What does "smart" mean in practice? Which edge cases? What does good UX look like for this specific interaction?

View file

@ -32,6 +32,8 @@ Ask **13 questions per round**. Keep each question focused on one of:
- **The biggest technical unknowns / risks** — what could fail, what hasn't been proven
- **What external systems/services this touches** — APIs, databases, third-party services
**Never fabricate or simulate user input.** Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
**If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions` for each round. 13 questions per call, each as a separate question object. Keep option labels short (35 words). Always include a freeform "Other / let me explain" option. When the user picks that option or writes a long freeform answer, switch to plain text follow-up for that thread before resuming structured questions. **IMPORTANT: Call `ask_user_questions` exactly once per turn. Never make multiple calls with the same or overlapping questions — wait for the user's response before asking the next round.**
**If `{{structuredQuestionsAvailable}}` is `false`:** ask questions in plain text. Keep each round to 13 focused questions. Wait for answers before asking the next round.

View file

@ -22,6 +22,8 @@ Do **not** go deep — just enough that your questions reflect what's actually t
### Question rounds
**Never fabricate or simulate user input.** Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
**If `{{structuredQuestionsAvailable}}` is `true`:** Ask **13 questions per round** using `ask_user_questions`. **Call `ask_user_questions` exactly once per turn — never make multiple calls with the same or overlapping questions. Wait for the user's response before asking the next round.**
**If `{{structuredQuestionsAvailable}}` is `false`:** Ask **13 questions per round** in plain text. Number them and wait for the user's response before asking the next round.
Keep each question focused on one of:

View file

@ -18,6 +18,7 @@ Say exactly: "What do you want to add?" — nothing else. Wait for the user's an
## Discussion Phase
After they describe it, your job is to understand the new work deeply enough to create context files that a future planning session can use.
Never fabricate or simulate user input during this discussion. Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
**If the user provides a file path or pastes a large document** (spec, design doc, product plan, chat export), read it fully before asking questions. Use it as the starting point — don't ask them to re-explain what's already in the document. Your questions should fill gaps and resolve ambiguities the document doesn't cover.
@ -36,11 +37,11 @@ Don't go deep — just enough that your next question reflects what's actually t
- How the new work relates to existing milestones — overlap, dependencies, prerequisites
- If `.gsd/REQUIREMENTS.md` exists: which unmet Active or Deferred requirements this queued work advances
**Then use ask_user_questions** to dig into gray areas — scope boundaries, proof expectations, integration choices, tech preferences when they materially matter, and what's in vs out. 1-3 questions per round.
**Then use ask_user_questions** to dig into gray areas — scope boundaries, proof expectations, integration choices, tech preferences when they materially matter, and what's in vs out. Ask 1-3 questions per round, then wait for the user's response before asking the next round.
If a `GSD Skill Preferences` block is present in system context, use it to decide which skills to load and follow during discuss/planning work, but do not let it override the required discuss flow or artifact requirements.
**Self-regulate:** Do **not** ask a meta "ready to queue?" question after every round. Keep going until you have enough depth to write the context well, then use a single wrap-up prompt if needed. If the user clearly keeps adding detail instead of objecting, treat that as permission to continue.
**Self-regulate:** Do **not** ask a meta "ready to queue?" question after every round. Keep going until you have enough depth to write the context well, then use a single wrap-up prompt if needed. Do not infer permission to continue from silence or from partial prior answers — each new round requires an actual user response.
## Existing Milestone Awareness

View file

@ -35,6 +35,7 @@ GSD ships with bundled skills. Load the relevant skill file with the `read` tool
- Read before edit.
- Reproduce before fix when possible.
- Work is not done until the relevant verification has passed.
- **Never fabricate, simulate, or role-play user responses.** Never generate markers like `[User]`, `[Human]`, `User:`, or similar to represent user input inside your own output. Ask one question round (1-3 questions), then stop and wait for the user's actual response before continuing. If `ask_user_questions` is available, treat its returned response as the only valid structured user input for that round.
- Never print, echo, log, or restate secrets or credentials. Report only key names and applied/skipped status.
- Never ask the user to edit `.env` files or set secrets manually. Use `secure_env_collect`.
- In enduring files, write current state only unless the file is explicitly historical.

View file

@ -42,13 +42,29 @@ test("system prompt references CODEBASE.md and /gsd codebase", () => {
assert.match(prompt, /auto-refreshes it when tracked files change/i);
});
test("system prompt hard rules forbid fabricating user responses", () => {
const prompt = readPrompt("system");
assert.match(prompt, /never fabricate, simulate, or role-play user responses/i);
assert.match(prompt, /never generate markers like `?\[User\]`?, `?\[Human\]`?, `?User:`?/i);
assert.match(prompt, /ask one question round \(1-3 questions\), then stop and wait for the user's actual response/i);
assert.match(prompt, /ask_user_questions.*only valid structured user input/i);
});
test("discuss prompt allows implementation questions when they materially matter", () => {
const prompt = readPrompt("discuss");
assert.match(prompt, /Lead with experience, but ask implementation when it materially matters/i);
assert.match(prompt, /Never fabricate, simulate, or role-play user responses/i);
assert.match(prompt, /Ask one question round \(1-3 questions\) per turn, then stop and wait for the user's actual response/i);
assert.match(prompt, /one gate, not two/i);
assert.doesNotMatch(prompt, /Questions must be about the experience, not the implementation/i);
});
test("discuss-prepared prompt enforces round-by-round user turn taking", () => {
const prompt = readPrompt("discuss-prepared");
assert.match(prompt, /Each round is multi-turn: run one round, then wait for the user's response before starting the next round\./i);
assert.match(prompt, /Never fabricate or simulate user input while moving through layers/i);
});
test("guided discussion prompts avoid wrap-up prompts after every round", () => {
const milestonePrompt = readPrompt("guided-discuss-milestone");
const slicePrompt = readPrompt("guided-discuss-slice");
@ -56,6 +72,8 @@ test("guided discussion prompts avoid wrap-up prompts after every round", () =>
assert.match(slicePrompt, /Do \*\*not\*\* ask a meta "ready to wrap up\?" question after every round/i);
assert.doesNotMatch(milestonePrompt, /I think I have a solid picture of this milestone\. Ready to wrap up/i);
assert.doesNotMatch(slicePrompt, /I think I have a solid picture of this slice\. Ready to wrap up/i);
assert.match(milestonePrompt, /Never fabricate or simulate user input/i);
assert.match(slicePrompt, /Never fabricate or simulate user input/i);
});
test("guided milestone discussion scopes depth verification to the milestone id", () => {
@ -64,6 +82,13 @@ test("guided milestone discussion scopes depth verification to the milestone id"
assert.doesNotMatch(prompt, /depth_verification_confirm" — this enables the write-gate downstream/i, "legacy global depth gate wording should be gone");
});
test("queue prompt requires waiting for user response between rounds", () => {
const prompt = readPrompt("queue");
assert.match(prompt, /Never fabricate or simulate user input during this discussion/i);
assert.match(prompt, /Ask 1-3 questions per round, then wait for the user's response before asking the next round\./i);
assert.doesNotMatch(prompt, /treat that as permission to continue/i);
});
test("guided-resume-task prompt preserves recovery state until work is superseded", () => {
const prompt = readPrompt("guided-resume-task");
assert.match(prompt, /Do \*\*not\*\* delete the continue file immediately/i);

View file

@ -78,6 +78,8 @@ Based on the user's message, route directly to the appropriate workflow:
**If user intent is unclear, ask minimal clarifying questions:**
- "Create a MIDI skill" → "Task-execution skill (does MIDI tasks) or domain expertise (complete MIDI knowledge base)?"
- "Work on my skill" → "Which skill? What do you want to do with it?"
- Ask one clarifying question round at a time, then wait for the user's actual response before asking another.
- Never fabricate or simulate user responses while clarifying (for example, fake `[User]` markers or imagined answers).
Then proceed directly to the workflow.
</routing>