Merge pull request #4019 from jeremymcs/fix/4018-anti-fabrication-guardrails
fix(gsd): enforce anti-fabrication turn-taking in discuss prompts
This commit is contained in:
commit
8ffec7fad0
8 changed files with 32 additions and 3 deletions
|
|
@ -275,7 +275,7 @@ Work flows through these phases. Each phase produces a file.
|
|||
**How to do it manually:**
|
||||
1. Read the roadmap to understand the scope.
|
||||
2. Identify 3-5 gray areas — implementation decisions the user cares about.
|
||||
3. Use `ask_user_questions` to discuss each area.
|
||||
3. Use `ask_user_questions` to discuss each area, one round at a time. Never fabricate user input; wait for the user's actual response before the next round.
|
||||
4. Write decisions to the appropriate context file (`M###-CONTEXT.md` or `S##-CONTEXT.md`).
|
||||
5. Do NOT discuss how to implement — only what the user wants.
|
||||
|
||||
|
|
|
|||
|
|
@ -73,6 +73,8 @@ After each round of answers, decide whether you already have enough depth to wri
|
|||
|
||||
You are a thinking partner, not an interviewer.
|
||||
|
||||
**Turn-taking contract (non-bypassable).** Never fabricate, simulate, or role-play user responses. Never generate fake transcript markers like `[User]`, `[Human]`, or `User:` to invent input. Ask one question round (1-3 questions) per turn, then stop and wait for the user's actual response before continuing. If you use `ask_user_questions`, call it at most once per turn and treat its returned response as the only valid structured user input for that round.
|
||||
|
||||
**Start open, follow energy.** Let the user's enthusiasm guide where you dig deeper. If they light up about a particular aspect, explore it. If they're vague about something, that's where you probe.
|
||||
|
||||
**Challenge vagueness, make abstract concrete.** When the user says something abstract ("it should be smart" / "it needs to handle edge cases" / "good UX"), push for specifics. What does "smart" mean in practice? Which edge cases? What does good UX look like for this specific interaction?
|
||||
|
|
|
|||
|
|
@ -32,6 +32,8 @@ Ask **1–3 questions per round**. Keep each question focused on one of:
|
|||
- **The biggest technical unknowns / risks** — what could fail, what hasn't been proven
|
||||
- **What external systems/services this touches** — APIs, databases, third-party services
|
||||
|
||||
**Never fabricate or simulate user input.** Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
|
||||
|
||||
**If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions` for each round. 1–3 questions per call, each as a separate question object. Keep option labels short (3–5 words). Always include a freeform "Other / let me explain" option. When the user picks that option or writes a long freeform answer, switch to plain text follow-up for that thread before resuming structured questions. **IMPORTANT: Call `ask_user_questions` exactly once per turn. Never make multiple calls with the same or overlapping questions — wait for the user's response before asking the next round.**
|
||||
|
||||
**If `{{structuredQuestionsAvailable}}` is `false`:** ask questions in plain text. Keep each round to 1–3 focused questions. Wait for answers before asking the next round.
|
||||
|
|
|
|||
|
|
@ -22,6 +22,8 @@ Do **not** go deep — just enough that your questions reflect what's actually t
|
|||
|
||||
### Question rounds
|
||||
|
||||
**Never fabricate or simulate user input.** Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
|
||||
|
||||
**If `{{structuredQuestionsAvailable}}` is `true`:** Ask **1–3 questions per round** using `ask_user_questions`. **Call `ask_user_questions` exactly once per turn — never make multiple calls with the same or overlapping questions. Wait for the user's response before asking the next round.**
|
||||
**If `{{structuredQuestionsAvailable}}` is `false`:** Ask **1–3 questions per round** in plain text. Number them and wait for the user's response before asking the next round.
|
||||
Keep each question focused on one of:
|
||||
|
|
|
|||
|
|
@ -18,6 +18,7 @@ Say exactly: "What do you want to add?" — nothing else. Wait for the user's an
|
|||
## Discussion Phase
|
||||
|
||||
After they describe it, your job is to understand the new work deeply enough to create context files that a future planning session can use.
|
||||
Never fabricate or simulate user input during this discussion. Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
|
||||
|
||||
**If the user provides a file path or pastes a large document** (spec, design doc, product plan, chat export), read it fully before asking questions. Use it as the starting point — don't ask them to re-explain what's already in the document. Your questions should fill gaps and resolve ambiguities the document doesn't cover.
|
||||
|
||||
|
|
@ -36,11 +37,11 @@ Don't go deep — just enough that your next question reflects what's actually t
|
|||
- How the new work relates to existing milestones — overlap, dependencies, prerequisites
|
||||
- If `.gsd/REQUIREMENTS.md` exists: which unmet Active or Deferred requirements this queued work advances
|
||||
|
||||
**Then use ask_user_questions** to dig into gray areas — scope boundaries, proof expectations, integration choices, tech preferences when they materially matter, and what's in vs out. 1-3 questions per round.
|
||||
**Then use ask_user_questions** to dig into gray areas — scope boundaries, proof expectations, integration choices, tech preferences when they materially matter, and what's in vs out. Ask 1-3 questions per round, then wait for the user's response before asking the next round.
|
||||
|
||||
If a `GSD Skill Preferences` block is present in system context, use it to decide which skills to load and follow during discuss/planning work, but do not let it override the required discuss flow or artifact requirements.
|
||||
|
||||
**Self-regulate:** Do **not** ask a meta "ready to queue?" question after every round. Keep going until you have enough depth to write the context well, then use a single wrap-up prompt if needed. If the user clearly keeps adding detail instead of objecting, treat that as permission to continue.
|
||||
**Self-regulate:** Do **not** ask a meta "ready to queue?" question after every round. Keep going until you have enough depth to write the context well, then use a single wrap-up prompt if needed. Do not infer permission to continue from silence or from partial prior answers — each new round requires an actual user response.
|
||||
|
||||
## Existing Milestone Awareness
|
||||
|
||||
|
|
|
|||
|
|
@ -35,6 +35,7 @@ GSD ships with bundled skills. Load the relevant skill file with the `read` tool
|
|||
- Read before edit.
|
||||
- Reproduce before fix when possible.
|
||||
- Work is not done until the relevant verification has passed.
|
||||
- **Never fabricate, simulate, or role-play user responses.** Never generate markers like `[User]`, `[Human]`, `User:`, or similar to represent user input inside your own output. Ask one question round (1-3 questions), then stop and wait for the user's actual response before continuing. If `ask_user_questions` is available, treat its returned response as the only valid structured user input for that round.
|
||||
- Never print, echo, log, or restate secrets or credentials. Report only key names and applied/skipped status.
|
||||
- Never ask the user to edit `.env` files or set secrets manually. Use `secure_env_collect`.
|
||||
- In enduring files, write current state only unless the file is explicitly historical.
|
||||
|
|
|
|||
|
|
@ -42,9 +42,19 @@ test("system prompt references CODEBASE.md and /gsd codebase", () => {
|
|||
assert.match(prompt, /auto-refreshes it when tracked files change/i);
|
||||
});
|
||||
|
||||
test("system prompt hard rules forbid fabricating user responses", () => {
|
||||
const prompt = readPrompt("system");
|
||||
assert.match(prompt, /never fabricate, simulate, or role-play user responses/i);
|
||||
assert.match(prompt, /never generate markers like `?\[User\]`?, `?\[Human\]`?, `?User:`?/i);
|
||||
assert.match(prompt, /ask one question round \(1-3 questions\), then stop and wait for the user's actual response/i);
|
||||
assert.match(prompt, /ask_user_questions.*only valid structured user input/i);
|
||||
});
|
||||
|
||||
test("discuss prompt allows implementation questions when they materially matter", () => {
|
||||
const prompt = readPrompt("discuss");
|
||||
assert.match(prompt, /Lead with experience, but ask implementation when it materially matters/i);
|
||||
assert.match(prompt, /Never fabricate, simulate, or role-play user responses/i);
|
||||
assert.match(prompt, /Ask one question round \(1-3 questions\) per turn, then stop and wait for the user's actual response/i);
|
||||
assert.match(prompt, /one gate, not two/i);
|
||||
assert.doesNotMatch(prompt, /Questions must be about the experience, not the implementation/i);
|
||||
});
|
||||
|
|
@ -56,6 +66,8 @@ test("guided discussion prompts avoid wrap-up prompts after every round", () =>
|
|||
assert.match(slicePrompt, /Do \*\*not\*\* ask a meta "ready to wrap up\?" question after every round/i);
|
||||
assert.doesNotMatch(milestonePrompt, /I think I have a solid picture of this milestone\. Ready to wrap up/i);
|
||||
assert.doesNotMatch(slicePrompt, /I think I have a solid picture of this slice\. Ready to wrap up/i);
|
||||
assert.match(milestonePrompt, /Never fabricate or simulate user input/i);
|
||||
assert.match(slicePrompt, /Never fabricate or simulate user input/i);
|
||||
});
|
||||
|
||||
test("guided milestone discussion scopes depth verification to the milestone id", () => {
|
||||
|
|
@ -64,6 +76,13 @@ test("guided milestone discussion scopes depth verification to the milestone id"
|
|||
assert.doesNotMatch(prompt, /depth_verification_confirm" — this enables the write-gate downstream/i, "legacy global depth gate wording should be gone");
|
||||
});
|
||||
|
||||
test("queue prompt requires waiting for user response between rounds", () => {
|
||||
const prompt = readPrompt("queue");
|
||||
assert.match(prompt, /Never fabricate or simulate user input during this discussion/i);
|
||||
assert.match(prompt, /Ask 1-3 questions per round, then wait for the user's response before asking the next round\./i);
|
||||
assert.doesNotMatch(prompt, /treat that as permission to continue/i);
|
||||
});
|
||||
|
||||
test("guided-resume-task prompt preserves recovery state until work is superseded", () => {
|
||||
const prompt = readPrompt("guided-resume-task");
|
||||
assert.match(prompt, /Do \*\*not\*\* delete the continue file immediately/i);
|
||||
|
|
|
|||
|
|
@ -78,6 +78,8 @@ Based on the user's message, route directly to the appropriate workflow:
|
|||
**If user intent is unclear, ask minimal clarifying questions:**
|
||||
- "Create a MIDI skill" → "Task-execution skill (does MIDI tasks) or domain expertise (complete MIDI knowledge base)?"
|
||||
- "Work on my skill" → "Which skill? What do you want to do with it?"
|
||||
- Ask one clarifying question round at a time, then wait for the user's actual response before asking another.
|
||||
- Never fabricate or simulate user responses while clarifying (for example, fake `[User]` markers or imagined answers).
|
||||
|
||||
Then proceed directly to the workflow.
|
||||
</routing>
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue