diff --git a/.mcp.json b/.mcp.json
deleted file mode 100644
index a8e68079d..000000000
--- a/.mcp.json
+++ /dev/null
@@ -1,14 +0,0 @@
-{
-  "mcpServers": {
-    "repowise": {
-      "command": "repowise",
-      "args": [
-        "mcp",
-        "/Users/jeremymcspadden/Github/gsd-2",
-        "--transport",
-        "stdio"
-      ],
-      "description": "repowise: codebase intelligence \u2014 docs, graph, git signals, dead code, decisions"
-    }
-  }
-}
diff --git a/src/resources/extensions/gsd/prompts/discuss.md b/src/resources/extensions/gsd/prompts/discuss.md
index 743ab8b9c..160d99f91 100644
--- a/src/resources/extensions/gsd/prompts/discuss.md
+++ b/src/resources/extensions/gsd/prompts/discuss.md
@@ -49,25 +49,126 @@ This happens ONCE, before the first round. The goal: your first questions should
 
 For subsequent rounds, continue investigating between rounds — check docs, search, or scout as needed to make each round's questions smarter. But the first-round investigation is mandatory and explicit. Distribute searches across turns rather than clustering them in one turn.
 
-## Question Rounds
+## Layered Question Rounds
 
-Ask **1–3 questions per round**. Keep each round tightly focused on one or two of the depth checklist dimensions — do not try to cover all six in one round.
+Questions are organized into four layers. Each layer targets a specific depth dimension. At each layer: ask 1-3 open questions per round, investigate between rounds as needed, and gate before advancing.
 
-**If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions` for each round. 1–3 questions per call, each as a separate question object. Keep option labels short (3–5 words). Always include a freeform "Other / let me explain" option. When the user picks that option or writes a long freeform answer, switch to plain text follow-up for that thread before resuming structured questions. **IMPORTANT: Call `ask_user_questions` exactly once per turn. Never make multiple calls with the same or overlapping questions — wait for the user's response before asking the next round.**
+**Default to open questions.** Use `ask_user_questions` only when there are 2-3 genuinely distinct paths with clear tradeoffs (e.g., "REST vs GraphQL" or "Postgres vs SQLite"). For nuanced design questions, ask in plain text and let the user explain.
 
-**If `{{structuredQuestionsAvailable}}` is `false`:** ask questions in plain text. Keep each round to 1–3 focused questions. Wait for answers before asking the next round.
+**If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions` for binary/ternary choices. Keep option labels short (3-5 words). Always include a freeform "Other / let me explain" option. When the user picks that option or writes a long freeform answer, switch to plain text follow-up for that thread before resuming structured questions. **IMPORTANT: Call `ask_user_questions` exactly once per turn. Never make multiple calls with the same or overlapping questions — wait for the user's response before asking the next round.**
 
-After each answer set, investigate further if any answer opens a new unknown, then ask the next round.
+**If `{{structuredQuestionsAvailable}}` is `false`:** ask questions in plain text. Keep each round to 1-3 focused questions. Wait for answers before asking the next round.
 
-### Round cadence
+**Incremental persistence:** After every 2 question rounds (across any layer), silently save a `{{milestoneId}}-CONTEXT-DRAFT.md` using `gsd_summary_save` with `artifact_type: "CONTEXT-DRAFT"` and `milestone_id: "{{milestoneId}}"`. This protects confirmed work against session crashes. Do NOT mention this save to the user.
 
-After each round of answers, decide whether you already have enough depth to write strong output.
+### Identify Work Type
 
-- **Incremental persistence:** After every 2 question rounds, silently save a `{{milestoneId}}-CONTEXT-DRAFT.md` using `gsd_summary_save` with `artifact_type: "CONTEXT-DRAFT"` and `milestone_id: "{{milestoneId}}"`. This protects confirmed work against session crashes. Do NOT mention this save to the user.
-- If not ready, continue to the next round immediately. Do **not** ask a meta "ready to wrap up?" question after every round.
-- **Depth-matching rule:** Simple, well-defined work needs fewer rounds — maybe 1–2. Large, ambiguous visions need more — maybe 4+. Do not pad rounds to hit a number. Stop when the Depth Enforcement checklist below is fully satisfied.
-- Do not count the reflection step as a question round. Rounds start after reflection is confirmed.
-- When you genuinely believe the depth checklist is satisfied, move to the Depth Verification step below. Do not ask a separate "ready to wrap up?" gate — the depth verification IS the gate.
+Before starting Layer 1, identify the primary work type and state it:
+
+"Based on your description and the codebase, this is primarily **[work type]** work."
+
+Work types include: API/backend, UI/frontend, CLI/developer tool, data pipeline, ML/AI, infrastructure/platform, refactoring/migration, or a combination. The user can correct this. The classification shapes which questions deserve deep exploration at each layer.
+
+### Layer 1 — Scope
+
+Resolve what's in and what's out. Ask about:
+- Feature boundaries — what exactly ships in this milestone vs later
+- Ambiguities in the user's description — anything you're unsure about, ask
+- Dependencies — what does this work depend on, what depends on it
+- Priority — if scope needs trimming, what matters most
+
+Adapt depth to work type:
+- **CLI work:** Focus on user mental model, command grammar, what existing commands do
+- **Refactoring:** Focus on what changes vs what must stay identical
+
+**Depth-matching:** Simple, well-defined scope may need 1 round. Ambiguous or large scope may need 3-4 rounds. Don't pad rounds to hit a number.
+
+#### Layer 1 Gate
+
+Summarize scope decisions in the user's own terminology:
+- What's included, what's excluded, what's deferred
+- Key boundaries and constraints
+
+Then ask: **"Does this capture the scope? Adjust anything before we move on."**
+
+If the user adjusts, reflect the updated understanding and ask again. Do not advance until the user explicitly confirms. If the user says "looks good, let's move faster" at any gate, respect that and advance.
+
+---
+
+### Layer 2 — Architecture
+
+Resolve how it's built. Ask about:
+- Per-slice technical decisions — what approach for each major piece
+- Inter-slice contracts — how do the pieces connect
+- Library/framework choices — with evidence from investigation, not assumptions
+- Integration with existing code — what patterns to follow, what to change
+
+Adapt depth to work type:
+- **API work:** Contracts, versioning, backwards compatibility, auth boundaries
+- **UI work:** Component boundaries, state management, data flow
+- **Infrastructure:** Deployment topology, failure domains, rollback
+
+Between rounds, use your available web search tools to research technologies from the Codebase Brief. Search for "[technology] [version] best practices [current year]" and "[technology] [version] known issues". Present findings alongside your questions.
+
+#### Layer 2 Gate
+
+Summarize architecture decisions, each with:
+- The decision and rationale
+- Evidence source (codebase patterns, library docs, web research)
+- Alternatives considered
+
+Then ask: **"Does this capture the architecture? Adjust anything before we move on."**
+
+Same gate rules: reflect adjustments, wait for confirmation.
+
+---
+
+### Layer 3 — Error States
+
+Resolve what happens when things fail. Present this layer with an option:
+
+"We can go deep on error handling and failure modes, or I can apply sensible defaults based on the architecture decisions above. Which do you prefer?"
+
+If the user chooses defaults, summarize what the defaults are and gate. If the user chooses to go deep, ask about:
+- Failure modes for each major component
+- Error propagation between layers (API → frontend, service → database)
+- Timeout, retry, and circuit-breaker strategies
+- What the user sees when something fails
+
+Adapt depth to work type:
+- **API work:** Rate limiting, timeout cascades, partial failure, status codes
+- **UI work:** Loading states, optimistic updates, offline behavior, error boundaries
+- **Data pipelines:** Data corruption, checkpoint recovery, idempotency
+
+#### Layer 3 Gate
+
+Summarize error handling strategy. Then ask: **"Does this capture how errors should be handled? Adjust anything before we move on."**
+
+---
+
+### Layer 4 — Quality Bar
+
+Resolve what "done" means concretely. Ask about:
+- Per-slice acceptance criteria — specific enough for automated verification
+- Test strategy — what types of tests, what coverage expectations
+- Definition of done — what must be true before the milestone ships
+- Non-functional requirements — performance, accessibility, security if relevant
+
+Adapt depth to work type:
+- **CLI work:** Shell compatibility, error message clarity, exit code semantics
+- **Refactoring:** Behavioral equivalence tests, not just code coverage
+- **UI work:** Visual regression criteria, responsive breakpoints
+
+#### Layer 4 Gate
+
+Summarize quality bar: acceptance criteria, test strategy, definition of done. Then ask: **"Does this capture the quality bar? Adjust anything before we move on to requirements and roadmap?"**
+
+---
+
+### Layer cadence
+
+- Do not count the reflection step as a question round. Rounds start at Layer 1 after reflection is confirmed.
+- When all four layer gates have been confirmed (or skipped by the user), move to the Depth Verification step below. Do not ask a separate "ready to wrap up?" gate — the depth verification confirms the full picture.
 
 ## Questioning Philosophy
 
@@ -225,6 +326,14 @@ Once the user is satisfied, in a single pass:
 **Depth-Preservation Guidance for context.md:**
 When writing context.md, preserve the user's exact terminology, emphasis, and specific framing from the discussion. Do not paraphrase user nuance into generic summaries. If the user said "craft feel," write "craft feel" — not "high-quality user experience." If they emphasized a specific constraint or negative requirement, carry that emphasis through verbatim. The context file is downstream agents' only window into this conversation — flattening specifics into generics loses the signal that shaped every decision.
 
+**Structured sections from discussion layers:**
+When writing CONTEXT.md, include structured sections that map to the discussion layers:
+- **Scope** — what's in, what's out, what's deferred (from Layer 1 gate summary)
+- **Architectural Decisions** — each with rationale, evidence source, alternatives considered (from Layer 2 gate summary)
+- **Error Handling Strategy** — failure modes, propagation, user-facing error behavior (from Layer 3 gate summary)
+- **Acceptance Criteria** — per-slice criteria specific enough for the planner to use directly (from Layer 4 gate summary)
+These sections are in addition to whatever other context the discussion surfaced.
+
 4. Write `{{contextPath}}` — use the **Context** output template below. Preserve key risks, unknowns, existing codebase constraints, integration points, and relevant requirements surfaced during discussion.
 5. Call `gsd_plan_milestone` to create the roadmap. Decompose into demoable vertical slices with risk, depends, demo sentences, proof strategy, verification classes, milestone definition of done, requirement coverage, and a boundary map. If the milestone crosses multiple runtime boundaries, include an explicit final integration slice that proves the assembled system works end-to-end in a real environment. Use the **Roadmap** output template below to structure the tool call parameters.
 6. For each architectural or pattern decision made during discussion, call `gsd_decision_save` — the tool auto-assigns IDs and regenerates `.gsd/DECISIONS.md` automatically.