Mikael Hugo 0266ca3ec8 docs(sf): wire parentTrace into advisory-partner dispatch

Adds a Dispatch Pattern subsection showing the parentTrace shape for
advisory review. For advisory, the trace is the planner's reasoning trail
(alternatives considered, untested assumptions, explicit out-of-scope) —
not tool calls. This lets the advisory reviewer catch the gap between
what the planner thought and what the artefact says, which is exactly
what advisory review exists to catch.

2026-05-02 03:45:37 +02:00

5.8 KiB

Raw Blame History

name	description
advisory-partner	Framework for independent advisory review of plans and decisions. Runs as a separate task with the validation model — NOT self-review by the planning model. In SF, this is the framework used by gate-evaluate (Q3/Q4) and validate-milestone (MV01-MV04). Use when dispatching a subagent to review a plan before committing to it.

Advisory Partner: Independent Review with a Different Model

This skill is for independent review, not self-review. It is meant to run as a separate agent dispatch using the validation model, giving a genuine second opinion on plans and decisions.

In SF, this pattern is already built into the pipeline:

gate-evaluate runs Q3 (security abuse surface) and Q4 (broken promises) with the validation model before slice execution
validate-milestone runs MV01-MV04 with the validation model after milestone execution
Both use a different model from the planning/execution model — that's the point

Do NOT add this to always_use_skills — that would make the planning model self-review, which misses the point. The advisory value comes from a different model challenging the plan.

When to Dispatch an Advisory Review

Use subagent to dispatch an advisory review (with the validation model) when:

A milestone plan has high risk or novel architecture — dispatch advisory before plan-milestone commits
A slice plan crosses multiple subsystems — dispatch advisory before execute-task starts
A significant architectural decision needs challenge — dispatch advisory, then write the ADR
The planning model is uncertain and needs a second opinion

Dispatch Pattern

Pass parentTrace so the advisory agent reviews the planner's actual reasoning trail — not just the artefact that landed. For advisory dispatch, the trace shape is reasoning, not tool calls: alternatives considered, assumptions made, what the planner is uncertain about, what was rejected and why. The advisory agent uses this to find the gap between what the planner thought and what the artefact says.

subagent({
  agent: "reviewer",
  model: "<validation-tier model>",
  parentTrace: "Reasoning trail (what the planner considered):\n" +
               "- Alternatives weighed: A vs B vs C; chose B because <reason>\n" +
               "- Untested assumptions: <assumption 1>, <assumption 2>\n" +
               "- Where the planner is uncertain: <one or two specifics>\n" +
               "- What was explicitly out of scope: <items>",
  task: "Apply the advisory-partner protocol to <artefact path>. " +
        "Answer Q1-Q5, run the trap scan, end with ADVISORY VERDICT."
})

The advisory verdict carries more weight when the reviewer can see what the planner thought and not just what the planner wrote. Hidden assumptions and waved-away objections are exactly what advisory review exists to catch.

Advisory Review Protocol

When running as the advisory agent, apply this framework:

1. State the Decision Under Review

Read the artifact being reviewed (CONTEXT.md, ROADMAP.md, or slice plan). Summarize:

"The plan proposes [X]. The core claim is [Y]. The alternative not taken is [Z]."

If you can't fill this in, the plan is incomplete.

2. Five Challenger Questions

Answer each from the artifact — not from general knowledge:

Q1: What problem does this actually solve? Which specific struggling moment (TODO, FIXME, missing test, user complaint) does this plan address? If none, flag it.

Q2: What assumptions are baked in? List ≥3 assumptions. For each: is it tested or untested? What happens if it's false?

Q3: What's the failure mode in 6 months? Name the specific thing that breaks, not "it doesn't work." Who notices first?

Q4: What's the simplest proof point? What's the smallest deliverable that would confirm the plan is on the right track? Does the plan include it early enough?

Q5: What's the strongest objection? Write the objection. Then answer it. If the answer is weak, that's a flag.

3. Trap Scan

Trap	Warning Sign
Shiny object	New tech without a diagnosed struggling moment
Scope creep	"While we're at it..." additions
Premature abstraction	Generic infrastructure with <3 real use cases
Missing consumer	Module with no real callers (only tests)
Axle not scooter	Infrastructure layer with no standalone demo value

4. Verdict

ADVISORY VERDICT: [PROCEED / PROCEED WITH CAVEAT / RECONSIDER]

Solid: [what's well-grounded]
Gaps: [what needs resolution]
Action: [one sentence]

PROCEED: plan is grounded, assumptions tested, clear proof point exists
PROCEED WITH CAVEAT: one specific thing must be resolved first (state it)
RECONSIDER: a core assumption is untested or a better approach exists (state it)

Integration with SF Gate System

The existing SF gates are specific instances of this advisory framework:

Gate	Question	Owner Turn	Model
Q3	How can this be exploited?	`gate-evaluate`	`validation`
Q4	What existing promises does this break?	`gate-evaluate`	`validation`
Q5	What breaks when dependencies fail?	`execute-task`	`execution`
Q8	How will ops know this is healthy?	`complete-slice`	`completion`
MV01-MV04	Requirements coverage, integration, acceptance criteria	`validate-milestone`	`validation`

If you want advisory review for planning decisions (not yet covered by gates), dispatch a subagent explicitly with the advisory-partner prompt and request that it use the validation model configuration.

Sources

Richard Rumelt, Good Strategy/Bad Strategy — diagnosis before policy
Ryan Singer, Shape Up — proof points, appetite-based scoping
Teresa Torres, Continuous Discovery Habits — assumption testing

5.8 KiB Raw Blame History