singularity-forge/src/resources/extensions/sf/skills/advisory-partner/SKILL.md
Mikael Hugo 0266ca3ec8 docs(sf): wire parentTrace into advisory-partner dispatch
Adds a Dispatch Pattern subsection showing the parentTrace shape for
advisory review. For advisory, the trace is the planner's reasoning trail
(alternatives considered, untested assumptions, explicit out-of-scope) —
not tool calls. This lets the advisory reviewer catch the gap between
what the planner thought and what the artefact says, which is exactly
what advisory review exists to catch.
2026-05-02 03:45:37 +02:00

5.8 KiB

name description
advisory-partner Framework for independent advisory review of plans and decisions. Runs as a separate task with the validation model — NOT self-review by the planning model. In SF, this is the framework used by gate-evaluate (Q3/Q4) and validate-milestone (MV01-MV04). Use when dispatching a subagent to review a plan before committing to it.

Advisory Partner: Independent Review with a Different Model

This skill is for independent review, not self-review. It is meant to run as a separate agent dispatch using the validation model, giving a genuine second opinion on plans and decisions.

In SF, this pattern is already built into the pipeline:

  • gate-evaluate runs Q3 (security abuse surface) and Q4 (broken promises) with the validation model before slice execution
  • validate-milestone runs MV01-MV04 with the validation model after milestone execution
  • Both use a different model from the planning/execution model — that's the point

Do NOT add this to always_use_skills — that would make the planning model self-review, which misses the point. The advisory value comes from a different model challenging the plan.


When to Dispatch an Advisory Review

Use subagent to dispatch an advisory review (with the validation model) when:

  • A milestone plan has high risk or novel architecture — dispatch advisory before plan-milestone commits
  • A slice plan crosses multiple subsystems — dispatch advisory before execute-task starts
  • A significant architectural decision needs challenge — dispatch advisory, then write the ADR
  • The planning model is uncertain and needs a second opinion

Dispatch Pattern

Pass parentTrace so the advisory agent reviews the planner's actual reasoning trail — not just the artefact that landed. For advisory dispatch, the trace shape is reasoning, not tool calls: alternatives considered, assumptions made, what the planner is uncertain about, what was rejected and why. The advisory agent uses this to find the gap between what the planner thought and what the artefact says.

subagent({
  agent: "reviewer",
  model: "<validation-tier model>",
  parentTrace: "Reasoning trail (what the planner considered):\n" +
               "- Alternatives weighed: A vs B vs C; chose B because <reason>\n" +
               "- Untested assumptions: <assumption 1>, <assumption 2>\n" +
               "- Where the planner is uncertain: <one or two specifics>\n" +
               "- What was explicitly out of scope: <items>",
  task: "Apply the advisory-partner protocol to <artefact path>. " +
        "Answer Q1-Q5, run the trap scan, end with ADVISORY VERDICT."
})

The advisory verdict carries more weight when the reviewer can see what the planner thought and not just what the planner wrote. Hidden assumptions and waved-away objections are exactly what advisory review exists to catch.


Advisory Review Protocol

When running as the advisory agent, apply this framework:

1. State the Decision Under Review

Read the artifact being reviewed (CONTEXT.md, ROADMAP.md, or slice plan). Summarize:

"The plan proposes [X]. The core claim is [Y]. The alternative not taken is [Z]."

If you can't fill this in, the plan is incomplete.


2. Five Challenger Questions

Answer each from the artifact — not from general knowledge:

Q1: What problem does this actually solve? Which specific struggling moment (TODO, FIXME, missing test, user complaint) does this plan address? If none, flag it.

Q2: What assumptions are baked in? List ≥3 assumptions. For each: is it tested or untested? What happens if it's false?

Q3: What's the failure mode in 6 months? Name the specific thing that breaks, not "it doesn't work." Who notices first?

Q4: What's the simplest proof point? What's the smallest deliverable that would confirm the plan is on the right track? Does the plan include it early enough?

Q5: What's the strongest objection? Write the objection. Then answer it. If the answer is weak, that's a flag.


3. Trap Scan

Trap Warning Sign
Shiny object New tech without a diagnosed struggling moment
Scope creep "While we're at it..." additions
Premature abstraction Generic infrastructure with <3 real use cases
Missing consumer Module with no real callers (only tests)
Axle not scooter Infrastructure layer with no standalone demo value

4. Verdict

ADVISORY VERDICT: [PROCEED / PROCEED WITH CAVEAT / RECONSIDER]

Solid: [what's well-grounded]
Gaps: [what needs resolution]
Action: [one sentence]
  • PROCEED: plan is grounded, assumptions tested, clear proof point exists
  • PROCEED WITH CAVEAT: one specific thing must be resolved first (state it)
  • RECONSIDER: a core assumption is untested or a better approach exists (state it)

Integration with SF Gate System

The existing SF gates are specific instances of this advisory framework:

Gate Question Owner Turn Model
Q3 How can this be exploited? gate-evaluate validation
Q4 What existing promises does this break? gate-evaluate validation
Q5 What breaks when dependencies fail? execute-task execution
Q8 How will ops know this is healthy? complete-slice completion
MV01-MV04 Requirements coverage, integration, acceptance criteria validate-milestone validation

If you want advisory review for planning decisions (not yet covered by gates), dispatch a subagent explicitly with the advisory-partner prompt and request that it use the validation model configuration.


Sources

  • Richard Rumelt, Good Strategy/Bad Strategy — diagnosis before policy
  • Ryan Singer, Shape Up — proof points, appetite-based scoping
  • Teresa Torres, Continuous Discovery Habits — assumption testing