feat(sf): port 16 workflow templates from gsd-2

Adds 16 ready-to-use workflow templates that gsd-2 has but SF was
missing. Each runs via /sf workflow run <name> or /sf start <name>.

Markdown phased workflows (10):
- accessibility-audit  — UI a11y scan + remediation report
- api-breaking-change  — survey callers, migrate, deprecate, schedule removal
- changelog-gen        — release notes from git log since last tag
- ci-bootstrap         — minimal-working CI pipeline
- dead-code            — find unused functions/files (report only, no delete)
- issue-triage         — classify a GitHub issue + label/priority recommendation
- observability-setup  — structured logs, metrics, tracing
- onboarding-check     — walk README as new contributor, report gaps
- performance-audit    — measure → fix → measure
- pr-review            — structured code review of a PR
- pr-triage            — bucket open PRs (merge/close/nudge)
- release              — version bump → changelog → tag → publish (gated)

YAML-step iterators (4):
- docs-sync            — backfill JSDoc/TSDoc on undocumented exports
- env-audit            — inventory env vars + flag drift
- rename-symbol        — global rename across code/tests/docs
- test-backfill        — write unit tests for untested functions

All gsd-specific refs adapted: /gsd → /sf, .gsd/ → .sf/, gsd-build/gsd-2
→ singularity-forge/sf-run.

Templates need no SF-runtime tools (sf_*, subagent, browser_*) — they
run via the bash + git + gh/npm commands the agent already has.
Discovery verified: discoverPlugins() picks up all 27 templates
(11 existing + 16 new); registry.json is 1:1 with the .md/.yaml files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Mikael Hugo 2026-05-02 19:01:51 +02:00
parent 955ee66614
commit c3ab4bfccf
17 changed files with 1692 additions and 0 deletions

View file

@ -0,0 +1,88 @@
# Accessibility Audit
<template_meta>
name: accessibility-audit
version: 1
mode: oneshot
requires_project: false
artifact_dir: null
</template_meta>
<purpose>
Scan the UI layer for accessibility issues and produce a prioritized remediation
list. Oneshot — report only, no code changes.
</purpose>
<instructions>
## 1. Identify the UI stack
- React, Vue, Svelte, Angular, plain HTML, or something else?
- Which files contain user-facing templates? (e.g. `src/components/**/*.tsx`,
`pages/**/*.vue`, `templates/**/*.html`).
If the project has no UI layer (library, CLI, backend), say so and stop.
## 2. Run available a11y tooling
Prefer automated tools when installed:
- React: `@axe-core/react`, `eslint-plugin-jsx-a11y`.
- Vue: `eslint-plugin-vuejs-accessibility`.
- Any: `pa11y` or `axe` against a running dev server.
If nothing's installed, do a **static audit**: grep for the common
violations listed below.
## 3. Check the WCAG essentials
For each component/page:
1. **Images** without `alt`.
2. **Buttons** that are `<div onClick>` instead of real `<button>`.
3. **Links** without `href` or with only an icon and no label.
4. **Form inputs** without an associated `<label>` or `aria-label`.
5. **Color-only** state indicators (errors shown only with red, etc.).
6. **Focus management**: missing `:focus-visible`, tab traps, hidden focus.
7. **Headings** that skip levels (`h1``h3` with no `h2`).
8. **ARIA**: `role=button` on divs (should just be a button), misuse of
`aria-label` on elements that already have accessible text.
9. **Landmark regions**: missing `<main>`, `<nav>`, `<header>`.
10. **Keyboard traps**: modals/dialogs without escape handlers.
## 4. Triage by severity
- **Blocker** (P0) — prevents a user from completing a core task with
keyboard / screen reader.
- **Serious** (P1) — materially degrades the experience but workable.
- **Moderate** (P2) — fixable, would benefit most users.
- **Minor** (P3) — polish.
## 5. Output
```
# A11y Audit — <date>
## Summary
<scope how many components/pages reviewed, tool coverage>
## Blockers (P0) — <n>
- file:line — issue
Fix: <specific code suggestion>
## Serious (P1) — <n>
...
## Moderate / Minor
...
## Top 5 Recommendations
1. <highest-impact fix first>
```
## 6. Don't refactor
Suggestions should be specific enough to act on, but don't edit any files.
If the user wants to apply the fixes, suggest:
> `/sf workflow refactor "apply a11y fixes"` with this report as context.
</instructions>

View file

@ -0,0 +1,117 @@
# API Breaking Change Workflow
<template_meta>
name: api-breaking-change
version: 1
mode: markdown-phase
requires_project: false
artifact_dir: .sf/workflows/api-breaks/
</template_meta>
<purpose>
Remove or redesign a public API in a controlled way. Surveys all callers,
migrates them, deprecates the old surface, and schedules the removal release.
Built for APIs consumed by other modules in the repo AND by external
dependents where feasible.
</purpose>
<phases>
1. survey — Identify all callers, draft the new design
2. migrate — Land the new API and migrate internal callers
3. deprecate — Mark the old API deprecated, communicate the change
4. release — Remove the old API in a future release
</phases>
<process>
## Phase 1: Survey
**Goal:** Understand the blast radius before touching anything.
1. **Identify the old API:**
- What's the symbol/route/contract? Where is it defined?
- Is it internal-only, exported to the SDK, or a public network endpoint?
2. **Map callers:**
- Internal: `grep` the symbol across the repo. List every call site with
file:line.
- External (if applicable): check the package registry for direct
dependents, look for GitHub code search results, check the docs.
3. **Draft the new shape:**
- What's changing? (rename, signature change, semantic change?)
- What's the migration pattern for a typical caller?
- Can callers migrate incrementally, or is it all-or-nothing?
4. **Produce `SURVEY.md`** with:
- Old signature vs new signature.
- Full caller list (internal + best-effort external).
- Migration difficulty per caller type.
- Timeline proposal: deprecate in release X, remove in release Y.
5. **Gate:** Review the survey with the user. This is the "do we actually want
to do this?" checkpoint. Don't proceed if the blast radius is larger than
the benefit.
## Phase 2: Migrate
**Goal:** Land the new API and move internal callers to it.
1. **Introduce the new API alongside the old one:**
- Export the new function/class/endpoint.
- The old API still works unchanged.
- Add tests for the new API.
2. **Migrate internal callers:**
- One file at a time, atomic commits: `refactor(api): migrate <caller> to <new-api>`.
- Run tests after each batch.
3. **Add a feature flag** if helpful — some callers may need runtime toggles
during a staged rollout.
4. **Gate:** All internal callers migrated, tests green. Confirm before
proceeding to the deprecation phase.
## Phase 3: Deprecate
**Goal:** Tell external callers to migrate.
1. **Mark the old API deprecated:**
- Add `@deprecated` JSDoc / language-equivalent annotations.
- Log a runtime deprecation warning on first use (if feasible and the
language supports it). Include the migration path in the message.
2. **Update docs:**
- Changelog: a prominent `### Deprecated` section with migration guidance.
- README / API docs: note the deprecation timeline.
- If there's a `MIGRATIONS.md`, add an entry.
3. **Communicate:**
- Draft a release-notes entry with before/after code examples.
- If the API has external users, draft an issue or blog post.
4. **Ship** the deprecation release (coordinate with `/sf workflow release`).
5. **Gate:** Deprecation is live, callers have had time to migrate. Decide
the removal timeline (typically next minor or next major).
## Phase 4: Release (removal)
**Goal:** Delete the old API in a future release.
1. **Verify ecosystem readiness:**
- Have internal consumers upgraded?
- Have known external consumers upgraded? If not, is it OK to force it?
2. **Remove the old API:**
- Delete the deprecated code paths.
- Update tests.
- Update docs to remove references.
3. **Release** as part of a major version bump (semver). Document the removal
prominently in the changelog.
4. **Close the loop:** update `SURVEY.md` with the final outcome — what
shipped, what's still outstanding, any lessons learned.
</process>

View file

@ -0,0 +1,82 @@
# Changelog Generator
<template_meta>
name: changelog-gen
version: 1
mode: oneshot
requires_project: false
artifact_dir: null
</template_meta>
<purpose>
Generate a CHANGELOG entry (or release notes draft) from git commits since the
last release tag. Categorizes by type, follows Keep a Changelog format, and
writes to `CHANGELOG.md` (or prints if the user prefers).
</purpose>
<instructions>
## 1. Locate the range
- Find the last release tag: `git describe --tags --abbrev=0` (fall back to
the first commit if no tags exist).
- Gather commits: `git log <last_tag>..HEAD --oneline --no-merges`.
- If zero commits, say "No changes since <tag>" and stop.
## 2. Categorize each commit
Use the conventional-commit prefix when present:
| Prefix | Category |
| --------------------------- | --------------- |
| `feat:` / `feature:` | Added |
| `fix:` / `bugfix:` | Fixed |
| `refactor:` | Changed |
| `docs:` | Docs |
| `chore:` / `ci:` / `build:` | Maintenance |
| `perf:` | Performance |
| `BREAKING CHANGE` footer | Breaking |
For commits without a prefix, infer the category from the subject line.
## 3. Produce the entry
Format using Keep a Changelog v1.1 conventions:
```
## [Unreleased] - YYYY-MM-DD
### Added
- <user-visible description> (#PR / commit sha)
### Fixed
- …
### Changed
- …
### Breaking
- …
```
- Prefer user-visible descriptions over commit-log verbatim.
- Group breaking changes FIRST when present, even though the section appears later.
- Omit empty sections.
## 4. Write or print
- If `CHANGELOG.md` exists, insert the new entry **after** the top-level
heading and before any existing `## [x.y.z]` entries. Do NOT touch prior
releases.
- If it doesn't exist, create one with the standard Keep a Changelog header.
- If the user's arguments include `--print`, print to the chat only — don't
write the file.
## 5. Report
End with:
- the file path (or "printed, not written"),
- the commit range used,
- the number of commits processed per category.
</instructions>

View file

@ -0,0 +1,144 @@
# CI Bootstrap Workflow
<template_meta>
name: ci-bootstrap
version: 1
mode: markdown-phase
requires_project: false
artifact_dir: .sf/workflows/ci/
</template_meta>
<purpose>
Set up continuous integration for a project that has none (or needs a rewrite).
Picks a provider, builds a minimal working pipeline, and incrementally adds
lint / test / build / deploy stages. Goal: green CI on the first PR after
bootstrap, not a 2000-line yaml no one will maintain.
</purpose>
<phases>
1. discover — Understand the stack and what "passing" means today
2. design — Choose provider + plan the pipeline
3. implement — Land the config, fix local failures so CI passes
4. verify — Confirm the pipeline catches regressions
</phases>
<process>
## Phase 1: Discover
**Goal:** Know what to automate.
1. **Detect the stack:**
- Primary language(s) and version(s) — check `package.json`, `pyproject.toml`,
`go.mod`, `Cargo.toml`, `.tool-versions`, `.nvmrc`.
- Package manager (npm / pnpm / yarn / pip / poetry / cargo / go).
- Test runner(s).
- Linter(s) / formatter(s).
- Build tool(s).
2. **Run each check locally and record:**
- `<install command>` — does it complete?
- `<lint command>` — does it pass? How many warnings/errors?
- `<test command>` — does it pass? What's the duration?
- `<build command>` — does it pass? What's the output?
If any fails locally, CI will fail too. Record the failures honestly —
we'll triage in Phase 2.
3. **Check existing CI config:**
- `.github/workflows/`, `.circleci/`, `.gitlab-ci.yml`, `azure-pipelines.yml`.
- Is something already there but disabled / broken?
4. **Write `DISCOVERY.md`:**
- Stack summary.
- Current local-check status (pass/fail per step).
- Existing CI state.
- Constraints: open-source (free minutes matter), private, self-hosted?
5. **Gate:** Confirm the discovery before spending time on CI config.
## Phase 2: Design
**Goal:** Pick a provider and shape the minimal viable pipeline.
1. **Choose a provider.** Default to **GitHub Actions** if the repo is on
GitHub — it's the most portable and well-documented. Alternatives:
- GitLab CI for GitLab-hosted repos.
- CircleCI for existing infra / orb reuse.
- Ask before picking something else.
2. **Plan the pipeline stages**:
- `install` — cache-friendly dependency install.
- `lint` — run linters (non-blocking? or blocking?).
- `test` — run tests (parallelize if the suite is slow).
- `build` — run the build (only if it produces a required artifact).
- `deploy` — optional, usually a later phase.
3. **Decide triggers:**
- Pull requests to `main` — always.
- Pushes to `main` — yes, for post-merge verification.
- Nightly / cron — only if the project has flakes that need monitoring.
4. **Plan caching** — this is what makes CI fast:
- Package manager caches (`node_modules`, `.venv`, `~/.cargo`).
- Build output caches (turborepo, bazel, etc.) if the project uses them.
5. **Write `PLAN.md`:**
- Provider.
- Pipeline YAML sketch (high-level, not final).
- Jobs + their dependencies.
- Expected first-run duration (estimate).
6. **Gate:** Confirm the plan. Scope creep on CI is very real.
## Phase 3: Implement
**Goal:** Land a green pipeline.
1. **Write the CI config** — single file in the correct location.
- Use the latest stable syntax.
- Pin action versions by tag (`actions/checkout@v4`), not `latest`.
- Keep to one matrix axis unless there's a strong reason.
2. **Triage local failures first.** If Phase 1 surfaced lint or test
failures, either fix them now or explicitly mark them as
`continue-on-error` (document why).
3. **Iterate locally** using `act` for GitHub Actions if available, or push
to a feature branch and watch the run.
4. **Commit atomically:**
```
ci: add GitHub Actions pipeline (lint, test, build)
```
5. **Append notes to `IMPL.md`** — gotchas, action version picks, caching
decisions, any YAML tricks.
## Phase 4: Verify
**Goal:** Prove the pipeline catches what it should.
1. **Green on main** — the pipeline must pass on the current `main`.
2. **Red on a broken PR** — open a test PR that:
- Introduces a lint violation.
- Breaks a test.
- Breaks the build.
Confirm CI catches each one, then revert.
3. **Check the timing** — total duration should be acceptable. If it's >15
min for a small project, look at caching and parallelization.
4. **Document for contributors**`CONTRIBUTING.md` (or equivalent):
- How to run each check locally.
- What the CI does.
- How to debug a red build.
5. **Gate:** Final demo. Show the user:
- A green run on main.
- A red run on a deliberately-broken branch.
- The timing.
- The docs.
</process>

View file

@ -0,0 +1,81 @@
# Dead Code Finder
<template_meta>
name: dead-code
version: 1
mode: oneshot
requires_project: false
artifact_dir: null
</template_meta>
<purpose>
Find functions, files, and exports that appear unused. Report them with
evidence. **Do not delete anything** — the human decides what's safe to remove.
</purpose>
<instructions>
## 1. Identify the language & toolchain
Inspect the repo root to determine the primary language. Based on that, pick
the appropriate tooling:
| Language | Tool(s) |
| --------------- | ------------------------------------------------------- |
| TypeScript / JS | `ts-unused-exports`, `knip`, or manual `grep` fallback |
| Python | `vulture` |
| Go | `staticcheck` + manual dead-code detection |
| Rust | `cargo +nightly udeps`, `cargo-machete` |
| Other | language-appropriate tool or manual symbol search |
If a suitable tool is installed in the project, use it. Otherwise fall back to
a systematic manual search (see step 3).
## 2. Scan for candidates
Look for four kinds of dead code:
1. **Unused exports** — exports no other file imports.
2. **Unused files** — files imported by nothing (and not an entry point).
3. **Dead branches** — functions that are reachable but have branches that
can never execute given the call sites.
4. **Unused dependencies** — packages in `package.json` / `pyproject.toml`
that aren't imported anywhere.
## 3. Verify each candidate
Dead-code tools are noisy. Before reporting anything, manually confirm by:
- Searching for the symbol name across the repo (`grep -r` or `rg`).
- Checking build configs (webpack, vite, rollup) for dynamic imports.
- Checking test configs for fixtures or test-only code.
- Checking for usage in templates, strings, or dynamic accessors (these
often trip tools).
If a symbol is only referenced in a test, distinguish:
- Real dead code: test exists but production never calls it.
- Test-only helper: legitimate — not dead.
## 4. Bucket by confidence
```
## High confidence (safe to remove)
- file path:line — symbol / file
Evidence: <why you're sure zero refs, no dynamic import candidates, etc.>
## Medium confidence (check with author)
- file path:line — symbol
Evidence: <only referenced in one unclear place, or dynamic-import-lookalike>
## Low confidence (ignore unless suspicious)
- file path:line
Reason: <looks dead but could be a public API, plugin surface, etc.>
```
## 5. Don't delete
End with:
> To remove high-confidence items, run `/sf workflow refactor "remove dead code"`
> and pass the list above as context.
The user decides. Do **not** delete files, `git rm`, or open a PR.
</instructions>

View file

@ -0,0 +1,76 @@
# Docs Sync workflow — mode: yaml-step
#
# Iterate over undocumented public exports, write JSDoc / TSDoc / equivalent
# for each, and verify the docs stay coherent with the code.
version: 1
name: docs-sync
mode: yaml-step
description: >-
Find undocumented public exports across the source tree, write
per-symbol documentation, and produce a summary of what changed.
params:
src_glob: "src/**/*.ts"
doc_style: "TSDoc"
steps:
- id: inventory
name: Inventory undocumented exports
prompt: >-
Scan {{src_glob}} for exported symbols (functions, classes, types,
constants) that lack a leading doc comment. Write one line per
undocumented export:
- <file_path>::<symbol_name> — <symbol kind>
Skip internal helpers, private exports, and generated code. If
everything is already documented, write "all documented" and no
list items. Save to `undocumented.md`.
requires: []
produces:
- undocumented.md
verify:
policy: content-heuristic
minSize: 1
- id: document
name: Document one export
prompt: >-
Add {{doc_style}} documentation to the symbol indicated by
{{_iter_capture_1}} (format: `<file_path>::<symbol_name>`).
- Describe what it does, not how.
- Document all parameters and the return type.
- Note any thrown errors or side effects.
- Keep it concise — one short paragraph + `@param` / `@returns`.
Append a log line to `docs-log.md`:
`- <symbol_name> — <file:line>`.
requires:
- inventory
context_from:
- inventory
produces:
- docs-log.md
iterate:
source: undocumented.md
pattern: "^- (.+::.+?) — "
verify:
policy: content-heuristic
- id: summary
name: Docs sync summary
prompt: >-
Produce a short summary at `docs-summary.md` listing: the total
number of exports documented, any that were intentionally skipped
with a reason, and a suggested follow-up (e.g. regenerate API docs
if the project has a generator).
requires:
- document
context_from:
- document
produces:
- docs-summary.md
verify:
policy: content-heuristic

View file

@ -0,0 +1,88 @@
# Env Audit workflow — mode: yaml-step
#
# Inventory every environment variable referenced across code/docs/CI, then
# audit each one for drift (documented? default? required vs optional?
# mentioned in CI?).
version: 1
name: env-audit
mode: yaml-step
description: >-
Find every environment variable the project uses and flag drift —
variables referenced in code but not in docs, or vice versa.
params:
src_glob: "src/**/*.{ts,js,py,go,rb}"
ci_paths: ".github/workflows .circleci .gitlab-ci.yml"
doc_paths: "README.md docs/ .env.example"
steps:
- id: inventory
name: Inventory env-var references
prompt: >-
Find every environment variable referenced in the project. Check:
- Code ({{src_glob}}): look for `process.env.X`, `os.environ["X"]`,
`os.Getenv("X")`, and equivalent patterns.
- Docs and `.env.example` ({{doc_paths}}).
- CI configs ({{ci_paths}}).
Produce `env-inventory.md` with one line per variable:
- <VAR_NAME> — code: <y|n>, docs: <y|n>, ci: <y|n>
Sort alphabetically. If any variable appears in one location but not
another, that's the drift — the next step will audit each entry.
requires: []
produces:
- env-inventory.md
verify:
policy: content-heuristic
- id: audit-var
name: Audit a single env variable
prompt: >-
Assess the env variable referenced by {{_iter_capture_1}} against
the three locations. For this variable, write an entry in
`env-audit.md`:
## <VAR_NAME>
- Code refs: <file:line list>
- Documented: <yes — link | no — gap>
- CI: <yes | no>
- Required / optional: <deduced from usage>
- Has a default: <yes — value | no>
- Drift: <one of: none | undocumented | unused-in-code | missing-in-ci | conflicting-default>
- Recommendation: <one sentence>
If the variable is used only in third-party code we can't control,
note that and move on.
requires:
- inventory
context_from:
- inventory
produces:
- env-audit.md
iterate:
source: env-inventory.md
pattern: "^- ([A-Z][A-Z0-9_]+) — "
verify:
policy: content-heuristic
- id: report
name: Final drift report
prompt: >-
Produce `env-drift-report.md` summarizing:
- Variables with drift (by category: undocumented, unused, missing-in-ci).
- Recommended `.env.example` diffs to close the gaps.
- Any variables whose defaults or required-ness look wrong.
Keep it scannable — prioritize drift that could cause runtime failures.
requires:
- audit-var
context_from:
- audit-var
produces:
- env-drift-report.md
verify:
policy: content-heuristic

View file

@ -0,0 +1,84 @@
# Issue Triage
<template_meta>
name: issue-triage
version: 1
mode: oneshot
requires_project: false
artifact_dir: null
</template_meta>
<purpose>
Classify a GitHub issue, recommend labels/priority, and propose the next
concrete action. Oneshot — read the issue, think, respond. No file edits,
no state.
</purpose>
<instructions>
## 1. Fetch the issue
- The user arguments should contain an issue number (`#123`) or URL. If not,
ask once for the reference.
- Pull issue body, labels, author, age, reactions, comment count, and linked
PRs via `gh issue view <ref> --json number,title,body,labels,author,createdAt,updatedAt,reactions,comments`.
- If already closed, say so and stop unless the user specifically asked to
revisit it.
## 2. Classify
Assign exactly one of:
- `bug` — reproducible broken behavior.
- `feature-request` — new capability, not a fix.
- `question` / `support` — user needs help, no code change implied.
- `docs` — docs gap.
- `discussion` — open-ended, no clear action.
- `invalid` — duplicate, off-topic, spam.
Add these secondary labels as applicable: `needs-repro`, `needs-info`,
`good-first-issue`, `regression`, `security` (flag strongly if so),
`breaking`, `external-dep`.
## 3. Assess priority
- `p0` — production-breaking / security / data loss.
- `p1` — significantly degrades common workflow.
- `p2` — standard bug/feature.
- `p3` — minor / cosmetic / future.
Base the assessment on: blast radius, reproduction frequency, reactions,
whether a workaround exists, and whether prior issues reference it.
## 4. Recommend the next action
Write ONE of these, with a concrete next step:
- **Ask for info:** list the 13 specific things missing (repro steps, version,
logs). Draft the comment text.
- **Accept and schedule:** suggest a workflow to run next (e.g.
`/sf start bugfix --issue #123` or `/sf workflow small-feature`).
- **Close:** draft a polite close comment with the reason.
- **Escalate:** flag for human review with a specific reason.
## 5. Output format
```
Issue: #<n><title>
Author: <user> Age: <d>d Comments: <n> Reactions: <n>
Classification: <primary>, <secondary labels>
Priority: <p0/p1/p2/p3>
Why: <23 sentence rationale>
Next action: <recommendation>
Comment draft:
> <text to post or "n/a" if no comment needed>
```
## 6. Don't post or edit
Draft the comment and any label changes as *suggestions* — never run
`gh issue comment` or `gh issue edit` unless the user explicitly confirms.
</instructions>

View file

@ -0,0 +1,133 @@
# Observability Setup Workflow
<template_meta>
name: observability-setup
version: 1
mode: markdown-phase
requires_project: false
artifact_dir: .sf/workflows/observability/
</template_meta>
<purpose>
Add structured logging, metrics, and tracing to a project that has none (or
not enough). Picks tools appropriate to the stack, instruments the
highest-value code paths first, and verifies that operators can actually use
the output to debug an incident.
</purpose>
<phases>
1. survey — Understand what exists, what's missing, and what we need
2. design — Pick tools + plan instrumentation
3. implement — Add logs / metrics / traces to the prioritized paths
4. verify — Run a synthetic incident and confirm we'd catch it
</phases>
<process>
## Phase 1: Survey
**Goal:** Know the starting point so the plan is honest.
1. **Inventory existing instrumentation:**
- Logging framework (winston / pino / logging / zap / ...)?
- Log destination (stdout / file / remote aggregator)?
- Existing metrics (Prometheus / OpenTelemetry / custom)?
- Existing traces (Jaeger / Zipkin / OTEL)?
- Error reporting (Sentry / Rollbar / ...)?
2. **Identify the critical paths:**
- The top 35 user-facing flows.
- Any background jobs or schedulers.
- External dependencies (DBs, HTTP APIs, queues).
3. **Classify each path:**
- Fully instrumented, partially, or not at all.
- What question would an operator want to answer about this path at 2 AM?
4. **Write `SURVEY.md`:**
- Current state summary.
- Prioritized list of gaps (what's missing where).
- Constraints (budget, existing tooling, cloud provider).
5. **Gate:** Confirm priorities. Observability work is easy to over-engineer —
focus on the top 3 paths rather than blanket coverage.
## Phase 2: Design
**Goal:** Pick tools and agree on conventions before coding.
1. **Choose the stack:**
- **Logs:** structured JSON with a consistent schema (timestamp, level,
service, request_id, user_id when safe, message, fields).
- **Metrics:** counter / gauge / histogram; decide the naming scheme
(`<service>.<area>.<what>_<unit>`).
- **Traces:** OpenTelemetry is the modern default unless the project
is already committed to something else.
2. **Define the conventions** and write them to `CONVENTIONS.md`:
- Log levels: when to use debug / info / warn / error.
- Trace naming: span names as `verb.object`.
- Metric labels: what's allowed, what's banned (high-cardinality warning).
- PII / secret-scrubbing rules — critical, document them.
3. **Plan the instrumentation**`PLAN.md`:
- For each critical path: what logs, what metrics, what trace spans.
- The order to implement (start with the highest-value path).
4. **Gate:** Review the plan. Conventions are hard to change later — get them
right now.
## Phase 3: Implement
**Goal:** Ship instrumentation one path at a time.
1. **Bootstrap the libraries:**
- Install chosen packages.
- Create a shared `observability.ts` / `observability.py` module:
logger factory, metric registry, tracer setup.
- Add env-based configuration (log level, trace sampling, metrics endpoint).
2. **Instrument one critical path end-to-end:**
- Entry-point log with all relevant context.
- Key decision points logged at debug level.
- Outbound calls wrapped in a trace span.
- Errors logged at error level with stack traces.
- Counter + histogram for the operation.
3. **Commit atomically** — one path per commit. Run the path and inspect the
output to make sure it's actually useful.
4. **Repeat** for the remaining prioritized paths.
5. **Write `IMPL.md`** as you go, noting anything that surprised you or that
operators should know.
## Phase 4: Verify
**Goal:** Prove that we'd catch a real incident.
1. **Run a synthetic incident:**
- Inject a failure (kill the DB connection, throw a timeout, slow down
a dependency).
- From the logs / metrics / traces alone, could an operator who didn't
write the code diagnose it?
2. **Fix the gaps** surfaced by the drill — usually missing context in
error logs, or metrics that don't label the failure mode.
3. **Write `VERIFY.md`:**
- What scenarios were tested.
- What was observable vs what wasn't.
- Recommended alerts to set up (thresholds, not tools).
4. **Document for operators** — update the runbook or README "operating"
section:
- Where logs go.
- How to view traces.
- Key metrics and their healthy ranges.
5. **Gate:** Final review. Observability that nobody uses is overhead, not
value — the user should be able to demo "here's how I'd debug X" using
the new instrumentation.
</process>

View file

@ -0,0 +1,74 @@
# Onboarding Check
<template_meta>
name: onboarding-check
version: 1
mode: oneshot
requires_project: false
artifact_dir: null
</template_meta>
<purpose>
Walk the project's README end-to-end as a brand-new contributor would, and
report every step that fails, is unclear, or is missing. Oneshot — produce a
gap report, not a fix.
</purpose>
<instructions>
## 1. Read the README top-to-bottom
- Read `README.md` (and any `CONTRIBUTING.md`, `docs/setup.md`, or other
docs the README links from its "Getting Started" section).
- Make a list of every command the docs tell you to run, in order.
## 2. Check the environment
For each prerequisite the README claims ("Node ≥ 22", "Python 3.11",
"Docker", etc.):
- Check whether the version is stated.
- Check whether it's pinned in the repo (e.g. `package.json` engines, `.nvmrc`,
`.tool-versions`, `pyproject.toml`, `Dockerfile`).
- If the README's claim and the repo's pin disagree, flag it.
## 3. Dry-run the commands
Where safe, run the commands in the **current** environment:
- `npm install` / `pip install -r requirements.txt` / equivalent.
- The "build" / "test" / "run" commands.
- The "dev server" command (spawn, wait 5s, kill it).
Skip any command that:
- Would hit external APIs with credentials (record as "needs real creds, not tested").
- Would incur real cost (cloud deploys, paid APIs).
- Would modify global state (`sudo`, package manager global installs).
## 4. Report
```
# Onboarding Report — <project name>
## Summary
<12 sentences: did a new contributor have a path that "just works"?>
## Prerequisites
- [✓/✗/?] <prereq><notes, version mismatch, etc.>
## Steps
1. [✓/✗/?] <command><exit code, error, or output snippet>
## Gaps
- <missing docs>
- <commands that failed or produced surprising output>
- <undocumented side effects>
## Recommendations
- <specific README edits that would fix the top 3 gaps>
```
## 5. Don't edit
Don't modify the README or any config — just report. The author decides which
gaps to close.
</instructions>

View file

@ -0,0 +1,125 @@
# Performance Audit Workflow
<template_meta>
name: performance-audit
version: 1
mode: markdown-phase
requires_project: false
artifact_dir: .sf/workflows/perf/
</template_meta>
<purpose>
Find and fix real performance problems. Measure first, fix with evidence,
measure again. Avoids the common trap of "optimizations" that don't move
actual user-facing metrics.
</purpose>
<phases>
1. profile — Gather real measurements from representative workloads
2. prioritize — Pick the fixes with the best effort/impact ratio
3. fix — Apply the changes with before/after numbers
4. verify — Confirm the improvements hold under realistic load
</phases>
<process>
## Phase 1: Profile
**Goal:** Replace intuition with measurements.
1. **Define the workload.** What's slow, for whom, and under what conditions?
- Interactive: which user flow?
- Batch: which job?
- API: which endpoint, at what QPS?
Without this, you're optimizing in the dark.
2. **Establish a baseline metric** that reflects what users feel:
- Latency at p50, p95, p99.
- Throughput.
- Memory high-water-mark.
- Cold-start / warm-start times.
Pick one or two metrics — not all five.
3. **Run a profiler.**
- Node: `node --prof`, `clinic.js`, Chrome DevTools flamegraphs.
- Python: `cProfile`, `py-spy`, `scalene`.
- Go: `pprof`.
- Web: Lighthouse, Chrome Performance tab, Web Vitals.
- Database: `EXPLAIN ANALYZE`, slow query log.
4. **Write `BASELINE.md`** with:
- Exact workload description (so we can re-run it).
- Metric values.
- Profile output or flamegraph attached.
- Top 5 hot functions / queries / network calls.
5. **Gate:** The user confirms the baseline matches their experience. If it
doesn't, the workload isn't representative — go back and fix that first.
## Phase 2: Prioritize
**Goal:** Pick the fixes that actually matter.
1. **For each hot spot in the profile**, estimate:
- Potential improvement (guesstimate the % reduction).
- Implementation effort (hours / days).
- Risk (probability of introducing bugs).
2. **Prioritize by impact / (effort × risk).** A 50% reduction in a
p99-tail function often beats a 90% reduction in a warm path.
3. **Write `PLAN.md`** with:
- A ranked list of fixes (top 35).
- For each: what changes, why it should help, what could go wrong.
- Explicitly call out hot spots you're choosing to SKIP and why.
4. **Gate:** Confirm the plan with the user before coding. It's cheap to
change direction here, expensive later.
## Phase 3: Fix
**Goal:** Apply changes with receipts.
1. **One fix at a time.** Each becomes an atomic commit. Don't bundle
unrelated perf changes — you'll lose the ability to attribute gains.
2. **Before/after measurement** for each fix:
- Run the same workload from Phase 1.
- Record the new metrics.
- If a fix doesn't help, revert it and say so.
3. **Commit message format:**
```
perf(<area>): <change summary>
Before: p95 400ms
After: p95 180ms
```
4. **Append to `PROGRESS.md`:**
- Fix name, before/after, whether kept or reverted.
## Phase 4: Verify
**Goal:** Make sure the improvements hold up in reality.
1. **Re-run the full Phase 1 workload.** Compare against baseline.
2. **Test under stress** — 2x the normal load, cold caches, realistic data
sizes. Perf fixes that only help a synthetic microbenchmark aren't
worth shipping.
3. **Check for regressions** elsewhere — run the full test suite, watch
memory, check other endpoints. Sometimes local gains come with global
costs.
4. **Write `REPORT.md`:**
- Summary: which metric improved by how much, and under what conditions.
- Fixes kept vs reverted.
- Remaining hot spots that weren't worth it.
- Monitoring recommendation: what metric to track so regressions surface.
5. **Gate:** Present the report. If the improvement isn't meaningful at the
user-facing level, that's important to surface — don't pretend a win.
</process>

View file

@ -0,0 +1,67 @@
# PR Review
<template_meta>
name: pr-review
version: 1
mode: oneshot
requires_project: false
artifact_dir: null
</template_meta>
<purpose>
Produce a structured code-review for the current branch's diff (or a named PR
if the user supplies one). No branch switching, no state tracking — emit the
review as a single response and stop.
</purpose>
<instructions>
## 1. Determine what to review
- If the user arguments include a PR number (e.g. `#123`) or a URL matching
`github.com/<owner>/<repo>/pull/<n>`, use `gh pr view <ref>` + `gh pr diff <ref>`.
- Otherwise, default to the current branch vs `main`: `git diff main...HEAD`.
- If neither has changes, say so and stop.
## 2. Survey the diff
- List the files touched, grouped into: `src/`, `tests/`, `docs/`, `config/`, `other`.
- For each file, note what kind of change it is (feature, refactor, fix, test, docs).
- Flag anything that looks unusual for its directory (e.g. `.env` changes,
generated files, lockfiles with semver-major bumps).
## 3. Produce the review
Structure the output as:
```
## Summary
<23 sentence overview of what the PR does>
## Concerns
- <specific line references for anything that could break, regress, or harm
maintainability. Prefer `file.ts:42` anchors. Omit this section if none.>
## Suggestions
- <non-blocking improvements naming, tests to add, small refactors>
## Tests / Verification
- <what tests were added? anything uncovered? did CI run?>
## Questions
- <open questions you can't answer from the diff alone>
```
## 4. Be concrete
- Quote 13 specific lines with `file:line` references in each bullet where
applicable. Vague reviews ("consider refactoring this") are worse than none.
- If the diff is >500 lines, call out that and ask whether to do a deep review
or a skim — don't silently skim a large diff.
## 5. Don't modify code
This is a oneshot review. Do **not** edit files or create artifacts. If you
suggest a change, describe it in prose — let the author decide.
</instructions>

View file

@ -0,0 +1,83 @@
# PR Triage
<template_meta>
name: pr-triage
version: 1
mode: oneshot
requires_project: false
artifact_dir: null
</template_meta>
<purpose>
Walk open pull requests and produce a triage report: which to merge, close,
or nudge. Oneshot — report only, no actions taken.
</purpose>
<instructions>
## 1. List open PRs
Run `gh pr list --state open --limit 50 --json number,title,author,createdAt,updatedAt,isDraft,labels,reviewDecision,mergeable,headRefName,statusCheckRollup,additions,deletions`.
If there are more than 50 open PRs, note that and use `--limit 100`. If still
more, just pick the 100 most recently updated and warn that the report is
partial.
## 2. Bucket each PR
For each PR, compute:
- **Age**: days since creation.
- **Staleness**: days since last update.
- **Size**: `additions + deletions` (small ≤ 100, medium ≤ 500, large > 500).
- **CI**: passing, failing, pending, none.
- **Reviews**: approved, changes-requested, pending, none.
- **Mergeability**: clean, conflicting, unknown.
- **Draft**: yes/no.
Put each PR into exactly ONE bucket:
- **✅ Ready to merge** — non-draft, approved, CI passing, mergeable.
- **🛠 Waiting on author** — changes-requested or CI failing.
- **👀 Needs review** — no reviews, non-draft, CI passing, ≤ 30 days old.
- **💤 Stale** — no update in 30+ days.
- **❌ Close candidate** — stale > 90 days AND (no reviews OR conflicting)
OR author is inactive OR scope is clearly superseded.
- **🚧 Draft** — explicitly drafted, not yet ready.
## 3. Output
```
PR Triage — <N> open PRs (as of YYYY-MM-DD)
## ✅ Ready to merge (<n>)
- #123 <title> — by @author, +X/-Y, approved, CI ✓
Next: merge with `gh pr merge 123`.
## 🛠 Waiting on author (<n>)
- #124 <title> — CI failing: <reason>. Nudge author.
## 👀 Needs review (<n>)
- #125 <title> — 5 days old, no reviews yet.
## 💤 Stale (<n>)
- #126 <title> — no update in 45 days. Suggest: nudge @author, or close.
## ❌ Close candidates (<n>)
- #127 <title> — 120 days stale, conflicts with main, superseded by #140.
## 🚧 Drafts (<n>)
- #128<title>
```
## 4. Recommend specific actions
At the bottom, produce a short action list of the top 35 PRs that would have
the biggest impact if resolved (oldest bottlenecking, most approved but unmerged,
easiest close-candidates).
## 5. Don't act
Do not run `gh pr merge`, `gh pr close`, or `gh pr comment`. The report is
for humans to act on.
</instructions>

View file

@ -213,6 +213,166 @@
"artifact_dir": ".sf/workflows/product-tracking/",
"estimated_complexity": "medium",
"requires_project": false
},
"accessibility-audit": {
"name": "Accessibility Audit",
"description": "Scan UI for a11y issues and produce prioritized remediation report — oneshot, no code changes",
"file": "accessibility-audit.md",
"phases": ["audit"],
"triggers": ["accessibility", "a11y", "wcag", "screen reader", "keyboard navigation"],
"artifact_dir": null,
"estimated_complexity": "low",
"requires_project": false
},
"api-breaking-change": {
"name": "API Breaking Change",
"description": "Survey callers, migrate them, deprecate old surface, schedule removal release",
"file": "api-breaking-change.md",
"phases": ["survey", "migrate", "deprecate", "schedule"],
"triggers": ["breaking change", "api break", "deprecate", "remove api", "redesign api"],
"artifact_dir": ".sf/workflows/api-breaks/",
"estimated_complexity": "medium",
"requires_project": false
},
"changelog-gen": {
"name": "Changelog Generation",
"description": "Generate CHANGELOG entry from git commits since last release tag, Keep-a-Changelog format",
"file": "changelog-gen.md",
"phases": ["generate"],
"triggers": ["changelog", "release notes", "what changed", "since last release"],
"artifact_dir": null,
"estimated_complexity": "low",
"requires_project": false
},
"ci-bootstrap": {
"name": "CI Bootstrap",
"description": "Set up minimal working CI pipeline — provider choice, lint/test/build/deploy stages added incrementally",
"file": "ci-bootstrap.md",
"phases": ["pick", "minimal", "stages", "verify"],
"triggers": ["ci setup", "ci bootstrap", "github actions", "gitlab ci", "set up ci", "no ci"],
"artifact_dir": ".sf/workflows/ci/",
"estimated_complexity": "medium",
"requires_project": false
},
"dead-code": {
"name": "Dead Code Detection",
"description": "Find unused functions, files, exports — report with evidence, do NOT delete (human decides)",
"file": "dead-code.md",
"phases": ["scan"],
"triggers": ["dead code", "unused", "unreachable", "orphan code", "remove unused"],
"artifact_dir": null,
"estimated_complexity": "low",
"requires_project": false
},
"docs-sync": {
"name": "Docs Sync",
"description": "Inventory undocumented public exports, write per-symbol docs, summarize changes — yaml-step iteration",
"file": "docs-sync.yaml",
"phases": ["inventory", "document", "summarize"],
"triggers": ["docs sync", "document exports", "missing docs", "tsdoc", "jsdoc", "doc backfill"],
"artifact_dir": null,
"estimated_complexity": "medium",
"requires_project": false
},
"env-audit": {
"name": "Env Variable Audit",
"description": "Inventory env vars across code/docs/CI and flag drift — referenced but undocumented or vice versa",
"file": "env-audit.yaml",
"phases": ["inventory", "audit"],
"triggers": ["env audit", "env vars", "environment variables", ".env audit", "missing env"],
"artifact_dir": null,
"estimated_complexity": "low",
"requires_project": false
},
"issue-triage": {
"name": "Issue Triage",
"description": "Classify a GitHub issue, recommend labels/priority, propose next concrete action — oneshot",
"file": "issue-triage.md",
"phases": ["triage"],
"triggers": ["triage issue", "classify issue", "what to do with this issue", "issue labels"],
"artifact_dir": null,
"estimated_complexity": "minimal",
"requires_project": false
},
"observability-setup": {
"name": "Observability Setup",
"description": "Add structured logging, metrics, tracing — pick stack-appropriate tools, instrument highest-value paths first, verify operator usability",
"file": "observability-setup.md",
"phases": ["pick", "instrument", "verify"],
"triggers": ["observability", "logging", "metrics", "tracing", "telemetry setup", "monitoring"],
"artifact_dir": ".sf/workflows/observability/",
"estimated_complexity": "medium",
"requires_project": false
},
"onboarding-check": {
"name": "Onboarding Check",
"description": "Walk the README as a brand-new contributor — report every step that fails, is unclear, or missing — oneshot gap report",
"file": "onboarding-check.md",
"phases": ["walk"],
"triggers": ["onboarding", "readme check", "new contributor", "setup gaps", "first-time setup"],
"artifact_dir": null,
"estimated_complexity": "low",
"requires_project": false
},
"performance-audit": {
"name": "Performance Audit",
"description": "Find and fix real performance problems — measure, fix with evidence, measure again",
"file": "performance-audit.md",
"phases": ["measure", "fix", "verify"],
"triggers": ["performance", "perf audit", "slow", "optimize", "bottleneck", "profiling"],
"artifact_dir": ".sf/workflows/perf/",
"estimated_complexity": "medium",
"requires_project": false
},
"pr-review": {
"name": "PR Review",
"description": "Read a pull request and produce a structured review — correctness, design, tests, docs — oneshot",
"file": "pr-review.md",
"phases": ["review"],
"triggers": ["review pr", "pr review", "code review", "review pull request"],
"artifact_dir": null,
"estimated_complexity": "low",
"requires_project": false
},
"pr-triage": {
"name": "PR Triage",
"description": "Walk open PRs, bucket each (merge/close/nudge), produce triage report — oneshot, no actions",
"file": "pr-triage.md",
"phases": ["triage"],
"triggers": ["pr triage", "open prs", "stale prs", "review queue"],
"artifact_dir": null,
"estimated_complexity": "low",
"requires_project": false
},
"release": {
"name": "Release",
"description": "Cut a release in four phases with approval gates — version bump, changelog, tag, publish/announce",
"file": "release.md",
"phases": ["prepare", "bump", "publish", "announce"],
"triggers": ["release", "cut release", "ship release", "publish release", "version bump"],
"artifact_dir": ".sf/workflows/releases/",
"estimated_complexity": "medium",
"requires_project": false
},
"rename-symbol": {
"name": "Rename Symbol",
"description": "Find every reference to a symbol across code/tests/docs and rename it consistently — yaml-step iteration",
"file": "rename-symbol.yaml",
"phases": ["inventory", "rename", "verify"],
"triggers": ["rename symbol", "rename function", "rename variable", "global rename"],
"artifact_dir": null,
"estimated_complexity": "low",
"requires_project": false
},
"test-backfill": {
"name": "Test Backfill",
"description": "Iterate over untested functions, write a focused test for each, verify suite stays green — yaml-step",
"file": "test-backfill.yaml",
"phases": ["inventory", "write", "verify"],
"triggers": ["test backfill", "missing tests", "coverage debt", "untested functions", "add tests"],
"artifact_dir": null,
"estimated_complexity": "medium",
"requires_project": false
}
}
}

View file

@ -0,0 +1,118 @@
# Release Workflow
<template_meta>
name: release
version: 1
mode: markdown-phase
requires_project: false
artifact_dir: .sf/workflows/releases/
</template_meta>
<purpose>
Cut a software release in four phases with approval gates. Handles version
bump, changelog generation, tag creation, and publish/announce. Conservative
by default — prompts for confirmation before any action that's visible outside
the repo.
</purpose>
<phases>
1. prepare — Decide version bump, verify state
2. bump — Write version, changelog, create tag
3. publish — Push tag, run release pipeline
4. announce — Post release notes
</phases>
<process>
## Phase 1: Prepare
**Goal:** Decide the version bump and confirm the repo is ready to release.
1. **Read the state of the repo:**
- `git log <last_tag>..HEAD --oneline --no-merges` — commits since last release.
- Check for uncommitted changes (`git status`).
- Identify the branch (release from main/master unless explicitly told otherwise).
2. **Propose a semver bump:**
- `major` if any commit has a `BREAKING CHANGE:` footer or `!` suffix.
- `minor` if any commit is `feat:` / `feature:`.
- `patch` otherwise.
- Print: current version, proposed next version, and the commit categorization.
3. **Write `PREPARE.md`** in the artifact directory with:
- Proposed version.
- Commit summary grouped by type.
- Any concerns (unmerged PRs, failing CI, recent reverts).
4. **Gate:** Present the plan and ask the user to confirm the version before
proceeding. If the user wants a different bump, record the rationale.
## Phase 2: Bump
**Goal:** Commit the version bump and write the changelog.
1. **Bump the version** in the appropriate file(s):
- Node: `package.json` (and workspace `package.json`s if monorepo).
- Python: `pyproject.toml` / `setup.py` / `__version__`.
- Rust: `Cargo.toml`.
- Other: ask if unsure.
2. **Generate the changelog entry** — follow Keep a Changelog format. Add it
to `CHANGELOG.md` under `## [x.y.z] - YYYY-MM-DD`. Preserve all existing
entries untouched.
3. **Commit:**
```
chore(release): v<x.y.z>
```
4. **Create an annotated tag:**
```
git tag -a v<x.y.z> -m "Release v<x.y.z>"
```
Don't push yet.
5. **Gate:** Show the diff (`git show HEAD`, `git show v<x.y.z>`) and confirm
before pushing.
## Phase 3: Publish
**Goal:** Push the release and kick off downstream pipelines.
1. **Push commit + tag:**
```
git push origin <branch>
git push origin v<x.y.z>
```
2. **Trigger the release pipeline** if applicable:
- GitHub Actions release workflow (often triggered by tag push).
- `npm publish`, `cargo publish`, `pypi upload` — only if explicitly asked.
3. **Verify:**
- CI passes on the tagged commit.
- Release artifact appears where expected (GitHub releases, registry, etc).
4. **Gate:** Confirm the release is live and visible before announcing.
## Phase 4: Announce
**Goal:** Make the release discoverable.
1. **Create or update the GitHub Release** (via `gh release create` or an
existing workflow output). Include:
- Tag name.
- Title: `v<x.y.z>`.
- Body: the CHANGELOG entry for this version.
2. **Optional follow-ups** (ask the user first):
- Slack / Discord announcement draft.
- Update docs / examples that reference the version.
- Close any milestone linked to this release.
3. **Write `RELEASE.md`** in the artifact dir capturing:
- What shipped.
- Links to the release, changelog, and key PRs.
- Any post-release follow-ups.
</process>

View file

@ -0,0 +1,99 @@
# Rename Symbol workflow — mode: yaml-step
#
# Rename a symbol across the codebase. Inventories call sites, renames each
# file atomically, and verifies the project still builds and tests pass.
#
# Run with: /sf workflow rename-symbol old_name=foo new_name=bar
version: 1
name: rename-symbol
mode: yaml-step
description: >-
Find all occurrences of a symbol and rename it across the codebase,
verifying the build and tests stay green.
params:
old_name: "OLD_SYMBOL_HERE"
new_name: "NEW_SYMBOL_HERE"
src_glob: "src/**/*.ts"
build_command: "npm run build"
test_command: "npm test"
steps:
- id: inventory
name: Find files that reference the symbol
prompt: >-
Search {{src_glob}} for files that reference the symbol `{{old_name}}`.
Include callers, imports, and any string references that look like
identifier usage (but skip unrelated word matches in comments/docs).
Write one file path per line:
- <relative file path>
Save to `rename-targets.md`. If there are zero matches, write
"no matches" and no items.
requires: []
produces:
- rename-targets.md
verify:
policy: content-heuristic
- id: rename-file
name: Rename in one file
prompt: >-
In the file indicated by {{_iter_capture_1}}, rename every
identifier-style occurrence of `{{old_name}}` to `{{new_name}}`.
- Only replace actual code references (identifiers, imports).
- Leave unrelated substrings (e.g. comments that happen to contain
the word but aren't referring to the symbol) alone.
- Preserve surrounding formatting.
- If the file defines `{{old_name}}` directly, rename the definition too.
Append a log line to `rename-log.md`:
`- <file> — <count> replacement(s)`.
requires:
- inventory
context_from:
- inventory
produces:
- rename-log.md
iterate:
source: rename-targets.md
pattern: "^- (.+)$"
verify:
policy: content-heuristic
- id: verify-build
name: Verify the build
prompt: >-
Run the build: `{{build_command}}`. If it fails, examine the failure
and report the root cause in `rename-build.md` — the rename may have
broken something that needs a manual fix before proceeding.
requires:
- rename-file
context_from:
- rename-file
produces:
- rename-build.md
verify:
policy: shell-command
command: "{{build_command}}"
- id: verify-tests
name: Verify the test suite
prompt: >-
Run the tests: `{{test_command}}`. If any fail, report which tests
failed and why in `rename-tests.md`. Tests that reference the old
name in string literals (e.g. snapshot tests, API contracts) may
need intentional updates.
requires:
- verify-build
context_from:
- verify-build
produces:
- rename-tests.md
verify:
policy: shell-command
command: "{{test_command}}"

View file

@ -0,0 +1,73 @@
# Test Backfill workflow — mode: yaml-step
#
# Iterate over functions that lack unit tests, write a test for each, and
# shell-verify the suite stays green. Use for paying down coverage debt.
version: 1
name: test-backfill
mode: yaml-step
description: >-
Inventory untested functions, write a focused test for each,
and verify the test suite passes after each addition.
params:
test_command: "npm test"
src_glob: "src/**/*.ts"
steps:
- id: inventory
name: Inventory untested functions
prompt: >-
Scan the source tree (glob: {{src_glob}}) and identify exported
functions, classes, and React components that have no direct unit
test. Write one finding per line as a Markdown list item:
- <file_path>::<symbol_name> — <1-line reason it's worth testing>
Save the list to `untested.md`. If nothing is untested, write a
single line "all covered" and no list items.
requires: []
produces:
- untested.md
verify:
policy: content-heuristic
minSize: 1
- id: write-test
name: Write a test for one untested symbol
prompt: >-
Write a unit test for the symbol indicated by the captured path.
The symbol reference in {{_iter_capture_1}} has the form
`<file_path>::<symbol_name>`. Follow the project's existing test
conventions (framework, style, file naming). Add the test file
alongside existing tests or under a parallel tests/ tree.
After writing, append a one-line note to `backfill-log.md`:
`- <symbol_name> — <test file path> — <pass|fail>`.
requires:
- inventory
context_from:
- inventory
produces:
- backfill-log.md
iterate:
source: untested.md
pattern: "^- (.+::.+?) — "
verify:
policy: shell-command
command: "{{test_command}}"
- id: summary
name: Produce coverage summary
prompt: >-
Summarize the session: how many symbols got tests, how many were
skipped and why, and what remains in `untested.md` that would
benefit from human review. Save to `backfill-summary.md`.
requires:
- write-test
context_from:
- write-test
produces:
- backfill-summary.md
verify:
policy: content-heuristic