From 82633b6f5e5dc4a73c8ee113bad4be09c83a9a5b Mon Sep 17 00:00:00 2001 From: Mikael Hugo Date: Sat, 2 May 2026 21:15:13 +0200 Subject: [PATCH] feat(sf): sf-audit-traces workflow for slow self-improvement loop MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A standalone agent prompt that reads SF's observability sources (self-feedback / journal / activity / judgments / forensics) and files AT MOST 3 recurring-pattern findings via sf_self_report so they enter the existing triage flow. PDD spec: Purpose: continuous self-improvement loop. SF already has the data sources (self-feedback.jsonl, journal/, activity/, judgments/) and the consumer pattern (triage-self-feedback → requirement-promoter). What was missing: a standalone prompt that pulls those sources together for a scheduled run. Consumer: agents invoked via '/schedule every morning sf-audit-traces' (cloud) or '/sf workflow run sf-audit-traces' (manual). Contract: 1. Snapshot the trace volumes (file counts + line counts) into evidence so reports are concrete, not prose. 2. Bar = 3+ occurrences. Single events go to operator eyeballs, not permanent self-feedback entries. 3. Hard cap of 3 entries per run. The whole point is slow iteration — the triage queue is human-paced, not a firehose. 4. NEVER auto-apply. Even if the fix looks one-line obvious, file and stop. The triage flow decides what becomes work. 5. Zero findings is a successful run when the system is healthy. Failure boundary: missing source files → skip silently. Read errors → handle gracefully. Never block on absence. Evidence (verified during scan before writing): - 181 self-feedback entries (55 open, 126 resolved) - Top open kinds: runaway-guard-hard-pause (4), git-stage-failure (2), context-injection-gap (2), orphan-prompt (2) - Journal: 6-233 events per active day - Activity logs: per-unit JSONL transcripts present - All sources accessible via plain file reads — no special tools. Non-goals: - ML training on traces - Cross-project trace aggregation - Auto-applying fixes (triage flow already does that) - Fast iteration (deliberately slow — 3/run cap means at most 21 new triage items per week even with daily runs) Invariants: - Safety: agent never edits code/prompts/templates/docs. - Liveness: zero findings is a valid output. The agent doesn't fabricate patterns to justify a run. Discovery verified: 28 total workflow templates after this commit (was 27); plugins.get('sf-audit-traces') returns the plugin from the bundled source. Pairs with: triage-self-feedback (reads what this files), requirement-promoter (auto-promotes recurring kinds to requirements), self-feedback-drain (session-start drain into repair turns). The audit is the IN end of that pipeline; the rest of SF was already the OUT end. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../sf/workflow-templates/registry.json | 10 ++ .../sf/workflow-templates/sf-audit-traces.md | 105 ++++++++++++++++++ 2 files changed, 115 insertions(+) create mode 100644 src/resources/extensions/sf/workflow-templates/sf-audit-traces.md diff --git a/src/resources/extensions/sf/workflow-templates/registry.json b/src/resources/extensions/sf/workflow-templates/registry.json index 545bf60f6..eff333de4 100644 --- a/src/resources/extensions/sf/workflow-templates/registry.json +++ b/src/resources/extensions/sf/workflow-templates/registry.json @@ -373,6 +373,16 @@ "artifact_dir": null, "estimated_complexity": "medium", "requires_project": false + }, + "sf-audit-traces": { + "name": "SF Audit Traces", + "description": "Read SF's observability sources (self-feedback, journal, activity, judgments) and file at most 3 recurring-pattern findings via sf_self_report — designed for /schedule daily cadence", + "file": "sf-audit-traces.md", + "phases": ["audit"], + "triggers": ["sf audit traces", "audit sf", "self improve", "scan sf logs", "improve sf"], + "artifact_dir": null, + "estimated_complexity": "low", + "requires_project": true } } } diff --git a/src/resources/extensions/sf/workflow-templates/sf-audit-traces.md b/src/resources/extensions/sf/workflow-templates/sf-audit-traces.md new file mode 100644 index 000000000..86d03ed94 --- /dev/null +++ b/src/resources/extensions/sf/workflow-templates/sf-audit-traces.md @@ -0,0 +1,105 @@ +# SF Audit Traces + + +name: sf-audit-traces +version: 1 +mode: oneshot +requires_project: true +artifact_dir: null + + + +Read SF's own observability sources, identify ONE non-obvious recurring pattern, +and file it as self-feedback so the existing triage flow can promote it. + +Iterate slowly: at most three entries per run. The point isn't to flood the +queue — the point is to catch what no single session noticed. + + + +- `.sf/SELF-FEEDBACK.md` — markdown view of filed anomalies +- `.sf/self-feedback.jsonl` — durable source of truth +- `.sf/journal/YYYY-MM-DD.jsonl` — per-day dispatch + iteration events +- `.sf/activity/{seq}-{type}-{id}.jsonl` — per-unit transcript +- `.sf/judgments/*.jsonl` — recorded agent decisions (when present) +- `.sf/forensics/*.json` — saved post-mortems (when present) + + + + +## 1. Snapshot + +Run these to anchor the scan in real numbers — file paths and counts go +into the eventual self-feedback evidence: + +```bash +wc -l .sf/self-feedback.jsonl 2>/dev/null +ls .sf/journal/ 2>/dev/null | tail -7 +ls .sf/activity/ 2>/dev/null | wc -l +``` + +Read the latest 7 days of journal files plus the last 30 activity files. If +a source is missing, skip silently — never block on absence. + +## 2. Look for recurring patterns + +The bar is **3+ occurrences** across the data, not single events. Examples: + +- The same `kind` filed by self-feedback 3+ times in a week +- The same dispatch rule firing then immediately being un-applied (paired + events in the journal) +- The same tool error repeating across activity logs +- The same run-away-guard pause across multiple units +- The same auto-resolved entry kind triaged as `wontfix` repeatedly (signal + the detector is too noisy) +- A judgment that proves wrong over multiple subsequent units + +Single events go to the operator's eyeballs, not to a permanent self-feedback +entry. Patterns earn one. + +## 3. File at most three findings + +For each pattern: + +- One call to `sf_self_report` with `kind` (slug, hyphenated), `severity` + (`low`/`medium`/`high`/`critical` — almost always `medium`), `summary` + (one sentence naming the pattern), `evidence` (concrete file paths + + line numbers + counts), `suggestedFix` (one or two specific edits — not + prose). +- Set `source: "agent"` so triage knows where it came from. +- Cite at least three observed instances in `evidence` so the triage agent + can verify without re-reading every log. + +If you find nothing pattern-worthy, file zero. That is a successful run — +silence is the correct output when the system is healthy. + +## 4. NEVER auto-apply + +Do not edit code, prompts, templates, or docs. The triage flow +(`triage-self-feedback`, `requirement-promoter`) decides what becomes work. +Your job ends at filing. Even if the fix looks one-line obvious — file it +and stop. + +## 5. NEVER ship a flood + +Three is a hard cap. If you find a fourth, hold it for the next run. The +slow-pace constraint is deliberate — the triage flow is a human-paced +queue, not a firehose intake. + + + + +A short report: + +- snapshot numbers (entries scanned, days covered) +- patterns considered + which ones met the 3-occurrence bar +- entry IDs filed (with `sf_self_report`'s returned id), or "none filed" + when the system is healthy +- one sentence on what trend you'd watch next run + + + +This template pairs well with `/schedule every morning sf-audit-traces`. +Daily cadence + the 3-entry cap means the triage queue grows by at most 21 +entries per week, which a human triage pass can clear in one sitting. +