diff --git a/docs/design-docs/index.md b/docs/design-docs/index.md index 4ad057841..8d40f1fa5 100644 --- a/docs/design-docs/index.md +++ b/docs/design-docs/index.md @@ -25,6 +25,7 @@ in `docs/dev/`. Lighter design docs (problem framing, event model decisions) liv | [ADR-018](../dev/ADR-018-repo-native-harness-evolution.md) | Repo-Native Harness Evolution | Proposed — staged impl | | [ADR-019](../dev/ADR-019-workspace-vm-convergence.md) | Workspace VM Convergence — SF↔ACE incremental convergence via microVM execution layer | Proposed | | [ADR-020](../dev/ADR-020-internal-wire-architecture.md) | Internal Wire Architecture — `singularity-grpc` shared schema repo, gRPC for first-party services, MCP at external-tool boundary only | Proposed | +| [ADR-021](../dev/ADR-021-versioned-documents-and-upgrade-path.md) | Versioned Documents and Upgrade Path — per-file scaffold markers, drift detection, `/sf scaffold sync` | Proposed | ## Design Docs (this directory) diff --git a/docs/dev/ADR-021-versioned-documents-and-upgrade-path.md b/docs/dev/ADR-021-versioned-documents-and-upgrade-path.md new file mode 100644 index 000000000..865ae5b7f --- /dev/null +++ b/docs/dev/ADR-021-versioned-documents-and-upgrade-path.md @@ -0,0 +1,448 @@ +# ADR-021: Versioned Documents and Upgrade Path + +**Status:** Proposed +**Date:** 2026-05-02 +**Deciders:** Mikael Hugo + +## Context + +SF ships a fixed set of scaffold templates via +`src/resources/extensions/sf/agentic-docs-scaffold.ts` (the `SCAFFOLD_FILES` +array). On project bootstrap (and on every subsequent SF auto-start), the +scaffolder calls `ensureAgenticDocsScaffold(basePath)` which performs +**skip-and-create**: if a target file is missing it is written, otherwise it +is left alone. + +This is correct for protecting user content but wrong for everything else: + +1. **No drift signal.** When SF adds a new template (e.g. `harness/AGENTS.md`, + `ARCHITECTURE.md`) or improves an existing one (e.g. tightens + `RELIABILITY.md`), existing projects never notice. Only newly-bootstrapped + projects benefit. +2. **No refresh command.** A user who hears "SF has a better template now" + has no way to pull it. +3. **Pending content is frozen.** A file that is still the verbatim scaffold + stub — neither the user nor any agent has touched it — is treated the same + as a fully customized doc. SF refuses to refresh it because it cannot tell + the two apart. +4. **Only one file is versioned.** `PREFERENCES.md` carries + `last_synced_with_sf` in its frontmatter and gets a silent re-stamp via + `preferences-template-upgrade.ts` on drift. Every other scaffold file is + anonymous. + +The user directive: **everything needs versioning, with upgrade paths for +documents that are not completed, perhaps with background agents.** + +This ADR generalises the `preferences-template-upgrade.ts` pattern to all +scaffold-managed documents and defines a structured upgrade pipeline that can +distinguish *pending*, *editing*, and *completed* content. + +## Decision + +Adopt a per-document state model with explicit version markers, a +project-local manifest, drift detection, and an upgrade command. Existing +files are not stamped; they are migrated by content-hash match against an +archive of past template versions. + +### 1. Document states + +Every scaffold-managed file is in one of three states: + +| State | Definition | SF action on drift | +|-------------|------------|--------------------| +| `pending` | Content equals (or hashes to) a known scaffold template version. Neither user nor agent has customised it. | Silent re-write to current template. Update marker. | +| `editing` | Marker present and stamped, content has drifted from the stamped template. Customisation in progress. | Do not overwrite. Write `.proposed` with new template + diff. Optionally dispatch background merge agent. | +| `completed` | Marker absent, OR marker explicitly says `state=completed`. | Never modified by SF. | + +#### Detection + +For Markdown files, the marker is an HTML comment on the **first line**: + +``` + +``` + +Fields: + +- `version` — SF semver that wrote/last-stamped the file. +- `template` — logical template id (matches an entry in `SCAFFOLD_FILES`). +- `state` — `pending` | `editing` | `completed`. Optional; default inferred + from hash comparison. +- `hash` — sha256 of the file body **after the marker line** at stamp time. + Used to confirm `pending` (current body still matches stamped hash) vs + `editing` (hash mismatch ⇒ drift since stamp). + +For frontmatter files (`PREFERENCES.md`), reuse the existing +`last_synced_with_sf` field and add `sf_template_state` and +`sf_template_hash`. The frontmatter path is the prior art described in +`preferences-template-upgrade.ts` — extend, do not duplicate. + +State derivation: + +| Marker present? | Body hash matches stamped hash? | Body hash matches some known template version? | State | +|-----------------|---------------------------------|------------------------------------------------|-------| +| yes | yes | n/a | `pending` | +| yes | no | n/a | `editing` | +| yes (`state=completed`) | n/a | n/a | `completed` | +| no | n/a | yes (legacy match via archive) | promote to `pending`, stamp | +| no | n/a | no | `completed` (untouched) | + +### 2. Universal versioning — what gets versioned + +Apply markers to every entry in `SCAFFOLD_FILES`. Categories: + +| Category | Files (examples) | Marker mechanism | +|----------|------------------|------------------| +| Markdown docs | `AGENTS.md`, `ARCHITECTURE.md`, `docs/RELIABILITY.md`, `docs/SECURITY.md`, `docs/DESIGN.md`, `docs/QUALITY_SCORE.md`, `docs/RECORDS_KEEPER.md`, all `*/AGENTS.md` | HTML comment on line 1 | +| Frontmatter docs | `.sf/PREFERENCES.md` | Frontmatter fields: `last_synced_with_sf`, `sf_template_state`, `sf_template_hash` (extends prior art in `preferences-template-upgrade.ts`) | +| Templates / specs | `docs/design-docs/ADR-TEMPLATE.md`, `harness/specs/bootstrap.md`, `harness/AGENTS.md`, `harness/specs/AGENTS.md`, `harness/evals/AGENTS.md`, `harness/graders/AGENTS.md` | HTML comment on line 1 | +| Reference slot text files | `docs/references/*-llms.txt` | HTML comment on line 1 (Markdown comments are valid in plain text consumed by LLMs) | +| `.siftignore` and similar non-Markdown configs | `.siftignore` | Skip versioning. Sibling file `.sf/scaffold-manifest.json` records the applied version. (Rationale: hash-based legacy match is sufficient; markers in dotfiles fight tooling.) | + +#### User-content files SF must never stamp + +These files are user-curated by intent (per ADR-001) and **must not** appear +in `SCAFFOLD_FILES` nor be touched by the upgrade path: + +- `.sf/PROJECT.md`, `.sf/DECISIONS.md`, `.sf/REQUIREMENTS.md`, `.sf/QUEUE.md` +- `.sf/milestones/**/*` (roadmaps, plans, summaries, slice artifacts) +- Any user-written exec plan, design doc, or product spec authored after + scaffold + +Verification: a unit test asserts that none of the above paths appear in +`SCAFFOLD_FILES` and that `detectScaffoldDrift` ignores them. + +### 3. Scaffold manifest (`.sf/scaffold-manifest.json`) + +Per-project record of what scaffold versions have been applied. Lives under +`.sf/` (gitignored runtime; per ADR-001 not durable user content). + +```json +{ + "schemaVersion": 1, + "applied": [ + { + "path": "AGENTS.md", + "template": "AGENTS.md", + "version": "2.75.2", + "appliedAt": "2026-04-30T12:00:00Z", + "stateAtApply": "pending", + "contentHash": "sha256:abc..." + }, + { + "path": "docs/RELIABILITY.md", + "template": "docs/RELIABILITY.md", + "version": "2.74.0", + "appliedAt": "2026-03-12T08:11:02Z", + "stateAtApply": "pending", + "contentHash": "sha256:def..." + } + ] +} +``` + +Writers: `agentic-docs-scaffold.ts`, `/sf scaffold sync`, +`migrateLegacyScaffold`. Readers: `detectScaffoldDrift`, doctor check. +Failure mode: a corrupt manifest is rebuilt by re-walking files and reading +their markers — the manifest is a cache, the marker is the source of truth. + +### 4. Drift detection + +New module exposes: + +```ts +function detectScaffoldDrift(basePath: string): ScaffoldDriftReport +``` + +Returns five buckets: + +| Bucket | Condition | Action by `/sf scaffold sync` | +|--------|-----------|-------------------------------| +| `missing` | Template in `SCAFFOLD_FILES` but file does not exist on disk | Write the file, stamp current version | +| `upgradable` | File exists, marker present, `state=pending`, stamped version older than current SF | Silent re-write to current template, restamp | +| `editing-drift` | File exists, marker present, body hash ≠ stamped hash, **template content has changed** since stamped version | Do not overwrite. Write `.proposed`. Optionally dispatch merge agent (#6). | +| `untracked` | File exists, no marker, predates this ADR | Run legacy hash match (#7). Promote to `pending` if matched, else leave alone. | +| `customized` | Marker says `state=completed`, OR marker absent and no legacy hash match | Skip. SF never modifies. | + +The report is structured (one entry per file) and can be rendered as a table +for CLI output. + +### 5. Upgrade paths + +Per drift case: + +- **missing** → write current template, stamp marker. No conflict. +- **upgradable (pending)** → silent re-render. Same pattern as + `upgradePreferencesFileIfDrifted` in + `preferences-template-upgrade.ts`: replace body, preserve any out-of-band + content where the file has separable regions (frontmatter for + `PREFERENCES.md`; for plain Markdown there is no preservation — the file + *is* the template). +- **editing-drift** → produce `.proposed` with: + - The new template content + - A leading `` block describing source version, + target version, and a unified diff against the **stamped template + version**, not the current file. The agent or user can then merge + intentionally. + - The original `` is untouched. +- **untracked** → see migration (#7). Either promote to `pending` and stamp, + or leave fully alone. +- **customized** → never touch. Period. + +### 6. Background-agent integration (stretch) + +When `editing-drift` items exist and the project enables auto-merge (a new +preference flag, default off), the sync command may dispatch a +**`scaffold-keeper` background agent**. This agent: + +1. Reads the stamped template version (from the `SCAFFOLD_VERSION_ARCHIVE`). +2. Reads the current template version. +3. Reads the current customised file. +4. Produces a 3-way merge: keep user customisations, apply non-conflicting + upstream improvements. +5. Writes the result to `.proposed`. +6. Posts an `approval_request` notification through the existing + structured-notification model so the user (or a supervisor) can review + and accept. + +This ADR does not specify the agent prompt, scheduling policy, or grading +rubric. Those belong to a follow-up ADR; the architecture here merely +*enables* the integration by guaranteeing a clean three-input merge surface +(stamped version, current version, current file). + +### 7. Migration strategy for existing projects + +Existing projects have no markers. Migration runs once on first sync after +this ADR ships: + +```ts +function migrateLegacyScaffold(basePath: string): MigrationReport +``` + +For each entry in `SCAFFOLD_FILES`: + +1. If file does not exist → `missing`, handled by sync normally. +2. Read file, compute body sha256. +3. Look up the hash in a new constant + `SCAFFOLD_VERSION_ARCHIVE: Record>` + that records every template body shipped by every prior SF version. +4. If matched → file is verbatim from a known prior version. Promote to + `pending`, stamp marker with `version=`, then let the normal + `upgradable` path bring it forward. +5. If no match → user has touched it (or it predates SF tracking entirely). + Leave the file alone, do **not** stamp. Treat as `customized`. + +The archive starts empty for the version of SF that ships this ADR (no prior +versions had hashes for these templates). As SF evolves, every release that +changes a template body adds an entry mapping the previous body hash to its +SF version. Within a few releases, the archive covers the long tail of +real-world projects. + +Migration is idempotent and non-destructive: the only filesystem change for +unmatched files is none. + +### 8. Doctor integration + +A new check in `doc-checker.ts` (or a sibling module to keep concerns +separate): + +```ts +function checkScaffoldFreshness(basePath: string): DoctorFinding +``` + +Reports: + +- Count by bucket: `missing`, `upgradable`, `editing-drift`, `untracked`. +- Severity: **warning** (non-fatal). Never blocks dispatch. +- Guidance: `Run /sf scaffold sync to refresh ${n} pending docs.` +- If `editing-drift > 0`: `Run /sf scaffold sync --include-editing to merge customised docs.` + +Doctor finding integrates with the existing report rendering in +`doc-checker.ts` style — same `status`/`note` shape so consumers don't fork. + +### 9. Automatic operation — primary mode + +**This is the headline behaviour. Manual command operation (#10) exists only +as an escape hatch and for dry-run inspection.** + +The upgrade pipeline runs automatically in three places: + +1. **On every SF startup** that already calls `ensureAgenticDocsScaffold` + (`auto-start.ts`, `guided-flow.ts`, `init-wizard.ts`). The sync runs + synchronously in the cheap path: + - `missing` items → write immediately (already current behaviour). + - `upgradable` items (pending state, hash matches) → silent re-render + immediately. No notification — no human attention needed. + - `untracked` items → migration tries the legacy hash match in-process; if + matched, promote to `pending` and re-render in the same pass. + - `editing-drift` items → defer to background agent (point 3). + +2. **After every milestone completion** (`auto-post-unit.ts`, `auto.ts:stopAuto`). + The state of the codebase has just changed meaningfully — a good time to + re-derive code-dependent docs (see "Code-as-fact verification" below). + +3. **Asynchronously via the existing subagent extension** for + `editing-drift` items and for code-as-fact re-derivation. The sync emits a + structured notification with `kind: "approval_request"` only when the + agent has produced a proposed change that needs review. Silent runs + produce no notification. + +The user **never has to know any of this is happening** for the common +cases. The only signal is: on a fresh `sf` invocation, the scaffold catches +up. If the agent finds something controversial, an approval-request +notification surfaces with a `.proposed` artifact ready for review. + +#### Code-as-fact verification + +Static template refresh handles the easy case. The harder, more valuable +case: **the document has drifted not because the template changed, but +because the code has evolved past what the document claims.** Examples: + +- `ARCHITECTURE.md` says the codebase has 3 modules; the code now has 6. +- `RELIABILITY.md` says exit codes are 0/1/10; the code now emits 0/1/10/11/12. +- `SECURITY.md` lists the write-gate's protected paths; the code added 2 more. + +The scaffold-keeper background agent runs the **`records-keeper`** skill +(already in `src/resources/extensions/sf/skills/records-keeper/SKILL.md`) +which already specifies *"prefer source and tests for implemented behavior"*. +That skill is the agent's contract. + +**Code is the fact. Documents are projections of code at a moment in time.** +When the projection drifts, the agent re-derives by reading source. When the +agent can re-derive non-controversially, it does so silently. When the change +is large enough to warrant review, it surfaces a `.proposed` artifact through +the structured notification model. + +This generalises records-keeper from a skill an agent *can* run to a skill +the system runs *automatically* on a regular cadence. The records-keeper +attempt was the right idea; this ADR is what makes it autonomous. + +#### Cadence + +- **Synchronous on startup**: only the cheap path (hash comparisons, missing + files, pending refreshes). Bounded by file count, runs in milliseconds. +- **Asynchronous after milestone completion**: dispatches a scaffold-keeper + subagent. The subagent runs the records-keeper procedure on the docs whose + source dependencies have changed since last sync. +- **Optional cron** via the existing `/loop` system: a daily background run + for projects that don't close milestones often. + +#### Failure modes are non-fatal + +The pipeline is designed to fail open: any error reading a marker, computing +a hash, dispatching an agent, or writing a `.proposed` file results in the +file being left alone, a debug log line, and SF continuing normally. The +user-facing contract is "SF doesn't break my docs." Drift detection may +under-fire; it must never over-fire. + +### 10. Manual command — `/sf scaffold sync` (escape hatch) + +For dry-run inspection, forced refresh, and scoped operations: + +``` +/sf scaffold sync [--dry-run] [--include-editing] [--only=] +``` + +| Flag | Behaviour | +|------|-----------| +| (default) | Force the same operation that would run automatically. Useful when the user wants to refresh on demand without waiting for the next startup or milestone close. | +| `--dry-run` | Print the drift report and the planned actions. Make no filesystem changes. The primary diagnostic mode. | +| `--include-editing` | Synchronously merge `editing-drift` items inline (vs. the default async-via-subagent). Used when the user wants a definitive answer right now. | +| `--only=` | Restrict the sync to a path glob. Useful for tightly scoped refreshes (`--only=harness/**`) or for re-deriving a specific code-dependent doc (`--only=docs/RELIABILITY.md`). | + +Exit code: 0 if no errors. Non-zero only if filesystem writes failed for +reasons unrelated to drift (permission, disk full). + +## Implementation phases + +| Phase | Scope | +|-------|-------| +| **A** | Stamp markers on all `SCAFFOLD_FILES` writes. Maintain `.sf/scaffold-manifest.json`. Extend `PREFERENCES.md` frontmatter with `sf_template_state` and `sf_template_hash`. Existing `agentic-docs-scaffold.ts` callsites unchanged externally. | +| **B** | Implement `detectScaffoldDrift`, `migrateLegacyScaffold`, and the initial `SCAFFOLD_VERSION_ARCHIVE` (empty). | +| **C** | **Automatic synchronous sync**: extend `ensureAgenticDocsScaffold` to apply `missing` + `upgradable` + legacy-migrated items in the same pass. No new command surface required for this; the existing callsites get the upgrade behaviour for free. `checkScaffoldFreshness` doctor finding for visibility. | +| **D** | **Automatic asynchronous sync via existing infrastructure**: after milestone completion (`auto-post-unit.ts`, `auto.ts:stopAuto`), dispatch a `scaffold-keeper` subagent (via the existing `subagent` extension) that runs the **`records-keeper`** skill against drifted docs. Code-as-fact verification: agent reads source and re-derives content. Surfaces results as `kind: "approval_request"` notifications using the structured-notification model from ADR-019/020. | +| **E** *(escape hatch)* | `/sf scaffold sync` command for dry-run inspection, forced refresh, and scoped operations (`--only=`, `--include-editing`). | + +Each phase is independently shippable and testable. Phase A alone unlocks +the architectural property: SF can tell *pending* from *completed* on every +project from now on. Phase C is what the user experiences as "automatic" +for the simple cases; Phase D is what makes records-keeper autonomous for +the code-derived cases. + +## Consequences + +### Becomes possible + +- Continuous template evolution: SF can iterate scaffold content freely + knowing pending docs auto-upgrade. +- Visible signal for project staleness via doctor. +- Clean separation of *SF-managed* vs *user-owned* content per file, not per + directory. +- Foundation for background-agent merges of customised docs. +- Future templates (new harness specs, new ADR types) propagate without + manual project-by-project edits. + +### Becomes harder + +- Scaffold manager grows from "skip-if-exists" to a state machine. Test + matrix grows accordingly. +- Every template change must consider whether to bump the SF version archive + entry for the previous body. Forgotten archive entries cause legacy files + to be classified `customized` when they should be `pending`. +- Marker format becomes load-bearing: changing it is itself a versioning + problem (handled by versioning the marker schema in `schemaVersion`). + +### Failure modes + +| Failure | Behaviour | +|---------|-----------| +| Corrupt `scaffold-manifest.json` | Rebuilt by re-walking files; markers are source of truth. | +| Marker hash mismatch with stamped content (e.g. user hand-edited the marker) | File classified `editing`. SF will not overwrite. User can fix by running sync with `--include-editing` after reviewing the proposed file. | +| `SCAFFOLD_VERSION_ARCHIVE` missing an entry for a real prior version | Affected files classified `customized` and left alone. Recoverable by adding the archive entry in a later SF release; sync will then promote on next run. | +| Read-only filesystem | Each writer is wrapped in try/catch (same pattern as `upgradePreferencesFileIfDrifted`). Sync degrades to dry-run output. | +| Concurrent sync runs | First writer wins; second sees no drift on second pass. Manifest writes are atomic via temp+rename. | + +## Alternatives Considered + +| Alternative | Why rejected | +|-------------|--------------| +| Status quo (skip-and-create only) | No visibility, no refresh path, anonymous templates. The whole point of this ADR. | +| Aggressive refresh on every run (overwrite all scaffold files) | Destroys user customisations. Non-starter. | +| Git-based detection (compare repo HEAD against SF init commit) | Requires clean git state at sync time, breaks under merges/rebases, conflates user commits with SF state, fragile across worktrees (per ADR-001). | +| Per-file `..sf-meta` sidecar instead of inline marker | Doubles file count, easy to delete, harder to cite in human review. Marker travels with the file it describes. | +| Single global `SF_VERSION` stamp on the manifest only | Cannot distinguish pending vs editing per file; a single global re-stamp would either skip everything (current behaviour) or overwrite everything (rejected). Per-file state is the minimum useful granularity. | +| LLM-based "is this still pending?" classifier | Non-deterministic, expensive, unnecessary. Hash equality is the right primitive. | + +## Migration + +See [#7](#7-migration-strategy-for-existing-projects). One-shot, idempotent, +non-destructive. No project action required; first sync after upgrade +classifies everything correctly. + +## Validation + +```bash +# Once Phase A lands: +node --experimental-strip-types --test \ + src/resources/extensions/sf/tests/scaffold-versioning.test.ts + +# Once Phase C lands: +sf scaffold sync --dry-run +# Expected output: structured drift report with bucket counts. +``` + +## References + +- ADR-001 — Branchless Worktree Architecture (`.sf/` durable vs runtime + boundary; informs which files are SF-managed vs user-curated). +- ADR-018 — Repo-native Harness Evolution (template kits and harness + contracts that this versioning system will manage). +- ADR-019 — Workspace VM Convergence (informs where scaffold sync runs in + the execution layer). +- `src/resources/extensions/sf/preferences-template-upgrade.ts` — prior art + for frontmatter drift detection and silent re-stamping; the pattern this + ADR generalises to all scaffold-managed files. +- `src/resources/extensions/sf/agentic-docs-scaffold.ts` — current scaffold + list and skip-if-exists logic, to be extended in Phase A. +- `src/resources/extensions/sf/doc-checker.ts` — existing scaffold-content + checker, sibling to the new freshness check in Phase C.