docs(M003): context, requirements, and roadmap

This commit is contained in:
Lex Christopherson 2026-03-14 22:02:26 -06:00
parent ac33781fd0
commit 24597873a0
6 changed files with 947 additions and 0 deletions

41
.gsd/DECISIONS.md Normal file
View file

@ -0,0 +1,41 @@
# Decisions Register
<!-- Append-only. Never edit or remove existing rows.
To reverse a decision, add a new row that supersedes it.
Read this file at the start of any planning or research phase. -->
| # | When | Scope | Decision | Choice | Rationale | Revisable? |
|---|------|-------|----------|--------|-----------|------------|
| D001 | M001 | arch | Secret collection insertion point | At `/gsd auto` entry (startAuto), not as a dispatch unit type | Keeps the state machine untouched. Collection is a one-time gate, not a repeating unit. Simpler, less risk of dispatch loop bugs. | Yes — if collection needs to happen mid-milestone |
| D002 | M001 | convention | Manifest file naming | `M00x-SECRETS.md` via existing `resolveMilestoneFile(base, mid, "SECRETS")` | Consistent with all other milestone-level files (CONTEXT, ROADMAP, RESEARCH). No new path resolver needed. | No |
| D003 | M001 | pattern | Summary screen interactivity | Read-only with auto-skip (no interactive deselection) | Matches the "walk away" philosophy. Simpler UX, fewer edge cases. User can always re-run collection. | Yes — if users request deselection |
| D004 | M001 | pattern | Guidance display placement | Same page as masked input (above the editor) | Single page per key — no extra navigation. User sees guidance while entering the value. | Yes — if terminal height constraints cause problems |
| D005 | M001 | convention | Manifest format | Markdown with H3 sections per key, bold fields, numbered guidance | Consistent with all other .gsd files. Parser and formatter already exist in files.ts. | No |
| D006 | M001 | arch | Destination inference | Reuse existing `detectDestination()` from get-secrets-from-user.ts | Simple file-presence checks (vercel.json → Vercel, convex/ → Convex, default → .env). Already proven. | Yes — if per-key destination override needed |
| D007 | M002 | arch | File structure after module split | Split index.ts into state.ts, lifecycle.ts, capture.ts, settle.ts, refs.ts, utils.ts, evaluate-helpers.ts, and tools/ directory | 5000-line monolith is unmaintainable; module boundaries enable safe changes. core.js already established the pattern. | No |
| D008 | M002 | library | Image resizing library | sharp | Fast, well-maintained, standard Node image processing. Replaces fragile canvas-based approach that depends on page context. | No |
| D009 | M002 | convention | Navigate screenshot default | Off by default, opt-in via parameter | Big token savings. Agent uses browser_screenshot explicitly when visual verification needed. | Yes — if agents consistently need screenshots on navigate |
| D010 | M002 | arch | Browser-side utility injection | page.addInitScript under window.__pi namespace | Survives navigation, available before page scripts, namespaced to avoid collisions. | Yes — if timing issues discovered |
| D011 | M002 | convention | Intent resolution approach | Deterministic heuristics only, no LLM calls | Predictable latency and cost. Scoring functions are testable and debuggable. | Yes — if heuristic coverage proves insufficient |
| D012 | M002 | convention | Browser reuse across sessions | Skip completely | Architecturally different from within-session work; user directed to exclude entirely. | No |
| D013 | M002/S01 | pattern | Mutable state accessor pattern | get/set functions for all 18 state variables, not `export let` | ES module live bindings break under jiti's CJS shim. Accessors guarantee consumers see mutations. | No |
| D014 | M002/S01 | pattern | ToolDeps interface location | Defined in state.ts alongside types it references | Keeps the dependency graph simple — tool files import state.ts for ToolDeps + types. | Yes — could move to separate types.ts if state.ts grows |
| D015 | M002/S01 | pattern | Factory pattern for lifecycle-dependent utils | createGetLivePagesSnapshot(ensureBrowser) instead of direct import | Avoids circular dependency between utils.ts and lifecycle.ts. Wired at orchestrator level. | No |
| D016 | M002/S01 | pattern | Tool file import strategy | Tool files import state accessors and core.js functions directly — ToolDeps carries only infrastructure functions needing lifecycle wiring | Keeps ToolDeps lean. State accessors are stable imports, not runtime-wired dependencies. Avoids bloating the deps interface with every utility. | Yes — if ToolDeps grows unwieldy |
| D017 | M002/S02 | pattern | Action tool signal classification | High-signal: click, type, key_press, select_option, set_checked, navigate, click_ref, fill_ref. Low-signal: scroll, hover, drag, upload_file, hover_ref. | High-signal tools produce meaningful page changes worth capturing body text for diffs. Low-signal tools don't change page content. fill_ref is high-signal because input value changes affect form state. | Yes — if new tools need reclassification |
| D018 | M002/S02 | pattern | postActionSummary retention | Keep postActionSummary in capture.ts for summary-only tools (go_back, go_forward, reload) but remove from action tools that do before/after diff | Summary-only tools don't do diffs and don't need beforeState — postActionSummary is the right abstraction for them. Action tools need consolidated capture. | Yes — could remove entirely if summary-only tools get before/after diff |
| D019 | M002/S02 | tuning | Zero-mutation settle thresholds | 60ms detection window, 30ms shortened quiet window, totalMutationsSeen === 0 required | Conservative thresholds — 60ms is enough time for any async DOM update to start, 30ms shortened window still catches late mutations. Requiring zero total mutations (not just current poll) prevents false short-circuits. | Yes — if real-world testing shows 60ms is too short for slow SPAs |
| D020 | M002/S04 | pattern | Form analysis evaluate location | Form analysis evaluate logic lives in tools/forms.ts, not extracted to evaluate-helpers.ts | Form-specific, not a shared utility. The label resolution heuristic is only used by form tools. Keeping it local avoids bloating the shared injection. | Yes — if S05 intent tools need label resolution |
| D021 | M002/S04 | pattern | Fill uses Playwright APIs, not evaluate | browser_fill_form uses Playwright locator.fill()/selectOption()/setChecked() instead of page.evaluate() value setting | Playwright APIs trigger proper input/change events and handle framework-specific reactivity (React, Vue). Direct value setting via evaluate skips event dispatch and breaks reactive frameworks. | No |
| D022 | M002/S04 | pattern | Fill field matching priority | Label (exact → case-insensitive) → name → placeholder → aria-label | Label is the most human-readable identifier. Name is the most reliable programmatic identifier. Placeholder and aria-label are fallbacks. Exact match before fuzzy prevents wrong-field fills. | Yes — if real-world usage shows a different priority works better |
| D023 | M002/S05 | pattern | Intent scoring model | 4 orthogonal dimensions per intent, each 0-1, summed and clamped | Consistent scoring structure across all 8 intents. Makes scoring testable and debuggable — each dimension has a named reason. 4 dimensions balance discrimination vs complexity. | Yes — could add/remove dimensions per intent if real-world usage shows imbalance |
| D024 | M002/S05 | pattern | search_field action type | Focus instead of click for search_field intent in browser_act | Search fields need keyboard focus for typing, not a click that might submit or toggle. Focus is the semantically correct action. Other intents use click. | Yes — if focus proves unreliable on specific input implementations |
| D025 | M002/S06 | pattern | Test import strategy for browser-tools | jiti CJS imports instead of ESM resolve-ts hook | The resolve-ts ESM hook breaks on core.js (plain .js file imported by TS modules). jiti handles mixed .ts/.js imports correctly from a .cjs test file. | No |
| D026 | M002/S06 | pattern | Testing module-private functions | Source extraction via readFileSync + brace-match + strip types + eval | Avoids exporting test-only APIs from production modules. Fragile to refactors but tests fail clearly when extraction breaks. Acceptable tradeoff for test code. | Yes — if private functions get exported for other reasons |
| D027 | M003 | arch | Git isolation model | Worktree-per-milestone (default for new projects) | Eliminates .gsd/ merge conflicts structurally. Each milestone gets its own worktree with isolated .gsd/ state. Branch-per-slice remains as opt-in legacy mode via git.isolation: "branch". | No |
| D028 | M003 | arch | Slice merge strategy within worktree | --no-ff merge (not squash) | Preserves full commit history as a diary of agent work. Merge commits give natural slice boundaries. Squash would destroy per-task granularity. | Yes — if commit noise proves problematic |
| D029 | M003 | arch | Milestone-to-main merge strategy | Squash merge | Main gets one clean commit per milestone. Individually revertable. Reads like a changelog. Full history preserved on milestone branch for forensics. | No |
| D030 | M003 | arch | Failure handling philosophy | Stop but self-heal | Auto-mode pauses, runs automatic repair (abort, reset, retry), resumes without user intervention in most cases. Only truly ambiguous conflicts need a human. Balances continuity with trust. | Yes — if self-heal proves unreliable |
| D031 | M003 | arch | Target user priority | Vibe coder first | Zero git errors as the default. Senior engineers configure overrides. Biggest market opportunity is users who can't use git today. | No |
| D032 | M003 | convention | Auto-worktree naming | Milestone ID as worktree name, milestone/<MID> as branch | .gsd/worktrees/M003/ with branch milestone/M003. Manual worktrees use worktree/<name> branches. No collision between auto and manual. | Yes — if naming conflicts discovered |
| D033 | M003 | arch | Migration strategy | New projects default to worktree; existing keep branch-per-slice | Detection: if project has gsd/* branches or milestone META with integration branch → legacy. Otherwise → worktree. No forced migration. | Yes — if adoption shows users want migration tooling |

43
.gsd/PROJECT.md Normal file
View file

@ -0,0 +1,43 @@
# Project
## What This Is
A pi coding agent extension (GSD — "Get Stuff Done") that provides structured planning, auto-mode execution, and project management for autonomous coding sessions. Includes proactive secret management, browser automation tools for UI verification, and worktree-isolated git architecture for zero-friction autonomous execution.
## Core Value
Auto-mode runs from start to finish without blocking. Git is invisible — no merge conflicts, no checkout errors, no state corruption. The system is automagical for vibe coders and configurable for senior engineers.
## Current State
The GSD extension is fully functional with:
- Milestone/slice/task planning hierarchy
- Auto-mode state machine with fresh-session-per-unit dispatch
- Guided `/gsd` wizard flow
- `secure_env_collect` tool with masked TUI input, multi-destination write support, guidance display, and summary screen
- Proactive secret management: planning prompts forecast secrets, manifests persist them, auto-mode collects them before first dispatch
- Browser-tools extension with 47 registered tools covering navigation, interaction, inspection, verification, tracing, debugging, form intelligence (browser_analyze_form, browser_fill_form), and intent-ranked retrieval and semantic actions (browser_find_best, browser_act)
- Browser-tools `core.js` with shared utilities for action timeline, page registry, state diffing, assertions, fingerprinting
- Branch-per-slice git model with squash merge to main (being superseded by worktree-isolated model in M003)
## Architecture / Key Patterns
- **Extension model**: pi extensions register tools, commands, hooks via `ExtensionAPI`
- **State machine**: `auto.ts` drives `dispatchNextUnit()` which reads disk state and dispatches fresh sessions
- **Secrets gate**: `startAuto()` checks `getManifestStatus()` before first dispatch
- **Disk-driven state**: `.gsd/` files are the source of truth, `STATE.md` is derived cache
- **File parsing**: `files.ts` has markdown parsers for all GSD file types
- **Browser-tools**: Modular structure — slim `index.ts` orchestrator, 8 focused infrastructure modules (state.ts, utils.ts, evaluate-helpers.ts, lifecycle.ts, capture.ts, settle.ts, refs.ts), 11 categorized tool files under `tools/` (including forms.ts, intent.ts), shared infrastructure in `core.js` (~1000 lines). Browser-side utilities injected once via `addInitScript` under `window.__pi` namespace. Uses Playwright for browser control. Accessibility-first state representation, deterministic versioned refs, adaptive DOM settling, compact post-action summaries. Form tools use Playwright locator APIs for type-aware filling with structured result reporting. Intent tools use deterministic 4-dimension heuristic scoring for element retrieval and one-call semantic actions.
- **Prompt templates**: `prompts/` directory with mustache-like `{{var}}` substitution
- **TUI components**: `@gsd/pi-tui` provides `Editor`, `Text`, key handling, themes
- **Git architecture**: Worktree-per-milestone isolation (default for new projects). Each milestone gets its own git worktree with isolated `.gsd/` state. Slices merge via `--no-ff` into the milestone branch (preserving full commit history). Milestones squash-merge to main on completion. Legacy branch-per-slice model supported via `git.isolation: "branch"` preference.
## Capability Contract
See `.gsd/REQUIREMENTS.md` for the explicit capability contract, requirement status, and coverage mapping.
## Milestone Sequence
- [x] M001: Proactive Secret Management — Front-loaded API key collection into planning so auto-mode runs uninterrupted (10 requirements validated)
- [x] M002: Browser Tools Performance & Intelligence — Module decomposition, action pipeline optimization, sharp-based screenshots, form intelligence, intent-ranked retrieval, semantic actions, 108-test suite (12 requirements validated)
- [ ] M003: Worktree-Isolated Git Architecture — Worktree-per-milestone isolation eliminating merge conflicts, self-healing git repair, zero git errors for vibe coders, configurable for senior engineers

553
.gsd/REQUIREMENTS.md Normal file
View file

@ -0,0 +1,553 @@
# Requirements
This file is the explicit capability and coverage contract for the project.
## Active
### R029 — Auto-worktree creation on milestone start
- Class: core-capability
- Status: active
- Description: When auto-mode starts a new milestone, it automatically creates a git worktree under `.gsd/worktrees/<MID>/` with branch `milestone/<MID>`, `chdir`s into it, and dispatches all units from within the worktree. The user never runs a git command.
- Why it matters: Worktree isolation gives each milestone its own `.gsd/` directory, eliminating the entire category of `.gsd/` merge conflicts that have caused ~15 separate bug fixes to date.
- Source: user
- Primary owning slice: M003/S01
- Supporting slices: none
- Validation: unmapped
- Notes: Must handle: fresh milestone (no worktree yet), resumed milestone (worktree already exists), milestone started from non-main branch. Must coexist with manual `/worktree` command.
### R030 — Auto-worktree teardown + squash-merge on milestone complete
- Class: core-capability
- Status: active
- Description: When a milestone completes, the milestone branch is squash-merged to main with a rich commit message, the worktree is removed, and `process.chdir` returns to the main project root. Main receives exactly one commit per milestone.
- Why it matters: Main stays clean and always represents completed, working milestones. One commit per milestone is individually revertable.
- Source: user
- Primary owning slice: M003/S03
- Supporting slices: M003/S01
- Validation: unmapped
- Notes: Must handle: dirty worktree at teardown time (auto-commit first), failed squash-merge (self-heal), remote push after merge (if auto_push enabled).
### R031 — `--no-ff` slice merges within milestone worktree
- Class: core-capability
- Status: active
- Description: Completed slices merge into the milestone branch via `--no-ff` merge instead of squash. This preserves the full per-task commit history on the milestone branch, with merge commits providing natural slice boundaries.
- Why it matters: The commit history is a diary of the agent's work. The LLM can read `git log` to understand what happened. Squashing slices destroys this granularity. `--no-ff` merge commits give clean slice boundaries while keeping all commits.
- Source: user
- Primary owning slice: M003/S02
- Supporting slices: M003/S01
- Validation: unmapped
- Notes: This is the default for worktree-isolated mode. The branch-per-slice legacy model retains its existing squash default.
### R032 — Rich milestone-level squash commit message
- Class: core-capability
- Status: active
- Description: When a milestone squash-merges to main, the commit message summarizes all slices and their key outcomes. Format: conventional commit subject + slice task list body + branch metadata.
- Why it matters: Main's git log should read like a changelog. Each milestone commit should tell the full story of what was built.
- Source: user
- Primary owning slice: M003/S03
- Supporting slices: none
- Validation: unmapped
- Notes: Similar to current rich commit message for slice merges, but at milestone level. Should list all slices with their titles and key outcomes.
### R033 — `git.isolation` preference
- Class: core-capability
- Status: active
- Description: A `git.isolation` preference with values `"worktree"` (default for new projects) and `"branch"` (legacy model). New projects that have never run GSD default to worktree isolation. Existing projects with an established branch-per-slice history default to branch mode.
- Why it matters: Backwards compatibility — existing projects must not break. New projects get the better model by default.
- Source: user
- Primary owning slice: M003/S04
- Supporting slices: none
- Validation: unmapped
- Notes: Detection heuristic: if the project has existing `gsd/*` branches or milestone metadata with integration branch records, it's a legacy project → default to "branch". Otherwise → default to "worktree".
### R034 — `git.merge_to_main` preference
- Class: core-capability
- Status: active
- Description: A `git.merge_to_main` preference with values `"milestone"` (default) and `"slice"`. In milestone mode, main only receives commits when milestones complete. In slice mode, each completed slice squash-merges to main immediately (current behavior).
- Why it matters: Senior engineers who want frequent integration can opt into slice-level merges. Vibe coders get the cleaner milestone-level default.
- Source: user
- Primary owning slice: M003/S04
- Supporting slices: M003/S03
- Validation: unmapped
- Notes: `merge_to_main: "slice"` with `isolation: "worktree"` is valid — slices squash-merge to main from within the worktree, but the worktree still provides `.gsd/` isolation.
### R035 — Self-healing git repair on failure
- Class: core-capability
- Status: active
- Description: When git operations fail during auto-mode (merge conflict, checkout failure, corrupt state), the system automatically attempts repair: abort incomplete merges, reset working tree, retry the operation. Only truly unresolvable conflicts (two humans edited the same code) pause auto-mode with a clear explanation.
- Why it matters: The north star is "automagical — just runs." Git errors are the #1 cause of auto-mode halting. Self-healing eliminates most of those stops.
- Source: user
- Primary owning slice: M003/S05
- Supporting slices: M003/S01, M003/S02, M003/S03
- Validation: unmapped
- Notes: The worktree model eliminates most `.gsd/` conflicts structurally. Self-healing handles the remaining edge cases (code conflicts, remote divergence, corrupt index).
### R036 — `.gsd/` conflict resolution elimination
- Class: quality-attribute
- Status: active
- Description: The ~60 lines of `.gsd/` auto-resolve conflict code in `mergeSliceToMain` and the ~44 merge-related recovery paths in `auto.ts` are simplified or removed. Worktree isolation makes most of this code structurally unnecessary.
- Why it matters: Dead conflict resolution code is maintenance burden and a source of bugs. If the architecture eliminates the problem, the code that patches it should go.
- Source: inferred
- Primary owning slice: M003/S02
- Supporting slices: M003/S06
- Validation: unmapped
- Notes: Only remove code that is genuinely unnecessary in worktree mode. Keep the legacy branch-per-slice path intact for `git.isolation: "branch"` users.
### R037 — Zero git errors for vibe coders
- Class: primary-user-loop
- Status: active
- Description: Users with zero git knowledge should never see a git error message during auto-mode. All git operations are invisible. If something fails, the system self-heals or presents a non-technical explanation with a clear action ("Run `/gsd doctor` to fix this").
- Why it matters: Vibe coders are the primary market. Git errors are incomprehensible to them and destroy trust in the system.
- Source: user
- Primary owning slice: M003/S05
- Supporting slices: all M003 slices
- Validation: unmapped
- Notes: This is a quality bar, not a single feature. Every git-touching codepath must handle errors gracefully.
### R038 — Backwards compatibility with branch-per-slice model
- Class: continuity
- Status: active
- Description: Existing projects that use the branch-per-slice model continue working exactly as they do today. No migration required. The old codepaths remain functional when `git.isolation: "branch"` is active.
- Why it matters: Breaking existing users' workflows would destroy trust.
- Source: user
- Primary owning slice: M003/S04
- Supporting slices: none
- Validation: unmapped
- Notes: All existing git-service.ts tests must continue passing in branch mode.
### R039 — Manual `/worktree` coexistence with auto-worktrees
- Class: integration
- Status: active
- Description: The manual `/worktree` command for exploration coexists with auto-mode's milestone worktrees. Different naming conventions prevent conflicts: auto-worktrees use `milestone/M003` branches, manual worktrees use `worktree/<name>` branches.
- Why it matters: Manual worktrees are a valuable exploration tool. They shouldn't be broken by auto-mode's worktree usage.
- Source: user
- Primary owning slice: M003/S01
- Supporting slices: none
- Validation: unmapped
- Notes: Auto-worktrees are created under `.gsd/worktrees/` just like manual ones, but with milestone ID as the name. The naming convention prevents branch collisions.
### R040 — Doctor git health checks
- Class: operability
- Status: active
- Description: `/gsd doctor` detects and optionally fixes git-related issues: orphaned auto-worktrees, stale milestone branches, corrupt merge state (MERGE_HEAD/SQUASH_MSG), tracked runtime files, missing gitignore patterns.
- Why it matters: When things do go wrong, users need a one-command fix. Doctor is the safety net.
- Source: inferred
- Primary owning slice: M003/S06
- Supporting slices: M003/S05
- Validation: unmapped
- Notes: Doctor already handles planning artifact issues. This extends it to git health.
### R041 — Test coverage for worktree-isolated flow
- Class: quality-attribute
- Status: active
- Description: Test suite covers: auto-worktree create/teardown, `--no-ff` slice merge within worktree, milestone squash to main, preference switching between isolation modes, self-heal scenarios, doctor git checks. All existing git tests continue passing.
- Why it matters: The git system is the most bug-prone part of GSD. Tests prevent regressions.
- Source: inferred
- Primary owning slice: M003/S07
- Supporting slices: all M003 slices
- Validation: unmapped
- Notes: Must test both worktree and branch isolation modes.
## Validated
### R001 — Secret forecasting during milestone planning
- Class: core-capability
- Status: validated
- Description: When a milestone is planned, the LLM analyzes slices for external service dependencies and writes a secrets manifest listing every predicted API key with setup guidance.
- Why it matters: Without forecasting, auto-mode discovers missing keys mid-execution and blocks for hours waiting for user input.
- Source: user
- Primary owning slice: M001/S01
- Supporting slices: none
- Validation: plan-milestone.md Secret Forecasting section (line 62) instructs LLM to write manifest. Parser round-trip tested in parsers.test.ts.
- Notes: The plan-milestone prompt has forecasting instructions. The manifest format and parser are implemented and tested.
### R002 — Secrets manifest persisted in .gsd/
- Class: continuity
- Status: validated
- Description: The secrets manifest is a durable markdown file at `.gsd/milestones/M00x/M00x-SECRETS.md` that survives session boundaries and can be re-read by any future unit.
- Why it matters: Collection may happen in a different session than planning. The manifest must persist on disk.
- Source: user
- Primary owning slice: M001/S01
- Supporting slices: none
- Validation: parseSecretsManifest/formatSecretsManifest round-trip tested (parsers.test.ts), resolveMilestoneFile(base, mid, "SECRETS") resolves path.
- Notes: Parser/formatter implemented in files.ts. Template exists at templates/secrets-manifest.md.
### R003 — Step-by-step guidance per key
- Class: primary-user-loop
- Status: validated
- Description: Each secret in the manifest includes numbered steps for obtaining the key (navigate to dashboard → create project → generate key → copy), a dashboard URL, and a format hint.
- Why it matters: Users shouldn't have to figure out where to find each key. The guidance makes collection self-service.
- Source: user
- Primary owning slice: M001/S02
- Supporting slices: M001/S01
- Validation: collectOneSecret renders numbered dim-styled guidance steps with wrapping (collect-from-manifest.test.ts tests 6-8).
- Notes: Guidance quality is LLM-dependent and best-effort.
### R004 — Summary screen before collection
- Class: primary-user-loop
- Status: validated
- Description: Before collecting secrets one-by-one, show a read-only summary screen listing all needed keys with their status (pending / already set / skipped). Auto-skip keys that already exist in the environment.
- Why it matters: The user needs to see the full picture before entering keys. Already-set keys should not require re-entry.
- Source: user
- Primary owning slice: M001/S02
- Supporting slices: none
- Validation: showSecretsSummary() renders read-only ctx.ui.custom screen with status indicators via makeUI().progressItem() (collect-from-manifest.test.ts tests 4-5).
- Notes: Read-only with auto-skip — no interactive deselection.
### R005 — Existing key detection and silent skip
- Class: primary-user-loop
- Status: validated
- Description: Before prompting for a key, check `.env` and `process.env`. If the key already exists, mark it as "already set" in the summary and skip collection.
- Why it matters: Users shouldn't re-enter keys they've already configured. Prevents frustration and errors.
- Source: user
- Primary owning slice: M001/S02
- Supporting slices: none
- Validation: getManifestStatus cross-references checkExistingEnvKeys, categorizes env-present keys as existing (manifest-status.test.ts tests 4,7). collectSecretsFromManifest skips them (collect-from-manifest.test.ts tests 1-2).
- Notes: `checkExistingEnvKeys()` implemented in get-secrets-from-user.ts.
### R006 — Smart destination detection
- Class: integration
- Status: validated
- Description: Automatically detect whether secrets should go to .env, Vercel, or Convex based on project file presence (vercel.json → Vercel, convex/ dir → Convex, default → .env).
- Why it matters: Users shouldn't have to specify the destination manually. The system should do the right thing.
- Source: user
- Primary owning slice: M001/S02
- Supporting slices: none
- Validation: collectSecretsFromManifest calls detectDestination() for destination inference. applySecrets() routes to dotenv/vercel/convex accordingly.
- Notes: `detectDestination()` implemented in get-secrets-from-user.ts.
### R007 — Auto-mode collection at entry point
- Class: core-capability
- Status: validated
- Description: When the user runs `/gsd auto`, check for a secrets manifest with pending keys. If found, collect them before dispatching the first slice. Collection happens once at the entry point, not as a dispatch unit.
- Why it matters: This is the primary integration point — auto-mode must not start execution with uncollected secrets.
- Source: user
- Primary owning slice: M001/S03
- Supporting slices: M001/S01, M001/S02
- Validation: startAuto() secrets gate at auto.ts:479. auto-secrets-gate.test.ts — 3/3 pass covering null manifest, pending keys, and no-pending-keys paths.
- Notes: Collection at entry point (startAuto), not as a separate unit type in dispatchNextUnit. D001 satisfied.
### R008 — Guided /gsd wizard integration
- Class: core-capability
- Status: validated
- Description: After milestone planning in the guided `/gsd` flow, trigger secret collection if a manifest exists with pending keys.
- Why it matters: Users who plan via the wizard should also get prompted for secrets before auto-mode begins.
- Source: user
- Primary owning slice: M001/S03
- Supporting slices: M001/S01, M001/S02
- Validation: guided-flow.ts calls startAuto() directly (lines 52, 486, 647, 794) — all guided flow paths that start auto-mode inherit the secrets gate.
- Notes: The guided flow dispatches to startAuto after planning. Collection is inherited via the gate.
### R009 — Planning prompts instruct LLM to forecast secrets
- Class: integration
- Status: validated
- Description: The plan-milestone prompt template includes instructions for the LLM to analyze slices for external service dependencies and write the secrets manifest.
- Why it matters: Without prompt instructions, the LLM won't know to forecast secrets.
- Source: user
- Primary owning slice: M001/S01
- Supporting slices: none
- Validation: plan-milestone.md has Secret Forecasting section at line 62 with instructions to write {{secretsOutputPath}} with H3 sections per key.
- Notes: Implemented in plan-milestone.md.
### R010 — secure_env_collect enhanced with guidance display
- Class: primary-user-loop
- Status: validated
- Description: The secure_env_collect TUI renders multi-line guidance steps above the masked input field on the same page, so the user sees setup instructions while entering the key.
- Why it matters: Without visible guidance, the user has to find keys on their own despite the LLM having generated instructions.
- Source: user
- Primary owning slice: M001/S02
- Supporting slices: none
- Validation: collectOneSecret accepts guidance parameter, renders numbered dim-styled lines with wrapTextWithAnsi above masked input (collect-from-manifest.test.ts tests 6-8).
- Notes: The guidance field is rendered in collectOneSecret().
### R015 — Module decomposition of browser-tools
- Class: quality-attribute
- Status: validated
- Description: The monolithic browser-tools index.ts (~5000 lines) is split into focused modules: shared infrastructure, tool groups, and browser-side utilities. All 43 existing tools continue to work identically.
- Why it matters: A 5000-line file is unmaintainable and makes targeted changes risky. Module boundaries enable safe refactoring and new tool development.
- Source: user
- Primary owning slice: M002/S01
- Supporting slices: none
- Validation: Extension loads via jiti, 43 tools register, browser navigate/snapshot/click work against real page, index.ts is 47-line orchestrator with zero registerTool calls, 9 tool files under tools/.
- Notes: core.js already exists with ~1000 lines of shared utilities. The split extends this pattern.
### R016 — Shared browser-side evaluate utilities
- Class: quality-attribute
- Status: validated
- Description: Common functions duplicated across page.evaluate boundaries (cssPath, simpleHash, isVisible, isEnabled, inferRole, accessibleName) are injected once and referenced from all evaluate callbacks.
- Why it matters: Currently buildRefSnapshot and resolveRefTarget each redeclare ~100 lines of identical utility code. Deduplication reduces payload size, improves maintainability, and ensures consistency.
- Source: user
- Primary owning slice: M002/S01
- Supporting slices: none
- Validation: window.__pi contains all 9 functions, survives navigation, refs.ts has zero inline redeclarations, close/reopen re-injects via addInitScript correctly.
- Notes: Uses context.addInitScript under window.__pi namespace.
### R017 — Consolidated state capture per action
- Class: core-capability
- Status: validated
- Description: The before-state capture, after-state capture, post-action summary, and recent-error check are consolidated into fewer page.evaluate calls per action.
- Why it matters: Every action tool currently runs 3-4 separate page.evaluate calls for state capture. Consolidating them reduces latency on every single browser interaction.
- Source: user
- Primary owning slice: M002/S02
- Supporting slices: M002/S01
- Validation: postActionSummary eliminated from action tools, countOpenDialogs removed from ToolDeps, consolidated capture pattern. Build passes.
- Notes: captureCompactPageState and postActionSummary merged into single evaluate.
### R018 — Conditional body text capture
- Class: core-capability
- Status: validated
- Description: Body text capture (includeBodyText: true) is skipped for low-signal actions (scroll, hover, Tab key press) and enabled for high-signal actions (navigate, click, type, submit).
- Why it matters: Capturing 4000 chars of body text on every scroll or hover is wasteful. Conditional capture reduces evaluate overhead.
- Source: user
- Primary owning slice: M002/S02
- Supporting slices: none
- Validation: explicit includeBodyText true/false per tool signal level in interaction.ts. Classification codified in D017. Build passes.
- Notes: Requires classifying each tool as high-signal or low-signal.
### R019 — Faster settle on zero mutations
- Class: core-capability
- Status: validated
- Description: settleAfterActionAdaptive short-circuits with a smaller quiet window when no mutation observer fires in the first 60ms.
- Why it matters: Many SPA interactions produce no DOM changes. Short-circuiting saves time on the most common case.
- Source: user
- Primary owning slice: M002/S02
- Supporting slices: none
- Validation: zero_mutation_shortcut settle reason in state.ts type union and settle.ts return path. 60ms/30ms thresholds codified in D019. Build passes.
- Notes: Track whether any mutation fired at all; if zero after 60ms, use a shorter quiet window.
### R020 — Sharp-based screenshot resizing
- Class: core-capability
- Status: validated
- Description: constrainScreenshot uses the sharp Node library for image resizing instead of bouncing buffers through page canvas context.
- Why it matters: Faster, no page dependency for image processing.
- Source: user
- Primary owning slice: M002/S03
- Supporting slices: M002/S01
- Validation: constrainScreenshot uses sharp(buffer).metadata() and sharp(buffer).resize(). Zero page.evaluate calls in capture.ts. Build passes.
- Notes: sharp added as a dependency.
### R021 — Opt-in screenshots on navigate
- Class: core-capability
- Status: validated
- Description: browser_navigate does not capture or return a screenshot by default. An explicit parameter opts in to screenshot capture.
- Why it matters: Significant token savings — the screenshot payload is large and often unnecessary.
- Source: user
- Primary owning slice: M002/S03
- Supporting slices: none
- Validation: browser_navigate has screenshot parameter default false. Capture gated. Build passes.
- Notes: Default is off. The agent can still use browser_screenshot explicitly.
### R022 — Form analysis tool (browser_analyze_form)
- Class: core-capability
- Status: validated
- Description: A browser_analyze_form tool that returns field inventory including labels, names, types, required status, current values, validation state, and submit controls.
- Why it matters: Collapses 3-8 tool calls for form analysis into one.
- Source: user
- Primary owning slice: M002/S04
- Supporting slices: M002/S01
- Validation: 7-level label resolution, form auto-detection, fieldset grouping, submit button discovery. Verified end-to-end against 12-field test form. Build passes.
- Notes: Must handle label association via for/id, wrapping label, aria-label, aria-labelledby, and placeholder.
### R023 — Form fill tool (browser_fill_form)
- Class: core-capability
- Status: validated
- Description: A browser_fill_form tool that maps labels/names/placeholders to inputs and fills them with type-aware Playwright APIs.
- Why it matters: Collapses 3-5 tool calls for form filling into one.
- Source: user
- Primary owning slice: M002/S04
- Supporting slices: M002/S01
- Validation: 5-strategy field resolution, type-aware fill via Playwright APIs, verified end-to-end with 10 fields. Build passes.
- Notes: Returns matched fields, unmatched values, fields skipped, and validation state.
### R024 — Intent-ranked element retrieval (browser_find_best)
- Class: core-capability
- Status: validated
- Description: A browser_find_best tool that returns scored candidates using deterministic heuristic ranking for 8 semantic intents.
- Why it matters: Cuts a round trip and reduces reasoning tokens for common element-finding tasks.
- Source: user
- Primary owning slice: M002/S05
- Supporting slices: M002/S01
- Validation: 8 intents implemented with 4-dimension scoring. Verified via Playwright tests. Build passes, tool count = 47.
- Notes: Deterministic heuristics only. No hidden LLM calls.
### R025 — Semantic action tool (browser_act)
- Class: core-capability
- Status: validated
- Description: A browser_act tool that resolves the top candidate for a semantic intent and executes the action in one call.
- Why it matters: Collapses 2-4 tool calls for common micro-tasks into one.
- Source: user
- Primary owning slice: M002/S05
- Supporting slices: M002/S04
- Validation: Resolves via same scoring engine as browser_find_best. Executes via Playwright locator. Returns before/after diff. Build passes, tool count = 47.
- Notes: Builds on browser_find_best for element selection. Bounded — does not loop or retry.
### R026 — Test coverage for new and refactored code
- Class: quality-attribute
- Status: validated
- Description: Test suite covers shared browser-side utilities, settle logic, screenshot resizing, form tools, and intent ranking.
- Why it matters: Regression protection for refactored and new features.
- Source: user
- Primary owning slice: M002/S06
- Supporting slices: all M002 slices
- Validation: 108 tests (63 unit + 45 integration) passing via `npm run test:browser-tools`.
- Notes: Test what's unit-testable without a browser. Integration tests with Playwright for tools that need a page.
## Deferred
### R011 — Multi-milestone secret forecasting
- Class: core-capability
- Status: deferred
- Description: Forecast secrets across all planned milestones, not just the active one.
- Why it matters: Would provide a complete picture of all secrets needed for the project.
- Source: user
- Primary owning slice: none
- Supporting slices: none
- Validation: unmapped
- Notes: Deferred — single-milestone forecasting is sufficient for now.
### R012 — Secret rotation reminders
- Class: operability
- Status: deferred
- Description: Track secret age and remind users when keys may need rotation.
- Why it matters: Security best practice, but not essential for the core workflow.
- Source: user
- Primary owning slice: none
- Supporting slices: none
- Validation: unmapped
- Notes: Deferred — out of scope for initial release.
### R027 — Browser reuse across sessions
- Class: core-capability
- Status: deferred
- Description: Keep a warm browser instance across rapid successive agent contexts to avoid ~2-3s Chrome cold-start per session.
- Why it matters: Would eliminate Chrome launch latency in auto-mode.
- Source: user
- Primary owning slice: none
- Supporting slices: none
- Validation: unmapped
- Notes: Deferred — skip completely per user direction.
### R042 — Parallel milestone execution in multiple worktrees
- Class: core-capability
- Status: deferred
- Description: Run multiple milestones simultaneously in separate worktrees with independent auto-mode sessions.
- Why it matters: Natural extension of worktree-per-milestone architecture. Would enable parallel work streams.
- Source: user
- Primary owning slice: none
- Supporting slices: none
- Validation: unmapped
- Notes: Deferred — ship sequential milestone execution first. The worktree infrastructure naturally supports this later.
### R043 — Native libgit2 write operations
- Class: quality-attribute
- Status: deferred
- Description: Extend the Rust/libgit2 native module to cover write operations (commit, merge, checkout) in addition to the current read-only queries.
- Why it matters: Would eliminate execSync overhead for git writes on the hot path.
- Source: inferred
- Primary owning slice: none
- Supporting slices: none
- Validation: unmapped
- Notes: Deferred — execSync writes are functional. Optimize later if profiling shows it matters.
## Out of Scope
### R013 — Curated service knowledge base
- Class: anti-feature
- Status: out-of-scope
- Description: A static database of known services with pre-written guidance for each API key.
- Why it matters: Prevents scope creep. LLM-generated guidance is sufficient and stays current without maintenance.
- Source: user
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: LLM generates guidance dynamically.
### R014 — Just-in-time collection enhancement
- Class: anti-feature
- Status: out-of-scope
- Description: Detect missing secrets during task execution and collect them inline.
- Why it matters: Prevents scope confusion. M001 is about proactive collection, not reactive.
- Source: user
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: Existing secure_env_collect already handles reactive collection.
### R028 — LLM-powered intent resolution
- Class: anti-feature
- Status: out-of-scope
- Description: Using hidden LLM calls inside browser_find_best or browser_act for intent resolution.
- Why it matters: Prevents unpredictable latency and cost.
- Source: inferred
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: browser_find_best and browser_act use scoring heuristics, not LLM inference.
### R044 — Rebase merge strategy
- Class: anti-feature
- Status: out-of-scope
- Description: Adding rebase as a merge strategy option alongside squash and --no-ff merge.
- Why it matters: Rebase rewrites history, which conflicts with the "commit diary" philosophy. It also introduces more failure modes (rebase conflicts are harder to auto-resolve than merge conflicts).
- Source: inferred
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: --no-ff merge + squash covers all needed use cases without history rewriting.
## Traceability
| ID | Class | Status | Primary owner | Supporting | Proof |
|---|---|---|---|---|---|
| R001 | core-capability | validated | M001/S01 | none | plan-milestone.md Secret Forecasting section, parser round-trip tests |
| R002 | continuity | validated | M001/S01 | none | parseSecretsManifest/formatSecretsManifest round-trip tested |
| R003 | primary-user-loop | validated | M001/S02 | M001/S01 | collect-from-manifest.test.ts tests 6-8 |
| R004 | primary-user-loop | validated | M001/S02 | none | collect-from-manifest.test.ts tests 4-5 |
| R005 | primary-user-loop | validated | M001/S02 | none | manifest-status.test.ts tests 4,7; collect-from-manifest.test.ts tests 1-2 |
| R006 | integration | validated | M001/S02 | none | collectSecretsFromManifest calls detectDestination() |
| R007 | core-capability | validated | M001/S03 | M001/S01, M001/S02 | auto-secrets-gate.test.ts 3/3 pass |
| R008 | core-capability | validated | M001/S03 | M001/S01, M001/S02 | guided-flow.ts calls startAuto() at lines 52, 486, 647, 794 |
| R009 | integration | validated | M001/S01 | none | plan-milestone.md Secret Forecasting section line 62 |
| R010 | primary-user-loop | validated | M001/S02 | none | collect-from-manifest.test.ts tests 6-8 |
| R011 | core-capability | deferred | none | none | unmapped |
| R012 | operability | deferred | none | none | unmapped |
| R013 | anti-feature | out-of-scope | none | none | n/a |
| R014 | anti-feature | out-of-scope | none | none | n/a |
| R015 | quality-attribute | validated | M002/S01 | none | jiti load, 43 tools register, slim index, browser spot-check |
| R016 | quality-attribute | validated | M002/S01 | none | window.__pi injection, zero inline redeclarations, survives navigation |
| R017 | core-capability | validated | M002/S02 | M002/S01 | postActionSummary eliminated, consolidated capture pattern |
| R018 | core-capability | validated | M002/S02 | none | explicit includeBodyText true/false per tool signal level |
| R019 | core-capability | validated | M002/S02 | none | zero_mutation_shortcut settle reason, 60ms/30ms thresholds |
| R020 | core-capability | validated | M002/S03 | M002/S01 | sharp-based constrainScreenshot, zero page.evaluate in capture.ts |
| R021 | core-capability | validated | M002/S03 | none | screenshot param default false, capture gated |
| R022 | core-capability | validated | M002/S04 | M002/S01 | 7-level label resolution, verified against 12-field test form |
| R023 | core-capability | validated | M002/S04 | M002/S01 | 5-strategy field resolution, verified end-to-end with 10 fields |
| R024 | core-capability | validated | M002/S05 | M002/S01 | 8-intent scoring, Playwright tests, differentiated rankings |
| R025 | core-capability | validated | M002/S05 | M002/S04 | top candidate execution, settle + diff, graceful error |
| R026 | quality-attribute | validated | M002/S06 | all M002 | 108 tests passing via npm run test:browser-tools |
| R027 | core-capability | deferred | none | none | unmapped |
| R028 | anti-feature | out-of-scope | none | none | n/a |
| R029 | core-capability | active | M003/S01 | none | unmapped |
| R030 | core-capability | active | M003/S03 | M003/S01 | unmapped |
| R031 | core-capability | active | M003/S02 | M003/S01 | unmapped |
| R032 | core-capability | active | M003/S03 | none | unmapped |
| R033 | core-capability | active | M003/S04 | none | unmapped |
| R034 | core-capability | active | M003/S04 | M003/S03 | unmapped |
| R035 | core-capability | active | M003/S05 | M003/S01, M003/S02, M003/S03 | unmapped |
| R036 | quality-attribute | active | M003/S02 | M003/S06 | unmapped |
| R037 | primary-user-loop | active | M003/S05 | all M003 | unmapped |
| R038 | continuity | active | M003/S04 | none | unmapped |
| R039 | integration | active | M003/S01 | none | unmapped |
| R040 | operability | active | M003/S06 | M003/S05 | unmapped |
| R041 | quality-attribute | active | M003/S07 | all M003 | unmapped |
| R042 | core-capability | deferred | none | none | unmapped |
| R043 | quality-attribute | deferred | none | none | unmapped |
| R044 | anti-feature | out-of-scope | none | none | n/a |
## Coverage Summary
- Active requirements: 13
- Mapped to slices: 13
- Validated: 22
- Deferred: 5
- Out of scope: 4
- Unmapped active requirements: 0

23
.gsd/STATE.md Normal file
View file

@ -0,0 +1,23 @@
# GSD State
**Active Milestone:** M003 — Worktree-Isolated Git Architecture
**Active Slice:** None
**Phase:** pre-planning
## Milestone Registry
- ✅ **M001:** Proactive Secret Management
- ✅ **M002:** Browser Tools Performance & Intelligence
- 🔄 **M003:** Worktree-Isolated Git Architecture
## Recent Decisions
- D027: Git isolation model — worktree-per-milestone (default for new projects)
- D028: Slice merge strategy within worktree — --no-ff merge
- D029: Milestone-to-main merge strategy — squash merge
- D030: Failure handling philosophy — stop but self-heal
- D031: Target user priority — vibe coder first
## Blockers
- None
## Next Action
Research and plan M003. Context and roadmap are written. Ready for auto-mode.

View file

@ -0,0 +1,114 @@
# M003: Worktree-Isolated Git Architecture
**Gathered:** 2026-03-14
**Status:** Ready for planning
## Project Description
Overhaul GSD's git system to use worktree-per-milestone isolation as the default model. Each milestone gets its own git worktree with an isolated `.gsd/` directory, eliminating the entire category of `.gsd/` merge conflicts that have caused ~15 separate bug fixes to date. Slices merge into the milestone branch via `--no-ff` (preserving full commit history as a diary of the agent's work). Milestones squash-merge to main on completion (keeping main clean). The system is automagical for vibe coders — zero git errors, zero git knowledge required — and configurable for senior engineers via preferences.
## Why This Milestone
The current branch-per-slice model shares `.gsd/` state across branches, causing merge conflicts that halt auto-mode. The CHANGELOG shows a pattern: each fix leads to a new edge case. The root cause is structural — sharing mutable state across branches. Worktree isolation eliminates the problem architecturally rather than patching symptoms.
## User-Visible Outcome
### When this milestone is complete, the user can:
- Run `/gsd auto` on a new project and have it execute start-to-finish without any git errors, merge conflicts, or mysterious halts
- See clean `git log` on main with one commit per completed milestone
- Configure `git.merge_to_main: "slice"` in preferences to get slice-level integration if they want it
- Run `/gsd doctor` to detect and fix git-related issues
- Use manual `/worktree` alongside auto-mode without conflicts
### Entry point / environment
- Entry point: `/gsd auto` CLI command, `/gsd doctor` CLI command
- Environment: local dev — any git repository
- Live dependencies involved: git CLI, optional libgit2 native module
## Completion Class
- Contract complete means: auto-worktree create/teardown lifecycle works, slice merges use `--no-ff`, milestone squashes to main, preferences switch between modes, self-heal recovers from common failures, all tests pass
- Integration complete means: the full auto-mode lifecycle (startAuto → dispatch units → complete slices → complete milestone → merge to main) works end-to-end in a real git repo with real file changes
- Operational complete means: existing projects on branch-per-slice model continue working unchanged, manual `/worktree` coexists without conflicts
## Final Integrated Acceptance
To call this milestone complete, we must prove:
- Auto-mode on a fresh project creates a worktree, executes through multiple slices, and merges the milestone to main — with zero git errors
- An existing project with branch-per-slice history continues working identically (no regression)
- A deliberately introduced merge conflict is self-healed without user intervention
- `git log main` shows exactly one squash commit per completed milestone
- `git log milestone/M003` shows full commit history with `--no-ff` merge boundaries per slice
## Risks and Unknowns
- **`process.chdir` in auto-mode** — auto-mode currently passes `basePath` to all functions but doesn't `chdir`. Worktree mode needs `chdir` into the worktree so that all tool calls (bash, read, write, edit) resolve against the worktree. The worktree-command.ts already does this, but auto-mode doesn't. Risk: some codepath uses `basePath` while another uses `process.cwd()`, causing split-brain.
- **Worktree `.gsd/` inheritance** — when a worktree is created, it gets a copy of the project files from the milestone branch base. But `.gsd/` planning files from the main tree may or may not be wanted in the worktree. Need to decide: copy planning state or start fresh.
- **State machine re-entry** — if auto-mode is paused and resumed, the worktree must be re-entered (if it still exists). The pause/resume logic in `startAuto` needs to handle this.
- **Existing orphan recovery** — the current `mergeOrphanedSliceBranches` logic needs to work within the worktree context, not just on main.
> See `.gsd/DECISIONS.md` for all architectural and pattern decisions — it is an append-only register; read it during planning, append to it during execution.
## Relevant Requirements
- R029 — Auto-worktree creation on milestone start
- R030 — Auto-worktree teardown + squash-merge on milestone complete
- R031 — `--no-ff` slice merges within milestone worktree
- R032 — Rich milestone-level squash commit message
- R033 — `git.isolation` preference
- R034 — `git.merge_to_main` preference
- R035 — Self-healing git repair on failure
- R036 — `.gsd/` conflict resolution elimination
- R037 — Zero git errors for vibe coders
- R038 — Backwards compatibility with branch-per-slice model
- R039 — Manual `/worktree` coexistence with auto-worktrees
- R040 — Doctor git health checks
- R041 — Test coverage for worktree-isolated flow
## Scope
### In Scope
- Auto-worktree lifecycle wired into `startAuto()` and `complete-milestone`
- `--no-ff` merge for slices within worktree, squash for milestone to main
- `git.isolation` and `git.merge_to_main` preferences with validation
- Self-healing git repair (abort, reset, retry) for common failure modes
- Doctor git health checks (orphaned worktrees, stale branches, corrupt state)
- Simplification of `.gsd/` conflict resolution code (worktree mode only)
- Test suite for both worktree and branch isolation modes
- Backwards compatibility with existing branch-per-slice projects
### Out of Scope / Non-Goals
- Parallel milestone execution (deferred to future milestone)
- Native libgit2 write operations (deferred)
- Rebase merge strategy (anti-feature — conflicts with commit diary philosophy)
- Remote git operations beyond existing auto-push
## Technical Constraints
- Must work with git CLI (libgit2 native module is optional, read-only)
- `process.chdir` is the mechanism for worktree switching (proven in worktree-command.ts)
- All file tools (read, write, edit, bash) resolve against `process.cwd()` — this is the reason `chdir` works
- Source files are in `src/resources/extensions/gsd/`, tests in `src/resources/extensions/gsd/tests/`
- Tests run via `npm run test:unit` and `npm run test:integration`
## Integration Points
- `auto.ts` — primary integration point for worktree lifecycle in `startAuto()`, `dispatchNextUnit()`, `handleAgentEnd()`
- `git-service.ts``GitServiceImpl` class owns all git mutation operations
- `worktree.ts` — thin facade over `GitServiceImpl`, exports `ensureSliceBranch`, `mergeSliceToMain`, etc.
- `worktree-manager.ts` — existing worktree create/list/remove/merge operations
- `worktree-command.ts` — manual `/worktree` command with `process.chdir` handling
- `preferences.ts` — preference validation and loading
- `doctor.ts` — health check and auto-fix system
- `native-git-bridge.ts` — libgit2 read operations
- `dispatch-guard.ts` — prior-slice completion checking
## Open Questions
- **Worktree naming convention for auto-worktrees** — should auto-worktrees use the milestone ID as the name (`.gsd/worktrees/M003/`) or a prefixed name (`.gsd/worktrees/auto-M003/`)? Current thinking: bare milestone ID is cleaner and the branch convention (`milestone/M003` vs `worktree/<name>`) disambiguates from manual worktrees.
- **`.gsd/` file handling on worktree creation** — should the worktree inherit the main tree's `.gsd/` planning files, or should they be cleared for a fresh start? Current thinking: inherit — the worktree needs the milestone's CONTEXT.md and ROADMAP.md to continue planning.

View file

@ -0,0 +1,173 @@
# M003: Worktree-Isolated Git Architecture
**Vision:** Overhaul GSD's git system so that auto-mode is automagical — zero git errors, zero merge conflicts, zero user intervention required. Each milestone gets its own isolated worktree. Main is always clean. The system just runs.
## Success Criteria
- Auto-mode on a fresh project executes through an entire milestone without any git errors or halts
- Main branch only receives commits when milestones complete (one squash commit per milestone)
- Full commit history preserved within milestone worktree branches via `--no-ff` slice merges
- Existing branch-per-slice projects continue working identically — zero regressions
- Self-healing resolves common git failures (merge conflict, checkout issue, corrupt state) without user intervention
- `/gsd doctor` detects and fixes git health issues (orphaned worktrees, stale branches, corrupt merge state)
## Key Risks / Unknowns
- **`process.chdir` coherence in auto-mode** — all tool calls must resolve against the worktree path after chdir. The worktree-command.ts has proven this works, but auto-mode's `basePath` variable and `process.cwd()` must stay in sync.
- **Worktree `.gsd/` inheritance** — creating a worktree copies project files from the base branch. `.gsd/` planning files (CONTEXT, ROADMAP) must carry through; runtime files (STATE.md, metrics, activity) must not cause conflicts.
- **State machine re-entry on resume** — pausing and resuming auto-mode must re-enter the worktree if it exists. The current pause/resume logic doesn't handle this.
## Proof Strategy
- `process.chdir` coherence → retire in S01 by proving auto-mode dispatches and executes a unit inside the worktree with all file operations resolving correctly
- Worktree `.gsd/` inheritance → retire in S01 by proving planning files are available after worktree creation and runtime files don't conflict
- State machine re-entry → retire in S01 by proving pause/resume correctly re-enters the worktree
## Verification Classes
- Contract verification: git operations produce expected branch state, file layout, and commit history in temp repos
- Integration verification: full auto-mode lifecycle (create worktree → execute slices → merge milestone → teardown) in a real git repo
- Operational verification: existing branch-per-slice projects continue working; manual `/worktree` coexists
- UAT / human verification: run auto-mode on a real project and confirm zero git errors
## Milestone Definition of Done
This milestone is complete only when all are true:
- Auto-worktree lifecycle works end-to-end (create, execute, merge, teardown)
- `--no-ff` slice merges produce correct history on milestone branch
- Milestone squash to main produces clean single commit
- `git.isolation` and `git.merge_to_main` preferences work with validation
- Self-healing recovers from common git failures without user intervention
- Existing branch-per-slice projects pass all existing tests
- `/gsd doctor` detects and fixes git health issues
- Full test suite passes for both worktree and branch isolation modes
- Success criteria re-checked against live behavior
## Requirement Coverage
- Covers: R029, R030, R031, R032, R033, R034, R035, R036, R037, R038, R039, R040, R041
- Partially covers: none
- Leaves for later: R042 (parallel milestones), R043 (native libgit2 writes)
- Orphan risks: none
## Slices
- [ ] **S01: Auto-worktree lifecycle in auto-mode** `risk:high` `depends:[]`
> After this: `startAuto()` on a new milestone creates a worktree under `.gsd/worktrees/M003/`, `chdir`s into it, and dispatches units inside the worktree. Pause/resume re-enters the worktree. Progress widget shows the worktree branch. Verified via running auto-mode unit dispatch in a temp repo worktree.
- [ ] **S02: --no-ff slice merges + conflict elimination** `risk:high` `depends:[S01]`
> After this: completed slices merge into the milestone branch via `--no-ff` instead of squash. The `.gsd/` auto-resolve conflict code in `mergeSliceToMain` is bypassed in worktree mode. `git log` on the milestone branch shows full commit history with merge commit boundaries per slice. Verified in temp repo.
- [ ] **S03: Milestone-to-main squash merge + worktree teardown** `risk:high` `depends:[S01,S02]`
> After this: `complete-milestone` squash-merges the milestone branch to main with a rich commit message listing all slices, removes the worktree, `chdir`s back to the main project root. `git log main` shows one clean commit. Auto-push works if enabled. Verified in temp repo with remote.
- [ ] **S04: Preferences + backwards compatibility** `risk:medium` `depends:[S01]`
> After this: `git.isolation: "worktree"` (default for new projects) / `"branch"` (existing projects) and `git.merge_to_main: "milestone"` / `"slice"` preferences are validated and respected. An existing project with `gsd/*` branches defaults to branch mode and works identically to today. Verified by running tests in both modes.
- [ ] **S05: Self-healing git repair** `risk:medium` `depends:[S01,S02,S03]`
> After this: when a merge fails or checkout breaks during auto-mode, the system aborts the failed operation, resets working tree state, and retries. Only truly unresolvable conflicts (real code conflicts between human-edited files) pause auto-mode. Users see non-technical messages, not raw git errors. Verified by deliberately introducing failures and confirming auto-recovery.
- [ ] **S06: Doctor + cleanup + code simplification** `risk:low` `depends:[S01,S02,S03,S05]`
> After this: `/gsd doctor` detects orphaned auto-worktrees, stale milestone branches, corrupt merge state (MERGE_HEAD/SQUASH_MSG), and tracked runtime files — and fixes them. Dead `.gsd/` conflict resolution code removed from worktree-mode paths in git-service.ts. Verified via doctor test cases.
- [ ] **S07: Test suite for worktree-isolated flow** `risk:low` `depends:[S01,S02,S03,S04,S05,S06]`
> After this: full test coverage for auto-worktree create/teardown, `--no-ff` slice merge, milestone squash, preference switching, self-heal scenarios, doctor checks. All existing git tests still pass. Both isolation modes tested. Verified via `npm run test:unit && npm run test:integration`.
<!--
Format rules (parsers depend on this exact structure):
- Checkbox line: - [ ] **S01: Title** `risk:high|medium|low` `depends:[S01,S02]`
- Demo line: > After this: one sentence showing what's demoable
- Mark done: change [ ] to [x]
- Order slices by risk (highest first)
- Each slice must be a vertical, demoable increment — not a layer
- If all slices are completed exactly as written, the milestone's promised outcome should actually work at the stated proof level
- depends:[X,Y] means X and Y must be done before this slice starts
-->
## Boundary Map
### S01 → S02, S03, S04, S05
Produces:
- `createAutoWorktree(basePath, milestoneId)` — creates worktree, returns worktree path
- `teardownAutoWorktree(basePath, milestoneId)` — removes worktree, returns to main tree
- `isInAutoWorktree(basePath)` → boolean — detects if currently in an auto-worktree
- `getAutoWorktreePath(basePath, milestoneId)` → string | null — resolves worktree path
- `enterAutoWorktree(basePath, milestoneId)``process.chdir` into existing worktree
- Updated `startAuto()` in auto.ts that creates/enters worktree on milestone start
- Updated pause/resume logic that re-enters worktree on resume
Consumes:
- nothing (first slice)
### S01 → S02
Produces:
- The worktree infrastructure that S02 merges slices within
Consumes:
- nothing (first slice)
### S02 → S03
Produces:
- `mergeSliceToMilestone(basePath, milestoneId, sliceId, sliceTitle)``--no-ff` merge of slice branch into milestone branch within worktree
- Simplified merge path that skips `.gsd/` conflict resolution in worktree mode
Consumes from S01:
- `isInAutoWorktree()` to determine which merge strategy to use
### S02 → S06
Produces:
- Knowledge of which conflict resolution code is dead in worktree mode
Consumes from S01:
- Worktree detection functions
### S03 → S05
Produces:
- `mergeMilestoneToMain(basePath, milestoneId)` — squash-merge milestone branch to main
- `buildMilestoneCommitMessage(milestoneId, milestoneTitle, slices)` — rich squash commit
Consumes from S01:
- `teardownAutoWorktree()` for worktree removal after merge
- `isInAutoWorktree()` for detection
Consumes from S02:
- Merged milestone branch with `--no-ff` slice history
### S04 → S01, S02, S03
Produces:
- `git.isolation` preference — `"worktree"` | `"branch"`
- `git.merge_to_main` preference — `"milestone"` | `"slice"`
- `shouldUseWorktreeIsolation(basePath)` — resolves effective isolation mode
- Preference validation in `preferences.ts`
Consumes from S01:
- Auto-worktree functions (gated by isolation preference)
### S05 → S06
Produces:
- Structured git error handling patterns (try/abort/reset/retry)
- User-facing error message formatting
Consumes from S01:
- Worktree detection (to scope repair to correct working tree)
Consumes from S02:
- Merge operations that may fail
Consumes from S03:
- Milestone merge that may fail
### S06 → S07
Produces:
- Doctor git health check functions
- Simplified git-service.ts with dead code removed
Consumes from S05:
- Error handling patterns for doctor fix operations