docs(M003): context, requirements, and roadmap
This commit is contained in:
parent
ac33781fd0
commit
24597873a0
6 changed files with 947 additions and 0 deletions
41
.gsd/DECISIONS.md
Normal file
41
.gsd/DECISIONS.md
Normal file
|
|
@ -0,0 +1,41 @@
|
|||
# Decisions Register
|
||||
|
||||
<!-- Append-only. Never edit or remove existing rows.
|
||||
To reverse a decision, add a new row that supersedes it.
|
||||
Read this file at the start of any planning or research phase. -->
|
||||
|
||||
| # | When | Scope | Decision | Choice | Rationale | Revisable? |
|
||||
|---|------|-------|----------|--------|-----------|------------|
|
||||
| D001 | M001 | arch | Secret collection insertion point | At `/gsd auto` entry (startAuto), not as a dispatch unit type | Keeps the state machine untouched. Collection is a one-time gate, not a repeating unit. Simpler, less risk of dispatch loop bugs. | Yes — if collection needs to happen mid-milestone |
|
||||
| D002 | M001 | convention | Manifest file naming | `M00x-SECRETS.md` via existing `resolveMilestoneFile(base, mid, "SECRETS")` | Consistent with all other milestone-level files (CONTEXT, ROADMAP, RESEARCH). No new path resolver needed. | No |
|
||||
| D003 | M001 | pattern | Summary screen interactivity | Read-only with auto-skip (no interactive deselection) | Matches the "walk away" philosophy. Simpler UX, fewer edge cases. User can always re-run collection. | Yes — if users request deselection |
|
||||
| D004 | M001 | pattern | Guidance display placement | Same page as masked input (above the editor) | Single page per key — no extra navigation. User sees guidance while entering the value. | Yes — if terminal height constraints cause problems |
|
||||
| D005 | M001 | convention | Manifest format | Markdown with H3 sections per key, bold fields, numbered guidance | Consistent with all other .gsd files. Parser and formatter already exist in files.ts. | No |
|
||||
| D006 | M001 | arch | Destination inference | Reuse existing `detectDestination()` from get-secrets-from-user.ts | Simple file-presence checks (vercel.json → Vercel, convex/ → Convex, default → .env). Already proven. | Yes — if per-key destination override needed |
|
||||
| D007 | M002 | arch | File structure after module split | Split index.ts into state.ts, lifecycle.ts, capture.ts, settle.ts, refs.ts, utils.ts, evaluate-helpers.ts, and tools/ directory | 5000-line monolith is unmaintainable; module boundaries enable safe changes. core.js already established the pattern. | No |
|
||||
| D008 | M002 | library | Image resizing library | sharp | Fast, well-maintained, standard Node image processing. Replaces fragile canvas-based approach that depends on page context. | No |
|
||||
| D009 | M002 | convention | Navigate screenshot default | Off by default, opt-in via parameter | Big token savings. Agent uses browser_screenshot explicitly when visual verification needed. | Yes — if agents consistently need screenshots on navigate |
|
||||
| D010 | M002 | arch | Browser-side utility injection | page.addInitScript under window.__pi namespace | Survives navigation, available before page scripts, namespaced to avoid collisions. | Yes — if timing issues discovered |
|
||||
| D011 | M002 | convention | Intent resolution approach | Deterministic heuristics only, no LLM calls | Predictable latency and cost. Scoring functions are testable and debuggable. | Yes — if heuristic coverage proves insufficient |
|
||||
| D012 | M002 | convention | Browser reuse across sessions | Skip completely | Architecturally different from within-session work; user directed to exclude entirely. | No |
|
||||
| D013 | M002/S01 | pattern | Mutable state accessor pattern | get/set functions for all 18 state variables, not `export let` | ES module live bindings break under jiti's CJS shim. Accessors guarantee consumers see mutations. | No |
|
||||
| D014 | M002/S01 | pattern | ToolDeps interface location | Defined in state.ts alongside types it references | Keeps the dependency graph simple — tool files import state.ts for ToolDeps + types. | Yes — could move to separate types.ts if state.ts grows |
|
||||
| D015 | M002/S01 | pattern | Factory pattern for lifecycle-dependent utils | createGetLivePagesSnapshot(ensureBrowser) instead of direct import | Avoids circular dependency between utils.ts and lifecycle.ts. Wired at orchestrator level. | No |
|
||||
| D016 | M002/S01 | pattern | Tool file import strategy | Tool files import state accessors and core.js functions directly — ToolDeps carries only infrastructure functions needing lifecycle wiring | Keeps ToolDeps lean. State accessors are stable imports, not runtime-wired dependencies. Avoids bloating the deps interface with every utility. | Yes — if ToolDeps grows unwieldy |
|
||||
| D017 | M002/S02 | pattern | Action tool signal classification | High-signal: click, type, key_press, select_option, set_checked, navigate, click_ref, fill_ref. Low-signal: scroll, hover, drag, upload_file, hover_ref. | High-signal tools produce meaningful page changes worth capturing body text for diffs. Low-signal tools don't change page content. fill_ref is high-signal because input value changes affect form state. | Yes — if new tools need reclassification |
|
||||
| D018 | M002/S02 | pattern | postActionSummary retention | Keep postActionSummary in capture.ts for summary-only tools (go_back, go_forward, reload) but remove from action tools that do before/after diff | Summary-only tools don't do diffs and don't need beforeState — postActionSummary is the right abstraction for them. Action tools need consolidated capture. | Yes — could remove entirely if summary-only tools get before/after diff |
|
||||
| D019 | M002/S02 | tuning | Zero-mutation settle thresholds | 60ms detection window, 30ms shortened quiet window, totalMutationsSeen === 0 required | Conservative thresholds — 60ms is enough time for any async DOM update to start, 30ms shortened window still catches late mutations. Requiring zero total mutations (not just current poll) prevents false short-circuits. | Yes — if real-world testing shows 60ms is too short for slow SPAs |
|
||||
| D020 | M002/S04 | pattern | Form analysis evaluate location | Form analysis evaluate logic lives in tools/forms.ts, not extracted to evaluate-helpers.ts | Form-specific, not a shared utility. The label resolution heuristic is only used by form tools. Keeping it local avoids bloating the shared injection. | Yes — if S05 intent tools need label resolution |
|
||||
| D021 | M002/S04 | pattern | Fill uses Playwright APIs, not evaluate | browser_fill_form uses Playwright locator.fill()/selectOption()/setChecked() instead of page.evaluate() value setting | Playwright APIs trigger proper input/change events and handle framework-specific reactivity (React, Vue). Direct value setting via evaluate skips event dispatch and breaks reactive frameworks. | No |
|
||||
| D022 | M002/S04 | pattern | Fill field matching priority | Label (exact → case-insensitive) → name → placeholder → aria-label | Label is the most human-readable identifier. Name is the most reliable programmatic identifier. Placeholder and aria-label are fallbacks. Exact match before fuzzy prevents wrong-field fills. | Yes — if real-world usage shows a different priority works better |
|
||||
| D023 | M002/S05 | pattern | Intent scoring model | 4 orthogonal dimensions per intent, each 0-1, summed and clamped | Consistent scoring structure across all 8 intents. Makes scoring testable and debuggable — each dimension has a named reason. 4 dimensions balance discrimination vs complexity. | Yes — could add/remove dimensions per intent if real-world usage shows imbalance |
|
||||
| D024 | M002/S05 | pattern | search_field action type | Focus instead of click for search_field intent in browser_act | Search fields need keyboard focus for typing, not a click that might submit or toggle. Focus is the semantically correct action. Other intents use click. | Yes — if focus proves unreliable on specific input implementations |
|
||||
| D025 | M002/S06 | pattern | Test import strategy for browser-tools | jiti CJS imports instead of ESM resolve-ts hook | The resolve-ts ESM hook breaks on core.js (plain .js file imported by TS modules). jiti handles mixed .ts/.js imports correctly from a .cjs test file. | No |
|
||||
| D026 | M002/S06 | pattern | Testing module-private functions | Source extraction via readFileSync + brace-match + strip types + eval | Avoids exporting test-only APIs from production modules. Fragile to refactors but tests fail clearly when extraction breaks. Acceptable tradeoff for test code. | Yes — if private functions get exported for other reasons |
|
||||
| D027 | M003 | arch | Git isolation model | Worktree-per-milestone (default for new projects) | Eliminates .gsd/ merge conflicts structurally. Each milestone gets its own worktree with isolated .gsd/ state. Branch-per-slice remains as opt-in legacy mode via git.isolation: "branch". | No |
|
||||
| D028 | M003 | arch | Slice merge strategy within worktree | --no-ff merge (not squash) | Preserves full commit history as a diary of agent work. Merge commits give natural slice boundaries. Squash would destroy per-task granularity. | Yes — if commit noise proves problematic |
|
||||
| D029 | M003 | arch | Milestone-to-main merge strategy | Squash merge | Main gets one clean commit per milestone. Individually revertable. Reads like a changelog. Full history preserved on milestone branch for forensics. | No |
|
||||
| D030 | M003 | arch | Failure handling philosophy | Stop but self-heal | Auto-mode pauses, runs automatic repair (abort, reset, retry), resumes without user intervention in most cases. Only truly ambiguous conflicts need a human. Balances continuity with trust. | Yes — if self-heal proves unreliable |
|
||||
| D031 | M003 | arch | Target user priority | Vibe coder first | Zero git errors as the default. Senior engineers configure overrides. Biggest market opportunity is users who can't use git today. | No |
|
||||
| D032 | M003 | convention | Auto-worktree naming | Milestone ID as worktree name, milestone/<MID> as branch | .gsd/worktrees/M003/ with branch milestone/M003. Manual worktrees use worktree/<name> branches. No collision between auto and manual. | Yes — if naming conflicts discovered |
|
||||
| D033 | M003 | arch | Migration strategy | New projects default to worktree; existing keep branch-per-slice | Detection: if project has gsd/* branches or milestone META with integration branch → legacy. Otherwise → worktree. No forced migration. | Yes — if adoption shows users want migration tooling |
|
||||
43
.gsd/PROJECT.md
Normal file
43
.gsd/PROJECT.md
Normal file
|
|
@ -0,0 +1,43 @@
|
|||
# Project
|
||||
|
||||
## What This Is
|
||||
|
||||
A pi coding agent extension (GSD — "Get Stuff Done") that provides structured planning, auto-mode execution, and project management for autonomous coding sessions. Includes proactive secret management, browser automation tools for UI verification, and worktree-isolated git architecture for zero-friction autonomous execution.
|
||||
|
||||
## Core Value
|
||||
|
||||
Auto-mode runs from start to finish without blocking. Git is invisible — no merge conflicts, no checkout errors, no state corruption. The system is automagical for vibe coders and configurable for senior engineers.
|
||||
|
||||
## Current State
|
||||
|
||||
The GSD extension is fully functional with:
|
||||
- Milestone/slice/task planning hierarchy
|
||||
- Auto-mode state machine with fresh-session-per-unit dispatch
|
||||
- Guided `/gsd` wizard flow
|
||||
- `secure_env_collect` tool with masked TUI input, multi-destination write support, guidance display, and summary screen
|
||||
- Proactive secret management: planning prompts forecast secrets, manifests persist them, auto-mode collects them before first dispatch
|
||||
- Browser-tools extension with 47 registered tools covering navigation, interaction, inspection, verification, tracing, debugging, form intelligence (browser_analyze_form, browser_fill_form), and intent-ranked retrieval and semantic actions (browser_find_best, browser_act)
|
||||
- Browser-tools `core.js` with shared utilities for action timeline, page registry, state diffing, assertions, fingerprinting
|
||||
- Branch-per-slice git model with squash merge to main (being superseded by worktree-isolated model in M003)
|
||||
|
||||
## Architecture / Key Patterns
|
||||
|
||||
- **Extension model**: pi extensions register tools, commands, hooks via `ExtensionAPI`
|
||||
- **State machine**: `auto.ts` drives `dispatchNextUnit()` which reads disk state and dispatches fresh sessions
|
||||
- **Secrets gate**: `startAuto()` checks `getManifestStatus()` before first dispatch
|
||||
- **Disk-driven state**: `.gsd/` files are the source of truth, `STATE.md` is derived cache
|
||||
- **File parsing**: `files.ts` has markdown parsers for all GSD file types
|
||||
- **Browser-tools**: Modular structure — slim `index.ts` orchestrator, 8 focused infrastructure modules (state.ts, utils.ts, evaluate-helpers.ts, lifecycle.ts, capture.ts, settle.ts, refs.ts), 11 categorized tool files under `tools/` (including forms.ts, intent.ts), shared infrastructure in `core.js` (~1000 lines). Browser-side utilities injected once via `addInitScript` under `window.__pi` namespace. Uses Playwright for browser control. Accessibility-first state representation, deterministic versioned refs, adaptive DOM settling, compact post-action summaries. Form tools use Playwright locator APIs for type-aware filling with structured result reporting. Intent tools use deterministic 4-dimension heuristic scoring for element retrieval and one-call semantic actions.
|
||||
- **Prompt templates**: `prompts/` directory with mustache-like `{{var}}` substitution
|
||||
- **TUI components**: `@gsd/pi-tui` provides `Editor`, `Text`, key handling, themes
|
||||
- **Git architecture**: Worktree-per-milestone isolation (default for new projects). Each milestone gets its own git worktree with isolated `.gsd/` state. Slices merge via `--no-ff` into the milestone branch (preserving full commit history). Milestones squash-merge to main on completion. Legacy branch-per-slice model supported via `git.isolation: "branch"` preference.
|
||||
|
||||
## Capability Contract
|
||||
|
||||
See `.gsd/REQUIREMENTS.md` for the explicit capability contract, requirement status, and coverage mapping.
|
||||
|
||||
## Milestone Sequence
|
||||
|
||||
- [x] M001: Proactive Secret Management — Front-loaded API key collection into planning so auto-mode runs uninterrupted (10 requirements validated)
|
||||
- [x] M002: Browser Tools Performance & Intelligence — Module decomposition, action pipeline optimization, sharp-based screenshots, form intelligence, intent-ranked retrieval, semantic actions, 108-test suite (12 requirements validated)
|
||||
- [ ] M003: Worktree-Isolated Git Architecture — Worktree-per-milestone isolation eliminating merge conflicts, self-healing git repair, zero git errors for vibe coders, configurable for senior engineers
|
||||
553
.gsd/REQUIREMENTS.md
Normal file
553
.gsd/REQUIREMENTS.md
Normal file
|
|
@ -0,0 +1,553 @@
|
|||
# Requirements
|
||||
|
||||
This file is the explicit capability and coverage contract for the project.
|
||||
|
||||
## Active
|
||||
|
||||
### R029 — Auto-worktree creation on milestone start
|
||||
- Class: core-capability
|
||||
- Status: active
|
||||
- Description: When auto-mode starts a new milestone, it automatically creates a git worktree under `.gsd/worktrees/<MID>/` with branch `milestone/<MID>`, `chdir`s into it, and dispatches all units from within the worktree. The user never runs a git command.
|
||||
- Why it matters: Worktree isolation gives each milestone its own `.gsd/` directory, eliminating the entire category of `.gsd/` merge conflicts that have caused ~15 separate bug fixes to date.
|
||||
- Source: user
|
||||
- Primary owning slice: M003/S01
|
||||
- Supporting slices: none
|
||||
- Validation: unmapped
|
||||
- Notes: Must handle: fresh milestone (no worktree yet), resumed milestone (worktree already exists), milestone started from non-main branch. Must coexist with manual `/worktree` command.
|
||||
|
||||
### R030 — Auto-worktree teardown + squash-merge on milestone complete
|
||||
- Class: core-capability
|
||||
- Status: active
|
||||
- Description: When a milestone completes, the milestone branch is squash-merged to main with a rich commit message, the worktree is removed, and `process.chdir` returns to the main project root. Main receives exactly one commit per milestone.
|
||||
- Why it matters: Main stays clean and always represents completed, working milestones. One commit per milestone is individually revertable.
|
||||
- Source: user
|
||||
- Primary owning slice: M003/S03
|
||||
- Supporting slices: M003/S01
|
||||
- Validation: unmapped
|
||||
- Notes: Must handle: dirty worktree at teardown time (auto-commit first), failed squash-merge (self-heal), remote push after merge (if auto_push enabled).
|
||||
|
||||
### R031 — `--no-ff` slice merges within milestone worktree
|
||||
- Class: core-capability
|
||||
- Status: active
|
||||
- Description: Completed slices merge into the milestone branch via `--no-ff` merge instead of squash. This preserves the full per-task commit history on the milestone branch, with merge commits providing natural slice boundaries.
|
||||
- Why it matters: The commit history is a diary of the agent's work. The LLM can read `git log` to understand what happened. Squashing slices destroys this granularity. `--no-ff` merge commits give clean slice boundaries while keeping all commits.
|
||||
- Source: user
|
||||
- Primary owning slice: M003/S02
|
||||
- Supporting slices: M003/S01
|
||||
- Validation: unmapped
|
||||
- Notes: This is the default for worktree-isolated mode. The branch-per-slice legacy model retains its existing squash default.
|
||||
|
||||
### R032 — Rich milestone-level squash commit message
|
||||
- Class: core-capability
|
||||
- Status: active
|
||||
- Description: When a milestone squash-merges to main, the commit message summarizes all slices and their key outcomes. Format: conventional commit subject + slice task list body + branch metadata.
|
||||
- Why it matters: Main's git log should read like a changelog. Each milestone commit should tell the full story of what was built.
|
||||
- Source: user
|
||||
- Primary owning slice: M003/S03
|
||||
- Supporting slices: none
|
||||
- Validation: unmapped
|
||||
- Notes: Similar to current rich commit message for slice merges, but at milestone level. Should list all slices with their titles and key outcomes.
|
||||
|
||||
### R033 — `git.isolation` preference
|
||||
- Class: core-capability
|
||||
- Status: active
|
||||
- Description: A `git.isolation` preference with values `"worktree"` (default for new projects) and `"branch"` (legacy model). New projects that have never run GSD default to worktree isolation. Existing projects with an established branch-per-slice history default to branch mode.
|
||||
- Why it matters: Backwards compatibility — existing projects must not break. New projects get the better model by default.
|
||||
- Source: user
|
||||
- Primary owning slice: M003/S04
|
||||
- Supporting slices: none
|
||||
- Validation: unmapped
|
||||
- Notes: Detection heuristic: if the project has existing `gsd/*` branches or milestone metadata with integration branch records, it's a legacy project → default to "branch". Otherwise → default to "worktree".
|
||||
|
||||
### R034 — `git.merge_to_main` preference
|
||||
- Class: core-capability
|
||||
- Status: active
|
||||
- Description: A `git.merge_to_main` preference with values `"milestone"` (default) and `"slice"`. In milestone mode, main only receives commits when milestones complete. In slice mode, each completed slice squash-merges to main immediately (current behavior).
|
||||
- Why it matters: Senior engineers who want frequent integration can opt into slice-level merges. Vibe coders get the cleaner milestone-level default.
|
||||
- Source: user
|
||||
- Primary owning slice: M003/S04
|
||||
- Supporting slices: M003/S03
|
||||
- Validation: unmapped
|
||||
- Notes: `merge_to_main: "slice"` with `isolation: "worktree"` is valid — slices squash-merge to main from within the worktree, but the worktree still provides `.gsd/` isolation.
|
||||
|
||||
### R035 — Self-healing git repair on failure
|
||||
- Class: core-capability
|
||||
- Status: active
|
||||
- Description: When git operations fail during auto-mode (merge conflict, checkout failure, corrupt state), the system automatically attempts repair: abort incomplete merges, reset working tree, retry the operation. Only truly unresolvable conflicts (two humans edited the same code) pause auto-mode with a clear explanation.
|
||||
- Why it matters: The north star is "automagical — just runs." Git errors are the #1 cause of auto-mode halting. Self-healing eliminates most of those stops.
|
||||
- Source: user
|
||||
- Primary owning slice: M003/S05
|
||||
- Supporting slices: M003/S01, M003/S02, M003/S03
|
||||
- Validation: unmapped
|
||||
- Notes: The worktree model eliminates most `.gsd/` conflicts structurally. Self-healing handles the remaining edge cases (code conflicts, remote divergence, corrupt index).
|
||||
|
||||
### R036 — `.gsd/` conflict resolution elimination
|
||||
- Class: quality-attribute
|
||||
- Status: active
|
||||
- Description: The ~60 lines of `.gsd/` auto-resolve conflict code in `mergeSliceToMain` and the ~44 merge-related recovery paths in `auto.ts` are simplified or removed. Worktree isolation makes most of this code structurally unnecessary.
|
||||
- Why it matters: Dead conflict resolution code is maintenance burden and a source of bugs. If the architecture eliminates the problem, the code that patches it should go.
|
||||
- Source: inferred
|
||||
- Primary owning slice: M003/S02
|
||||
- Supporting slices: M003/S06
|
||||
- Validation: unmapped
|
||||
- Notes: Only remove code that is genuinely unnecessary in worktree mode. Keep the legacy branch-per-slice path intact for `git.isolation: "branch"` users.
|
||||
|
||||
### R037 — Zero git errors for vibe coders
|
||||
- Class: primary-user-loop
|
||||
- Status: active
|
||||
- Description: Users with zero git knowledge should never see a git error message during auto-mode. All git operations are invisible. If something fails, the system self-heals or presents a non-technical explanation with a clear action ("Run `/gsd doctor` to fix this").
|
||||
- Why it matters: Vibe coders are the primary market. Git errors are incomprehensible to them and destroy trust in the system.
|
||||
- Source: user
|
||||
- Primary owning slice: M003/S05
|
||||
- Supporting slices: all M003 slices
|
||||
- Validation: unmapped
|
||||
- Notes: This is a quality bar, not a single feature. Every git-touching codepath must handle errors gracefully.
|
||||
|
||||
### R038 — Backwards compatibility with branch-per-slice model
|
||||
- Class: continuity
|
||||
- Status: active
|
||||
- Description: Existing projects that use the branch-per-slice model continue working exactly as they do today. No migration required. The old codepaths remain functional when `git.isolation: "branch"` is active.
|
||||
- Why it matters: Breaking existing users' workflows would destroy trust.
|
||||
- Source: user
|
||||
- Primary owning slice: M003/S04
|
||||
- Supporting slices: none
|
||||
- Validation: unmapped
|
||||
- Notes: All existing git-service.ts tests must continue passing in branch mode.
|
||||
|
||||
### R039 — Manual `/worktree` coexistence with auto-worktrees
|
||||
- Class: integration
|
||||
- Status: active
|
||||
- Description: The manual `/worktree` command for exploration coexists with auto-mode's milestone worktrees. Different naming conventions prevent conflicts: auto-worktrees use `milestone/M003` branches, manual worktrees use `worktree/<name>` branches.
|
||||
- Why it matters: Manual worktrees are a valuable exploration tool. They shouldn't be broken by auto-mode's worktree usage.
|
||||
- Source: user
|
||||
- Primary owning slice: M003/S01
|
||||
- Supporting slices: none
|
||||
- Validation: unmapped
|
||||
- Notes: Auto-worktrees are created under `.gsd/worktrees/` just like manual ones, but with milestone ID as the name. The naming convention prevents branch collisions.
|
||||
|
||||
### R040 — Doctor git health checks
|
||||
- Class: operability
|
||||
- Status: active
|
||||
- Description: `/gsd doctor` detects and optionally fixes git-related issues: orphaned auto-worktrees, stale milestone branches, corrupt merge state (MERGE_HEAD/SQUASH_MSG), tracked runtime files, missing gitignore patterns.
|
||||
- Why it matters: When things do go wrong, users need a one-command fix. Doctor is the safety net.
|
||||
- Source: inferred
|
||||
- Primary owning slice: M003/S06
|
||||
- Supporting slices: M003/S05
|
||||
- Validation: unmapped
|
||||
- Notes: Doctor already handles planning artifact issues. This extends it to git health.
|
||||
|
||||
### R041 — Test coverage for worktree-isolated flow
|
||||
- Class: quality-attribute
|
||||
- Status: active
|
||||
- Description: Test suite covers: auto-worktree create/teardown, `--no-ff` slice merge within worktree, milestone squash to main, preference switching between isolation modes, self-heal scenarios, doctor git checks. All existing git tests continue passing.
|
||||
- Why it matters: The git system is the most bug-prone part of GSD. Tests prevent regressions.
|
||||
- Source: inferred
|
||||
- Primary owning slice: M003/S07
|
||||
- Supporting slices: all M003 slices
|
||||
- Validation: unmapped
|
||||
- Notes: Must test both worktree and branch isolation modes.
|
||||
|
||||
## Validated
|
||||
|
||||
### R001 — Secret forecasting during milestone planning
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: When a milestone is planned, the LLM analyzes slices for external service dependencies and writes a secrets manifest listing every predicted API key with setup guidance.
|
||||
- Why it matters: Without forecasting, auto-mode discovers missing keys mid-execution and blocks for hours waiting for user input.
|
||||
- Source: user
|
||||
- Primary owning slice: M001/S01
|
||||
- Supporting slices: none
|
||||
- Validation: plan-milestone.md Secret Forecasting section (line 62) instructs LLM to write manifest. Parser round-trip tested in parsers.test.ts.
|
||||
- Notes: The plan-milestone prompt has forecasting instructions. The manifest format and parser are implemented and tested.
|
||||
|
||||
### R002 — Secrets manifest persisted in .gsd/
|
||||
- Class: continuity
|
||||
- Status: validated
|
||||
- Description: The secrets manifest is a durable markdown file at `.gsd/milestones/M00x/M00x-SECRETS.md` that survives session boundaries and can be re-read by any future unit.
|
||||
- Why it matters: Collection may happen in a different session than planning. The manifest must persist on disk.
|
||||
- Source: user
|
||||
- Primary owning slice: M001/S01
|
||||
- Supporting slices: none
|
||||
- Validation: parseSecretsManifest/formatSecretsManifest round-trip tested (parsers.test.ts), resolveMilestoneFile(base, mid, "SECRETS") resolves path.
|
||||
- Notes: Parser/formatter implemented in files.ts. Template exists at templates/secrets-manifest.md.
|
||||
|
||||
### R003 — Step-by-step guidance per key
|
||||
- Class: primary-user-loop
|
||||
- Status: validated
|
||||
- Description: Each secret in the manifest includes numbered steps for obtaining the key (navigate to dashboard → create project → generate key → copy), a dashboard URL, and a format hint.
|
||||
- Why it matters: Users shouldn't have to figure out where to find each key. The guidance makes collection self-service.
|
||||
- Source: user
|
||||
- Primary owning slice: M001/S02
|
||||
- Supporting slices: M001/S01
|
||||
- Validation: collectOneSecret renders numbered dim-styled guidance steps with wrapping (collect-from-manifest.test.ts tests 6-8).
|
||||
- Notes: Guidance quality is LLM-dependent and best-effort.
|
||||
|
||||
### R004 — Summary screen before collection
|
||||
- Class: primary-user-loop
|
||||
- Status: validated
|
||||
- Description: Before collecting secrets one-by-one, show a read-only summary screen listing all needed keys with their status (pending / already set / skipped). Auto-skip keys that already exist in the environment.
|
||||
- Why it matters: The user needs to see the full picture before entering keys. Already-set keys should not require re-entry.
|
||||
- Source: user
|
||||
- Primary owning slice: M001/S02
|
||||
- Supporting slices: none
|
||||
- Validation: showSecretsSummary() renders read-only ctx.ui.custom screen with status indicators via makeUI().progressItem() (collect-from-manifest.test.ts tests 4-5).
|
||||
- Notes: Read-only with auto-skip — no interactive deselection.
|
||||
|
||||
### R005 — Existing key detection and silent skip
|
||||
- Class: primary-user-loop
|
||||
- Status: validated
|
||||
- Description: Before prompting for a key, check `.env` and `process.env`. If the key already exists, mark it as "already set" in the summary and skip collection.
|
||||
- Why it matters: Users shouldn't re-enter keys they've already configured. Prevents frustration and errors.
|
||||
- Source: user
|
||||
- Primary owning slice: M001/S02
|
||||
- Supporting slices: none
|
||||
- Validation: getManifestStatus cross-references checkExistingEnvKeys, categorizes env-present keys as existing (manifest-status.test.ts tests 4,7). collectSecretsFromManifest skips them (collect-from-manifest.test.ts tests 1-2).
|
||||
- Notes: `checkExistingEnvKeys()` implemented in get-secrets-from-user.ts.
|
||||
|
||||
### R006 — Smart destination detection
|
||||
- Class: integration
|
||||
- Status: validated
|
||||
- Description: Automatically detect whether secrets should go to .env, Vercel, or Convex based on project file presence (vercel.json → Vercel, convex/ dir → Convex, default → .env).
|
||||
- Why it matters: Users shouldn't have to specify the destination manually. The system should do the right thing.
|
||||
- Source: user
|
||||
- Primary owning slice: M001/S02
|
||||
- Supporting slices: none
|
||||
- Validation: collectSecretsFromManifest calls detectDestination() for destination inference. applySecrets() routes to dotenv/vercel/convex accordingly.
|
||||
- Notes: `detectDestination()` implemented in get-secrets-from-user.ts.
|
||||
|
||||
### R007 — Auto-mode collection at entry point
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: When the user runs `/gsd auto`, check for a secrets manifest with pending keys. If found, collect them before dispatching the first slice. Collection happens once at the entry point, not as a dispatch unit.
|
||||
- Why it matters: This is the primary integration point — auto-mode must not start execution with uncollected secrets.
|
||||
- Source: user
|
||||
- Primary owning slice: M001/S03
|
||||
- Supporting slices: M001/S01, M001/S02
|
||||
- Validation: startAuto() secrets gate at auto.ts:479. auto-secrets-gate.test.ts — 3/3 pass covering null manifest, pending keys, and no-pending-keys paths.
|
||||
- Notes: Collection at entry point (startAuto), not as a separate unit type in dispatchNextUnit. D001 satisfied.
|
||||
|
||||
### R008 — Guided /gsd wizard integration
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: After milestone planning in the guided `/gsd` flow, trigger secret collection if a manifest exists with pending keys.
|
||||
- Why it matters: Users who plan via the wizard should also get prompted for secrets before auto-mode begins.
|
||||
- Source: user
|
||||
- Primary owning slice: M001/S03
|
||||
- Supporting slices: M001/S01, M001/S02
|
||||
- Validation: guided-flow.ts calls startAuto() directly (lines 52, 486, 647, 794) — all guided flow paths that start auto-mode inherit the secrets gate.
|
||||
- Notes: The guided flow dispatches to startAuto after planning. Collection is inherited via the gate.
|
||||
|
||||
### R009 — Planning prompts instruct LLM to forecast secrets
|
||||
- Class: integration
|
||||
- Status: validated
|
||||
- Description: The plan-milestone prompt template includes instructions for the LLM to analyze slices for external service dependencies and write the secrets manifest.
|
||||
- Why it matters: Without prompt instructions, the LLM won't know to forecast secrets.
|
||||
- Source: user
|
||||
- Primary owning slice: M001/S01
|
||||
- Supporting slices: none
|
||||
- Validation: plan-milestone.md has Secret Forecasting section at line 62 with instructions to write {{secretsOutputPath}} with H3 sections per key.
|
||||
- Notes: Implemented in plan-milestone.md.
|
||||
|
||||
### R010 — secure_env_collect enhanced with guidance display
|
||||
- Class: primary-user-loop
|
||||
- Status: validated
|
||||
- Description: The secure_env_collect TUI renders multi-line guidance steps above the masked input field on the same page, so the user sees setup instructions while entering the key.
|
||||
- Why it matters: Without visible guidance, the user has to find keys on their own despite the LLM having generated instructions.
|
||||
- Source: user
|
||||
- Primary owning slice: M001/S02
|
||||
- Supporting slices: none
|
||||
- Validation: collectOneSecret accepts guidance parameter, renders numbered dim-styled lines with wrapTextWithAnsi above masked input (collect-from-manifest.test.ts tests 6-8).
|
||||
- Notes: The guidance field is rendered in collectOneSecret().
|
||||
|
||||
### R015 — Module decomposition of browser-tools
|
||||
- Class: quality-attribute
|
||||
- Status: validated
|
||||
- Description: The monolithic browser-tools index.ts (~5000 lines) is split into focused modules: shared infrastructure, tool groups, and browser-side utilities. All 43 existing tools continue to work identically.
|
||||
- Why it matters: A 5000-line file is unmaintainable and makes targeted changes risky. Module boundaries enable safe refactoring and new tool development.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S01
|
||||
- Supporting slices: none
|
||||
- Validation: Extension loads via jiti, 43 tools register, browser navigate/snapshot/click work against real page, index.ts is 47-line orchestrator with zero registerTool calls, 9 tool files under tools/.
|
||||
- Notes: core.js already exists with ~1000 lines of shared utilities. The split extends this pattern.
|
||||
|
||||
### R016 — Shared browser-side evaluate utilities
|
||||
- Class: quality-attribute
|
||||
- Status: validated
|
||||
- Description: Common functions duplicated across page.evaluate boundaries (cssPath, simpleHash, isVisible, isEnabled, inferRole, accessibleName) are injected once and referenced from all evaluate callbacks.
|
||||
- Why it matters: Currently buildRefSnapshot and resolveRefTarget each redeclare ~100 lines of identical utility code. Deduplication reduces payload size, improves maintainability, and ensures consistency.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S01
|
||||
- Supporting slices: none
|
||||
- Validation: window.__pi contains all 9 functions, survives navigation, refs.ts has zero inline redeclarations, close/reopen re-injects via addInitScript correctly.
|
||||
- Notes: Uses context.addInitScript under window.__pi namespace.
|
||||
|
||||
### R017 — Consolidated state capture per action
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: The before-state capture, after-state capture, post-action summary, and recent-error check are consolidated into fewer page.evaluate calls per action.
|
||||
- Why it matters: Every action tool currently runs 3-4 separate page.evaluate calls for state capture. Consolidating them reduces latency on every single browser interaction.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S02
|
||||
- Supporting slices: M002/S01
|
||||
- Validation: postActionSummary eliminated from action tools, countOpenDialogs removed from ToolDeps, consolidated capture pattern. Build passes.
|
||||
- Notes: captureCompactPageState and postActionSummary merged into single evaluate.
|
||||
|
||||
### R018 — Conditional body text capture
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: Body text capture (includeBodyText: true) is skipped for low-signal actions (scroll, hover, Tab key press) and enabled for high-signal actions (navigate, click, type, submit).
|
||||
- Why it matters: Capturing 4000 chars of body text on every scroll or hover is wasteful. Conditional capture reduces evaluate overhead.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S02
|
||||
- Supporting slices: none
|
||||
- Validation: explicit includeBodyText true/false per tool signal level in interaction.ts. Classification codified in D017. Build passes.
|
||||
- Notes: Requires classifying each tool as high-signal or low-signal.
|
||||
|
||||
### R019 — Faster settle on zero mutations
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: settleAfterActionAdaptive short-circuits with a smaller quiet window when no mutation observer fires in the first 60ms.
|
||||
- Why it matters: Many SPA interactions produce no DOM changes. Short-circuiting saves time on the most common case.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S02
|
||||
- Supporting slices: none
|
||||
- Validation: zero_mutation_shortcut settle reason in state.ts type union and settle.ts return path. 60ms/30ms thresholds codified in D019. Build passes.
|
||||
- Notes: Track whether any mutation fired at all; if zero after 60ms, use a shorter quiet window.
|
||||
|
||||
### R020 — Sharp-based screenshot resizing
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: constrainScreenshot uses the sharp Node library for image resizing instead of bouncing buffers through page canvas context.
|
||||
- Why it matters: Faster, no page dependency for image processing.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S03
|
||||
- Supporting slices: M002/S01
|
||||
- Validation: constrainScreenshot uses sharp(buffer).metadata() and sharp(buffer).resize(). Zero page.evaluate calls in capture.ts. Build passes.
|
||||
- Notes: sharp added as a dependency.
|
||||
|
||||
### R021 — Opt-in screenshots on navigate
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: browser_navigate does not capture or return a screenshot by default. An explicit parameter opts in to screenshot capture.
|
||||
- Why it matters: Significant token savings — the screenshot payload is large and often unnecessary.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S03
|
||||
- Supporting slices: none
|
||||
- Validation: browser_navigate has screenshot parameter default false. Capture gated. Build passes.
|
||||
- Notes: Default is off. The agent can still use browser_screenshot explicitly.
|
||||
|
||||
### R022 — Form analysis tool (browser_analyze_form)
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: A browser_analyze_form tool that returns field inventory including labels, names, types, required status, current values, validation state, and submit controls.
|
||||
- Why it matters: Collapses 3-8 tool calls for form analysis into one.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S04
|
||||
- Supporting slices: M002/S01
|
||||
- Validation: 7-level label resolution, form auto-detection, fieldset grouping, submit button discovery. Verified end-to-end against 12-field test form. Build passes.
|
||||
- Notes: Must handle label association via for/id, wrapping label, aria-label, aria-labelledby, and placeholder.
|
||||
|
||||
### R023 — Form fill tool (browser_fill_form)
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: A browser_fill_form tool that maps labels/names/placeholders to inputs and fills them with type-aware Playwright APIs.
|
||||
- Why it matters: Collapses 3-5 tool calls for form filling into one.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S04
|
||||
- Supporting slices: M002/S01
|
||||
- Validation: 5-strategy field resolution, type-aware fill via Playwright APIs, verified end-to-end with 10 fields. Build passes.
|
||||
- Notes: Returns matched fields, unmatched values, fields skipped, and validation state.
|
||||
|
||||
### R024 — Intent-ranked element retrieval (browser_find_best)
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: A browser_find_best tool that returns scored candidates using deterministic heuristic ranking for 8 semantic intents.
|
||||
- Why it matters: Cuts a round trip and reduces reasoning tokens for common element-finding tasks.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S05
|
||||
- Supporting slices: M002/S01
|
||||
- Validation: 8 intents implemented with 4-dimension scoring. Verified via Playwright tests. Build passes, tool count = 47.
|
||||
- Notes: Deterministic heuristics only. No hidden LLM calls.
|
||||
|
||||
### R025 — Semantic action tool (browser_act)
|
||||
- Class: core-capability
|
||||
- Status: validated
|
||||
- Description: A browser_act tool that resolves the top candidate for a semantic intent and executes the action in one call.
|
||||
- Why it matters: Collapses 2-4 tool calls for common micro-tasks into one.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S05
|
||||
- Supporting slices: M002/S04
|
||||
- Validation: Resolves via same scoring engine as browser_find_best. Executes via Playwright locator. Returns before/after diff. Build passes, tool count = 47.
|
||||
- Notes: Builds on browser_find_best for element selection. Bounded — does not loop or retry.
|
||||
|
||||
### R026 — Test coverage for new and refactored code
|
||||
- Class: quality-attribute
|
||||
- Status: validated
|
||||
- Description: Test suite covers shared browser-side utilities, settle logic, screenshot resizing, form tools, and intent ranking.
|
||||
- Why it matters: Regression protection for refactored and new features.
|
||||
- Source: user
|
||||
- Primary owning slice: M002/S06
|
||||
- Supporting slices: all M002 slices
|
||||
- Validation: 108 tests (63 unit + 45 integration) passing via `npm run test:browser-tools`.
|
||||
- Notes: Test what's unit-testable without a browser. Integration tests with Playwright for tools that need a page.
|
||||
|
||||
## Deferred
|
||||
|
||||
### R011 — Multi-milestone secret forecasting
|
||||
- Class: core-capability
|
||||
- Status: deferred
|
||||
- Description: Forecast secrets across all planned milestones, not just the active one.
|
||||
- Why it matters: Would provide a complete picture of all secrets needed for the project.
|
||||
- Source: user
|
||||
- Primary owning slice: none
|
||||
- Supporting slices: none
|
||||
- Validation: unmapped
|
||||
- Notes: Deferred — single-milestone forecasting is sufficient for now.
|
||||
|
||||
### R012 — Secret rotation reminders
|
||||
- Class: operability
|
||||
- Status: deferred
|
||||
- Description: Track secret age and remind users when keys may need rotation.
|
||||
- Why it matters: Security best practice, but not essential for the core workflow.
|
||||
- Source: user
|
||||
- Primary owning slice: none
|
||||
- Supporting slices: none
|
||||
- Validation: unmapped
|
||||
- Notes: Deferred — out of scope for initial release.
|
||||
|
||||
### R027 — Browser reuse across sessions
|
||||
- Class: core-capability
|
||||
- Status: deferred
|
||||
- Description: Keep a warm browser instance across rapid successive agent contexts to avoid ~2-3s Chrome cold-start per session.
|
||||
- Why it matters: Would eliminate Chrome launch latency in auto-mode.
|
||||
- Source: user
|
||||
- Primary owning slice: none
|
||||
- Supporting slices: none
|
||||
- Validation: unmapped
|
||||
- Notes: Deferred — skip completely per user direction.
|
||||
|
||||
### R042 — Parallel milestone execution in multiple worktrees
|
||||
- Class: core-capability
|
||||
- Status: deferred
|
||||
- Description: Run multiple milestones simultaneously in separate worktrees with independent auto-mode sessions.
|
||||
- Why it matters: Natural extension of worktree-per-milestone architecture. Would enable parallel work streams.
|
||||
- Source: user
|
||||
- Primary owning slice: none
|
||||
- Supporting slices: none
|
||||
- Validation: unmapped
|
||||
- Notes: Deferred — ship sequential milestone execution first. The worktree infrastructure naturally supports this later.
|
||||
|
||||
### R043 — Native libgit2 write operations
|
||||
- Class: quality-attribute
|
||||
- Status: deferred
|
||||
- Description: Extend the Rust/libgit2 native module to cover write operations (commit, merge, checkout) in addition to the current read-only queries.
|
||||
- Why it matters: Would eliminate execSync overhead for git writes on the hot path.
|
||||
- Source: inferred
|
||||
- Primary owning slice: none
|
||||
- Supporting slices: none
|
||||
- Validation: unmapped
|
||||
- Notes: Deferred — execSync writes are functional. Optimize later if profiling shows it matters.
|
||||
|
||||
## Out of Scope
|
||||
|
||||
### R013 — Curated service knowledge base
|
||||
- Class: anti-feature
|
||||
- Status: out-of-scope
|
||||
- Description: A static database of known services with pre-written guidance for each API key.
|
||||
- Why it matters: Prevents scope creep. LLM-generated guidance is sufficient and stays current without maintenance.
|
||||
- Source: user
|
||||
- Primary owning slice: none
|
||||
- Supporting slices: none
|
||||
- Validation: n/a
|
||||
- Notes: LLM generates guidance dynamically.
|
||||
|
||||
### R014 — Just-in-time collection enhancement
|
||||
- Class: anti-feature
|
||||
- Status: out-of-scope
|
||||
- Description: Detect missing secrets during task execution and collect them inline.
|
||||
- Why it matters: Prevents scope confusion. M001 is about proactive collection, not reactive.
|
||||
- Source: user
|
||||
- Primary owning slice: none
|
||||
- Supporting slices: none
|
||||
- Validation: n/a
|
||||
- Notes: Existing secure_env_collect already handles reactive collection.
|
||||
|
||||
### R028 — LLM-powered intent resolution
|
||||
- Class: anti-feature
|
||||
- Status: out-of-scope
|
||||
- Description: Using hidden LLM calls inside browser_find_best or browser_act for intent resolution.
|
||||
- Why it matters: Prevents unpredictable latency and cost.
|
||||
- Source: inferred
|
||||
- Primary owning slice: none
|
||||
- Supporting slices: none
|
||||
- Validation: n/a
|
||||
- Notes: browser_find_best and browser_act use scoring heuristics, not LLM inference.
|
||||
|
||||
### R044 — Rebase merge strategy
|
||||
- Class: anti-feature
|
||||
- Status: out-of-scope
|
||||
- Description: Adding rebase as a merge strategy option alongside squash and --no-ff merge.
|
||||
- Why it matters: Rebase rewrites history, which conflicts with the "commit diary" philosophy. It also introduces more failure modes (rebase conflicts are harder to auto-resolve than merge conflicts).
|
||||
- Source: inferred
|
||||
- Primary owning slice: none
|
||||
- Supporting slices: none
|
||||
- Validation: n/a
|
||||
- Notes: --no-ff merge + squash covers all needed use cases without history rewriting.
|
||||
|
||||
## Traceability
|
||||
|
||||
| ID | Class | Status | Primary owner | Supporting | Proof |
|
||||
|---|---|---|---|---|---|
|
||||
| R001 | core-capability | validated | M001/S01 | none | plan-milestone.md Secret Forecasting section, parser round-trip tests |
|
||||
| R002 | continuity | validated | M001/S01 | none | parseSecretsManifest/formatSecretsManifest round-trip tested |
|
||||
| R003 | primary-user-loop | validated | M001/S02 | M001/S01 | collect-from-manifest.test.ts tests 6-8 |
|
||||
| R004 | primary-user-loop | validated | M001/S02 | none | collect-from-manifest.test.ts tests 4-5 |
|
||||
| R005 | primary-user-loop | validated | M001/S02 | none | manifest-status.test.ts tests 4,7; collect-from-manifest.test.ts tests 1-2 |
|
||||
| R006 | integration | validated | M001/S02 | none | collectSecretsFromManifest calls detectDestination() |
|
||||
| R007 | core-capability | validated | M001/S03 | M001/S01, M001/S02 | auto-secrets-gate.test.ts 3/3 pass |
|
||||
| R008 | core-capability | validated | M001/S03 | M001/S01, M001/S02 | guided-flow.ts calls startAuto() at lines 52, 486, 647, 794 |
|
||||
| R009 | integration | validated | M001/S01 | none | plan-milestone.md Secret Forecasting section line 62 |
|
||||
| R010 | primary-user-loop | validated | M001/S02 | none | collect-from-manifest.test.ts tests 6-8 |
|
||||
| R011 | core-capability | deferred | none | none | unmapped |
|
||||
| R012 | operability | deferred | none | none | unmapped |
|
||||
| R013 | anti-feature | out-of-scope | none | none | n/a |
|
||||
| R014 | anti-feature | out-of-scope | none | none | n/a |
|
||||
| R015 | quality-attribute | validated | M002/S01 | none | jiti load, 43 tools register, slim index, browser spot-check |
|
||||
| R016 | quality-attribute | validated | M002/S01 | none | window.__pi injection, zero inline redeclarations, survives navigation |
|
||||
| R017 | core-capability | validated | M002/S02 | M002/S01 | postActionSummary eliminated, consolidated capture pattern |
|
||||
| R018 | core-capability | validated | M002/S02 | none | explicit includeBodyText true/false per tool signal level |
|
||||
| R019 | core-capability | validated | M002/S02 | none | zero_mutation_shortcut settle reason, 60ms/30ms thresholds |
|
||||
| R020 | core-capability | validated | M002/S03 | M002/S01 | sharp-based constrainScreenshot, zero page.evaluate in capture.ts |
|
||||
| R021 | core-capability | validated | M002/S03 | none | screenshot param default false, capture gated |
|
||||
| R022 | core-capability | validated | M002/S04 | M002/S01 | 7-level label resolution, verified against 12-field test form |
|
||||
| R023 | core-capability | validated | M002/S04 | M002/S01 | 5-strategy field resolution, verified end-to-end with 10 fields |
|
||||
| R024 | core-capability | validated | M002/S05 | M002/S01 | 8-intent scoring, Playwright tests, differentiated rankings |
|
||||
| R025 | core-capability | validated | M002/S05 | M002/S04 | top candidate execution, settle + diff, graceful error |
|
||||
| R026 | quality-attribute | validated | M002/S06 | all M002 | 108 tests passing via npm run test:browser-tools |
|
||||
| R027 | core-capability | deferred | none | none | unmapped |
|
||||
| R028 | anti-feature | out-of-scope | none | none | n/a |
|
||||
| R029 | core-capability | active | M003/S01 | none | unmapped |
|
||||
| R030 | core-capability | active | M003/S03 | M003/S01 | unmapped |
|
||||
| R031 | core-capability | active | M003/S02 | M003/S01 | unmapped |
|
||||
| R032 | core-capability | active | M003/S03 | none | unmapped |
|
||||
| R033 | core-capability | active | M003/S04 | none | unmapped |
|
||||
| R034 | core-capability | active | M003/S04 | M003/S03 | unmapped |
|
||||
| R035 | core-capability | active | M003/S05 | M003/S01, M003/S02, M003/S03 | unmapped |
|
||||
| R036 | quality-attribute | active | M003/S02 | M003/S06 | unmapped |
|
||||
| R037 | primary-user-loop | active | M003/S05 | all M003 | unmapped |
|
||||
| R038 | continuity | active | M003/S04 | none | unmapped |
|
||||
| R039 | integration | active | M003/S01 | none | unmapped |
|
||||
| R040 | operability | active | M003/S06 | M003/S05 | unmapped |
|
||||
| R041 | quality-attribute | active | M003/S07 | all M003 | unmapped |
|
||||
| R042 | core-capability | deferred | none | none | unmapped |
|
||||
| R043 | quality-attribute | deferred | none | none | unmapped |
|
||||
| R044 | anti-feature | out-of-scope | none | none | n/a |
|
||||
|
||||
## Coverage Summary
|
||||
|
||||
- Active requirements: 13
|
||||
- Mapped to slices: 13
|
||||
- Validated: 22
|
||||
- Deferred: 5
|
||||
- Out of scope: 4
|
||||
- Unmapped active requirements: 0
|
||||
23
.gsd/STATE.md
Normal file
23
.gsd/STATE.md
Normal file
|
|
@ -0,0 +1,23 @@
|
|||
# GSD State
|
||||
|
||||
**Active Milestone:** M003 — Worktree-Isolated Git Architecture
|
||||
**Active Slice:** None
|
||||
**Phase:** pre-planning
|
||||
|
||||
## Milestone Registry
|
||||
- ✅ **M001:** Proactive Secret Management
|
||||
- ✅ **M002:** Browser Tools Performance & Intelligence
|
||||
- 🔄 **M003:** Worktree-Isolated Git Architecture
|
||||
|
||||
## Recent Decisions
|
||||
- D027: Git isolation model — worktree-per-milestone (default for new projects)
|
||||
- D028: Slice merge strategy within worktree — --no-ff merge
|
||||
- D029: Milestone-to-main merge strategy — squash merge
|
||||
- D030: Failure handling philosophy — stop but self-heal
|
||||
- D031: Target user priority — vibe coder first
|
||||
|
||||
## Blockers
|
||||
- None
|
||||
|
||||
## Next Action
|
||||
Research and plan M003. Context and roadmap are written. Ready for auto-mode.
|
||||
114
.gsd/milestones/M003/M003-CONTEXT.md
Normal file
114
.gsd/milestones/M003/M003-CONTEXT.md
Normal file
|
|
@ -0,0 +1,114 @@
|
|||
# M003: Worktree-Isolated Git Architecture
|
||||
|
||||
**Gathered:** 2026-03-14
|
||||
**Status:** Ready for planning
|
||||
|
||||
## Project Description
|
||||
|
||||
Overhaul GSD's git system to use worktree-per-milestone isolation as the default model. Each milestone gets its own git worktree with an isolated `.gsd/` directory, eliminating the entire category of `.gsd/` merge conflicts that have caused ~15 separate bug fixes to date. Slices merge into the milestone branch via `--no-ff` (preserving full commit history as a diary of the agent's work). Milestones squash-merge to main on completion (keeping main clean). The system is automagical for vibe coders — zero git errors, zero git knowledge required — and configurable for senior engineers via preferences.
|
||||
|
||||
## Why This Milestone
|
||||
|
||||
The current branch-per-slice model shares `.gsd/` state across branches, causing merge conflicts that halt auto-mode. The CHANGELOG shows a pattern: each fix leads to a new edge case. The root cause is structural — sharing mutable state across branches. Worktree isolation eliminates the problem architecturally rather than patching symptoms.
|
||||
|
||||
## User-Visible Outcome
|
||||
|
||||
### When this milestone is complete, the user can:
|
||||
|
||||
- Run `/gsd auto` on a new project and have it execute start-to-finish without any git errors, merge conflicts, or mysterious halts
|
||||
- See clean `git log` on main with one commit per completed milestone
|
||||
- Configure `git.merge_to_main: "slice"` in preferences to get slice-level integration if they want it
|
||||
- Run `/gsd doctor` to detect and fix git-related issues
|
||||
- Use manual `/worktree` alongside auto-mode without conflicts
|
||||
|
||||
### Entry point / environment
|
||||
|
||||
- Entry point: `/gsd auto` CLI command, `/gsd doctor` CLI command
|
||||
- Environment: local dev — any git repository
|
||||
- Live dependencies involved: git CLI, optional libgit2 native module
|
||||
|
||||
## Completion Class
|
||||
|
||||
- Contract complete means: auto-worktree create/teardown lifecycle works, slice merges use `--no-ff`, milestone squashes to main, preferences switch between modes, self-heal recovers from common failures, all tests pass
|
||||
- Integration complete means: the full auto-mode lifecycle (startAuto → dispatch units → complete slices → complete milestone → merge to main) works end-to-end in a real git repo with real file changes
|
||||
- Operational complete means: existing projects on branch-per-slice model continue working unchanged, manual `/worktree` coexists without conflicts
|
||||
|
||||
## Final Integrated Acceptance
|
||||
|
||||
To call this milestone complete, we must prove:
|
||||
|
||||
- Auto-mode on a fresh project creates a worktree, executes through multiple slices, and merges the milestone to main — with zero git errors
|
||||
- An existing project with branch-per-slice history continues working identically (no regression)
|
||||
- A deliberately introduced merge conflict is self-healed without user intervention
|
||||
- `git log main` shows exactly one squash commit per completed milestone
|
||||
- `git log milestone/M003` shows full commit history with `--no-ff` merge boundaries per slice
|
||||
|
||||
## Risks and Unknowns
|
||||
|
||||
- **`process.chdir` in auto-mode** — auto-mode currently passes `basePath` to all functions but doesn't `chdir`. Worktree mode needs `chdir` into the worktree so that all tool calls (bash, read, write, edit) resolve against the worktree. The worktree-command.ts already does this, but auto-mode doesn't. Risk: some codepath uses `basePath` while another uses `process.cwd()`, causing split-brain.
|
||||
- **Worktree `.gsd/` inheritance** — when a worktree is created, it gets a copy of the project files from the milestone branch base. But `.gsd/` planning files from the main tree may or may not be wanted in the worktree. Need to decide: copy planning state or start fresh.
|
||||
- **State machine re-entry** — if auto-mode is paused and resumed, the worktree must be re-entered (if it still exists). The pause/resume logic in `startAuto` needs to handle this.
|
||||
- **Existing orphan recovery** — the current `mergeOrphanedSliceBranches` logic needs to work within the worktree context, not just on main.
|
||||
|
||||
> See `.gsd/DECISIONS.md` for all architectural and pattern decisions — it is an append-only register; read it during planning, append to it during execution.
|
||||
|
||||
## Relevant Requirements
|
||||
|
||||
- R029 — Auto-worktree creation on milestone start
|
||||
- R030 — Auto-worktree teardown + squash-merge on milestone complete
|
||||
- R031 — `--no-ff` slice merges within milestone worktree
|
||||
- R032 — Rich milestone-level squash commit message
|
||||
- R033 — `git.isolation` preference
|
||||
- R034 — `git.merge_to_main` preference
|
||||
- R035 — Self-healing git repair on failure
|
||||
- R036 — `.gsd/` conflict resolution elimination
|
||||
- R037 — Zero git errors for vibe coders
|
||||
- R038 — Backwards compatibility with branch-per-slice model
|
||||
- R039 — Manual `/worktree` coexistence with auto-worktrees
|
||||
- R040 — Doctor git health checks
|
||||
- R041 — Test coverage for worktree-isolated flow
|
||||
|
||||
## Scope
|
||||
|
||||
### In Scope
|
||||
|
||||
- Auto-worktree lifecycle wired into `startAuto()` and `complete-milestone`
|
||||
- `--no-ff` merge for slices within worktree, squash for milestone to main
|
||||
- `git.isolation` and `git.merge_to_main` preferences with validation
|
||||
- Self-healing git repair (abort, reset, retry) for common failure modes
|
||||
- Doctor git health checks (orphaned worktrees, stale branches, corrupt state)
|
||||
- Simplification of `.gsd/` conflict resolution code (worktree mode only)
|
||||
- Test suite for both worktree and branch isolation modes
|
||||
- Backwards compatibility with existing branch-per-slice projects
|
||||
|
||||
### Out of Scope / Non-Goals
|
||||
|
||||
- Parallel milestone execution (deferred to future milestone)
|
||||
- Native libgit2 write operations (deferred)
|
||||
- Rebase merge strategy (anti-feature — conflicts with commit diary philosophy)
|
||||
- Remote git operations beyond existing auto-push
|
||||
|
||||
## Technical Constraints
|
||||
|
||||
- Must work with git CLI (libgit2 native module is optional, read-only)
|
||||
- `process.chdir` is the mechanism for worktree switching (proven in worktree-command.ts)
|
||||
- All file tools (read, write, edit, bash) resolve against `process.cwd()` — this is the reason `chdir` works
|
||||
- Source files are in `src/resources/extensions/gsd/`, tests in `src/resources/extensions/gsd/tests/`
|
||||
- Tests run via `npm run test:unit` and `npm run test:integration`
|
||||
|
||||
## Integration Points
|
||||
|
||||
- `auto.ts` — primary integration point for worktree lifecycle in `startAuto()`, `dispatchNextUnit()`, `handleAgentEnd()`
|
||||
- `git-service.ts` — `GitServiceImpl` class owns all git mutation operations
|
||||
- `worktree.ts` — thin facade over `GitServiceImpl`, exports `ensureSliceBranch`, `mergeSliceToMain`, etc.
|
||||
- `worktree-manager.ts` — existing worktree create/list/remove/merge operations
|
||||
- `worktree-command.ts` — manual `/worktree` command with `process.chdir` handling
|
||||
- `preferences.ts` — preference validation and loading
|
||||
- `doctor.ts` — health check and auto-fix system
|
||||
- `native-git-bridge.ts` — libgit2 read operations
|
||||
- `dispatch-guard.ts` — prior-slice completion checking
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **Worktree naming convention for auto-worktrees** — should auto-worktrees use the milestone ID as the name (`.gsd/worktrees/M003/`) or a prefixed name (`.gsd/worktrees/auto-M003/`)? Current thinking: bare milestone ID is cleaner and the branch convention (`milestone/M003` vs `worktree/<name>`) disambiguates from manual worktrees.
|
||||
- **`.gsd/` file handling on worktree creation** — should the worktree inherit the main tree's `.gsd/` planning files, or should they be cleared for a fresh start? Current thinking: inherit — the worktree needs the milestone's CONTEXT.md and ROADMAP.md to continue planning.
|
||||
173
.gsd/milestones/M003/M003-ROADMAP.md
Normal file
173
.gsd/milestones/M003/M003-ROADMAP.md
Normal file
|
|
@ -0,0 +1,173 @@
|
|||
# M003: Worktree-Isolated Git Architecture
|
||||
|
||||
**Vision:** Overhaul GSD's git system so that auto-mode is automagical — zero git errors, zero merge conflicts, zero user intervention required. Each milestone gets its own isolated worktree. Main is always clean. The system just runs.
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- Auto-mode on a fresh project executes through an entire milestone without any git errors or halts
|
||||
- Main branch only receives commits when milestones complete (one squash commit per milestone)
|
||||
- Full commit history preserved within milestone worktree branches via `--no-ff` slice merges
|
||||
- Existing branch-per-slice projects continue working identically — zero regressions
|
||||
- Self-healing resolves common git failures (merge conflict, checkout issue, corrupt state) without user intervention
|
||||
- `/gsd doctor` detects and fixes git health issues (orphaned worktrees, stale branches, corrupt merge state)
|
||||
|
||||
## Key Risks / Unknowns
|
||||
|
||||
- **`process.chdir` coherence in auto-mode** — all tool calls must resolve against the worktree path after chdir. The worktree-command.ts has proven this works, but auto-mode's `basePath` variable and `process.cwd()` must stay in sync.
|
||||
- **Worktree `.gsd/` inheritance** — creating a worktree copies project files from the base branch. `.gsd/` planning files (CONTEXT, ROADMAP) must carry through; runtime files (STATE.md, metrics, activity) must not cause conflicts.
|
||||
- **State machine re-entry on resume** — pausing and resuming auto-mode must re-enter the worktree if it exists. The current pause/resume logic doesn't handle this.
|
||||
|
||||
## Proof Strategy
|
||||
|
||||
- `process.chdir` coherence → retire in S01 by proving auto-mode dispatches and executes a unit inside the worktree with all file operations resolving correctly
|
||||
- Worktree `.gsd/` inheritance → retire in S01 by proving planning files are available after worktree creation and runtime files don't conflict
|
||||
- State machine re-entry → retire in S01 by proving pause/resume correctly re-enters the worktree
|
||||
|
||||
## Verification Classes
|
||||
|
||||
- Contract verification: git operations produce expected branch state, file layout, and commit history in temp repos
|
||||
- Integration verification: full auto-mode lifecycle (create worktree → execute slices → merge milestone → teardown) in a real git repo
|
||||
- Operational verification: existing branch-per-slice projects continue working; manual `/worktree` coexists
|
||||
- UAT / human verification: run auto-mode on a real project and confirm zero git errors
|
||||
|
||||
## Milestone Definition of Done
|
||||
|
||||
This milestone is complete only when all are true:
|
||||
|
||||
- Auto-worktree lifecycle works end-to-end (create, execute, merge, teardown)
|
||||
- `--no-ff` slice merges produce correct history on milestone branch
|
||||
- Milestone squash to main produces clean single commit
|
||||
- `git.isolation` and `git.merge_to_main` preferences work with validation
|
||||
- Self-healing recovers from common git failures without user intervention
|
||||
- Existing branch-per-slice projects pass all existing tests
|
||||
- `/gsd doctor` detects and fixes git health issues
|
||||
- Full test suite passes for both worktree and branch isolation modes
|
||||
- Success criteria re-checked against live behavior
|
||||
|
||||
## Requirement Coverage
|
||||
|
||||
- Covers: R029, R030, R031, R032, R033, R034, R035, R036, R037, R038, R039, R040, R041
|
||||
- Partially covers: none
|
||||
- Leaves for later: R042 (parallel milestones), R043 (native libgit2 writes)
|
||||
- Orphan risks: none
|
||||
|
||||
## Slices
|
||||
|
||||
- [ ] **S01: Auto-worktree lifecycle in auto-mode** `risk:high` `depends:[]`
|
||||
> After this: `startAuto()` on a new milestone creates a worktree under `.gsd/worktrees/M003/`, `chdir`s into it, and dispatches units inside the worktree. Pause/resume re-enters the worktree. Progress widget shows the worktree branch. Verified via running auto-mode unit dispatch in a temp repo worktree.
|
||||
|
||||
- [ ] **S02: --no-ff slice merges + conflict elimination** `risk:high` `depends:[S01]`
|
||||
> After this: completed slices merge into the milestone branch via `--no-ff` instead of squash. The `.gsd/` auto-resolve conflict code in `mergeSliceToMain` is bypassed in worktree mode. `git log` on the milestone branch shows full commit history with merge commit boundaries per slice. Verified in temp repo.
|
||||
|
||||
- [ ] **S03: Milestone-to-main squash merge + worktree teardown** `risk:high` `depends:[S01,S02]`
|
||||
> After this: `complete-milestone` squash-merges the milestone branch to main with a rich commit message listing all slices, removes the worktree, `chdir`s back to the main project root. `git log main` shows one clean commit. Auto-push works if enabled. Verified in temp repo with remote.
|
||||
|
||||
- [ ] **S04: Preferences + backwards compatibility** `risk:medium` `depends:[S01]`
|
||||
> After this: `git.isolation: "worktree"` (default for new projects) / `"branch"` (existing projects) and `git.merge_to_main: "milestone"` / `"slice"` preferences are validated and respected. An existing project with `gsd/*` branches defaults to branch mode and works identically to today. Verified by running tests in both modes.
|
||||
|
||||
- [ ] **S05: Self-healing git repair** `risk:medium` `depends:[S01,S02,S03]`
|
||||
> After this: when a merge fails or checkout breaks during auto-mode, the system aborts the failed operation, resets working tree state, and retries. Only truly unresolvable conflicts (real code conflicts between human-edited files) pause auto-mode. Users see non-technical messages, not raw git errors. Verified by deliberately introducing failures and confirming auto-recovery.
|
||||
|
||||
- [ ] **S06: Doctor + cleanup + code simplification** `risk:low` `depends:[S01,S02,S03,S05]`
|
||||
> After this: `/gsd doctor` detects orphaned auto-worktrees, stale milestone branches, corrupt merge state (MERGE_HEAD/SQUASH_MSG), and tracked runtime files — and fixes them. Dead `.gsd/` conflict resolution code removed from worktree-mode paths in git-service.ts. Verified via doctor test cases.
|
||||
|
||||
- [ ] **S07: Test suite for worktree-isolated flow** `risk:low` `depends:[S01,S02,S03,S04,S05,S06]`
|
||||
> After this: full test coverage for auto-worktree create/teardown, `--no-ff` slice merge, milestone squash, preference switching, self-heal scenarios, doctor checks. All existing git tests still pass. Both isolation modes tested. Verified via `npm run test:unit && npm run test:integration`.
|
||||
|
||||
<!--
|
||||
Format rules (parsers depend on this exact structure):
|
||||
- Checkbox line: - [ ] **S01: Title** `risk:high|medium|low` `depends:[S01,S02]`
|
||||
- Demo line: > After this: one sentence showing what's demoable
|
||||
- Mark done: change [ ] to [x]
|
||||
- Order slices by risk (highest first)
|
||||
- Each slice must be a vertical, demoable increment — not a layer
|
||||
- If all slices are completed exactly as written, the milestone's promised outcome should actually work at the stated proof level
|
||||
- depends:[X,Y] means X and Y must be done before this slice starts
|
||||
-->
|
||||
|
||||
## Boundary Map
|
||||
|
||||
### S01 → S02, S03, S04, S05
|
||||
|
||||
Produces:
|
||||
- `createAutoWorktree(basePath, milestoneId)` — creates worktree, returns worktree path
|
||||
- `teardownAutoWorktree(basePath, milestoneId)` — removes worktree, returns to main tree
|
||||
- `isInAutoWorktree(basePath)` → boolean — detects if currently in an auto-worktree
|
||||
- `getAutoWorktreePath(basePath, milestoneId)` → string | null — resolves worktree path
|
||||
- `enterAutoWorktree(basePath, milestoneId)` — `process.chdir` into existing worktree
|
||||
- Updated `startAuto()` in auto.ts that creates/enters worktree on milestone start
|
||||
- Updated pause/resume logic that re-enters worktree on resume
|
||||
|
||||
Consumes:
|
||||
- nothing (first slice)
|
||||
|
||||
### S01 → S02
|
||||
|
||||
Produces:
|
||||
- The worktree infrastructure that S02 merges slices within
|
||||
|
||||
Consumes:
|
||||
- nothing (first slice)
|
||||
|
||||
### S02 → S03
|
||||
|
||||
Produces:
|
||||
- `mergeSliceToMilestone(basePath, milestoneId, sliceId, sliceTitle)` — `--no-ff` merge of slice branch into milestone branch within worktree
|
||||
- Simplified merge path that skips `.gsd/` conflict resolution in worktree mode
|
||||
|
||||
Consumes from S01:
|
||||
- `isInAutoWorktree()` to determine which merge strategy to use
|
||||
|
||||
### S02 → S06
|
||||
|
||||
Produces:
|
||||
- Knowledge of which conflict resolution code is dead in worktree mode
|
||||
|
||||
Consumes from S01:
|
||||
- Worktree detection functions
|
||||
|
||||
### S03 → S05
|
||||
|
||||
Produces:
|
||||
- `mergeMilestoneToMain(basePath, milestoneId)` — squash-merge milestone branch to main
|
||||
- `buildMilestoneCommitMessage(milestoneId, milestoneTitle, slices)` — rich squash commit
|
||||
|
||||
Consumes from S01:
|
||||
- `teardownAutoWorktree()` for worktree removal after merge
|
||||
- `isInAutoWorktree()` for detection
|
||||
|
||||
Consumes from S02:
|
||||
- Merged milestone branch with `--no-ff` slice history
|
||||
|
||||
### S04 → S01, S02, S03
|
||||
|
||||
Produces:
|
||||
- `git.isolation` preference — `"worktree"` | `"branch"`
|
||||
- `git.merge_to_main` preference — `"milestone"` | `"slice"`
|
||||
- `shouldUseWorktreeIsolation(basePath)` — resolves effective isolation mode
|
||||
- Preference validation in `preferences.ts`
|
||||
|
||||
Consumes from S01:
|
||||
- Auto-worktree functions (gated by isolation preference)
|
||||
|
||||
### S05 → S06
|
||||
|
||||
Produces:
|
||||
- Structured git error handling patterns (try/abort/reset/retry)
|
||||
- User-facing error message formatting
|
||||
|
||||
Consumes from S01:
|
||||
- Worktree detection (to scope repair to correct working tree)
|
||||
Consumes from S02:
|
||||
- Merge operations that may fail
|
||||
Consumes from S03:
|
||||
- Milestone merge that may fail
|
||||
|
||||
### S06 → S07
|
||||
|
||||
Produces:
|
||||
- Doctor git health check functions
|
||||
- Simplified git-service.ts with dead code removed
|
||||
|
||||
Consumes from S05:
|
||||
- Error handling patterns for doctor fix operations
|
||||
Loading…
Add table
Reference in a new issue