Commit graph

605 commits

Author SHA1 Message Date
Lex Christopherson
498614c95f fix: inject observability warnings into agent prompt for enforcement (#174)
emitObservabilityWarnings only called ctx.ui.notify — the agent never
saw the warnings and ignored them entirely. Validator caught real issues
(missing observability sections, placeholder diagnostics) but had zero
enforcement.

Rename to collectObservabilityWarnings (returns issues), add
buildObservabilityRepairBlock to format issues as actionable prompt
instructions. Appended to the unit prompt so the agent reads flagged
files and fixes gaps before proceeding with the unit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:20:19 -06:00
Lex Christopherson
788c356a25 fix: auto-detect headless environment for Playwright browser launch (#183)
Browser launch was hardcoded to headless: false, crashing on Linux
servers without a display server ($DISPLAY). Auto-detect headless
environments and also support FORCE_HEADLESS=true override.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:16:26 -06:00
Lex Christopherson
66196b4a4f fix: add configurable merge_strategy preference for slice completion (#167)
Squash merge was hardcoded, causing auto-mode to hard-stop when conflicts
arose from long-lived branches or frequently-changing .gsd/* artifacts.

Add git.merge_strategy preference ("squash" | "merge", default: squash).
"merge" uses --no-ff which preserves branch history and handles conflicts
from divergent branches more gracefully. Users hitting repeated squash
merge failures can set merge_strategy: merge in .gsd/preferences.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:16:26 -06:00
Lex Christopherson
71d3a69646 fix: verify UAT artifact before marking complete-slice done (#176, #175)
complete-slice verification only checked for the SUMMARY file, so when
the LLM skipped writing the UAT, the unit was marked complete and UAT
was never produced. Users saw doctor-created placeholder UATs instead
of real test scripts.

- verifyExpectedArtifact now checks both SUMMARY and UAT for complete-slice
- complete-slice prompt strengthened: step 7 requires concrete test cases,
  MUST line lists all three required artifacts with enforcement warning

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:16:26 -06:00
Lex Christopherson
94bd622f0c fix: update stale comment to reflect 7 runtime exclusion paths
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:16:26 -06:00
dan bachelder
064a15f988 fix(gsd-auto): require prior slices complete on main (#160) 2026-03-13 08:56:34 -06:00
deseltrus
87a1e51bc0 fix: smartStage fallback bypasses runtime exclusions when .gsd/ is gitignored (#168)
Three-layer fix for runtime files leaking into git commits:

1. Stage-then-unstage: replace pathspec excludes with git add -A followed
   by git reset HEAD for each exclusion. The old approach failed when .gsd/
   was in .gitignore — git exited non-zero before evaluating the excludes,
   and the catch fallback staged everything unconditionally.

2. Auto-cleanup: on first smartStage call per session, remove any runtime
   files that are already tracked in the index (from the fallback bug) via
   a dedicated commit. This is a one-time migration that self-heals repos
   where runtime files were accidentally committed.

3. Pre-checkout discard: after pre-switch auto-commits that exclude .gsd/,
   run git checkout -- .gsd/ to clear dirty runtime files that would
   otherwise block git checkout when the target branch has different
   tracked versions.

Also adds completed-units.json to RUNTIME_EXCLUSION_PATHS and
BASELINE_PATTERNS (was missing — metrics.json was listed but
completed-units.json was not).
2026-03-13 08:53:58 -06:00
deseltrus
dcd993064c feat(gsd): add discussion depth verification and context write-gate (#181)
- Prompt enhancements: sparring patterns, enrichment reflection, depth
  verification checkpoint, depth-preservation guidance for context.md
- Write-gate: block milestone CONTEXT.md writes until depth verification
  is confirmed via ask_user_questions with depth_verification id
- Discussion persistence: auto-save exchanges to DISCUSSION.md during
  discuss phase with structured markdown formatting
- Depth state management: track verification status, reset on discuss
  phase completion, expose getDiscussionMilestoneId() for gate checks
- 7 unit tests covering write-gate logic (pure function, no mocks needed)
2026-03-13 08:53:00 -06:00
TÂCHES
9009e5dd78 Merge pull request #158 from deseltrus/feat/guided-flow-escape-hatch
feat: add skip/discard escape hatches to no-roadmap wizard
2026-03-13 08:50:04 -06:00
Ryan Harrington
9f58583888 fix/gsd-graceful-exit: make /exit use graceful shutdown (#134)
* fix/gsd-graceful-exit: make /exit use graceful shutdown

* fix/gsd-graceful-exit: restore auto cleanup in exit command
2026-03-13 08:47:55 -06:00
TÂCHES
789a6645da feat: TTSR + blob/artifact storage (ported from oh-my-pi)
* docs(M002): context, requirements, and roadmap

* feat: port TTSR and blob/artifact storage from oh-my-pi

Phase 1 — TTSR (Time Traveling Stream Rules):
- TtsrManager: regex-based stream monitoring with scope filtering,
  repeat gating, and buffer isolation (picomatch replaces Bun.Glob)
- Rule loader: scans ~/.gsd/agent/rules/*.md and .gsd/rules/*.md
  with YAML frontmatter parsing; project rules override global
- TTSR extension: wires into pi event lifecycle (session_start,
  turn_start, message_update, turn_end, agent_end) to abort on
  match and inject violation as system reminder via sendMessage
- Interrupt template for rule violation injection

Phase 2 — Blob/Artifact Storage:
- BlobStore: content-addressed storage at ~/.gsd/agent/blobs/ using
  Node crypto (sha256), sync I/O, automatic deduplication
- ArtifactManager: session-scoped sequential artifact files stored
  alongside session JSONL (lazy dir creation, resume-safe ID scan)
- Session manager integration: prepareForPersistence externalizes
  images ≥1KB to blob store before JSONL write; resolveBlobRefs
  rehydrates on session load; truncates strings >500KB
- Bash tool artifact spill: uses ArtifactManager instead of temp
  files when available, includes artifact:// references in output

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: harden blob store, TTSR manager, and dep classification

- Validate SHA-256 hex format in BlobStore.get/has/parseBlobRef to
  prevent path traversal via crafted blob references
- Cap TTSR per-stream buffers at 512KB to prevent unbounded memory growth
- Move picomatch from devDependencies to dependencies (runtime import)
- Warn on invalid regex in TTSR rule conditions instead of silent skip
- Remove .gsd/ planning files that were force-added past .gitignore
- Add trailing newline to ttsr-interrupt.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: add tests for blob store, artifact manager, TTSR manager, and rule loader

55 tests covering:
- BlobStore put/get/has, idempotency, path traversal rejection
- parseBlobRef/isBlobRef validation, externalize/resolve round-trips
- ArtifactManager sequential IDs, lazy dir creation, session resume
- TtsrManager rule matching, scope filtering, buffer isolation,
  repeat gating, buffer size cap, injection persistence
- Rule loader frontmatter parsing, directory scanning, merge logic

Also fixes BlobStore constructor to avoid TS parameter property syntax
(incompatible with Node's strip-only TypeScript mode).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 08:43:56 -06:00
Lex Christopherson
5155d69d55 test(M002/S06): Test coverage
Tasks:
- chore(M002/S06): auto-commit after complete-slice
- chore(M002/S06): auto-commit after complete-slice
- chore(M002/S06/T02): auto-commit after execute-task
- chore(M002/S06/T02): auto-commit after execute-task
- chore(M002/S06/T01): auto-commit after execute-task
- chore(M002/S06/T01): auto-commit after execute-task
- chore(M002/S06): auto-commit after plan-slice
- chore: update state for S06 execution
- docs(S06): add slice plan

Branch: gsd/M002/S06
2026-03-13 08:04:27 -06:00
Lex Christopherson
8c549bd9c7 feat(M002/S05): Intent-ranked retrieval and semantic actions
Tasks:
- chore(M002/S05): auto-commit after complete-slice
- chore(M002/S05): auto-commit after complete-slice
- chore(M002/S05/T01): auto-commit after execute-task
- chore(M002/S05/T01): auto-commit after execute-task
- chore(M002/S05): auto-commit after plan-slice
- docs(S05): add slice plan

Branch: gsd/M002/S05
2026-03-13 08:04:27 -06:00
Lex Christopherson
57f6b0fd90 feat(M002/S04): Form intelligence
Tasks:
- chore(M002/S04): auto-commit after complete-slice
- chore(M002/S04): auto-commit after complete-slice
- chore(M002/S04/T02): auto-commit after execute-task
- chore(M002/S04/T02): auto-commit after execute-task
- chore(M002/S04/T01): auto-commit after execute-task
- chore(M002/S04): auto-commit after plan-slice
- docs(S04): add slice plan

Branch: gsd/M002/S04
2026-03-13 08:04:27 -06:00
Lex Christopherson
68cf10eed5 feat(M002/S03): Screenshot pipeline
Tasks:
- chore(M002/S03): auto-commit after complete-slice
- chore(M002/S03): auto-commit after complete-slice
- chore(M002/S03/T01): auto-commit after execute-task
- chore(M002/S03/T01): auto-commit after execute-task
- chore(M002/S03): auto-commit after plan-slice
- docs(S03): add slice plan

Branch: gsd/M002/S03
2026-03-13 08:04:27 -06:00
Lex Christopherson
3590b32252 feat(M002/S02): Action pipeline performance
Tasks:
- chore(M002/S02): auto-commit after complete-slice
- chore(M002/S02): auto-commit after complete-slice
- chore(M002/S02/T02): auto-commit after execute-task
- chore(M002/S02/T02): auto-commit after execute-task
- chore(M002/S02/T01): auto-commit after execute-task
- fix: expand tool outputs by default and pull main before slice merge
- chore(M002/S02): auto-commit after plan-slice
- docs(S02): add slice plan

Branch: gsd/M002/S02
2026-03-13 08:04:27 -06:00
Lex Christopherson
ee6dce643b chore(M002/S01): auto-commit after reassess-roadmap 2026-03-13 08:04:27 -06:00
deseltrus
f021d8cafa fix: use getExpandedText() to preserve large paste content in notes (#169)
interview-ui.ts saveEditorToState() was calling getText() which returns
paste markers like '[paste #1 2033 chars]' for content >1000 chars or
>10 lines. The actual pasted content was stored in the Editor's paste
map but never expanded back.

This silently discards user input in ask_user_questions notes — any
substantive response (voice transcripts, detailed explanations, extended
enrichments) that exceeds the paste threshold gets replaced with a
marker string. The LLM receives '[paste #1 N chars]' instead of the
user's actual words.

Fix: getText() → getExpandedText() — the Editor already has the method
that expands paste markers to their stored content. One-line change.
2026-03-13 08:01:14 -06:00
deseltrus
c276eac292 feat: add skip/discard escape hatches to no-roadmap wizard 2026-03-13 07:28:05 +01:00
TÂCHES
24302942d0 Merge pull request #153 from Jah-yee/main
fix: make /exit use graceful shutdown, add /kill for immediate exit
2026-03-12 22:49:12 -06:00
SparkLabScout
1217e03225 fix: make /exit use graceful shutdown, add /kill for immediate exit
- /exit now calls stopAuto() before exiting to save activity log and clear locks
- Added new /kill command for immediate exit without cleanup
- Fixes issue #132: /exit terminates too abruptly and leaves terminal state dirty
2026-03-13 12:29:15 +08:00
TÂCHES
2a3c2b5194 Merge pull request #151 from dbachelder/fix/pi-provider-reuse-and-extension-loading
fix: reuse Pi provider config and load Pi extensions correctly
2026-03-12 22:25:15 -06:00
TÂCHES
18348e2103 Merge pull request #135 from Jamie-BitFlight/feat/model-fallbacks
feat: add model fallback support for auto-mode phases
2026-03-12 22:24:13 -06:00
Lex Christopherson
4b09cd9a91 fix: exclude packages/ from test resolver .js→.ts rewriting
The ESM resolve hook was rewriting .js imports from vendored Pi
packages (packages/*/dist/) to .ts, breaking test resolution.
Compiled dist/ files need their .js specifiers left intact.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:01:00 -06:00
Lex Christopherson
c80d640d35 feat: vendor Pi source into workspace monorepo
Vendor all 4 Pi packages (tui, ai, agent-core, coding-agent) from
pi-mono v0.57.1 as @gsd/* workspace packages under packages/. This
replaces the compiled npm dependency (@mariozechner/pi-coding-agent)
and patch-package workflow, giving direct source access for
modifications.

- Copy Pi source from pi-mono v0.57.1 into packages/
- Create workspace package.json + tsconfig.json for each package
- Rename ~240 imports from @mariozechner/pi-* to @gsd/pi-*
- Apply existing patches as source edits (setModel persist, VT input)
- Remove @mariozechner/pi-coding-agent dep and patch-package
- Update build pipeline to build packages in dependency order
- Add pi-upstream git remote for future selective syncing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:55:17 -06:00
dan
00cb2f36a8 fix: reuse pi provider config and extension loading 2026-03-12 20:44:09 -07:00
Lex Christopherson
f6d56991df fix: doctor post-hook no longer preempts complete-slice dispatch
Doctor's fix:true mode was creating summary stubs and marking slices
done in the roadmap during the post-hook after every task. This
short-circuited the complete-slice dispatch unit — by the time
dispatchNextUnit ran, the slice was already 'done' and the merge
guard merged it to main, so complete-slice (which writes the real
compressed summary) never got a chance to run.

Root cause: doctor conflated two responsibilities — task-level
bookkeeping (marking checkboxes) and completion state transitions
(summary stubs, roadmap marking). The post-hook should only do
the former.

Fix: added fixLevel option to runGSDDoctor. fixLevel:'task' (used
by post-hook) skips completion transition codes. fixLevel:'all'
(default, used by manual gsd doctor and resume) preserves existing
recovery behavior.

Completion transition codes gated by fixLevel:
- all_tasks_done_missing_slice_summary
- all_tasks_done_missing_slice_uat
- all_tasks_done_roadmap_not_checked
2026-03-12 21:35:57 -06:00
Lex Christopherson
d4a46d36dd fix: restore main_branch preference and implement runPreMergeCheck
Restores main_branch field on GitPreferences (removed in a prior merge conflict
resolution) and adds VALID_BRANCH_NAME validation in getMainBranch(). Implements
runPreMergeCheck with auto-detection from package.json test scripts and support
for custom commands via prefs.pre_merge_check string values.

Fixes 5 pre-existing test failures in git-service.test.ts (158/158 now pass).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:35:57 -06:00
Lex Christopherson
af2061bbe1 fix: cap recovery/retry prompt injection to prevent V8 OOM (#139)
The crash loop: stale state → unit redispatched → activity log grows →
retry diagnostic reads full log → prompt grows → replaceAll on huge
string → V8 heap exhaustion. Cap both the read path (10MB JSONL parse
limit) and the injection path (50K char prompt cap) to break the cycle.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:35:57 -06:00
Lex Christopherson
db2a409d7d fix: exclude .gsd/ from pre-switch auto-commits to prevent squash merge conflicts (#143)
Pre-switch auto-commits were including .gsd/ planning artifacts (roadmaps, STATE.md)
on both sides of a branch switch, causing reliable merge conflicts when squash-merging
slices back to main. Now pre-switch auto-commits exclude the entire .gsd/ directory,
while post-task auto-commits continue to include them normally.

Also restores VALID_BRANCH_NAME export removed in a prior merge conflict resolution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:35:57 -06:00
Lex Christopherson
7a1eac6af3 feat(M001): proactive secret management
Front-load API key collection into GSD's planning phase so auto-mode
runs uninterrupted. Planning prompts forecast secrets into a manifest,
auto-mode collects pending keys before dispatching the first slice.

- getManifestStatus() queries manifest state against env
- collectSecretsFromManifest() orchestrates summary, collection, manifest update
- showSecretsSummary() read-only TUI summary with status indicators
- collectOneSecret() enhanced with guidance display above masked input
- Secrets gate in startAuto() — non-fatal, inherited by guided flow
- 19 new tests (manifest-status, collect-from-manifest, auto-secrets-gate)
- All 10 requirements (R001-R010) validated
2026-03-12 21:35:57 -06:00
vp275
d7a90cf0e6 Add --continue / -c flag to resume the most recent session
Uses the existing SessionManager.continueRecent() from the Pi SDK
to load the most recent session for the current working directory.
Mirrors the --continue flag already available in the base Pi CLI.
2026-03-13 07:51:29 +05:30
Lex Christopherson
39f0df45d5 fix: abort squash-merge on conflict and stop auto-mode instead of looping (#merge-bug-fix)
mergeSliceToMain now runs git reset --hard if git merge --squash fails,
restoring a clean working tree instead of leaving conflict markers.

The merge guard catch block in auto.ts now:
1. Detects leftover conflicted state (UU/AA/UD in porcelain status)
2. Resets the working tree if conflicts remain
3. Stops auto-mode with a clear error instead of continuing with
   corrupted .gsd/ state files that cause an infinite dispatch loop

Also fixes conflict markers in loader.ts, logo.ts, and postinstall.js
that were baked into main from a prior bad merge resolution.
2026-03-12 15:32:39 -06:00
Lex Christopherson
63f9a84e8a feat(M002/S02): enhanced secure_env_collect UX — checkExistingEnvKeys, detectDestination, guidance field, auto-detection 2026-03-12 15:32:39 -06:00
Lex Christopherson
d6ae48d2bf fix: resolve remaining merge conflict in git-service.ts 2026-03-12 15:32:39 -06:00
Lex Christopherson
86aba6ec59 fix: resolve merge conflicts from S01 branch merge into main 2026-03-12 15:32:39 -06:00
Lex Christopherson
a64508e7a8 chore: auto-commit before switching to gsd/M002/S01 2026-03-12 15:32:39 -06:00
Lex Christopherson
1c68dc2906 chore: auto-commit before switching to gsd/M002/S01 2026-03-12 15:32:39 -06:00
Lex Christopherson
f0459785c6 refactor: right-size pipeline for simple work
Remove task/project/product classification taxonomy from discuss prompt.
The LLM now sizes work based on judgment, not labels.

Key changes:
- discuss.md: Replace 3-tier classification with judgment-based sizing.
  Remove hard minimum question rounds (2 for task, 4 for project).
  Questioning depth now matches actual scope.

- plan-milestone.md: Add right-sizing doctrine. Single-slice milestones
  now write the S01 plan + task plans inline, eliminating separate
  research-slice and plan-slice sessions.

- plan-slice.md: Add right-sizing guidance. Make Proof Level,
  Integration Closure, and Observability sections conditional —
  omit entirely for simple slices instead of filling with 'none'.
  Consolidate self-audit from 10 items to 7 (remove duplicates).

- auto.ts: Skip research-slice for S01 when milestone research exists.
  Update peekNext label for plan-milestone.

- complete-slice.md: Add effort-matching guidance. Lighten observability
  verification for simple slices.

- execute-task.md: Make observability steps conditional on task plan
  content rather than always required.

- templates (plan.md, task-plan.md): Add comments making heavyweight
  sections explicitly optional for simple work.

Pipeline reduction for simple 1-slice milestone:
  Before: 9-10 sessions (research-M, plan-M, research-S, plan-S,
          execute×N, complete-S, reassess, complete-M)
  After:  5-6 sessions (research-M, plan-M [+S01 inline],
          execute×N, complete-S, complete-M)
2026-03-12 14:33:42 -06:00
Jamie McGregor Nelson
f1cf77a738 feat: add model fallback support for auto-mode phases
Adds support for specifying fallback models in GSD preferences. When a
primary model fails to switch (provider unavailable, rate limited, etc.),
GSD automatically tries the next model in the fallbacks list.

Changes:
- Add GSDPhaseModelConfig interface for per-phase model with fallbacks
- Add resolveModelWithFallbacksForUnit() function
- Update model switching in auto.ts to try fallbacks in order
- Update preferences-reference.md with fallback examples

Example usage:
```yaml
models:
  planning:
    model: claude-opus-4-6
    fallbacks:
      - openrouter/z-ai/glm-5
      - openrouter/minimax/minimax-m2.5
```

This enables cost-optimized configurations with resilience against
provider outages or credit exhaustion.
2026-03-12 16:15:54 -04:00
Lex Christopherson
ea1dbd26f5 test: add main_branch preference tests to git-service
Covers VALID_BRANCH_NAME regex validation, configured preference
returns correctly, fallback to auto-detection, and injection rejection.

Closes #108

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 14:14:26 -06:00
Lex Christopherson
9200401e55 chore: auto-commit before switching to gsd/M002/S01 2026-03-12 14:01:10 -06:00
Lex Christopherson
d43322c45d feat(M001/S05): Enhanced features — merge guards, snapshots, auto-push, rich commits 2026-03-12 13:21:18 -06:00
Lex Christopherson
d9d773e44e feat(M001/S04): Remove git commands from prompts 2026-03-12 13:21:05 -06:00
Lex Christopherson
b2e7dbdc25 feat(M001/S03): Bug fixes and doc corrections 2026-03-12 13:21:05 -06:00
Lex Christopherson
dfe9527641 feat(M001/S02): Wire GitService into codebase 2026-03-12 13:21:05 -06:00
Lex Christopherson
91cf23a634 fix(auto): prevent state machine deadlock when units fail to produce artifacts
Three fixes to the dispatch loop:

1. Don't mark a unit complete when the next dispatch is the same unit
   (retry scenario) — let the retry mechanism handle it instead of
   persisting a false completion.

2. Verify expected artifact exists on disk before marking a unit
   complete. Uses resolveExpectedArtifactPath + existsSync to gate
   persistCompletedKey calls.

3. Cross-validate idempotency: when skipping a "completed" unit, verify
   the artifact actually exists. If missing, remove the stale record
   from completed-units.json and re-run the unit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:21:05 -06:00
TÂCHES
108f6b4f1d Merge pull request #79 from FacuVCanale/feat/native-web-search
feat: native Anthropic web search via before_provider_request hook
2026-03-12 11:56:00 -06:00
Lex Christopherson
b72d852771 feat: migrate provider credentials from existing Pi install
Closes #122

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:08:34 -06:00
Lex Christopherson
574acae114 refactor(prompts): compress system prompt from 360 to 187 lines
Cut 48% of system prompt token cost while preserving all load-bearing content:
- Remove Activity Logs section (agent never interacts with these)
- Remove Investigation escalation ladder (redundant with tool-routing)
- Remove Context economy section (obvious/redundant)
- Remove Web research vs browser execution (compressed into playbooks)
- Compress tool-routing to non-obvious entries only (scout, bg_shell, Context7)
- Compress Ask vs infer to core rule
- Compress Code structure to 5 key principles
- Compress Verification to inline task-type table
- Compress Agent-First Observability (character block carries the why)
- Compress Background processes playbook from 30 to 5 lines
- Compress Web behavior playbook from 25 to 6 lines
- Compress Libraries and Current facts into single section
- Remove BRAVE_API_KEY config (user-facing, not agent-facing)
2026-03-12 11:07:26 -06:00