Commit graph

3535 commits

Author SHA1 Message Date
Lex Christopherson
af2061bbe1 fix: cap recovery/retry prompt injection to prevent V8 OOM (#139)
The crash loop: stale state → unit redispatched → activity log grows →
retry diagnostic reads full log → prompt grows → replaceAll on huge
string → V8 heap exhaustion. Cap both the read path (10MB JSONL parse
limit) and the injection path (50K char prompt cap) to break the cycle.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:35:57 -06:00
Lex Christopherson
db2a409d7d fix: exclude .gsd/ from pre-switch auto-commits to prevent squash merge conflicts (#143)
Pre-switch auto-commits were including .gsd/ planning artifacts (roadmaps, STATE.md)
on both sides of a branch switch, causing reliable merge conflicts when squash-merging
slices back to main. Now pre-switch auto-commits exclude the entire .gsd/ directory,
while post-task auto-commits continue to include them normally.

Also restores VALID_BRANCH_NAME export removed in a prior merge conflict resolution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:35:57 -06:00
Lex Christopherson
7a1eac6af3 feat(M001): proactive secret management
Front-load API key collection into GSD's planning phase so auto-mode
runs uninterrupted. Planning prompts forecast secrets into a manifest,
auto-mode collects pending keys before dispatching the first slice.

- getManifestStatus() queries manifest state against env
- collectSecretsFromManifest() orchestrates summary, collection, manifest update
- showSecretsSummary() read-only TUI summary with status indicators
- collectOneSecret() enhanced with guidance display above masked input
- Secrets gate in startAuto() — non-fatal, inherited by guided flow
- 19 new tests (manifest-status, collect-from-manifest, auto-secrets-gate)
- All 10 requirements (R001-R010) validated
2026-03-12 21:35:57 -06:00
Lex Christopherson
dc3c2e7d76 docs(M001): context, requirements, and roadmap 2026-03-12 21:35:57 -06:00
TÂCHES
cf0ab43b14 Merge pull request #148 from vp275/feat/continue-flag
Add --continue / -c flag to resume most recent session
2026-03-12 21:02:58 -06:00
vp275
d7a90cf0e6 Add --continue / -c flag to resume the most recent session
Uses the existing SessionManager.continueRecent() from the Pi SDK
to load the most recent session for the current working directory.
Mirrors the --continue flag already available in the base Pi CLI.
2026-03-13 07:51:29 +05:30
Lex Christopherson
59a4d06fae 2.5.1 2026-03-12 15:32:39 -06:00
Lex Christopherson
6b358491b3 docs: update changelog for v2.5.1
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 15:32:39 -06:00
Lex Christopherson
6139a22c44 chore: delete old .gsd 2026-03-12 15:32:39 -06:00
Lex Christopherson
39f0df45d5 fix: abort squash-merge on conflict and stop auto-mode instead of looping (#merge-bug-fix)
mergeSliceToMain now runs git reset --hard if git merge --squash fails,
restoring a clean working tree instead of leaving conflict markers.

The merge guard catch block in auto.ts now:
1. Detects leftover conflicted state (UU/AA/UD in porcelain status)
2. Resets the working tree if conflicts remain
3. Stops auto-mode with a clear error instead of continuing with
   corrupted .gsd/ state files that cause an infinite dispatch loop

Also fixes conflict markers in loader.ts, logo.ts, and postinstall.js
that were baked into main from a prior bad merge resolution.
2026-03-12 15:32:39 -06:00
Lex Christopherson
63f9a84e8a feat(M002/S02): enhanced secure_env_collect UX — checkExistingEnvKeys, detectDestination, guidance field, auto-detection 2026-03-12 15:32:39 -06:00
Lex Christopherson
1b87bd046f docs(M002): reassess roadmap after S01 — no changes needed 2026-03-12 15:32:39 -06:00
Lex Christopherson
d6ae48d2bf fix: resolve remaining merge conflict in git-service.ts 2026-03-12 15:32:39 -06:00
Lex Christopherson
86aba6ec59 fix: resolve merge conflicts from S01 branch merge into main 2026-03-12 15:32:39 -06:00
Lex Christopherson
a64508e7a8 chore: auto-commit before switching to gsd/M002/S01 2026-03-12 15:32:39 -06:00
Lex Christopherson
1c68dc2906 chore: auto-commit before switching to gsd/M002/S01 2026-03-12 15:32:39 -06:00
TÂCHES
488d0fd4fb Merge pull request #136 from gsd-build/feat/pipeline-right-sizing
refactor: right-size pipeline for simple work
2026-03-12 14:54:42 -06:00
Lex Christopherson
f0459785c6 refactor: right-size pipeline for simple work
Remove task/project/product classification taxonomy from discuss prompt.
The LLM now sizes work based on judgment, not labels.

Key changes:
- discuss.md: Replace 3-tier classification with judgment-based sizing.
  Remove hard minimum question rounds (2 for task, 4 for project).
  Questioning depth now matches actual scope.

- plan-milestone.md: Add right-sizing doctrine. Single-slice milestones
  now write the S01 plan + task plans inline, eliminating separate
  research-slice and plan-slice sessions.

- plan-slice.md: Add right-sizing guidance. Make Proof Level,
  Integration Closure, and Observability sections conditional —
  omit entirely for simple slices instead of filling with 'none'.
  Consolidate self-audit from 10 items to 7 (remove duplicates).

- auto.ts: Skip research-slice for S01 when milestone research exists.
  Update peekNext label for plan-milestone.

- complete-slice.md: Add effort-matching guidance. Lighten observability
  verification for simple slices.

- execute-task.md: Make observability steps conditional on task plan
  content rather than always required.

- templates (plan.md, task-plan.md): Add comments making heavyweight
  sections explicitly optional for simple work.

Pipeline reduction for simple 1-slice milestone:
  Before: 9-10 sessions (research-M, plan-M, research-S, plan-S,
          execute×N, complete-S, reassess, complete-M)
  After:  5-6 sessions (research-M, plan-M [+S01 inline],
          execute×N, complete-S, complete-M)
2026-03-12 14:33:42 -06:00
Jamie McGregor Nelson
f1cf77a738 feat: add model fallback support for auto-mode phases
Adds support for specifying fallback models in GSD preferences. When a
primary model fails to switch (provider unavailable, rate limited, etc.),
GSD automatically tries the next model in the fallbacks list.

Changes:
- Add GSDPhaseModelConfig interface for per-phase model with fallbacks
- Add resolveModelWithFallbacksForUnit() function
- Update model switching in auto.ts to try fallbacks in order
- Update preferences-reference.md with fallback examples

Example usage:
```yaml
models:
  planning:
    model: claude-opus-4-6
    fallbacks:
      - openrouter/z-ai/glm-5
      - openrouter/minimax/minimax-m2.5
```

This enables cost-optimized configurations with resilience against
provider outages or credit exhaustion.
2026-03-12 16:15:54 -04:00
Lex Christopherson
ea1dbd26f5 test: add main_branch preference tests to git-service
Covers VALID_BRANCH_NAME regex validation, configured preference
returns correctly, fallback to auto-detection, and injection rejection.

Closes #108

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 14:14:26 -06:00
Lex Christopherson
9200401e55 chore: auto-commit before switching to gsd/M002/S01 2026-03-12 14:01:10 -06:00
Lex Christopherson
4e92a49d45 docs(M002): context, requirements, and roadmap 2026-03-12 14:01:07 -06:00
Lex Christopherson
ad2931a521 2.5.0 2026-03-12 13:24:24 -06:00
Lex Christopherson
57e118c144 docs: update changelog for v2.5.0 2026-03-12 13:24:20 -06:00
Lex Christopherson
b17ab25aaa chore: remove .gsd/ from tracking (already in .gitignore) 2026-03-12 13:21:18 -06:00
Lex Christopherson
cd01a47461 feat(M001/S06): Cleanup and archive 2026-03-12 13:21:18 -06:00
Lex Christopherson
d43322c45d feat(M001/S05): Enhanced features — merge guards, snapshots, auto-push, rich commits 2026-03-12 13:21:18 -06:00
Lex Christopherson
d9d773e44e feat(M001/S04): Remove git commands from prompts 2026-03-12 13:21:05 -06:00
Lex Christopherson
b2e7dbdc25 feat(M001/S03): Bug fixes and doc corrections 2026-03-12 13:21:05 -06:00
Lex Christopherson
dfe9527641 feat(M001/S02): Wire GitService into codebase 2026-03-12 13:21:05 -06:00
Lex Christopherson
91cf23a634 fix(auto): prevent state machine deadlock when units fail to produce artifacts
Three fixes to the dispatch loop:

1. Don't mark a unit complete when the next dispatch is the same unit
   (retry scenario) — let the retry mechanism handle it instead of
   persisting a false completion.

2. Verify expected artifact exists on disk before marking a unit
   complete. Uses resolveExpectedArtifactPath + existsSync to gate
   persistCompletedKey calls.

3. Cross-validate idempotency: when skipping a "completed" unit, verify
   the artifact actually exists. If missing, remove the stale record
   from completed-units.json and re-run the unit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:21:05 -06:00
TÂCHES
108f6b4f1d Merge pull request #79 from FacuVCanale/feat/native-web-search
feat: native Anthropic web search via before_provider_request hook
2026-03-12 11:56:00 -06:00
Lex Christopherson
428865a149 chore: auto-commit before switching to gsd/M001/S02 2026-03-12 11:31:04 -06:00
Lex Christopherson
ebacc5ad88 chore(M001/S02): auto-commit after complete-slice 2026-03-12 11:19:31 -06:00
Lex Christopherson
71984a8f0f 2.4.0 2026-03-12 11:16:20 -06:00
Lex Christopherson
f8c33aeea9 docs: update changelog for v2.4.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:16:12 -06:00
Lex Christopherson
631b2e0b86 docs: mention Pi credential migration in first launch section
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:14:16 -06:00
TÂCHES
352c55b70d Merge pull request #123 from gsd-build/fix/122-pi-provider-migration
feat: migrate provider credentials from existing Pi install
2026-03-12 11:10:19 -06:00
Lex Christopherson
b72d852771 feat: migrate provider credentials from existing Pi install
Closes #122

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:08:34 -06:00
Lex Christopherson
54f79b7a31 Merge feat/lean-system-prompt: 48% system prompt compression 2026-03-12 11:07:33 -06:00
Lex Christopherson
574acae114 refactor(prompts): compress system prompt from 360 to 187 lines
Cut 48% of system prompt token cost while preserving all load-bearing content:
- Remove Activity Logs section (agent never interacts with these)
- Remove Investigation escalation ladder (redundant with tool-routing)
- Remove Context economy section (obvious/redundant)
- Remove Web research vs browser execution (compressed into playbooks)
- Compress tool-routing to non-obvious entries only (scout, bg_shell, Context7)
- Compress Ask vs infer to core rule
- Compress Code structure to 5 key principles
- Compress Verification to inline task-type table
- Compress Agent-First Observability (character block carries the why)
- Compress Background processes playbook from 30 to 5 lines
- Compress Web behavior playbook from 25 to 6 lines
- Compress Libraries and Current facts into single section
- Remove BRAVE_API_KEY config (user-facing, not agent-facing)
2026-03-12 11:07:26 -06:00
Lex Christopherson
fa6d085eb7 Merge feat/gsd-craft-standards: security, completeness, observability 2026-03-12 10:58:43 -06:00
Lex Christopherson
4855d0a37b feat(prompts): add craft standards, completeness, and self-debugging awareness
Three additions to the GSD character block:
- Security/performance/elegance as craft instinct, not checkbox compliance
- Anti-laziness: finish what you start, no stubs, no 80% features, no skipped error handling
- Self-debugging awareness: you write code you will debug later with no memory of writing it
2026-03-12 10:58:38 -06:00
Lex Christopherson
f6dfffb61e feat(M001/S01): GitService core implementation 2026-03-12 10:56:10 -06:00
Lex Christopherson
454f104747 Merge feat/gsd-character: craftsman-engineer identity for GSD 2026-03-12 10:53:29 -06:00
Lex Christopherson
d8612ab15e feat(prompts): define GSD character and consolidate communication style
Replace the generic agent intro with a craftsman-engineer character
definition: curious about problems, warm but terse, co-owner during
planning, committed executor during auto-mode. Consolidate the
scattered Communication and Writing Style + Work Narration sections
into a single focused Communication section that preserves all
calibration signals (pushback triggers, narration examples, uncertainty
handling).
2026-03-12 10:53:23 -06:00
Facu_Viñas
a595b9e28e fix: prevent duplicate tools on provider toggle, suppress restore notifications, fix Windows test globs
- Prevent duplicate Brave tool entries when toggling providers repeatedly
  by filtering already-active tools before re-adding (BUG-1)
- Remove single quotes from test glob patterns in package.json so Windows
  shell expands them correctly (BUG-2)
- Fix test mock fire() to call all handlers instead of short-circuiting
  on first match, matching real framework behavior (BUG-3)
- Suppress "Native Anthropic web search active" notification on session
  restore (source: "restore") to reduce UX noise (BUG-4)
- Add regression tests for all 4 bugs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:50:03 -03:00
Facu_Viñas
e22a2f7622 fix: remove Brave search tools from API payload when no BRAVE_API_KEY
The model_select event doesn't reliably fire on startup, so Brave tools
remained visible to Claude even without a key. Now before_provider_request
filters search-the-web and search_and_read from the payload directly,
ensuring Claude only sees the native web_search tool.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:50:03 -03:00
Facu_Viñas
2252a6dfca fix: strip thinking blocks from history to fix conversation replay error
The Pi SDK's streaming parser drops server_tool_use and
web_search_tool_result content blocks. When the conversation is replayed,
assistant messages are incomplete, causing the Anthropic API to reject
requests with "thinking blocks cannot be modified."

Fix: stripThinkingFromHistory() removes thinking/redacted_thinking blocks
from all assistant messages before sending, since they're all from stored
history. The model generates fresh thinking for each new turn.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:50:03 -03:00
Facu_Viñas
4ba7930240 test: add tests for native Anthropic web search hook logic
12 tests covering: tool injection for claude models, non-claude passthrough,
double-injection prevention, tool deactivation/reactivation on model switch,
and session_start diagnostics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:50:02 -03:00