Commit graph

2631 commits

Author SHA1 Message Date
Jeremy
af158235eb fix(gsd): remove background color from backdrop, fix message truncation
Backdrop was painting empty lines with dark gray background (48;5;233),
making the entire screen go black. Now uses dim + gray foreground only.

Message truncation now measures actual prefix width with visibleWidth()
instead of hardcoded 20-char estimate, and uses truncateToWidth() for
proper Unicode handling.
2026-04-06 20:11:07 -05:00
Jeremy
41d5189c4c fix(gsd): restore consistent overlay height to prevent ghost artifacts
Differential renderer can't clear old overlay positions when height
changes between filter cycles. Pad to maxVisibleRows so the overlay
stays the same size regardless of filter state.
2026-04-06 19:31:32 -05:00
Jeremy
2c91b8c6d8 test(tui): add test for 256-color backdrop codes 2026-04-06 18:57:02 -05:00
Jeremy
c35385fe53 fix(gsd): improve notification overlay backdrop and content-fit sizing
Use dark gray background + dim foreground for visible backdrop effect
instead of barely-perceptible SGR dim. Size overlay box to content
instead of padding to fill the entire viewport.
2026-04-06 18:53:26 -05:00
Jeremy
9d1e343e41 test(gsd): add overlay backdrop and notification lock safety tests
- Overlay layout: verify backdrop dims base lines, no dim without flag,
  overlay composites on top of dimmed background
- Notification store: verify markAllRead and clearNotifications do not
  delete a foreign process's lock file
2026-04-06 17:44:34 -05:00
Jeremy
d553455732 fix(gsd): only unlink notification lock when owned, prevent foreign lock deletion
_withLock() was unconditionally unlinking the lock file in finally,
even when lock acquisition failed. This could delete another process's
lock and allow unlocked concurrent writes. Now tracks ownership and
only cleans up locks we created.
2026-04-06 17:39:44 -05:00
Jeremy
2c4ac844f1 fix(gsd): add backdrop dimming and viewport padding to notification overlay
The notification overlay was rendering too small with few entries, allowing
underlying content to bleed through. Added viewport padding to fill the
overlay box and a new `backdrop` option to OverlayOptions that dims the
background behind modal overlays.
2026-04-06 17:34:45 -05:00
Jeremy McSpadden
6dfc422990 Merge pull request #3587 from jeremymcs/feat/persistent-notification-panel
feat(gsd): persistent notification panel
2026-04-05 22:53:30 -05:00
Jeremy
8078755e4b feat(gsd): persistent notification panel with TUI overlay, widget, and web API
Notifications from ctx.ui.notify() and workflow-logger now persist to
.gsd/notifications.jsonl instead of evaporating as transient toasts.

- notification-store: JSONL persistence with 500-entry rotation, atomic
  temp+rename rewrites, ref-counted suppress API, disk-synced counters
- notify-interceptor: WeakSet-guarded monkey-patch on ctx.ui.notify
  installed at session_start and session_switch
- notification-widget: always-on belowEditor strip showing unread count
- notification-overlay: scrollable Ctrl+Alt+N panel with severity filter
- /gsd notifications command: clear, tail, filter subcommands
- workflow-logger: warnings now also persist to notification store
- web API: GET/DELETE /api/notifications with ?countOnly support
- 16 unit tests covering store, suppress, project isolation, resync
2026-04-05 22:13:28 -05:00
Jeremy McSpadden
c9d358b8fe Merge pull request #3468 from OfficialDelta/feat/enhanced-verification
feat(gsd): add enhanced verification checks for auto-mode
2026-04-05 21:59:03 -05:00
Alan Alwakeel
7af983bda2 fix: address PR #3468 review findings
1. Post-execution retry bypass (auto-verification.ts)
   - When postExecBlockingFailure is true, skip retry and pause immediately
   - Post-exec failures are cross-task consistency issues that retrying won't fix
   - Added test in post-exec-retry-bypass.test.ts

2. File path normalization (pre-execution-checks.ts)
   - Added normalizeFilePath() to handle ./path vs path equivalence
   - Normalizes backslashes, removes duplicate slashes, strips leading ./
   - Applied to checkFilePathConsistency() and checkTaskOrdering()
   - Added tests for path normalization in pre-execution-checks.test.ts

3. Pre-exec fail-closed (auto-post-unit.ts)
   - Added try/catch around runPreExecutionChecks() inside runSafely block
   - If runPreExecutionChecks throws, set preExecPauseNeeded = true
   - Used logError from workflow-logger (not raw stderr)
   - Added test in pre-execution-fail-closed.test.ts
2026-04-05 22:44:15 -04:00
Jeremy McSpadden
7b7cad0de9 Merge pull request #3586 from jeremymcs/fix/elapsed-timer-lost-on-resume
fix(gsd): persist elapsed timer across session resume
2026-04-05 21:21:37 -05:00
github-actions[bot]
f6a1549edd release: v2.64.0 2026-04-06 02:11:42 +00:00
Jeremy
b6794956f8 test(gsd): add regression tests for autoStartTime persistence (#3585)
Source-code guards verifying autoStartTime is saved in paused-session.json,
restored on both resume paths, and falls back to Date.now() when missing.
2026-04-05 21:07:58 -05:00
Jeremy
a0a20599a0 fix(gsd): persist autoStartTime across session resume so elapsed timer survives /exit
autoStartTime was never saved to paused-session.json, so cross-session
resume always started with autoStartTime=0 and the widget showed no
elapsed timer. Now saved on pause, restored on resume with Date.now()
fallback for old files.

Also fixes widget layout: elapsed/ETA stays on the header line above
the milestone/branch info line.
2026-04-05 21:04:05 -05:00
Alan Alwakeel
9711ac3efa test(gsd): add pause wiring and integration tests for enhanced verification
- pre-execution-pause-wiring.test.ts: Tests blocking check → pause control flow
- enhanced-verification-integration.test.ts: End-to-end integration coverage

Verifies that blocking pre-execution failures trigger auto-mode pause and
that the enhanced verification pipeline integrates correctly with existing
verification infrastructure.
2026-04-05 20:25:27 -04:00
Alan Alwakeel
8f2c544a91 fix(gsd): add enhanced_verification preferences to mergePreferences
The enhanced_verification_* preferences were validated and typed but not
included in mergePreferences(), causing project-level overrides to be
silently ignored. This fix ensures project preferences properly merge
with user-level defaults.
2026-04-05 19:47:34 -04:00
Alan Alwakeel
6ee0f40083 feat(gsd): wire blocking behavior and strict mode for enhanced verification
Integrates pre/post-execution checks into auto-mode:
- auto-verification.ts: runEnhancedPreChecks/runEnhancedPostChecks integration
- auto-post-unit.ts: pause control flow when blocking checks fail
- Respects enhanced_verification_strict preference for blocking vs warning

Control flow: blocking failures trigger auto-mode pause for user review.
2026-04-05 19:47:34 -04:00
Alan Alwakeel
a3d08f7125 feat(gsd): add post-execution cross-task consistency checks
Adds 3 post-execution checks that run after task completion:
- Import resolution: verifies relative imports resolve to existing files
- Export verification: confirms exported symbols are defined
- Type consistency: validates function return types match declarations

All checks follow the permissive-by-default pattern (R012) - warnings don't block.
2026-04-05 19:46:31 -04:00
Alan Alwakeel
992b321b63 feat(gsd): add pre-execution plan verification checks
Adds 4 pre-execution checks that run before each task:
- File ops review: surfaces create/edit/delete intent for manual review
- Read-before-create guard: fails when plan reads a file before creating it
- Package existence: verifies npm packages exist before install attempts
- Interface contract: warns on mismatched function signatures

Includes preference types and validation for enhanced_verification settings.
2026-04-05 19:46:31 -04:00
Jeremy McSpadden
24e0856950 Merge pull request #3583 from jeremymcs/fix/welcome-screen-full-width
fix(ui): remove 200-column cap on welcome screen width
2026-04-05 18:43:02 -05:00
Jeremy
0d92f2fbba test(ui): add regression test for full-width separator lines
Verifies separator lines extend to the full terminal width when
the terminal is wider than 200 columns.
2026-04-05 17:57:23 -05:00
Jeremy
efb4e21205 fix(ui): remove 200-column cap on welcome screen width
The welcome screen lines stopped short on wide terminals because
termWidth was capped at 200 columns. Remove the cap so separator
lines extend to the full terminal width.
2026-04-05 17:41:21 -05:00
Jeremy McSpadden
d1b7f6f85c Merge pull request #3570 from Tibsfox/fix/headless-query-extension-drift
fix(headless): sync resources and use agent dir for query
2026-04-05 16:05:56 -05:00
Jeremy McSpadden
2298b9acab Merge pull request #3576 from jeremymcs/feat/llm-safety-harness
feat(gsd): LLM safety harness for auto-mode damage control
2026-04-05 16:03:16 -05:00
Jeremy McSpadden
46b18818a6 Merge pull request #3561 from Tibsfox/fix/ollama-fallback-provider-ready
fix(pi-coding-agent): make Ollama visible to fallback resolver
2026-04-05 15:59:20 -05:00
Jeremy McSpadden
d666fea3a9 Merge pull request #3569 from Tibsfox/fix/update-diagnostics
fix(cli): show latest version and bypass npm cache in update check
2026-04-05 15:57:41 -05:00
Jeremy McSpadden
e17b50b8a4 Merge pull request #3577 from jeremymcs/fix/hardcoded-agent-paths-3575
fix(gsd): replace hardcoded agent skill paths with dynamic resolution
2026-04-05 15:56:49 -05:00
Jeremy
8d11e5d507 test: add regression tests for adversarial review fixes (#3576)
- git-checkpoint: rollback on checked-out branch, detached HEAD, ref cleanup
- ollama streaming: terminal done:true chunk content preservation
- provider registration: preflush clears queue to prevent double registration
2026-04-05 15:52:26 -05:00
Jeremy
ac20eab501 fix: address adversarial review findings for #3576
- Use `git reset --hard <sha>` for rollback instead of `git branch -f`
  which fails on checked-out branches and worktrees
- Clear pendingProviderRegistrations after preflush to prevent duplicate
  registration when bindCore() runs
- Process Ollama stream content on terminal `done:true` chunks to avoid
  truncating trailing assistant text
2026-04-05 15:48:25 -05:00
Jeremy McSpadden
369303f82f Merge pull request #3560 from Tibsfox/fix/ollama-stream-simple-crash
fix(ollama): use apiKey auth mode to avoid streamSimple crash
2026-04-05 15:22:38 -05:00
Jeremy McSpadden
adeedef328 Merge pull request #3562 from jeremymcs/fix/harden-flat-rate-guard
fix(gsd): harden flat-rate routing guard against alias/resolution gaps
2026-04-05 15:16:01 -05:00
Jeremy McSpadden
d3a38bb771 Merge pull request #3552 from Tibsfox/fix/disable-routing-copilot
fix(gsd): disable dynamic model routing for flat-rate providers
2026-04-05 15:15:50 -05:00
Jeremy
6d19cd6baf fix(gsd): replace hardcoded agent skill paths with dynamic resolution (#3575)
The system prompt hardcoded ~/.gsd/agent/skills/ paths for bundled skills,
causing ENOENT loops when skills weren't installed at those locations. The
auto-mode loop treated ENOENT as transient and retried indefinitely.

- Replace hardcoded skill paths in system.md with {{bundledSkillsTable}} template
  variable, resolved dynamically via resolveSkillReference() at runtime
- Replace hardcoded templates dir path with {{templatesDir}} variable
- Add buildBundledSkillsTable() to system-context.ts — only includes skills
  that actually exist on disk
- Export getTemplatesDir() from prompt-loader.ts
- Add Rule 4 to detect-stuck.ts: same ENOENT path seen twice in the sliding
  window triggers immediate stuck detection (missing files don't self-heal)
- Add 4 tests for Rule 4 coverage

Closes #3575
2026-04-05 15:12:19 -05:00
Jeremy
0d3ef6b545 feat(gsd): add LLM safety harness for auto-mode damage control
Unified safety layer that monitors, validates, and constrains LLM behavior
during auto-mode execution. All components use warn-and-continue policy by
default (log violations, notify user, keep going).

Components:
- Evidence collector: real-time bash/write/edit tool call tracking
- Destructive command guard: classifies 10 dangerous patterns (rm -rf, force push, etc.)
- File change validator: compares git diff against task plan's expected output
- Evidence cross-reference: detects tasks marked complete with zero bash calls
- Git checkpoint: pre-unit refs/gsd/checkpoints/ for optional rollback
- Content validator: minimum quality checks on plans and summaries
- Timeout scale cap: limits timeout multiplier to 6x (was unlimited)

New preference: safety_harness with per-component toggles.
Enabled by default, auto_rollback off by default.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 15:00:06 -05:00
Tibsfox
4b68e8c9d9 test(headless): add extension path alignment test 2026-04-05 11:57:40 -07:00
Tibsfox
469acf53af test(cli): add update diagnostics regression test 2026-04-05 11:57:25 -07:00
Tibsfox
d4b6eb714c test(pi-coding-agent): add custom provider registration test 2026-04-05 11:56:29 -07:00
Tibsfox
dbe6f9d292 test(ollama): add authMode regression test 2026-04-05 11:56:15 -07:00
Jeremy McSpadden
3e09184493 Merge pull request #3566 from jeremymcs/fix/complete-slice-string-coercion
fix(gsd): coerce string arrays to objects in complete-slice/task tools
2026-04-05 13:44:40 -05:00
Tibsfox
523fcd89a8 fix(headless): sync resources and use agent dir for query 2026-04-05 11:35:11 -07:00
Jeremy
0b7764349c chore(gsd): remove copyright line from test file 2026-04-05 13:33:13 -05:00
Tibsfox
3bcd55ccfd fix(cli): show latest version and bypass npm cache in update check 2026-04-05 11:33:03 -07:00
Jeremy
e210b7efdf fix(gsd): follow CONTRIBUTING standards for #3565
- Move new coercion tests to standalone file using node:test +
  node:assert/strict (per CONTRIBUTING testing standards)
- Remove tests from legacy complete-slice.test.ts to avoid mixing
  test frameworks in the same file
2026-04-05 13:32:56 -05:00
Jeremy
6046a31c6f fix(gsd): address Codex adversarial review findings for #3565
- verificationEvidence coercion now uses sentinel values (exitCode: -1,
  verdict: "unknown") instead of fabricating passing results
- String coercion for requirements fields now parses "ID — detail"
  delimiter format to preserve semantic payload
- Added regression tests for sentinel values and delimiter parsing

Closes #3565
2026-04-05 13:30:09 -05:00
Jeremy
0742cf3493 fix(gsd): coerce string arrays to objects in complete-slice/task tools (#3565)
LLMs sometimes pass plain strings instead of the expected object shape
for array fields like filesModified and requires, causing TypeBox
validation to reject the input before the execute function runs. This
adds Type.Union schemas to accept both formats and normalizes strings
to objects with sensible defaults in the execute functions.

Closes #3565
2026-04-05 13:23:30 -05:00
Jeremy
3a1e9e3416 fix(gsd): harden flat-rate routing guard against alias/resolution gaps
The flat-rate provider guard from #3552 can fail open in two scenarios:

1. Provider alias mismatch — isFlatRateProvider only matched the exact
   string "github-copilot", but "copilot" appears as a provider alias
   in the codebase. Case variations could also bypass the check.
   Fix: add "copilot" alias and lowercase input before set membership.

2. Unresolved primary model — when resolveModelId returns undefined
   (stale model ID, registry mismatch), the guard was skipped entirely,
   allowing dynamic routing to downgrade models on a flat-rate backend.
   Fix: fall back to autoModeStartModel.provider and ctx.model.provider
   when primary resolution fails, disabling routing if either indicates
   a flat-rate provider.

Ref: #3453
2026-04-05 13:09:44 -05:00
Tibsfox
935cc9a464 fix(pi-coding-agent): register models.json providers and await Ollama probe in headless mode 2026-04-05 11:09:08 -07:00
Tibsfox
352dd17e76 fix(ollama): use apiKey auth mode to avoid streamSimple crash 2026-04-05 11:06:38 -07:00
Tibsfox
9ab675a843 fix(gsd): disable dynamic model routing for flat-rate providers 2026-04-05 10:24:52 -07:00