Covers: environment promotion flow, testing dev/next/latest builds, Docker images, gating tests (including AutoSession encapsulation), rollback procedures, fixture recording, and GitHub configuration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5.7 KiB
CI/CD Pipeline Guide
Overview
GSD 2 uses a three-stage promotion pipeline that automatically moves merged PRs through Dev → Test → Prod environments using npm dist-tags.
PR merged to main
│
▼
┌─────────┐ ci.yml passes (build, test, typecheck)
│ DEV │ → publishes gsd-pi@<version>-dev.<sha> with @dev tag
└────┬────┘
▼ (automatic if green)
┌─────────┐ CLI smoke tests + LLM fixture replay
│ TEST │ → promotes to @next tag
└────┬────┘ → pushes Docker image as :next
▼ (manual approval required)
┌─────────┐ optional real-LLM integration tests
│ PROD │ → promotes to @latest tag
└─────────┘ → creates GitHub Release
For Contributors: Testing Your PR Before It Ships
Install the Dev Build
Every merged PR is immediately installable:
# Latest dev build (bleeding edge, every merged PR)
npx gsd-pi@dev
# Test candidate (passed smoke + fixture tests)
npx gsd-pi@next
# Stable production release
npx gsd-pi@latest # or just: npx gsd-pi
Using Docker
# Test candidate
docker run --rm -v $(pwd):/workspace ghcr.io/gsd-build/gsd-pi:next --version
# Stable
docker run --rm -v $(pwd):/workspace ghcr.io/gsd-build/gsd-pi:latest --version
Checking if a Fix Landed
- Find the PR's merge commit SHA (first 7 chars)
- Check if it's in
@dev:npm view gsd-pi@dev version- If the version ends in
-dev.<your-sha>, your PR is in dev
- If the version ends in
- Check if it promoted to
@next:npm view gsd-pi@next version - Check if it's in production:
npm view gsd-pi@latest version
For Maintainers
Pipeline Workflows
| Workflow | File | Trigger | Purpose |
|---|---|---|---|
| CI | ci.yml |
PR + push to main | Build, test, typecheck — gate for all promotions |
| Release Pipeline | pipeline.yml |
After CI succeeds on main | Three-stage promotion |
| Native Binaries | build-native.yml |
v* tags |
Cross-compile platform binaries |
| Dev Cleanup | cleanup-dev-versions.yml |
Weekly (Monday 06:00 UTC) | Unpublish -dev. versions older than 30 days |
Gating Tests
The pipeline only triggers after ci.yml passes. Key gating tests include:
- Unit tests (
npm run test:unit) — includesauto-session-encapsulation.test.tswhich enforces that all auto-mode state is encapsulated inAutoSession. Any PR adding module-level mutable state toauto.tswill fail CI and block the pipeline. - Integration tests (
npm run test:integration) - Extension typecheck (
npm run typecheck:extensions) - Package validation (
npm run validate-pack)
Approving a Prod Release
- A version reaches the Test stage automatically
- In GitHub Actions, the
prod-releasejob will show "Waiting for review" - Click Review deployments → select
prod→ Approve - The version is promoted to
@latestand a GitHub Release is created
To enable live LLM tests during Prod promotion:
- Set the
RUN_LIVE_TESTSenvironment variable totrueon theprodenvironment
Rolling Back a Release
If a broken version reaches production:
# Roll back npm
npm dist-tag add gsd-pi@<previous-good-version> latest
# Roll back Docker
docker pull ghcr.io/gsd-build/gsd-pi:<previous-good-version>
docker tag ghcr.io/gsd-build/gsd-pi:<previous-good-version> ghcr.io/gsd-build/gsd-pi:latest
docker push ghcr.io/gsd-build/gsd-pi:latest
For @dev or @next rollbacks, the next successful merge will overwrite the tag automatically.
GitHub Configuration Required
| Setting | Value |
|---|---|
Environment: dev |
No protection rules |
Environment: test |
No protection rules |
Environment: prod |
Required reviewers: maintainers |
Secret: NPM_TOKEN |
All environments |
Secret: ANTHROPIC_API_KEY |
Prod environment only |
Secret: OPENAI_API_KEY |
Prod environment only |
Variable: RUN_LIVE_TESTS |
false (set to true to enable live LLM tests) |
| GHCR | Enabled for the gsd-build org |
Docker Images
| Image | Base | Purpose | Tags |
|---|---|---|---|
ghcr.io/gsd-build/gsd-ci-builder |
node:22-bookworm |
CI build environment with Rust toolchain | :latest, :<date> |
ghcr.io/gsd-build/gsd-pi |
node:22-slim |
User-facing runtime | :latest, :next, :v<version> |
The CI builder image is rebuilt automatically when the Dockerfile changes. It eliminates ~3-5 min of toolchain setup per CI run.
LLM Fixture Tests
The fixture system records and replays LLM conversations without hitting real APIs (zero cost).
Running Fixture Tests
npm run test:fixtures
Recording New Fixtures
# Set your API key, then record
GSD_FIXTURE_MODE=record GSD_FIXTURE_DIR=./tests/fixtures/recordings \
node --experimental-strip-types tests/fixtures/record.ts
Fixtures are JSON files in tests/fixtures/recordings/. Each one captures a conversation's request/response pairs and replays them by turn index.
When to Re-Record
Re-record fixtures when:
- Provider wire format changes (e.g., new field in Anthropic response)
- Tool definitions change (affects request shape)
- System prompt changes (may cause turn count mismatch)
Version Strategy
| Tag | Published | Format | Who uses it |
|---|---|---|---|
@dev |
Every merged PR | 2.27.0-dev.a3f2c1b |
Developers verifying fixes |
@next |
Auto-promoted from dev | Same version | Early adopters, beta testers |
@latest |
Manually approved | Same version | Production users |
Old -dev. versions are cleaned up weekly (30-day retention).