feat: add GitHub Workflows skill with CI workflow and ci_monitor tool (#294)

* feat: add GitHub Workflows skill with CI workflow and ci_monitor tool - Runs on push to main and feature branches - Runs on pull requests to main - Build + test pipeline using Node 22 Cross-platform CI monitoring tool for debugging GitHub Actions: - `runs` - List recent workflow runs - `watch` - Monitor running workflow - `fail-fast` - Exit 1 on first failure (for scripts) - `log-failed` - Show failed job logs - `test-summary` - Extract test pass/fail counts - `check-actions` - GraphQL query for action versions - `grep` - Search logs with context - `wait-for` - Block until deployment keyword appears Pure Node.js - no shell interpolation, works on macOS/Windows/Linux. Drift-immune skill that: - Routes all CI operations through ci_monitor.cjs - Fetches live docs from docs.github.com (no stale training data) - Provides validation constraints (BEFORE/AFTER/EVIDENCE) - Split tests into test:unit (141 tests, ~12s) and test:integration (5 tests) - Fixed idle-recovery.test.ts for current implementation - Removed AGENTS.md dead code from resource-loader.ts - Moved npm run build out of tests (fixes ENOBUFS) When CI fails, you need observable diagnostics: - `gh run` output is not script-friendly - ci_monitor.cjs provides structured output for automation - The skill ensures AI uses the tool, not stale training data * fix: resolve imports and path for current upstream version - Updated imports from @mariozechner/pi-coding-agent to @gsd/pi-coding-agent - Fixed integration test path calculation to use process.cwd() - Kept test:unit and test:integration scripts * fix: replace search provider preference instead of accumulating AuthStorage.set() for api_key credentials appends to the existing list rather than replacing. When setSearchProviderPreference was called twice with different values, the second call appended the new value, leaving the first value at index 0, which get() returned. Fix: call auth.remove() before auth.set() to ensure only the latest preference is stored. https://claude.ai/code/session_01Qx7HRSDb117KzDZzdKk1KB * fix: address all 10 open PR review comments - package.json: run build before test:integration so a fresh checkout works - pack-install.test.ts: replace execSync+shell redirects with execFileSync argument arrays (portable, no shell parsing, paths with spaces safe) - ci_monitor.test.ts: remove unconditional passed++ after assert; move success message after the failed > 0 check so it only prints on success - setup_gh.cjs: replace unzip/tar shell-outs with platform-specific execFileSync calls (unzip on macOS, PowerShell Expand-Archive on Windows); add compareVersions() for correct element-by-element semver comparison - ci_monitor.cjs: add --repo/-R global option so repo is overrideable; fix getLogs() to use gh run view --log --job instead of binary REST endpoint https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM * fix: make all changed files fully cross-platform (Windows/macOS/Linux) - pack-install.test.ts: use tar npm package instead of tar CLI; resolve gsd binary as gsd.cmd on Windows; skip shebang check on Windows - setup_gh.cjs: use execFileSync for all binary invocations; replace which with where on Windows; add Windows PATH guidance; filter preferred install dirs by platform; unify ZIP extraction to use process.platform consistently; escape single quotes in PowerShell Expand-Archive args - ci_monitor.cjs: use path.join for .github/workflows paths; replace all split('\n') with split(/\r?\n/) to handle Windows CRLF output https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM * refactor: simplify and deduplicate changed files - ci_monitor.cjs: memoize getRepo() so gh repo view subprocess runs at most once per invocation instead of once per command call in watch loops - pack-install.test.ts: extract packTarball() helper to eliminate duplicate npm pack logic across two tests; remove unused contents variable https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM * refactor: remove redundant existsSync before canWrite() in findInstallDir canWrite() already returns false for non-existent directories, so the pre-check was a TOCTOU-style redundancy with no behavioral value. https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM * fix: replace tar npm package with Node built-ins (zlib + manual tar parsing) tar is not in the dependency tree. listTarEntries() decompresses via createGunzip() and parses the 512-byte tar block format directly, reading name/prefix/type/size fields per POSIX ustar spec. No external dependency required. Also fixes the broken tarball variable reference left over from the packTarball() refactor. https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM * remove: drop setup_gh scripts in favour of ci_monitor setup_gh.cjs and setup_gh.py were one-shot gh CLI installers. ci_monitor.cjs covers the day-to-day CI use case and is the tool the skill routes through. Environments that need gh installed can use brew/winget/distro packages directly. https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM * fix: run only unit tests in CI — integration tests cause ENOBUFS The integration tests (npm pack → npm install → spawn node) exceed the buffer limits of the CI runner environment. They are documented as requiring a manual build+run step. CI now runs test:unit only. https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM * fix: run all tests in CI without ENOBUFS - ci.yml: run unit and integration as separate steps; build is already its own step so test:integration doesn't need to rebuild - package.json: remove npm run build from test:integration script - pack-install.test.ts: npm install uses stdio:'ignore' to avoid piping large output through Node buffers (root cause of ENOBUFS); add early dist/ check with clear error message instead of rebuilding https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM * fix: resolve ENOBUFS and clean up setup_gh references - pack-install.test.ts: derive tarball filename from package.json instead of piping npm pack --json stdout; use stdio:ignore throughout to avoid exhausting OS pipe buffers on CI runners - SKILL.md: remove setup_gh install instructions; assume gh is pre-installed via system package manager; point to ci_monitor.cjs - github_project_setup.py: remove setup_gh.py reference from error message https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM * fix: address Copilot review comments on pack-install.test.ts - listTarEntries: collect chunks in array, Buffer.concat once on end instead of O(n²) repeated concat in data handler - listTarEntries: attach error handler to createReadStream so read errors reject the Promise instead of crashing the process - npm pack: use stdio:['ignore','ignore','pipe'] to preserve stderr for diagnostics while still avoiding ENOBUFS on stdout - npm install: same — pipe stderr so failures include error output https://claude.ai/code/session_01AT6CgcAB62kWcDsTJg9HZM --------- Co-authored-by: Claude <noreply@anthropic.com>
2026-03-14 00:31:17 -04:00 · 2026-03-14 00:31:17 -04:00 · f791731d4f
commit f791731d4f
parent 105fc0103a
20 changed files with 3318 additions and 235 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -0,0 +1,35 @@
+name: CI
+
+on:
+  push:
+    branches: [main, feat/**]
+  pull_request:
+    branches: [main]
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v6
+        with:
+          fetch-depth: 0
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v6
+        with:
+          node-version: '22'
+          cache: 'npm'
+
+      - name: Install dependencies
+        run: npm ci
+
+      - name: Build
+        run: npm run build
+
+      - name: Run unit tests
+        run: npm run test:unit
+
+      - name: Run integration tests
+        run: npm run test:integration
--- a/package-lock.json
+++ b/package-lock.json
@ -1,12 +1,12 @@
 {
  "name": "gsd-pi",
-  "version": "2.10.5",
+  "version": "2.10.6",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "gsd-pi",
-      "version": "2.10.5",
+      "version": "2.10.6",
      "bundleDependencies": [
        "@gsd/native",
        "@gsd/pi-agent-core",
--- a/package.json
+++ b/package.json
@ -44,7 +44,9 @@
    "build:pi": "npm run build:native-pkg && npm run build:pi-tui && npm run build:pi-ai && npm run build:pi-agent-core && npm run build:pi-coding-agent",
    "build": "npm run build:pi && tsc && npm run copy-themes",
    "copy-themes": "node -e \"const{mkdirSync,cpSync}=require('fs');const{resolve}=require('path');const src=resolve(__dirname,'packages/pi-coding-agent/dist/modes/interactive/theme');mkdirSync('pkg/dist/modes/interactive/theme',{recursive:true});cpSync(src,'pkg/dist/modes/interactive/theme',{recursive:true})\"",
-    "test": "node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/*.test.ts src/resources/extensions/gsd/tests/*.test.mjs src/tests/*.test.ts",
+    "test:unit": "node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/*.test.ts src/resources/extensions/gsd/tests/*.test.mjs src/tests/*.test.ts",
+    "test:integration": "node --import ./src/resources/extensions/gsd/tests/resolve-ts.mjs --experimental-strip-types --test src/resources/extensions/gsd/tests/*integration*.test.ts src/tests/integration/*.test.ts",
+    "test": "npm run test:unit && npm run test:integration",
    "test:browser-tools": "node --test src/resources/extensions/browser-tools/tests/browser-tools-unit.test.cjs src/resources/extensions/browser-tools/tests/browser-tools-integration.test.mjs",
    "test:native": "node --test packages/native/src/__tests__/grep.test.mjs",
    "build:native": "node native/scripts/build.js",
--- a/scripts/ci_monitor.cjs
+++ b/scripts/ci_monitor.cjs
@ -0,0 +1,293 @@
+#!/usr/bin/env node
+/**
+ * GitHub Actions CI/CD Workflow Monitor - Pure Node.js implementation
+ */
+const { spawnSync } = require('child_process');
+const fs = require('fs');
+const path = require('path');
+
+const EMOJI = { success: '✅', failure: '❌', cancelled: '🚫', skipped: '⏭️', timed_out: '⏱️', in_progress: '▶️', queued: '⏳' };
+const INTERVAL = 10, TIMEOUT = 3600, MAXBUF = 50 * 1024 * 1024;
+
+// Pure Node.js gh CLI helpers - no shell strings
+const gh = (args, opts = {}) => {
+  const r = spawnSync('gh', args, { encoding: 'utf-8', maxBuffer: opts.maxBuffer || MAXBUF, cwd: opts.cwd });
+  if (r.error) throw r.error;
+  if (r.status !== 0 && !opts.allowFail) throw new Error(r.stderr || `gh exited ${r.status}`);
+  return r.stdout;
+};
+const ghJson = (args, opts) => JSON.parse(gh(args, opts));
+const cliRepo = (() => {
+  const a = process.argv;
+  const i = a.findIndex(x => x === '--repo' || x === '-R');
+  return i >= 0 && a[i + 1] ? a[i + 1] : null;
+})();
+let _repo = null;
+const getRepo = () => _repo || (_repo = cliRepo || process.env.GITHUB_REPOSITORY || ghJson(['repo', 'view', '--json', 'nameWithOwner']).nameWithOwner);
+const runView = (id, f = 'status,conclusion,jobs') => ghJson(['run', 'view', String(id), '--repo', getRepo(), '--json', f]);
+const runList = (opts = {}) => {
+  const args = ['run', 'list', '--repo', getRepo(), '--limit', String(opts.limit || 10),
+    '--json', 'databaseId,status,conclusion,headBranch,createdAt,displayTitle,event'];
+  if (opts.branch) args.push('--branch', opts.branch);
+  return ghJson(args);
+};
+const getLogs = (runId, jobId) => gh(['run', 'view', String(runId), '--repo', getRepo(), '--log', '--job', String(jobId)], { maxBuffer: MAXBUF });
+const findJob = (runId, name) => {
+  const job = runView(runId, 'jobs').jobs?.find(j => j.name === name);
+  if (!job) { console.error(`❌ Job "${name}" not found`); process.exit(1); }
+  return job;
+};
+const emoji = (s, c) => EMOJI[c || s] || '❓';
+const sleep = ms => new Promise(r => setTimeout(r, ms));
+
+// Commands
+const cmd = {
+  runs: (opts = {}) => {
+    const list = runList({ ...opts, limit: parseInt(opts.limit) || 15 });
+    console.log(`\n📋 Recent runs${opts.branch ? ` for "${opts.branch}"` : ''}:\n`);
+    for (const r of list) {
+      console.log(`${emoji(r.status, r.conclusion)} ${String(r.databaseId).padEnd(12)} ${new Date(r.createdAt).toLocaleDateString()} [${(r.headBranch || '').padEnd(20)}] (${r.event || ''})`);
+      if (r.displayTitle) console.log(`     ${r.displayTitle.substring(0, 60)}`);
+    }
+    return list;
+  },
+  watch: async (id, opts = {}) => {
+    const int = parseInt(opts.interval) || INTERVAL;
+    console.log(`👁️  Watching run ${id}...\n`);
+    const last = new Map();
+    while (true) {
+      const run = runView(id);
+      const rs = `${run.status}:${run.conclusion}`;
+      if (last.get('run') !== rs) { console.log(`${emoji(run.status, run.conclusion)} Run: ${run.status}${run.conclusion ? ' → ' + run.conclusion : ''}`); last.set('run', rs); }
+      for (const j of run.jobs || []) {
+        const js = `${j.status}:${j.conclusion}`;
+        if (last.get(`job:${j.id}`) !== js) { console.log(`  ${emoji(j.status, j.conclusion)} ${j.name}: ${j.status}${j.conclusion ? ' → ' + j.conclusion : ''}`); last.set(`job:${j.id}`, js); }
+      }
+      if (run.status === 'completed') { console.log(`\n${emoji(run.status, run.conclusion)} Completed: ${run.conclusion}`); process.exit(run.conclusion === 'success' ? 0 : 1); }
+      await sleep(int * 1000);
+    }
+  },
+  'fail-fast': async (id, opts = {}) => {
+    const int = parseInt(opts.interval) || INTERVAL;
+    console.log(`🔍 Watching run ${id} (fail-fast)...\n`);
+    const seen = new Set();
+    while (true) {
+      const run = runView(id);
+      for (const j of run.jobs || []) {
+        if (!seen.has(j.id)) { console.log(`${emoji(j.status, j.conclusion)} ${j.name}: ${j.conclusion || j.status}`); seen.add(j.id); }
+        if (j.conclusion === 'failure') { console.log(`\n❌ Job "${j.name}" failed!\n📋 Run: ci_monitor.cjs log-failed ${id}`); process.exit(1); }
+      }
+      if (run.status === 'completed') { console.log(`\n${emoji(run.status, run.conclusion)} Run completed: ${run.conclusion}`); process.exit(run.conclusion === 'success' ? 0 : 1); }
+      await sleep(int * 1000);
+    }
+  },
+  'list-jobs': (id, opts = {}) => {
+    let jobs = runView(id).jobs || [];
+    if (opts.status) jobs = jobs.filter(j => j.conclusion === opts.status || j.status === opts.status);
+    console.log(`\n📋 Jobs in run ${id}:\n`);
+    for (const j of jobs) console.log(`${emoji(j.status, j.conclusion)} ${(j.conclusion || j.status || '?').padEnd(12)} ${j.name}`);
+  },
+  'log-failed': (id, opts = {}) => {
+    const run = runView(id, 'jobs');
+    if (!(run.jobs || []).some(j => j.conclusion === 'failure')) { console.log('✅ No failed jobs found.'); return; }
+    console.log(`\n❌ Failed jobs in run ${id}:\n`);
+    try { console.log(gh(['run', 'view', String(id), '--repo', getRepo(), '--log-failed'], { maxBuffer: MAXBUF }).split(/\r?\n/).slice(-(parseInt(opts.lines) || 200)).join('\n')); }
+    catch (e) { console.error(`Could not fetch logs: ${e.message}`); }
+  },
+  log: (id, opts = {}) => {
+    console.log(`\n📋 Full logs for run ${id}:\n`);
+    try {
+      let lines = gh(['run', 'view', String(id), '--repo', getRepo(), '--log'], { maxBuffer: MAXBUF }).split(/\r?\n/);
+      if (opts.filter) { const re = new RegExp(opts.filter, 'gi'); lines = lines.filter(l => re.test(l)); console.log(`🔍 Filtered (${lines.length} lines):\n`); }
+      console.log(lines.slice(-(parseInt(opts.lines) || 500)).join('\n'));
+    } catch (e) { console.error(`Could not fetch logs: ${e.message}`); }
+  },
+  grep: (id, opts = {}) => {
+    if (!opts.pattern) { console.error('❌ --pattern required'); process.exit(1); }
+    console.log(`\n🔍 Searching for "${opts.pattern}" in run ${id}:\n`);
+    try {
+      const lines = gh(['run', 'view', String(id), '--repo', getRepo(), '--log'], { maxBuffer: MAXBUF }).split(/\r?\n/);
+      const re = new RegExp(opts.pattern, 'gi');
+      const matches = lines.map((l, i) => re.test(l) ? { i, l } : null).filter(Boolean);
+      if (!matches.length) { console.log('No matches found.'); return; }
+      console.log(`Found ${matches.length} matches:\n`);
+      const ctx = parseInt(opts.context) || 3;
+      for (const m of matches.slice(0, 20)) {
+        console.log(`--- Line ${m.i} ---`);
+        for (let j = Math.max(0, m.i - ctx); j < Math.min(lines.length, m.i + ctx + 1); j++)
+          console.log(`${j === m.i ? '>>>' : '   '} ${lines[j]}`);
+      }
+      if (matches.length > 20) console.log(`\n... and ${matches.length - 20} more`);
+    } catch (e) { console.error(`Could not fetch logs: ${e.message}`); }
+  },
+  'test-summary': (id, opts = {}) => {
+    console.log(`\n📊 Test summary for run ${id}:\n`);
+    try {
+      const logs = gh(['run', 'view', String(id), '--repo', getRepo(), '--log'], { maxBuffer: MAXBUF });
+      const t = logs.match(/# tests[\s:]+(\d+)/i), p = logs.match(/# pass[\s:]+(\d+)/i), f = logs.match(/# fail[\s:]+(\d+)/i);
+      const notOk = logs.match(/^not ok .+$/gm);
+      if (t) console.log(`  Total tests: ${t[1]}`);
+      if (p) console.log(`  ✅ Passed: ${p[1]}`);
+      if (f) console.log(`  ❌ Failed: ${f[1]}`);
+      if (notOk?.length) { console.log(`\nFailed tests:`); notOk.slice(0, 15).forEach(x => console.log(`  ${x}`)); if (notOk.length > 15) console.log(`  ... and ${notOk.length - 15} more`); }
+    } catch (e) { console.error(`Could not fetch logs: ${e.message}`); }
+  },
+  tail: (id, job, opts = {}) => console.log(getLogs(id, findJob(id, job).id).split(/\r?\n/).slice(-(parseInt(opts.lines) || 100)).join('\n')),
+  'wait-for': async (id, jobName, opts = {}) => {
+    if (!opts.keyword) { console.error('❌ --keyword required'); process.exit(1); }
+    const to = (parseInt(opts.timeout) || TIMEOUT) * 1000, int = (parseInt(opts.interval) || 5) * 1000;
+    console.log(`🔍 Waiting for "${opts.keyword}" in "${jobName}"...\n`);
+    const start = Date.now();
+    let job = null;
+    while (!job && Date.now() - start < to) { job = runView(id).jobs?.find(j => j.name === jobName); if (!job) { console.log(`⏳ Waiting...`); await sleep(int); } }
+    if (!job) { console.error('❌ Timeout waiting for job'); process.exit(1); }
+    console.log(`▶️  Job started (ID: ${job.id})`);
+    while (Date.now() - start < to) {
+      try {
+        const logs = getLogs(id, job.id);
+        if (logs.includes(opts.keyword)) {
+          console.log(`\n✅ Found "${opts.keyword}"!`);
+          const lines = logs.split(/\r?\n/), idx = lines.findIndex(l => l.includes(opts.keyword));
+          if (idx >= 0) console.log('\n' + lines.slice(Math.max(0, idx - 2), idx + 3).join('\n'));
+          process.exit(0);
+        }
+        console.log(`📝 Log: ${logs.length} chars (${Math.floor((Date.now() - start) / 1000)}s)`);
+      } catch (e) { /* ignore */ }
+      await sleep(int);
+    }
+    console.error(`❌ Timeout waiting for "${opts.keyword}"`); process.exit(1);
+  },
+  analyze: (id, jobName) => {
+    const logs = getLogs(id, findJob(id, jobName).id);
+    const patterns = [
+      ['Errors', /error[:：]\s*(.+)/gi], ['NPM Errors', /npm ERR!\s*(.+)/gi], ['TypeScript', /error TS\d+:\s*(.+)/gi],
+      ['Timeout', /timeout|timed?\s*out/gi], ['OOM', /out of memory|OOM|heap.*exceeded/gi],
+      ['Network', /ECONNREFUSED|ETIMEDOUT|ENOTFOUND/gi], ['Bad Option', /bad option[:：]\s*(.+)/gi],
+    ];
+    console.log(`🔍 Analyzing "${jobName}"...\n`);
+    for (const [name, re] of patterns) {
+      const m = [...logs.matchAll(re)].slice(0, 5);
+      if (m.length) { console.log(`❌ ${name}:`); m.forEach(x => console.log(`   • ${(x[1] || x[0]).trim().substring(0, 80)}`)); }
+    }
+  },
+  compare: (id1, id2) => {
+    const j1 = new Map((runView(id1, 'jobs').jobs || []).map(j => [j.name, j]));
+    const j2 = new Map((runView(id2, 'jobs').jobs || []).map(j => [j.name, j]));
+    console.log(`\n🔍 Comparing ${id1} vs ${id2}:\n`);
+    for (const name of new Set([...j1.keys(), ...j2.keys()])) {
+      const a = j1.get(name)?.conclusion || 'missing', b = j2.get(name)?.conclusion || 'missing';
+      console.log(`${emoji(0, a)} ${emoji(0, b)} ${name.padEnd(25)} ${a.padEnd(10)} → ${b}${a !== b ? ' ⚠️' : ''}`);
+    }
+  },
+  'branch-runs': (branch, opts = {}) => {
+    const list = runList({ branch, limit: parseInt(opts.limit) || 10 });
+    console.log(`\n📋 Runs for "${branch}":\n`);
+    for (const r of list) console.log(`${emoji(r.status, r.conclusion)} ${String(r.databaseId).padEnd(10)} ${new Date(r.createdAt).toLocaleDateString()} ${r.displayTitle?.substring(0, 40) || ''}`);
+  },
+  'list-workflows': (opts = {}) => {
+    const dir = path.join('.github', 'workflows');
+    if (!fs.existsSync(dir)) { console.error('❌ No .github/workflows directory'); process.exit(1); }
+    const files = fs.readdirSync(dir).filter(f => f.endsWith('.yml') || f.endsWith('.yaml')).sort();
+    if (!files.length) { console.log('No workflow files found.'); return []; }
+    console.log('\n📋 Workflow files:\n');
+    for (const f of files) {
+      const c = fs.readFileSync(path.join(dir, f), 'utf-8');
+      const nm = c.match(/^name:\s*['"]?(.+?)['"]?\s*$/m)?.[1] || '(unnamed)';
+      const tr = ['push', 'pull_request', 'schedule', 'workflow_dispatch', 'release'].filter(x => c.includes(`${x}:`));
+      console.log(`📄 ${f.padEnd(30)} ${nm.padEnd(30)} ${tr.length ? `[${tr.join(', ')}]` : ''}`);
+    }
+    return files;
+  },
+  'check-actions': (wf, opts = {}) => {
+    const fp = wf || path.join('.github', 'workflows', 'ci.yml');
+    if (!fs.existsSync(fp)) { console.error(`❌ File not found: ${fp}`); process.exit(1); }
+    const c = fs.readFileSync(fp, 'utf-8');
+    
+    // Find all uses: statements
+    const actions = new Set();
+    const lines = c.split(/\r?\n/);
+    for (const line of lines) {
+      const m = line.match(/uses:\s*['"]?([^'"\s]+)['"]?/);
+      if (m && !m[1].startsWith('./') && !m[1].startsWith('docker://')) {
+        actions.add(m[1].split('@')[0]);
+      }
+    }
+    
+    if (!actions.size) { console.log('No external actions found.'); return; }
+    console.log(`\n🔍 Checking ${actions.size} actions in ${fp}:\n`);
+    
+    for (const a of actions) {
+      const [owner, repo] = a.split('/');
+      if (!owner || !repo) continue;
+      try {
+        const res = ghJson(['api', 'graphql', '-f', `query=query { repository(owner: "${owner}", name: "${repo}") { latestRelease { tagName } } }`]);
+        const latest = res?.data?.repository?.latestRelease?.tagName;
+        const curMatch = c.match(new RegExp(`${a.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}@([\\w.-]+)`));
+        const cur = curMatch?.[1] || 'unknown';
+        if (latest) {
+          const ok = cur === latest || cur === latest.replace(/^v/, '');
+          console.log(`${ok ? '✅' : '⚠️'} ${a.padEnd(35)} current: ${cur.padEnd(15)} latest: ${latest}`);
+        } else console.log(`❓ ${a.padEnd(35)} current: ${cur.padEnd(15)} (no releases)`);
+      } catch (e) { console.log(`❌ ${a.padEnd(35)} Error: ${e.message?.substring(0, 50) || e}`); }
+    }
+  },
+};
+
+// CLI
+const parseArgs = args => {
+  const r = { command: null, positional: [], options: {} };
+  for (let i = 0; i < args.length; i++) {
+    const a = args[i];
+    if (a.startsWith('--')) { const k = a.slice(2); const n = args[i + 1]; if (n && !n.startsWith('-')) { r.options[k] = n; i++; } else r.options[k] = true; }
+    else if (a.startsWith('-')) { const k = a.slice(1); const n = args[i + 1]; if (n && !n.startsWith('-')) { r.options[k] = n; i++; } else r.options[k] = true; }
+    else if (r.command === null) r.command = a; else r.positional.push(a);
+  }
+  return r;
+};
+
+const HELP = `
+GitHub Actions CI/CD Workflow Monitor
+
+COMMANDS:
+  runs [--branch <name>]              List recent runs
+  watch <run-id>                      Watch run with status changes
+  fail-fast <run-id>                  Watch run, exit 1 on first failure
+  list-jobs <run-id>                  List jobs in run
+  log-failed <run-id>                 Show logs for failed jobs
+  log <run-id> [--filter <regex>]     Show full run logs
+  grep <run-id> --pattern <regex>     Search logs with context
+  test-summary <run-id>               Extract test pass/fail counts
+  tail <run-id> <job-name>            Get last N lines of job log
+  wait-for <run-id> <job> --keyword   Block until keyword appears
+  analyze <run-id> <job>              Pattern analysis for failures
+  compare <run1> <run2>               Compare job statuses between runs
+  branch-runs <branch>                List recent runs for branch
+  list-workflows                      List all workflow files
+  check-actions [file]                Check action versions via GraphQL
+
+OPTIONS: --interval, --timeout, --lines, --filter, --pattern, --context, --branch, --keyword, --limit, --repo/-R
+`;
+
+const REQ = {
+  'watch': ['run-id'], 'fail-fast': ['run-id'], 'list-jobs': ['run-id'], 'log-failed': ['run-id'],
+  'log': ['run-id'], 'grep': ['run-id'], 'test-summary': ['run-id'], 'tail': ['run-id', 'job-name'],
+  'wait-for': ['run-id', 'job-name'], 'analyze': ['run-id', 'job-name'], 'compare': ['run-id-1', 'run-id-2'],
+  'branch-runs': ['branch'],
+};
+
+async function main() {
+  const args = process.argv.slice(2);
+  if (!args.length || args[0] === 'help' || args[0] === '--help') { console.log(HELP); process.exit(0); }
+  const { command, positional, options } = parseArgs(args);
+
+  if (!cmd[command]) { console.error(`❌ Unknown command: ${command}`); console.log(HELP); process.exit(1); }
+  const req = REQ[command] || [];
+  if (req.some((_, i) => !positional[i])) { console.error(`❌ Missing: ${req.filter((_, i) => !positional[i]).join(', ')}`); process.exit(1); }
+  if (command === 'grep' && !options.pattern) { console.error('❌ --pattern required'); process.exit(1); }
+  if (command === 'wait-for' && !options.keyword) { console.error('❌ --keyword required'); process.exit(1); }
+
+  try { await cmd[command](...positional, options); }
+  catch (e) { console.error(`❌ Error: ${e.message}`); if (process.env.DEBUG) console.error(e.stack); process.exit(1); }
+}
+
+main();
--- a/scripts/ci_monitor.md
+++ b/scripts/ci_monitor.md
@ -0,0 +1,42 @@
+# ci_monitor.cjs
+
+Cross-platform GitHub Actions CI monitoring tool. Pure Node.js — no shell commands.
+
+## Usage
+
+```bash
+node scripts/ci_monitor.cjs <command>
+```
+
+**Before using:** Run `--help` to discover available arguments.
+
+## Routing Table
+
+| When You Need | Command |
+|---------------|---------|
+| List recent runs | `runs [--branch <name>]` |
+| Monitor running workflow | `watch <run-id>` |
+| Fail fast in scripts | `fail-fast <run-id>` |
+| See why run failed | `log-failed <run-id>` |
+| Test pass/fail counts | `test-summary <run-id>` |
+| Check action versions | `check-actions <workflow-file>` |
+| Search logs | `grep <run-id> --pattern <regex>` |
+| Wait for deployment | `wait-for <run-id> <job> --keyword <text>` |
+| Compare runs | `compare <run-id-1> <run-id-2>` |
+
+## Validation Principle
+
+**"No errors" is not validation.** Use observable output:
+
+```bash
+# NOT just "success" - show specific output
+node scripts/ci_monitor.cjs test-summary <run-id>
+node scripts/ci_monitor.cjs grep <run-id> --pattern "TypeError"
+```
+
+## Why Not Just Use `gh run`?
+
+- **Observable output** — test-summary extracts counts, grep shows context
+- **fail-fast** — exits 1 on first failure (for scripts)
+- **GraphQL batching** — check-actions queries all versions in one request
+- **Cross-platform** — no shell interpolation, works on Windows
--- a/src/resource-loader.ts
+++ b/src/resource-loader.ts
@ -78,7 +78,7 @@ function getExtensionKey(entryPath: string, extensionsDir: string): string {
 *
 * - extensions/ → ~/.gsd/agent/extensions/   (always overwrite — ensures updates ship on next launch)
 * - agents/     → ~/.gsd/agent/agents/        (always overwrite)
- * - AGENTS.md   → ~/.gsd/agent/AGENTS.md      (always overwrite)
+ * - skills/     → ~/.gsd/agent/skills/        (always overwrite)
 * - GSD-WORKFLOW.md is read directly from bundled path via GSD_WORKFLOW_PATH env var
 *
 * Always-overwrite ensures `npm update -g @glittercowboy/gsd` takes effect immediately.
@ -107,13 +107,6 @@ export function initResources(agentDir: string): void {
  if (existsSync(srcSkills)) {
    cpSync(srcSkills, destSkills, { recursive: true, force: true })
  }
-
-  // Sync AGENTS.md
-  const srcAgentsMd = join(resourcesDir, 'AGENTS.md')
-  const destAgentsMd = join(agentDir, 'AGENTS.md')
-  if (existsSync(srcAgentsMd)) {
-    writeFileSync(destAgentsMd, readFileSync(srcAgentsMd))
-  }
 }

 /**
--- a/src/resources/extensions/search-the-web/provider.ts
+++ b/src/resources/extensions/search-the-web/provider.ts
@ -58,6 +58,7 @@ export function getSearchProviderPreference(authPath?: string): SearchProviderPr
 */
 export function setSearchProviderPreference(pref: SearchProviderPreference, authPath?: string): void {
  const auth = AuthStorage.create(authPath ?? authFilePath)
+  auth.remove(PREFERENCE_KEY)
  auth.set(PREFERENCE_KEY, { type: 'api_key', key: pref })
 }

--- a/src/resources/skills/github-workflows/SKILL.md
+++ b/src/resources/skills/github-workflows/SKILL.md
@ -0,0 +1,87 @@
+# GitHub Workflows
+
+**Mission:** Work with GitHub Actions without using stale training data. All syntax, versions, and parameters come from live sources.
+
+---
+
+## Structural Principle
+
+**All CI operations go through ci_monitor.cjs.** Never reach for `gh` CLI directly — the script wraps it with observable output.
+
+---
+
+## Primary Tool: ci_monitor.cjs
+
+**Path:** `scripts/ci_monitor.cjs`
+
+```bash
+node scripts/ci_monitor.cjs <command>
+```
+
+**Before using any command:**
+- [ ] Run `--help` to discover available arguments
+
+**Routing Table:**
+
+| When You Need | Command |
+|---------------|---------|
+| List recent runs | `runs [--branch <name>]` |
+| Monitor running workflow | `watch <run-id>` |
+| Fail fast in scripts | `fail-fast <run-id>` |
+| See why run failed | `log-failed <run-id>` |
+| Test pass/fail counts | `test-summary <run-id>` |
+| Check action versions | `check-actions [file]` |
+| Search logs | `grep <run-id> --pattern <regex>` |
+| Wait for deployment | `wait-for <run-id> <job> --keyword <text>` |
+
+---
+
+## Documentation Routing
+
+**Base URL:** `https://docs.github.com/en/actions/reference/workflows-and-actions/`
+
+**Before writing any workflow syntax:**
+- [ ] Fetch the relevant `.md` file from the URL above
+- [ ] Read only the section you need
+
+| Task | File | Section |
+|------|------|---------|
+| Create workflow | workflow-syntax.md | `name`, `on`, `jobs` |
+| Set triggers | workflow-syntax.md | `on` |
+| Set permissions | workflow-syntax.md | `permissions` |
+| Concurrency | workflow-syntax.md | `concurrency` |
+| Reusable workflow | workflow-syntax.md | `on.workflow_call` |
+| Annotations | workflow-commands.md | "Setting an error/warning/notice message" |
+| Output variables | workflow-commands.md | "Environment files" |
+| Conditionals | expressions.md | "Operators", "Functions" |
+| Contexts | contexts.md | "<context> context" |
+| Events | events-that-trigger-workflows.md | Event tables |
+
+---
+
+## Version Verification
+
+| What | Where |
+|------|-------|
+| Action versions | `node ci_monitor.cjs check-actions <file>` |
+| Node.js LTS | `curl -s https://nodejs.org/dist/index.json \| jq '.[0].version'` |
+
+---
+
+## Validation Constraint
+
+**"No errors" is not validation.** Prove observable change:
+
+```
+BEFORE: [specific state]
+AFTER:  [different state]
+EVIDENCE: [output from ci_monitor.cjs]
+```
+
+---
+
+## References
+
+- `references/gh/SKILL.md` — gh CLI reference
+- `scripts/ci_monitor.cjs` — CI monitoring tool
+- `scripts/ci_monitor.md` — Tool usage documentation
--- a/src/resources/skills/github-workflows/references/gh/SKILL.md
+++ b/src/resources/skills/github-workflows/references/gh/SKILL.md
@ -0,0 +1,255 @@
+---
+name: gh
+description: "Install and configure the GitHub CLI (gh) for AI agent environments where gh may not be pre-installed and git remotes use local proxies instead of github.com. Provides auto-install script with SHA256 verification and GITHUB_TOKEN auth with anonymous fallback. Use when gh command not found, shutil.which(\"gh\") returns None, need GitHub API access (issues, PRs, releases, workflow runs), or repository operations fail with \"failed to determine base repo\" error. Documents required -R flag for all gh commands in proxy environments. Includes project management: GitHub Projects V2 (gh project), milestones (REST API), issue stories (lifecycle and templates), and label taxonomy management."
+---
+# GitHub CLI (gh) — Setup and Usage
+
+## Purpose
+
+Ensures the GitHub CLI (`gh`) is available and provides correct usage patterns for AI agents operating in environments where `gh` may not be pre-installed and where git remotes point to local proxies instead of `github.com`.
+
+## When to Use
+
+- `gh` command not found or `shutil.which("gh")` returns None
+- Need to interact with GitHub API (issues, PRs, releases, workflows)
+- Repository remote does not point to `github.com` (proxy environments)
+- Need authenticated GitHub operations with `GITHUB_TOKEN`
+- Managing GitHub Issues, Projects V2, Milestones, or Labels
+
+---
+
+## Installation
+
+`gh` is assumed to be already installed via the system package manager:
+
+- **macOS**: `brew install gh`
+- **Windows**: `winget install GitHub.cli`
+- **Linux (Debian/Ubuntu)**: `apt install gh`
+
+For CI monitoring operations (watching workflow runs, checking statuses, waiting for jobs), use `ci_monitor.cjs`:
+
+```bash
+node src/resources/skills/github-workflows/references/gh/scripts/ci_monitor.cjs
+```
+
+---
+
+## Authentication
+
+`GITHUB_TOKEN` environment variable provides automatic authentication. No manual `gh auth login` needed.
+
+```bash
+# Verify authentication
+gh auth status
+```
+
+If `GITHUB_TOKEN` is set, `gh` authenticates automatically for all API calls.
+
+---
+
+## Repository Detection
+
+<repo_detection>
+
+Git remote points to a local proxy (`127.0.0.1`), NOT `github.com`. Every `gh` command fails without explicit repo specification:
+
+```text
+failed to determine base repo: none of the git remotes configured for this
+repository point to a known GitHub host.
+```
+
+**RULE: Pass `-R` (or `--repo`) on EVERY `gh` command:**
+
+```bash
+gh <command> -R gsd-build/gsd-2
+```
+
+This applies to ALL `gh` subcommands: `pr`, `issue`, `run`, `api`, `release`, `project`, etc.
+
+</repo_detection>
+
+---
+
+## Common Commands (v2.87.0)
+
+<gh_commands>
+
+### Pull Requests
+
+```bash
+# List open PRs
+gh pr list -R gsd-build/gsd-2
+
+# View PR details
+gh pr view <number> -R gsd-build/gsd-2
+
+# Check PR CI status
+gh pr checks <number> -R gsd-build/gsd-2
+
+# Create PR
+gh pr create -R gsd-build/gsd-2 --title "title" --body "body"
+
+# View PR comments
+gh api repos/gsd-build/gsd-2/pulls/<number>/comments
+```
+
+### Issues
+
+```bash
+# List issues
+gh issue list -R gsd-build/gsd-2
+
+# List by label
+gh issue list -R gsd-build/gsd-2 --label "priority:p1" --state open
+
+# Create issue with labels and milestone
+gh issue create -R gsd-build/gsd-2 \
+  --title "feat: add feature X" \
+  --label "priority:p1" --label "type:feature" \
+  --milestone "v1.0"
+
+# View issue
+gh issue view <number> -R gsd-build/gsd-2
+
+# Close issue with comment
+gh issue close <number> -R gsd-build/gsd-2 --comment "Implemented in PR #N"
+
+# Edit labels on issue
+gh issue edit <number> -R gsd-build/gsd-2 \
+  --add-label "status:in-progress" \
+  --remove-label "status:needs-grooming"
+```
+
+### Labels
+
+```bash
+# List all labels
+gh label list -R gsd-build/gsd-2
+
+# Create label
+gh label create "priority:p1" --color "E99695" \
+  --description "High priority" -R gsd-build/gsd-2
+```
+
+See [labels.md](./references/labels.md) for the full taxonomy and color codes.
+
+### Projects V2
+
+```bash
+# List projects
+gh project list --owner gsd-build
+
+# Create project
+gh project create --owner gsd-build --title "gsd-2 Backlog"
+
+# Add issue to project
+gh project item-add 1 --owner gsd-build \
+  --url https://github.com/gsd-build/gsd-2/issues/42
+```
+
+See [projects-v2.md](./references/projects-v2.md) for field creation and item editing commands.
+
+### Milestones
+
+`gh` has no native `milestone` subcommand — use `gh api` with the REST endpoint:
+
+```bash
+# List milestones
+gh api repos/gsd-build/gsd-2/milestones
+
+# Create milestone
+gh api repos/gsd-build/gsd-2/milestones \
+  -X POST -f title="v1.0" -f due_on="2026-03-31T00:00:00Z"
+
+# Assign milestone to issue
+gh api repos/gsd-build/gsd-2/issues/42 \
+  -X PATCH -F milestone=1
+```
+
+See [milestones.md](./references/milestones.md) for full CRUD reference.
+
+### Workflow Runs
+
+```bash
+# List recent runs
+gh run list -R gsd-build/gsd-2 --limit 5
+
+# View specific run
+gh run view <run-id> -R gsd-build/gsd-2
+
+# View failed job logs
+gh run view <run-id> -R gsd-build/gsd-2 --log-failed
+```
+
+### Releases
+
+```bash
+# List releases
+gh release list -R gsd-build/gsd-2
+
+# View latest release
+gh release view --repo gsd-build/gsd-2
+```
+
+### API (Direct)
+
+```bash
+# GET request
+gh api repos/gsd-build/gsd-2
+
+# POST with fields
+gh api repos/gsd-build/gsd-2/issues -f title="Bug" -f body="Details"
+
+# GraphQL
+gh api graphql -f query='{ viewer { login } }'
+
+# Paginated results
+gh api repos/gsd-build/gsd-2/contributors --paginate
+```
+
+### Repository
+
+```bash
+# Clone
+gh repo clone gsd-build/gsd-2
+
+# View repo info
+gh repo view -R gsd-build/gsd-2
+```
+
+</gh_commands>
+
+---
+
+## Output Formatting
+
+```bash
+# JSON output
+gh pr list -R gsd-build/gsd-2 --json number,title,state
+
+# JQ filtering
+gh pr list -R gsd-build/gsd-2 --json number,title --jq '.[].title'
+
+# Template formatting
+gh pr list -R gsd-build/gsd-2 --json number,title \
+  --template '{{range .}}#{{.number}} {{.title}}{{"\n"}}{{end}}'
+```
+
+---
+
+## Reference Files
+
+- [labels.md](./references/labels.md) — Label taxonomy (priority, type, status), color codes, bulk setup
+- [milestones.md](./references/milestones.md) — Milestone CRUD via REST API, naming conventions
+- [projects-v2.md](./references/projects-v2.md) — GitHub Projects V2 commands, custom fields, GraphQL queries
+- [issue-stories.md](./references/issue-stories.md) — Issue as story format, body template, lifecycle, backlog item field mapping
+
+---
+
+## Sources
+
+- [GitHub CLI Manual](https://cli.github.com/manual) — official reference
+- [GitHub CLI Releases](https://github.com/cli/cli/releases) — binary downloads
+- [GitHub REST API — Issues](https://docs.github.com/en/rest/issues) — milestones, labels, issues
+- [GitHub Projects V2 API](https://docs.github.com/en/issues/planning-and-tracking-with-projects/automating-your-project/using-the-api-to-manage-projects) — GraphQL API
+- `gh version 2.87.2 (2026-02-20)` — version verified by installation test
--- a/src/resources/skills/github-workflows/references/gh/references/issue-stories.md
+++ b/src/resources/skills/github-workflows/references/gh/references/issue-stories.md
@ -0,0 +1,204 @@
+# GitHub Issues as User Stories — Workflow and Templates
+
+## Issue as Story Model
+
+Each issue represents one backlog item and follows a story format with:
+
+- **Title**: short, imperative, `[type]: description` prefix (conventional commits style)
+- **Body**: user story + acceptance criteria + context
+- **Labels**: priority + type + status (see [labels.md](./labels.md))
+- **Milestone**: release or theme grouping (see [milestones.md](./milestones.md))
+- **Project**: board item for visualization (see [projects-v2.md](./projects-v2.md))
+
+---
+
+## Issue Title Convention
+
+```text
+feat: add priority labels to issue taxonomy
+fix: correct task_output variable reference in log_functions.sh
+refactor: replace hardcoded corporate URL in validate_glfm.py
+docs: document GitHub Projects V2 workflow
+chore: bump marketplace.json version after plugin removal
+```
+
+Mirrors Conventional Commits to link commits to issues naturally.
+
+---
+
+## Issue Body Template
+
+```markdown
+## Story
+
+As a **{role}**, I want **{goal}** so that **{benefit}**.
+
+## Description
+
+{detailed description from backlog item}
+
+## Acceptance Criteria
+
+- [ ] {criterion 1}
+- [ ] {criterion 2}
+- [ ] {criterion 3}
+
+## Context
+
+- **Source**: {where this item came from}
+- **Priority**: {P0 / P1 / P2 / Idea}
+- **Added**: {YYYY-MM-DD}
+- **Research questions**: {any open questions, or "None"}
+
+## Notes
+
+{optional: links to related issues, PRs, skills, or research}
+```
+
+---
+
+## Issue Lifecycle
+
+```text
+Open → label: status:needs-grooming
+  ↓ after grooming
+  label: status:in-progress (when work starts)
+  ↓ during work
+  PR created → PR body: "Closes #N"
+  ↓ PR merged
+  Issue auto-closed by GitHub
+  Milestone completion tracked
+```
+
+---
+
+## gh CLI — Quick Commands
+
+```bash
+# Create issue
+gh issue create -R Jamie-BitFlight/claude_skills \
+  --title "fix: correct task_output variable in log_functions.sh" \
+  --body "..." \
+  --label "priority:p1" \
+  --label "type:bug" \
+  --label "status:needs-grooming" \
+  --milestone "v1.0 — Skills Foundation"
+
+# List open issues by priority
+gh issue list -R Jamie-BitFlight/claude_skills \
+  --label "priority:p1" --state open \
+  --json number,title,labels,milestone
+
+# View issue
+gh issue view 42 -R Jamie-BitFlight/claude_skills
+
+# Close with comment
+gh issue close 42 -R Jamie-BitFlight/claude_skills \
+  --comment "Implemented in PR #45. Checklist 12/12, acceptance criteria verified."
+
+# Edit labels
+gh issue edit 42 -R Jamie-BitFlight/claude_skills \
+  --add-label "status:in-progress" --remove-label "status:needs-grooming"
+```
+
+---
+
+## PyGithub — Scripted Operations (Python)
+
+Use `PyGithub` in Python scripts — never shell out to `gh`.
+
+```python
+#!/usr/bin/env -S uv run --quiet --script
+# /// script
+# requires-python = ">=3.11"
+# dependencies = ["PyGithub>=2.1.1"]
+# ///
+from __future__ import annotations
+
+import os
+
+from github import Auth, Github
+
+gh = Github(auth=Auth.Token(os.environ["GITHUB_TOKEN"]))
+repo = gh.get_repo("Jamie-BitFlight/claude_skills")
+
+# Create issue with labels and milestone
+issue = repo.create_issue(
+    title="fix: correct task_output variable in log_functions.sh",
+    body="## Story\n\nAs a developer...",
+    labels=[
+        repo.get_label("priority:p1"),
+        repo.get_label("type:bug"),
+        repo.get_label("status:needs-grooming"),
+    ],
+    milestone=repo.get_milestone(1),
+)
+print(f"Created #{issue.number}: {issue.html_url}")
+
+# Close issue
+issue = repo.get_issue(42)
+issue.edit(state="closed")
+issue.create_comment("Implemented in PR #45.")
+```
+
+---
+
+## @octokit/rest — Claude Code Hooks (JavaScript)
+
+```javascript
+const { Octokit } = require('@octokit/rest');
+
+const octokit = new Octokit({ auth: process.env.GITHUB_TOKEN });
+
+// Create issue
+const { data: issue } = await octokit.rest.issues.create({
+  owner: 'Jamie-BitFlight',
+  repo: 'claude_skills',
+  title: 'fix: correct task_output variable',
+  body: '## Story\n\nAs a developer...',
+  labels: ['priority:p1', 'type:bug', 'status:needs-grooming'],
+  milestone: 1,
+});
+
+// Close issue
+await octokit.rest.issues.update({
+  owner: 'Jamie-BitFlight',
+  repo: 'claude_skills',
+  issue_number: 42,
+  state: 'closed',
+});
+```
+
+---
+
+## Automation Script
+
+```bash
+uv run .claude/skills/gh/scripts/github_project_setup.py issue list --priority p1
+uv run .claude/skills/gh/scripts/github_project_setup.py issue create \
+  --title "fix: correct task_output variable" \
+  --priority-label priority:p1 \
+  --type-label type:bug \
+  --milestone 1
+```
+
+---
+
+## Backlog Item ↔ GitHub Issue Field Mapping
+
+| Per-item file field | GitHub Issue field |
+|---|----|
+| `name` frontmatter | Issue title |
+| `metadata.priority` frontmatter (P0/P1/P2) | `priority:*` label |
+| Item description body | Issue body |
+| `metadata.status` frontmatter | `status:*` label |
+| `metadata.plan` frontmatter | Issue body Notes section |
+| `metadata.issue` frontmatter | Issue number (written back by `work-backlog-item`) |
+| `last-completed` frontmatter | Issue closed date |
+
+When `work-backlog-item` creates a GitHub Issue, it writes the issue number back to the per-item file in `.claude/backlog/` as `metadata.issue: '#N'`.
+
+SOURCE: GitHub Issues documentation — <https://docs.github.com/en/issues/tracking-your-work-with-issues> (accessed 2026-02-21)
+SOURCE: GitHub CLI issue manual — <https://cli.github.com/manual/gh_issue> (accessed 2026-02-21)
+SOURCE: PyGithub Issue API — <https://pygithub.readthedocs.io/en/latest/github_objects/Issue.html> (accessed 2026-02-21)
+SOURCE: Octokit.js REST — <https://octokit.github.io/rest.js/v20#issues-create> (accessed 2026-02-21)
--- a/src/resources/skills/github-workflows/references/gh/references/labels.md
+++ b/src/resources/skills/github-workflows/references/gh/references/labels.md
@ -0,0 +1,170 @@
+# GitHub Labels — Taxonomy and Management
+
+## When to Use What
+
+| Context | Tool |
+|---------|------|
+| Quick one-off command | `gh label` CLI |
+| Scripted / multi-step | `PyGithub` (Python) or `@octokit/rest` (JS) |
+| Claude Code hook | `@octokit/rest` or Node.js `https` built-in |
+
+---
+
+## Standard Label Taxonomy
+
+Three axes: **priority**, **type**, **status**.
+
+### Priority Labels
+
+| Label | Color | Description |
+|-------|-------|-------------|
+| `priority:p0` | `#D73A4A` | Critical — blocks work or production |
+| `priority:p1` | `#E99695` | High — should be done next |
+| `priority:p2` | `#F9D0C4` | Medium — do when P0/P1 are clear |
+| `priority:idea` | `#BFD4F2` | Unscoped — future consideration |
+
+### Type Labels
+
+| Label | Color | Description |
+|-------|-------|-------------|
+| `type:feature` | `#0E8A16` | New capability or skill |
+| `type:bug` | `#B60205` | Something is broken |
+| `type:refactor` | `#5319E7` | Internal improvement, no behavior change |
+| `type:docs` | `#0075CA` | Documentation only |
+| `type:chore` | `#EDEDED` | Maintenance, tooling, CI |
+
+### Status Labels
+
+All 8 lifecycle states from the backlog state machine (`.claude/skills/backlog/references/state-machine.md`) have corresponding labels.
+
+| Label | Color | Description |
+|-------|-------|-------------|
+| `status:needs-grooming` | `#FEF2C0` | Captured but not yet groomed |
+| `status:groomed` | `#C2E0C6` | Grooming complete, RT-ICA APPROVED |
+| `status:blocked` | `#B60205` | RT-ICA BLOCKED or AC verification FAIL |
+| `status:in-milestone` | `#BFD4F2` | Assigned to an active milestone |
+| `status:in-progress` | `#1D76DB` | Actively being worked |
+| `status:done` | `#0E8A16` | Implementation complete, AC verified PASS |
+| `status:resolved` | `#6B737B` | Closed without full implementation (obsolete/superseded) |
+| `status:closed` | `#EDEDED` | Terminal — milestone archived by complete-milestone |
+
+> Note: `status:needs-review` was previously in this taxonomy but is not part of the
+> state machine lifecycle. It has been retained in `github_project_setup.py` for
+> backwards compatibility but should not be applied by backlog commands.
+
+---
+
+## gh CLI — Quick Commands
+
+```bash
+# List all labels
+gh label list -R Jamie-BitFlight/claude_skills
+
+# Create a label
+gh label create "priority:p1" \
+  --color "E99695" \
+  --description "High priority — should be done next" \
+  -R Jamie-BitFlight/claude_skills
+
+# Edit a label
+gh label edit "priority:p1" \
+  --description "High priority — updated" \
+  -R Jamie-BitFlight/claude_skills
+
+# Apply labels to an issue
+gh issue edit 42 -R Jamie-BitFlight/claude_skills \
+  --add-label "status:in-progress" \
+  --remove-label "status:needs-grooming"
+```
+
+---
+
+## PyGithub — Scripted Operations (Python)
+
+Use `PyGithub` (`github` package) in Python scripts — never shell out to `gh`.
+
+```python
+#!/usr/bin/env -S uv run --quiet --script
+# /// script
+# requires-python = ">=3.11"
+# dependencies = ["PyGithub>=2.1.1"]
+# ///
+from __future__ import annotations
+
+import os
+
+from github import Auth, Github
+
+gh = Github(auth=Auth.Token(os.environ["GITHUB_TOKEN"]))
+repo = gh.get_repo("Jamie-BitFlight/claude_skills")
+
+# Create a label
+repo.create_label(name="priority:p1", color="E99695", description="High priority")
+
+# Edit existing label
+label = repo.get_label("priority:p1")
+label.edit(name="priority:p1", color="E99695", description="Updated description")
+
+# Apply label to issue
+issue = repo.get_issue(42)
+issue.add_to_labels(repo.get_label("status:in-progress"))
+issue.remove_from_labels(repo.get_label("status:needs-grooming"))
+```
+
+---
+
+## @octokit/rest — Claude Code Hooks (JavaScript)
+
+Use `@octokit/rest` in `.cjs` hook files — never use child_process to call `gh`.
+
+```javascript
+// In a Claude Code hook (.cjs)
+const { Octokit } = require('@octokit/rest');
+
+const octokit = new Octokit({ auth: process.env.GITHUB_TOKEN });
+
+// Apply label to issue
+await octokit.rest.issues.addLabels({
+  owner: 'Jamie-BitFlight',
+  repo: 'claude_skills',
+  issue_number: 42,
+  labels: ['status:in-progress'],
+});
+
+// Remove label
+await octokit.rest.issues.removeLabel({
+  owner: 'Jamie-BitFlight',
+  repo: 'claude_skills',
+  issue_number: 42,
+  name: 'status:needs-grooming',
+});
+```
+
+---
+
+## Bulk Label Setup
+
+```bash
+# Creates all taxonomy labels; skips existing
+uv run .claude/skills/gh/scripts/github_project_setup.py labels \
+  --repo Jamie-BitFlight/claude_skills
+
+# Force-update existing labels too
+uv run .claude/skills/gh/scripts/github_project_setup.py labels \
+  --repo Jamie-BitFlight/claude_skills --force
+```
+
+---
+
+## Backlog Item Priority → Issue Label Mapping
+
+| Per-item file priority | Issue label |
+|--------------------|-------------|
+| P0 | `priority:p0` |
+| P1 | `priority:p1` |
+| P2 | `priority:p2` |
+| Ideas | `priority:idea` |
+
+SOURCE: GitHub CLI label documentation — <https://cli.github.com/manual/gh_label> (accessed 2026-02-21)
+SOURCE: PyGithub label API — <https://pygithub.readthedocs.io/en/latest/github_objects/Label.html> (accessed 2026-02-21)
+SOURCE: Octokit.js REST — <https://github.com/octokit/rest.js> (accessed 2026-02-21)
--- a/src/resources/skills/github-workflows/references/gh/references/milestones.md
+++ b/src/resources/skills/github-workflows/references/gh/references/milestones.md
@ -0,0 +1,158 @@
+# GitHub Milestones — Management
+
+`gh` has no native `milestone` subcommand. Use `gh api` (REST) for quick operations or `PyGithub` in scripts.
+
+## When to Use What
+
+| Context | Tool |
+|---------|------|
+| Quick one-off | `gh api repos/{owner}/{repo}/milestones` |
+| Scripted / multi-step | `PyGithub` — `repo.create_milestone()` |
+| Claude Code hook | `@octokit/rest` |
+
+---
+
+## gh CLI (REST) — Quick Commands
+
+### List Milestones
+
+```bash
+gh api repos/Jamie-BitFlight/claude_skills/milestones \
+  --jq '.[] | [.number, .title, .state, .open_issues, .due_on] | @tsv'
+```
+
+### Create a Milestone
+
+```bash
+gh api repos/Jamie-BitFlight/claude_skills/milestones \
+  -X POST \
+  -f title="v1.0 — Skills Foundation" \
+  -f description="Core skills for the claude_skills plugin marketplace" \
+  -f due_on="2026-03-31T00:00:00Z" \
+  -f state="open"
+```
+
+Returns JSON with `number` field — use this to assign issues.
+
+### Update a Milestone
+
+```bash
+gh api repos/Jamie-BitFlight/claude_skills/milestones/1 \
+  -X PATCH -f due_on="2026-04-15T00:00:00Z"
+```
+
+### Assign Milestone to Issue
+
+```bash
+# -F sends value as integer (required for milestone field)
+gh api repos/Jamie-BitFlight/claude_skills/issues/42 \
+  -X PATCH -F milestone=1
+
+# Remove milestone
+gh api repos/Jamie-BitFlight/claude_skills/issues/42 \
+  -X PATCH -F milestone=null
+```
+
+### List Issues in a Milestone
+
+```bash
+gh issue list -R Jamie-BitFlight/claude_skills \
+  --milestone "v1.0 — Skills Foundation" \
+  --json number,title,state,labels
+```
+
+---
+
+## PyGithub — Scripted Operations (Python)
+
+Use `PyGithub` in Python scripts — never shell out to `gh`.
+
+```python
+#!/usr/bin/env -S uv run --quiet --script
+# /// script
+# requires-python = ">=3.11"
+# dependencies = ["PyGithub>=2.1.1"]
+# ///
+from __future__ import annotations
+
+import os
+from datetime import datetime, timezone
+
+from github import Auth, Github
+
+gh = Github(auth=Auth.Token(os.environ["GITHUB_TOKEN"]))
+repo = gh.get_repo("Jamie-BitFlight/claude_skills")
+
+# Create milestone
+milestone = repo.create_milestone(
+    title="v1.0 — Skills Foundation",
+    description="Core skills for the claude_skills plugin marketplace",
+    due_on=datetime(2026, 3, 31, tzinfo=timezone.utc),
+)
+
+# List milestones
+for m in repo.get_milestones(state="all"):
+    print(f"#{m.number} {m.title}")
+
+# Assign milestone to issue
+repo.get_issue(42).edit(milestone=repo.get_milestone(1))
+
+# Close milestone
+m = repo.get_milestone(1)
+m.edit(title=m.title, state="closed")
+```
+
+---
+
+## @octokit/rest — Claude Code Hooks (JavaScript)
+
+```javascript
+const { Octokit } = require('@octokit/rest');
+
+const octokit = new Octokit({ auth: process.env.GITHUB_TOKEN });
+
+// Create milestone
+const { data: milestone } = await octokit.rest.issues.createMilestone({
+  owner: 'Jamie-BitFlight',
+  repo: 'claude_skills',
+  title: 'v1.0 — Skills Foundation',
+  due_on: '2026-03-31T00:00:00Z',
+});
+
+// Assign milestone to issue
+await octokit.rest.issues.update({
+  owner: 'Jamie-BitFlight',
+  repo: 'claude_skills',
+  issue_number: 42,
+  milestone: milestone.number,
+});
+```
+
+---
+
+## Automation Script
+
+```bash
+uv run .claude/skills/gh/scripts/github_project_setup.py milestone list
+uv run .claude/skills/gh/scripts/github_project_setup.py milestone create \
+  --title "v1.0 — Skills Foundation" --due 2026-03-31
+uv run .claude/skills/gh/scripts/github_project_setup.py milestone start \
+  --number 3
+uv run .claude/skills/gh/scripts/github_project_setup.py milestone start \
+  --number 3 --dry-run
+```
+
+---
+
+## Milestone Naming Conventions
+
+```text
+v1.0 — Skills Foundation        # initial stable release
+v1.1 — Quality Gates            # linting/validation improvements
+v2.0 — GitHub Integration       # issues, projects, milestones support
+Backlog Grooming — 2026-Q1      # quarterly grooming milestone
+```
+
+SOURCE: GitHub REST API — Milestones — <https://docs.github.com/en/rest/issues/milestones> (accessed 2026-02-21)
+SOURCE: PyGithub Milestone API — <https://pygithub.readthedocs.io/en/latest/github_objects/Milestone.html> (accessed 2026-02-21)
+SOURCE: Octokit.js REST — <https://octokit.github.io/rest.js/v20#issues-create-milestone> (accessed 2026-02-21)
--- a/src/resources/skills/github-workflows/references/gh/references/projects-v2.md
+++ b/src/resources/skills/github-workflows/references/gh/references/projects-v2.md
@ -0,0 +1,177 @@
+# GitHub Projects V2 — Management
+
+GitHub Projects V2 is the current projects system (board, table, roadmap views). Managed via `gh project` CLI or GraphQL API.
+
+**Scope requirement**: `GITHUB_TOKEN` needs `project` scope. Verify with:
+
+```bash
+gh auth status
+```
+
+## When to Use What
+
+| Context | Tool |
+|---------|------|
+| Quick one-off | `gh project` CLI |
+| Scripted / multi-step | GraphQL via `PyGithub` or `@octokit/graphql` (JS) |
+| Claude Code hook | `@octokit/graphql` |
+
+Note: PyGithub does not currently expose Projects V2 objects natively. Use `repo.requester` for raw GraphQL, or `@octokit/graphql` in JS hooks.
+
+---
+
+## gh CLI — Quick Commands
+
+### Project Lifecycle
+
+```bash
+# Create for a user
+gh project create --owner Jamie-BitFlight --title "claude_skills Backlog"
+# Returns project number (e.g., 1)
+
+# List projects
+gh project list --owner Jamie-BitFlight
+
+# Link project to repository
+gh project link 1 --owner Jamie-BitFlight --repo Jamie-BitFlight/claude_skills
+
+# View project
+gh project view 1 --owner Jamie-BitFlight
+```
+
+### Custom Fields
+
+```bash
+# List fields
+gh project field-list 1 --owner Jamie-BitFlight --format json
+
+# Create Priority single-select field
+gh project field-create 1 --owner Jamie-BitFlight \
+  --name "Priority" \
+  --data-type SINGLE_SELECT \
+  --single-select-options "P0,P1,P2,Idea"
+
+# Create Status field
+gh project field-create 1 --owner Jamie-BitFlight \
+  --name "Status" \
+  --data-type SINGLE_SELECT \
+  --single-select-options "Backlog,Grooming,In Progress,Review,Done"
+```
+
+### Adding Items
+
+```bash
+# Add issue to project
+gh project item-add 1 --owner Jamie-BitFlight \
+  --url https://github.com/Jamie-BitFlight/claude_skills/issues/42
+
+# List items
+gh project item-list 1 --owner Jamie-BitFlight --format json
+```
+
+### Editing Item Fields
+
+Field values require node IDs — retrieve from `field-list` and `item-list`.
+
+```bash
+gh project item-edit \
+  --project-id <project-node-id> \
+  --id <item-node-id> \
+  --field-id <field-node-id> \
+  --single-select-option-id <option-node-id>
+```
+
+---
+
+## GraphQL — Get Node IDs
+
+```bash
+# Get project node ID and field option IDs
+gh api graphql -f query='
+{
+  user(login: "Jamie-BitFlight") {
+    projectV2(number: 1) {
+      id
+      fields(first: 20) {
+        nodes {
+          ... on ProjectV2SingleSelectField {
+            id
+            name
+            options { id name }
+          }
+        }
+      }
+    }
+  }
+}'
+```
+
+---
+
+## @octokit/graphql — Hooks (JavaScript)
+
+Use `@octokit/graphql` in Claude Code hooks for Projects V2 operations.
+
+```javascript
+const { graphql } = require('@octokit/graphql');
+
+const graphqlWithAuth = graphql.defaults({
+  headers: { authorization: `token ${process.env.GITHUB_TOKEN}` },
+});
+
+// Add issue to project
+const { addProjectV2ItemById } = await graphqlWithAuth(`
+  mutation AddItem($projectId: ID!, $contentId: ID!) {
+    addProjectV2ItemById(input: { projectId: $projectId, contentId: $contentId }) {
+      item { id }
+    }
+  }
+`, {
+  projectId: 'PVT_kwXXX',
+  contentId: 'I_kwXXX',  // issue node ID from gh api
+});
+
+// Set single-select field value
+await graphqlWithAuth(`
+  mutation SetField($projectId: ID!, $itemId: ID!, $fieldId: ID!, $optionId: String!) {
+    updateProjectV2ItemFieldValue(input: {
+      projectId: $projectId
+      itemId: $itemId
+      fieldId: $fieldId
+      value: { singleSelectOptionId: $optionId }
+    }) {
+      projectV2Item { id }
+    }
+  }
+`, { projectId: '...', itemId: '...', fieldId: '...', optionId: '...' });
+```
+
+---
+
+## Automation Script
+
+```bash
+# Full setup (labels + project creation instructions)
+uv run .claude/skills/gh/scripts/github_project_setup.py setup \
+  --repo Jamie-BitFlight/claude_skills
+```
+
+---
+
+## Standard Project Structure
+
+```text
+Project: "claude_skills Backlog"
+  Fields:
+    - Status: Backlog | Grooming | In Progress | Review | Done
+    - Priority: P0 | P1 | P2 | Idea
+    - Type: Feature | Bug | Refactor | Docs | Chore
+  Views:
+    - Board (grouped by Status)
+    - Table (all fields visible)
+    - Roadmap (grouped by Milestone)
+```
+
+SOURCE: GitHub CLI Projects documentation — <https://cli.github.com/manual/gh_project> (accessed 2026-02-21)
+SOURCE: GitHub Projects V2 GraphQL API — <https://docs.github.com/en/issues/planning-and-tracking-with-projects/automating-your-project/using-the-api-to-manage-projects> (accessed 2026-02-21)
+SOURCE: Octokit GraphQL — <https://github.com/octokit/graphql.js> (accessed 2026-02-21)
--- a/src/resources/skills/github-workflows/references/gh/scripts/experiment_cleanup.py
+++ b/src/resources/skills/github-workflows/references/gh/scripts/experiment_cleanup.py
@ -0,0 +1,191 @@
+#!/usr/bin/env -S uv --quiet run --active --script
+# /// script
+# requires-python = ">=3.11"
+# dependencies = [
+#   "typer>=0.21.0",
+#   "PyGithub>=2.1.1",
+# ]
+# ///
+"""Experiment cleanup — removes GitHub resources created during workflow experiments.
+
+Deletes only resources tagged with the experiment prefix to avoid clobbering
+production data. Safe to run between iterations.
+
+Usage:
+    experiment_cleanup.py run  --repo OWNER/REPO [--prefix experiment/] [--dry-run]
+    experiment_cleanup.py list --repo OWNER/REPO [--prefix experiment/]
+"""
+
+from __future__ import annotations
+
+import os
+from typing import TYPE_CHECKING, Annotated
+
+import typer
+from github import Auth, Github, GithubException
+
+if TYPE_CHECKING:
+    from github.Label import Label
+    from github.Repository import Repository
+
+app = typer.Typer(help="Remove experiment-created GitHub resources between test iterations")
+
+EXPERIMENT_LABELS = [
+    "priority:p0",
+    "priority:p1",
+    "priority:p2",
+    "priority:idea",
+    "type:feature",
+    "type:bug",
+    "type:refactor",
+    "type:docs",
+    "type:chore",
+    "status:in-progress",
+    "status:blocked",
+    "status:needs-grooming",
+    "status:needs-review",
+]
+
+EXPERIMENT_MILESTONE_PREFIX = "v1.0"
+EXPERIMENT_PROJECT_TITLE = "claude_skills Backlog"
+
+
+def get_gh(repo_slug: str) -> tuple[Github, Repository]:
+    """Authenticate and return a Github client and Repository.
+
+    Args:
+        repo_slug: Repository identifier in ``owner/repo`` format.
+
+    Returns:
+        Tuple of authenticated Github client and Repository object.
+    """
+    token = os.environ.get("GITHUB_TOKEN")
+    if not token:
+        typer.echo("ERROR: GITHUB_TOKEN not set", err=True)
+        raise typer.Exit(1)
+    gh = Github(auth=Auth.Token(token))
+    try:
+        return gh, gh.get_repo(repo_slug)
+    except GithubException as exc:
+        typer.echo(f"ERROR: Cannot access repo '{repo_slug}': {exc}", err=True)
+        raise typer.Exit(1) from exc
+
+
+@app.command()
+def list_resources(
+    repo: Annotated[str, typer.Option("--repo", help="owner/repo")] = "Jamie-BitFlight/claude_skills",
+) -> None:
+    """List experiment-created resources that would be removed."""
+    _, repository = get_gh(repo)
+
+    typer.echo("=== Labels (experiment taxonomy) ===")
+    existing = {label.name for label in repository.get_labels()}
+    for name in EXPERIMENT_LABELS:
+        mark = "[EXISTS]" if name in existing else "[absent]"
+        typer.echo(f"  {mark} {name}")
+
+    typer.echo("\n=== Milestones ===")
+    for ms in repository.get_milestones(state="open"):
+        if ms.title.startswith(EXPERIMENT_MILESTONE_PREFIX):
+            typer.echo(f"  [EXISTS] #{ms.number} {ms.title}")
+
+    typer.echo("\n=== Issues with experiment labels ===")
+    for label_name in EXPERIMENT_LABELS:
+        if label_name not in existing:
+            continue
+        label_obj = repository.get_label(label_name)
+        for issue in repository.get_issues(labels=[label_obj], state="all"):
+            typer.echo(f"  #{issue.number} [{issue.state}] {issue.title}")
+
+
+def _close_issues(repository: Repository, existing_labels: dict[str, Label], prefix: str, dry_run: bool) -> int:
+    """Close open issues that carry any experiment label.
+
+    Args:
+        repository: GitHub repository object.
+        existing_labels: Mapping of label name to Label object.
+        prefix: Log prefix for dry-run mode.
+        dry_run: If True, only print what would happen.
+
+    Returns:
+        Number of issues closed.
+    """
+    closed = 0
+    for label_name in EXPERIMENT_LABELS:
+        if label_name not in existing_labels:
+            continue
+        label_obj = existing_labels[label_name]
+        for issue in repository.get_issues(labels=[label_obj], state="open"):
+            typer.echo(f"{prefix}Close issue #{issue.number}: {issue.title}")
+            if not dry_run:
+                issue.edit(state="closed")
+                closed += 1
+    return closed
+
+
+def _delete_labels(existing_labels: dict[str, Label], prefix: str, dry_run: bool) -> int:
+    """Delete experiment taxonomy labels.
+
+    Args:
+        existing_labels: Mapping of label name to Label object.
+        prefix: Log prefix for dry-run mode.
+        dry_run: If True, only print what would happen.
+
+    Returns:
+        Number of labels deleted.
+    """
+    deleted = 0
+    for name, label_obj in existing_labels.items():
+        if name in EXPERIMENT_LABELS:
+            typer.echo(f"{prefix}Delete label: {name}")
+            if not dry_run:
+                label_obj.delete()
+                deleted += 1
+    return deleted
+
+
+def _close_milestones(repository: Repository, prefix: str, dry_run: bool) -> int:
+    """Close milestones with the experiment title prefix.
+
+    Args:
+        repository: GitHub repository object.
+        prefix: Log prefix for dry-run mode.
+        dry_run: If True, only print what would happen.
+
+    Returns:
+        Number of milestones closed.
+    """
+    closed = 0
+    for ms in repository.get_milestones(state="open"):
+        if ms.title.startswith(EXPERIMENT_MILESTONE_PREFIX):
+            typer.echo(f"{prefix}Close milestone #{ms.number}: {ms.title}")
+            if not dry_run:
+                ms.edit(title=ms.title, state="closed")
+                closed += 1
+    return closed
+
+
+@app.command()
+def run(
+    repo: Annotated[str, typer.Option("--repo", help="owner/repo")] = "Jamie-BitFlight/claude_skills",
+    dry_run: Annotated[bool, typer.Option("--dry-run", help="Print actions without executing")] = False,
+) -> None:
+    """Remove experiment-created labels, milestones, and issues."""
+    _, repository = get_gh(repo)
+    prefix = "[DRY-RUN] " if dry_run else ""
+
+    existing_labels = {label.name: label for label in repository.get_labels()}
+    issues_closed = _close_issues(repository, existing_labels, prefix, dry_run)
+    deleted_labels = _delete_labels(existing_labels, prefix, dry_run)
+    closed_milestones = _close_milestones(repository, prefix, dry_run)
+
+    typer.echo("\nCleanup summary:")
+    typer.echo(f"  Issues closed:     {issues_closed}")
+    typer.echo(f"  Labels deleted:    {deleted_labels}")
+    typer.echo(f"  Milestones closed: {closed_milestones}")
+    if dry_run:
+        typer.echo("  (dry-run — no changes made)")
+
+
+if __name__ == "__main__":
+    app()
--- a/src/resources/skills/github-workflows/references/gh/scripts/github_project_setup.py
+++ b/src/resources/skills/github-workflows/references/gh/scripts/github_project_setup.py
@ -0,0 +1,799 @@
+#!/usr/bin/env -S uv --quiet run --active --script
+# /// script
+# requires-python = ">=3.11"
+# dependencies = [
+#   "typer>=0.21.0",
+#   "PyGithub>=2.1.1",
+# ]
+# ///
+"""GitHub Project Setup — multi-step project management automation.
+
+Orchestrates: label creation, milestone management, project setup,
+Projects V2 status updates, and backlog item issue import using the
+PyGithub native library. Projects V2 GraphQL mutations use the gh CLI.
+
+Authentication: reads GITHUB_TOKEN from environment.
+
+Usage:
+    github_project_setup.py setup    --repo OWNER/REPO [--project-title TITLE]
+    github_project_setup.py labels   --repo OWNER/REPO [--force]
+    github_project_setup.py milestone create --repo OWNER/REPO --title TITLE [--due YYYY-MM-DD]
+    github_project_setup.py milestone list   --repo OWNER/REPO
+    github_project_setup.py milestone start  --repo OWNER/REPO --number N [--dry-run]
+    github_project_setup.py milestone close  --repo OWNER/REPO --number N [--dry-run]
+    github_project_setup.py issue create     --repo OWNER/REPO --title TITLE [options]
+    github_project_setup.py issue list       --repo OWNER/REPO [--priority p1] [--state open]
+    github_project_setup.py project update-status --project-number N --issue-number N --status STATUS
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import shutil
+import subprocess
+from datetime import UTC, datetime
+from typing import TYPE_CHECKING, Annotated
+
+import typer
+from github import Auth, Github, GithubException
+
+if TYPE_CHECKING:
+    from github.Issue import Issue
+    from github.Label import Label
+    from github.Milestone import Milestone
+    from github.Repository import Repository
+
+app = typer.Typer(help="GitHub Project management automation via PyGithub")
+milestone_app = typer.Typer(help="Milestone operations")
+issue_app = typer.Typer(help="Issue operations")
+project_app = typer.Typer(help="GitHub Projects V2 operations")
+app.add_typer(milestone_app, name="milestone")
+app.add_typer(issue_app, name="issue")
+app.add_typer(project_app, name="project")
+
+DEFAULT_REPO = "Jamie-BitFlight/claude_skills"
+
+# Standard label taxonomy
+LABELS: list[dict[str, str]] = [
+    # Priority
+    {"name": "priority:p0", "color": "D73A4A", "description": "Critical — blocks work or production"},
+    {"name": "priority:p1", "color": "E99695", "description": "High — should be done next"},
+    {"name": "priority:p2", "color": "F9D0C4", "description": "Medium — do when P0/P1 are clear"},
+    {"name": "priority:idea", "color": "BFD4F2", "description": "Unscoped — future consideration"},
+    # Type
+    {"name": "type:feature", "color": "0E8A16", "description": "New capability or skill"},
+    {"name": "type:bug", "color": "B60205", "description": "Something is broken"},
+    {"name": "type:refactor", "color": "5319E7", "description": "Internal improvement, no behavior change"},
+    {"name": "type:docs", "color": "0075CA", "description": "Documentation only"},
+    {"name": "type:chore", "color": "EDEDED", "description": "Maintenance, tooling, CI"},
+    # Status (all 8 state-machine states + legacy needs-review)
+    {"name": "status:needs-grooming", "color": "FEF2C0", "description": "Captured but not yet groomed"},
+    {"name": "status:groomed", "color": "C2E0C6", "description": "Grooming complete, RT-ICA APPROVED"},
+    {"name": "status:blocked", "color": "B60205", "description": "RT-ICA BLOCKED or AC verification FAIL"},
+    {"name": "status:in-milestone", "color": "BFD4F2", "description": "Assigned to an active milestone"},
+    {"name": "status:in-progress", "color": "1D76DB", "description": "Actively being worked on"},
+    {"name": "status:done", "color": "0E8A16", "description": "Implementation complete, AC verified PASS"},
+    {"name": "status:resolved", "color": "6B737B", "description": "Closed without full implementation"},
+    {"name": "status:closed", "color": "EDEDED", "description": "Terminal — milestone archived"},
+    # Legacy label retained for backwards compatibility — not part of state machine
+    {"name": "status:needs-review", "color": "D876E3", "description": "Implementation done, needs review"},
+]
+
+PRIORITY_LABEL_MAP = {"P0": "priority:p0", "P1": "priority:p1", "P2": "priority:p2", "IDEAS": "priority:idea"}
+
+VALID_STATUSES = ("Backlog", "Grooming", "In Progress", "In Review", "Done")
+
+# Label-to-status mapping for milestone transitions
+_LABEL_TO_PROJECT_STATUS = {
+    "status:in-progress": "In Progress",
+    "status:done": "Done",
+    "status:needs-grooming": "Grooming",
+    "status:needs-review": "In Review",
+    "status:blocked": "Backlog",
+}
+
+
+def get_github() -> Github:
+    """Return an authenticated Github client from GITHUB_TOKEN."""
+    token = os.environ.get("GITHUB_TOKEN")
+    if not token:
+        typer.echo("ERROR: GITHUB_TOKEN environment variable not set", err=True)
+        raise typer.Exit(1)
+    return Github(auth=Auth.Token(token))
+
+
+def get_repo(gh: Github, repo_slug: str) -> Repository:
+    """Return a Repository object, exit on failure.
+
+    Args:
+        gh: Authenticated Github client.
+        repo_slug: Repository identifier in ``owner/repo`` format.
+
+    Returns:
+        Repository object for the given slug.
+    """
+    try:
+        return gh.get_repo(repo_slug)
+    except GithubException as exc:
+        typer.echo(f"ERROR: Cannot access repo '{repo_slug}': {exc}", err=True)
+        raise typer.Exit(1) from exc
+
+
+@app.command()
+def labels(
+    repo: Annotated[str, typer.Option("--repo", "-R")] = DEFAULT_REPO,
+    force: Annotated[bool, typer.Option("--force")] = False,
+) -> None:
+    """Create standard label taxonomy. Skips labels that already exist unless --force."""
+    gh = get_github()
+    repository = get_repo(gh, repo)
+
+    existing = {lbl.name: lbl for lbl in repository.get_labels()}
+    created = updated = skipped = 0
+
+    for spec in LABELS:
+        name = spec["name"]
+        if name in existing:
+            if force:
+                existing[name].edit(name=name, color=spec["color"], description=spec["description"])
+                typer.echo(f"  updated: {name}")
+                updated += 1
+            else:
+                typer.echo(f"  exists:  {name}  (--force to update)")
+                skipped += 1
+        else:
+            repository.create_label(name=name, color=spec["color"], description=spec["description"])
+            typer.echo(f"  created: {name}")
+            created += 1
+
+    typer.echo(f"\nLabels: {created} created, {updated} updated, {skipped} skipped")
+
+
+@milestone_app.command("create")
+def milestone_create(
+    title: Annotated[str, typer.Option("--title")],
+    repo: Annotated[str, typer.Option("--repo", "-R")] = DEFAULT_REPO,
+    description: Annotated[str, typer.Option("--description")] = "",
+    due: Annotated[str | None, typer.Option("--due", help="Due date YYYY-MM-DD")] = None,
+) -> None:
+    """Create a milestone."""
+    gh = get_github()
+    repository = get_repo(gh, repo)
+
+    due_dt = datetime.strptime(due, "%Y-%m-%d").replace(tzinfo=UTC) if due else None
+    if due_dt is not None and description:
+        milestone = repository.create_milestone(title=title, description=description, due_on=due_dt)
+    elif due_dt is not None:
+        milestone = repository.create_milestone(title=title, due_on=due_dt)
+    elif description:
+        milestone = repository.create_milestone(title=title, description=description)
+    else:
+        milestone = repository.create_milestone(title=title)
+    typer.echo(f"Created milestone #{milestone.number}: {milestone.title}")
+    typer.echo(f"  URL: {milestone.html_url}")
+
+
+@milestone_app.command("list")
+def milestone_list(repo: Annotated[str, typer.Option("--repo", "-R")] = DEFAULT_REPO) -> None:
+    """List all open milestones."""
+    gh = get_github()
+    repository = get_repo(gh, repo)
+
+    milestones = list(repository.get_milestones(state="all"))
+    if not milestones:
+        typer.echo("No milestones.")
+        return
+    for m in milestones:
+        due = m.due_on.strftime("%Y-%m-%d") if m.due_on else "no due date"
+        typer.echo(
+            f"  #{m.number:3d}  [{m.state}]  {m.title}  ({m.open_issues} open, {m.closed_issues} closed)  due: {due}"
+        )
+
+
+@milestone_app.command("start")
+def milestone_start(
+    number: Annotated[int, typer.Option("--number", "-n", help="Milestone number")],
+    repo: Annotated[str, typer.Option("--repo", "-R")] = DEFAULT_REPO,
+    dry_run: Annotated[bool, typer.Option("--dry-run")] = False,
+    project_number: Annotated[int, typer.Option("--project-number", "-p", help="Projects V2 number")] = 0,
+    owner: Annotated[str, typer.Option("--owner", help="GitHub owner for Projects V2")] = "Jamie-BitFlight",
+) -> None:
+    """Transition open milestone issues from status:needs-grooming to status:in-progress.
+
+    When --project-number is set, also updates the Projects V2 Status field
+    to "In Progress" for each transitioned issue.
+    """
+    gh = get_github()
+    repository = get_repo(gh, repo)
+    milestone = _get_open_milestone(repository, number)
+
+    if milestone.open_issues == 0:
+        typer.echo(
+            f"WARNING: Milestone #{number} '{milestone.title}' has no open issues. "
+            "Add items first with /group-items-to-milestone."
+        )
+        raise typer.Exit(0)
+
+    open_issues = list(repository.get_issues(milestone=milestone, state="open"))
+    typer.echo(f"Milestone #{milestone.number}: {milestone.title}")
+    typer.echo(f"  {milestone.open_issues} open issue(s) — transitioning labels:\n")
+
+    for issue in open_issues:
+        label_names = [lbl.name for lbl in issue.labels]
+        typer.echo(f"  #{issue.number:4d}  {issue.title[:60]:<60}  [{', '.join(label_names)}]")
+
+    if dry_run:
+        typer.echo("\n[dry-run] No changes made.")
+        if project_number:
+            typer.echo("\nProjects V2 status updates (dry-run):")
+            _bulk_update_project_status(owner, project_number, open_issues, "In Progress", dry_run=True)
+        return
+
+    in_progress_label = _ensure_label(repository, "status:in-progress", "1D76DB", "Actively being worked on")
+    succeeded, skipped, failed = _transition_issues(open_issues, in_progress_label)
+
+    # Update Projects V2 Status if project specified
+    v2_succeeded = v2_failed = 0
+    if project_number:
+        typer.echo("\nUpdating Projects V2 Status → In Progress:")
+        v2_succeeded, v2_failed = _bulk_update_project_status(owner, project_number, open_issues, "In Progress")
+
+    typer.echo(
+        f"\nMilestone #{milestone.number} '{milestone.title}' started.\n"
+        f"  {succeeded} transitioned, {skipped} already in-progress, {failed} failed."
+    )
+    if project_number:
+        typer.echo(f"  Projects V2: {v2_succeeded} updated, {v2_failed} failed.")
+    typer.echo(
+        f"\nWork on individual items:\n"
+        f"  /work-backlog-item {{title}}\n"
+        f"\nTrack progress:\n"
+        f"  uv run .claude/skills/gh/scripts/github_project_setup.py issue list "
+        f"--repo {repo}"
+    )
+    if failed:
+        raise typer.Exit(1)
+
+
+@milestone_app.command("close")
+def milestone_close(
+    number: Annotated[int, typer.Option("--number", "-n", help="Milestone number")],
+    repo: Annotated[str, typer.Option("--repo", "-R")] = DEFAULT_REPO,
+    dry_run: Annotated[bool, typer.Option("--dry-run")] = False,
+    project_number: Annotated[int, typer.Option("--project-number", "-p", help="Projects V2 number")] = 0,
+    owner: Annotated[str, typer.Option("--owner", help="GitHub owner for Projects V2")] = "Jamie-BitFlight",
+) -> None:
+    """Close a milestone: transition open issues to status:done and close the milestone.
+
+    When --project-number is set, also updates the Projects V2 Status field
+    to "Done" for each issue in the milestone (both open and already-closed).
+    """
+    gh = get_github()
+    repository = get_repo(gh, repo)
+    milestone = _get_open_milestone(repository, number)
+
+    open_issues = list(repository.get_issues(milestone=milestone, state="open"))
+    closed_issues = list(repository.get_issues(milestone=milestone, state="closed"))
+    total = len(open_issues) + len(closed_issues)
+
+    typer.echo(f"Milestone #{milestone.number}: {milestone.title}")
+    typer.echo(f"  {len(closed_issues)} closed, {len(open_issues)} open\n")
+
+    if open_issues:
+        typer.echo("Open issues (will be transitioned to status:done):")
+        for issue in open_issues:
+            label_names = [lbl.name for lbl in issue.labels]
+            typer.echo(f"  #{issue.number:4d}  {issue.title[:60]:<60}  [{', '.join(label_names)}]")
+        typer.echo()
+
+    if dry_run:
+        typer.echo("[dry-run] No changes made.")
+        if project_number:
+            all_issues = open_issues + closed_issues
+            typer.echo("\nProjects V2 status updates (dry-run):")
+            _bulk_update_project_status(owner, project_number, all_issues, "Done", dry_run=True)
+        return
+
+    succeeded = skipped = failed = 0
+    if open_issues:
+        done_label = _ensure_label(repository, "status:done", "0E8A16", "Work complete, milestone closing")
+        succeeded, skipped, failed = _transition_to_done(open_issues, done_label)
+
+    # Close the milestone
+    milestone.edit(title=milestone.title, state="closed")
+    typer.echo(f"\nMilestone #{milestone.number} '{milestone.title}' closed.")
+    if open_issues:
+        typer.echo(f"  {succeeded} transitioned to status:done, {skipped} already done, {failed} failed.")
+    typer.echo(f"  {len(closed_issues)}/{total} issues were closed before milestone close.")
+
+    # Update Projects V2 Status for all issues in milestone
+    v2_succeeded = v2_failed = 0
+    if project_number:
+        all_issues = open_issues + closed_issues
+        typer.echo("\nUpdating Projects V2 Status → Done:")
+        v2_succeeded, v2_failed = _bulk_update_project_status(owner, project_number, all_issues, "Done")
+        typer.echo(f"  Projects V2: {v2_succeeded} updated, {v2_failed} failed.")
+
+    if failed:
+        raise typer.Exit(1)
+
+
+def _transition_to_done(open_issues: list[Issue], done_label: Label) -> tuple[int, int, int]:
+    """Apply status:done label to each open issue.
+
+    Returns:
+        Tuple of (succeeded, skipped, failed) counts.
+    """
+    status_labels_to_remove = {"status:in-progress", "status:needs-grooming"}
+    succeeded = failed = skipped = 0
+    typer.echo()
+    for issue in open_issues:
+        label_names = [lbl.name for lbl in issue.labels]
+        if "status:done" in label_names:
+            typer.echo(f"  #{issue.number}  already has status:done — skipped")
+            skipped += 1
+            continue
+        try:
+            new_label_names = [lbl.name for lbl in issue.labels if lbl.name not in status_labels_to_remove]
+            new_label_names.append(done_label.name)
+            issue.edit(labels=new_label_names)
+            typer.echo(f"  #{issue.number}  {issue.title[:60]}  → status:done")
+            succeeded += 1
+        except GithubException as exc:
+            typer.echo(f"  #{issue.number}  FAILED: {exc}", err=True)
+            failed += 1
+    return succeeded, skipped, failed
+
+
+def _get_open_milestone(repository: Repository, number: int) -> Milestone:
+    """Fetch a milestone and verify it is open.
+
+    Args:
+        repository: GitHub repository object.
+        number: Milestone number.
+
+    Returns:
+        The Milestone object.
+
+    Raises:
+        typer.Exit: If the milestone is not found or already closed.
+    """
+    try:
+        milestone = repository.get_milestone(number)
+    except GithubException as exc:
+        typer.echo(f"ERROR: Milestone #{number} not found.", err=True)
+        open_milestones = list(repository.get_milestones(state="open"))
+        if open_milestones:
+            typer.echo("Open milestones:", err=True)
+            for m in open_milestones:
+                typer.echo(f"  #{m.number}  {m.title}", err=True)
+        raise typer.Exit(1) from exc
+
+    if milestone.state == "closed":
+        typer.echo(f"ERROR: Milestone #{number} '{milestone.title}' is already closed.", err=True)
+        raise typer.Exit(1)
+
+    return milestone
+
+
+def _find_gh_cli() -> str:
+    """Locate the gh CLI binary.
+
+    Returns:
+        Path to gh binary.
+
+    Raises:
+        typer.Exit: If gh is not found on PATH.
+    """
+    gh_path = shutil.which("gh")
+    if not gh_path:
+        typer.echo("ERROR: gh CLI not found. Install gh via your system package manager (brew/winget/apt).", err=True)
+        raise typer.Exit(1)
+    return gh_path
+
+
+def _gh_graphql(query: str) -> dict:
+    """Execute a GraphQL query via the gh CLI.
+
+    Args:
+        query: GraphQL query string.
+
+    Returns:
+        Parsed JSON response from the GitHub GraphQL API.
+
+    Raises:
+        typer.Exit: If the gh CLI call fails.
+    """
+    gh_path = _find_gh_cli()
+    try:
+        result = subprocess.run(
+            [gh_path, "api", "graphql", "-f", f"query={query}"], capture_output=True, text=True, check=True
+        )
+    except subprocess.CalledProcessError as exc:
+        typer.echo(f"ERROR: GraphQL query failed: {exc.stderr or exc}", err=True)
+        raise typer.Exit(1) from exc
+    return json.loads(result.stdout)
+
+
+def _discover_project_fields(owner: str, project_number: int) -> tuple[str, str, dict[str, str]]:
+    """Discover project ID, Status field ID, and option IDs via GraphQL.
+
+    Args:
+        owner: GitHub user or organization login.
+        project_number: The project number (visible in the URL).
+
+    Returns:
+        Tuple of (project_id, status_field_id, option_map) where
+        option_map maps status name to option ID.
+
+    Raises:
+        typer.Exit: If project or Status field not found.
+    """
+    query = (
+        '{ user(login: "' + owner + '") { projectV2(number: ' + str(project_number) + ") { id fields(first: 30) {"
+        " nodes { ... on ProjectV2SingleSelectField {"
+        " id name options { id name } } } } } } }"
+    )
+    resp = _gh_graphql(query)
+
+    project = resp.get("data", {}).get("user", {}).get("projectV2")
+    if not project:
+        typer.echo(f"ERROR: Project #{project_number} not found for user '{owner}'.", err=True)
+        raise typer.Exit(1)
+
+    project_id = project["id"]
+    for field in project["fields"]["nodes"]:
+        if field.get("name") == "Status":
+            field_id = field["id"]
+            option_map = {opt["name"]: opt["id"] for opt in field["options"]}
+            return project_id, field_id, option_map
+
+    typer.echo(f"ERROR: No 'Status' field found in project #{project_number}.", err=True)
+    raise typer.Exit(1)
+
+
+def _find_project_item_id(project_id: str, issue_node_id: str) -> str:
+    """Find or create the project item for a given issue.
+
+    Adds the issue to the project if not already present. The
+    ``addProjectV2ItemById`` mutation is idempotent — it returns the
+    existing item if the issue is already on the board.
+
+    Args:
+        project_id: GraphQL node ID of the project.
+        issue_node_id: GraphQL node ID of the issue.
+
+    Returns:
+        The project item ID.
+
+    Raises:
+        typer.Exit: If the mutation fails to return an item ID.
+    """
+    query = (
+        "mutation { addProjectV2ItemById(input: {"
+        f'projectId: "{project_id}", contentId: "{issue_node_id}"'
+        "}) { item { id } } }"
+    )
+    resp = _gh_graphql(query)
+    item_id = resp.get("data", {}).get("addProjectV2ItemById", {}).get("item", {}).get("id")
+    if not item_id:
+        typer.echo("ERROR: Failed to add issue to project.", err=True)
+        raise typer.Exit(1)
+    return item_id
+
+
+def _set_project_field(project_id: str, item_id: str, field_id: str, option_id: str) -> None:
+    """Set a single-select field value on a project item.
+
+    Args:
+        project_id: GraphQL node ID of the project.
+        item_id: GraphQL node ID of the project item.
+        field_id: GraphQL node ID of the field.
+        option_id: GraphQL node ID of the option to set.
+    """
+    query = (
+        "mutation { updateProjectV2ItemFieldValue(input: {"
+        f'projectId: "{project_id}", itemId: "{item_id}", '
+        f'fieldId: "{field_id}", '
+        f'value: {{singleSelectOptionId: "{option_id}"}}'
+        "}) { projectV2Item { id } } }"
+    )
+    _gh_graphql(query)
+
+
+def _update_project_status(
+    owner: str, project_number: int, issue_node_id: str, status: str, *, dry_run: bool = False, issue_label: str = ""
+) -> bool:
+    """Update the Projects V2 Status field for a single issue.
+
+    This is the shared implementation used by both the ``project update-status``
+    command and the milestone start/close integration.
+
+    Args:
+        owner: GitHub user or organization login.
+        project_number: The project number.
+        issue_node_id: GraphQL node ID of the issue.
+        status: Target status value (must be in VALID_STATUSES).
+        dry_run: If True, report what would happen without mutating.
+        issue_label: Optional label for dry-run output (e.g. "#42 title").
+
+    Returns:
+        True if the update succeeded (or would succeed in dry-run).
+    """
+    if status not in VALID_STATUSES:
+        typer.echo(f"ERROR: Invalid status '{status}'. Valid values: {', '.join(VALID_STATUSES)}", err=True)
+        return False
+
+    project_id, field_id, option_map = _discover_project_fields(owner, project_number)
+
+    option_id = option_map.get(status)
+    if not option_id:
+        typer.echo(
+            f"ERROR: Status '{status}' not found in project options. Available: {', '.join(option_map)}", err=True
+        )
+        return False
+
+    if dry_run:
+        label = issue_label or issue_node_id
+        typer.echo(f"  [dry-run] Would set {label} → Status: {status}")
+        return True
+
+    item_id = _find_project_item_id(project_id, issue_node_id)
+    _set_project_field(project_id, item_id, field_id, option_id)
+
+    label = issue_label or issue_node_id
+    typer.echo(f"  {label} → Status: {status}")
+    return True
+
+
+def _bulk_update_project_status(
+    owner: str, project_number: int, issues: list[Issue], status: str, *, dry_run: bool = False
+) -> tuple[int, int]:
+    """Update Projects V2 Status for multiple issues.
+
+    Discovers project fields once and reuses for all issues.
+
+    Args:
+        owner: GitHub user or organization login.
+        project_number: The project number.
+        issues: List of PyGithub Issue objects.
+        status: Target status value.
+        dry_run: If True, report without mutating.
+
+    Returns:
+        Tuple of (succeeded, failed) counts.
+    """
+    if status not in VALID_STATUSES:
+        typer.echo(f"ERROR: Invalid status '{status}'. Valid values: {', '.join(VALID_STATUSES)}", err=True)
+        return 0, len(issues)
+
+    project_id, field_id, option_map = _discover_project_fields(owner, project_number)
+    option_id = option_map.get(status)
+    if not option_id:
+        typer.echo(
+            f"ERROR: Status '{status}' not found in project options. Available: {', '.join(option_map)}", err=True
+        )
+        return 0, len(issues)
+
+    succeeded = failed = 0
+    for issue in issues:
+        label = f"#{issue.number}  {issue.title[:50]}"
+        if dry_run:
+            typer.echo(f"  [dry-run] Would set {label} → Status: {status}")
+            succeeded += 1
+            continue
+        try:
+            item_id = _find_project_item_id(project_id, issue.node_id)
+            _set_project_field(project_id, item_id, field_id, option_id)
+            typer.echo(f"  {label} → Status: {status}")
+            succeeded += 1
+        except (typer.Exit, subprocess.CalledProcessError) as exc:
+            typer.echo(f"  {label} FAILED: {exc}", err=True)
+            failed += 1
+    return succeeded, failed
+
+
+def _ensure_label(repository: Repository, name: str, color: str, description: str) -> Label:
+    """Return the label, creating it if it does not exist.
+
+    Args:
+        repository: GitHub repository object.
+        name: Label name to find or create.
+        color: Hex color code for the label (without ``#`` prefix).
+        description: Human-readable label description.
+
+    Returns:
+        The existing or newly created Label object.
+    """
+    try:
+        return repository.get_label(name)
+    except GithubException:
+        label = repository.create_label(name=name, color=color, description=description)
+        typer.echo(f"\n  Created label: {name}")
+        return label
+
+
+def _transition_issues(open_issues: list[Issue], in_progress_label: Label) -> tuple[int, int, int]:
+    """Apply label transition from ``status:needs-grooming`` to ``status:in-progress``.
+
+    Args:
+        open_issues: List of open Issue objects to transition.
+        in_progress_label: The ``status:in-progress`` Label to apply.
+
+    Returns:
+        Tuple of (succeeded, skipped, failed) counts.
+    """
+    succeeded = failed = skipped = 0
+    typer.echo()
+    for issue in open_issues:
+        label_names = [lbl.name for lbl in issue.labels]
+        if "status:in-progress" in label_names:
+            typer.echo(f"  #{issue.number}  already has status:in-progress — skipped")
+            skipped += 1
+            continue
+        try:
+            new_label_names = [lbl.name for lbl in issue.labels if lbl.name != "status:needs-grooming"]
+            new_label_names.append(in_progress_label.name)
+            issue.edit(labels=new_label_names)
+            typer.echo(f"  #{issue.number}  {issue.title[:60]}  → status:in-progress")
+            succeeded += 1
+        except GithubException as exc:
+            typer.echo(f"  #{issue.number}  FAILED: {exc}", err=True)
+            failed += 1
+    return succeeded, skipped, failed
+
+
+@project_app.command("update-status")
+def project_update_status(
+    issue_number: Annotated[int, typer.Option("--issue-number", "-i", help="GitHub issue number")],
+    status: Annotated[str, typer.Option("--status", "-s", help="Target status value")],
+    project_number: Annotated[int, typer.Option("--project-number", "-p", help="Project number")] = 1,
+    owner: Annotated[str, typer.Option("--owner")] = "Jamie-BitFlight",
+    repo: Annotated[str, typer.Option("--repo", "-R")] = DEFAULT_REPO,
+    dry_run: Annotated[bool, typer.Option("--dry-run")] = False,
+) -> None:
+    """Set the Projects V2 Status field for a GitHub issue.
+
+    Discovers field IDs dynamically via GraphQL, then updates the Status
+    single-select field. The issue is added to the project if not already present.
+
+    Valid statuses: Backlog, Grooming, In Progress, In Review, Done
+    """
+    if status not in VALID_STATUSES:
+        typer.echo(f"ERROR: Invalid status '{status}'. Valid values: {', '.join(VALID_STATUSES)}", err=True)
+        raise typer.Exit(1)
+
+    gh = get_github()
+    repository = get_repo(gh, repo)
+
+    try:
+        issue = repository.get_issue(issue_number)
+    except GithubException as exc:
+        typer.echo(f"ERROR: Issue #{issue_number} not found: {exc}", err=True)
+        raise typer.Exit(1) from exc
+
+    typer.echo(f"Issue #{issue.number}: {issue.title}")
+    typer.echo(f"  Project: {owner}/projects/{project_number}")
+    typer.echo(f"  Target status: {status}")
+
+    ok = _update_project_status(
+        owner=owner,
+        project_number=project_number,
+        issue_node_id=issue.raw_data["node_id"],
+        status=status,
+        dry_run=dry_run,
+        issue_label=f"#{issue.number}  {issue.title[:50]}",
+    )
+    if not ok:
+        raise typer.Exit(1)
+    if not dry_run:
+        typer.echo("  Done.")
+
+
+@issue_app.command("create")
+def issue_create(
+    repo: Annotated[str, typer.Option("--repo", "-R")] = DEFAULT_REPO,
+    title: Annotated[str, typer.Option("--title")] = "",
+    body: Annotated[str, typer.Option("--body")] = "",
+    priority_label: Annotated[str, typer.Option("--priority-label")] = "",
+    type_label: Annotated[str, typer.Option("--type-label")] = "",
+    milestone_number: Annotated[int, typer.Option("--milestone")] = 0,
+) -> None:
+    """Create a GitHub issue with priority/type labels and optional milestone."""
+    if not title:
+        typer.echo("ERROR: --title is required", err=True)
+        raise typer.Exit(1)
+
+    gh = get_github()
+    repository = get_repo(gh, repo)
+
+    label_names = ["status:needs-grooming"]
+    if priority_label:
+        label_names.append(priority_label)
+    if type_label:
+        label_names.append(type_label)
+
+    label_objects = []
+    for lbl_name in label_names:
+        try:
+            label_objects.append(repository.get_label(lbl_name))
+        except GithubException:
+            typer.echo(f"  WARNING: label '{lbl_name}' not found — skipping", err=True)
+
+    milestone_obj = None
+    if milestone_number:
+        try:
+            milestone_obj = repository.get_milestone(milestone_number)
+        except GithubException:
+            typer.echo(f"  WARNING: milestone #{milestone_number} not found — skipping", err=True)
+
+    if milestone_obj is not None:
+        issue = repository.create_issue(title=title, body=body or "", labels=label_objects, milestone=milestone_obj)
+    else:
+        issue = repository.create_issue(title=title, body=body or "", labels=label_objects)
+    typer.echo(f"Created issue #{issue.number}: {issue.title}")
+    typer.echo(f"  URL: {issue.html_url}")
+
+
+@issue_app.command("list")
+def issue_list(
+    repo: Annotated[str, typer.Option("--repo", "-R")] = DEFAULT_REPO,
+    priority: Annotated[str, typer.Option("--priority")] = "",
+    state: Annotated[str, typer.Option("--state")] = "open",
+) -> None:
+    """List issues, optionally filtered by priority."""
+    gh = get_github()
+    repository = get_repo(gh, repo)
+
+    kwargs: dict = {"state": state}
+    if priority:
+        label_name = PRIORITY_LABEL_MAP.get(priority.upper(), f"priority:{priority.lower()}")
+        try:
+            kwargs["labels"] = [repository.get_label(label_name)]
+        except GithubException:
+            typer.echo(f"Label '{label_name}' not found", err=True)
+
+    issues = list(repository.get_issues(**kwargs))
+    if not issues:
+        typer.echo("No issues found.")
+        return
+    for issue in issues:
+        milestone_title = issue.milestone.title if issue.milestone else "—"
+        label_names = ", ".join(lbl.name for lbl in issue.labels)
+        typer.echo(f"  #{issue.number:4d}  {issue.title[:55]:<55}  [{label_names}]  {milestone_title}")
+
+
+@app.command()
+def setup(
+    repo: Annotated[str, typer.Option("--repo", "-R")] = DEFAULT_REPO,
+    project_title: Annotated[str, typer.Option("--project-title")] = "claude_skills Backlog",
+) -> None:
+    """Full project setup: create label taxonomy and report next steps."""
+    typer.echo(f"Setting up GitHub project for {repo}...")
+    typer.echo("\n1. Creating label taxonomy...")
+
+    gh = get_github()
+    repository = get_repo(gh, repo)
+
+    existing = {lbl.name: lbl for lbl in repository.get_labels()}
+    created = skipped = 0
+    for spec in LABELS:
+        if spec["name"] not in existing:
+            repository.create_label(name=spec["name"], color=spec["color"], description=spec["description"])
+            typer.echo(f"   created: {spec['name']}")
+            created += 1
+        else:
+            skipped += 1
+
+    typer.echo(f"   Labels: {created} created, {skipped} already existed")
+
+    typer.echo(f"\n2. Project '{project_title}' — create via gh CLI:")
+    typer.echo(f'   gh project create --owner {repo.split("/")[0]} --title "{project_title}"')
+    typer.echo("\nNote: GitHub Projects V2 requires project OAuth scope.")
+    typer.echo("      Use gh project commands or the GraphQL API for project creation.")
+    typer.echo("      See .claude/skills/gh/references/projects-v2.md for field setup commands.")
+
+
+if __name__ == "__main__":
+    app()
--- a/src/resources/skills/github-workflows/references/gh/tests/init.py
+++ b/src/resources/skills/github-workflows/references/gh/tests/init.py
--- a/src/resources/skills/github-workflows/references/gh/tests/test_github_project_setup.py
+++ b/src/resources/skills/github-workflows/references/gh/tests/test_github_project_setup.py
@ -0,0 +1,608 @@
+"""Tests for github_project_setup.py — milestone/issue workflow automation.
+
+Each test class maps to one CLI command (one step in the workflow):
+
+  Step 1  labels         — create/update label taxonomy
+  Step 2  milestone list — list milestones
+  Step 3  milestone create — create a new milestone
+  Step 4  issue create   — create an issue with labels and optional milestone
+  Step 5  milestone start — bulk-transition status:needs-grooming → status:in-progress
+  Step 6  issue list     — list issues (post-start verification)
+
+Tests: every happy-path and key error-path for each command
+How:   Typer CliRunner + unittest.mock to mock PyGithub calls (no network)
+Why:   Prove each workflow step executes correctly and exits with the right code
+"""
+
+from __future__ import annotations
+
+import importlib.util
+import sys
+from pathlib import Path
+from unittest.mock import MagicMock, patch
+
+from typer.testing import CliRunner
+
+# ---------------------------------------------------------------------------
+# Load github_project_setup as a module (filename contains underscores so
+# a plain import works, but we use importlib for clarity and path safety).
+# ---------------------------------------------------------------------------
+_SCRIPT = Path(__file__).parent.parent / "scripts" / "github_project_setup.py"
+_spec = importlib.util.spec_from_file_location("github_project_setup", _SCRIPT)
+assert _spec is not None, f"Cannot find spec for {_SCRIPT}"
+assert _spec.loader is not None, f"Cannot find loader for {_SCRIPT}"
+_gps = importlib.util.module_from_spec(_spec)
+sys.modules["github_project_setup"] = _gps
+_spec.loader.exec_module(_gps)
+
+app = _gps.app
+
+runner = CliRunner()
+
+
+# ---------------------------------------------------------------------------
+# Shared helpers
+# ---------------------------------------------------------------------------
+
+
+def _make_label(name: str) -> MagicMock:
+    lbl = MagicMock()
+    lbl.name = name
+    return lbl
+
+
+def _make_issue(number: int, title: str, labels: list[str]) -> MagicMock:
+    issue = MagicMock()
+    issue.number = number
+    issue.title = title
+    issue.labels = [_make_label(n) for n in labels]
+    return issue
+
+
+def _make_milestone(
+    number: int,
+    title: str,
+    state: str = "open",
+    open_issues: int = 2,
+    closed_issues: int = 1,
+    due_on: object = None,
+    html_url: str = "https://github.com/owner/repo/milestone/1",
+) -> MagicMock:
+    m = MagicMock()
+    m.number = number
+    m.title = title
+    m.state = state
+    m.open_issues = open_issues
+    m.closed_issues = closed_issues
+    m.due_on = due_on
+    m.html_url = html_url
+    return m
+
+
+# ---------------------------------------------------------------------------
+# Step 1 — labels command
+# ---------------------------------------------------------------------------
+
+
+class TestLabelsCommand:
+    """Step 1: create/update the label taxonomy."""
+
+    def test_creates_missing_labels(self) -> None:
+        """Labels that don't exist are created; output confirms creation."""
+        mock_repo = MagicMock()
+        mock_repo.get_labels.return_value = []  # no existing labels
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "test-token"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["labels", "--repo", "owner/repo"])
+
+        assert result.exit_code == 0, result.output
+        assert mock_repo.create_label.called
+        assert "created" in result.output
+
+    def test_skips_existing_labels_without_force(self) -> None:
+        """Existing labels are skipped when --force is not passed."""
+        existing = _make_label("priority:p0")
+        mock_repo = MagicMock()
+        mock_repo.get_labels.return_value = [existing]
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "test-token"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["labels", "--repo", "owner/repo"])
+
+        assert result.exit_code == 0, result.output
+        assert "exists" in result.output
+
+    def test_force_updates_existing_labels(self) -> None:
+        """--force flag causes existing labels to be updated."""
+        existing_label = _make_label("priority:p0")
+        mock_repo = MagicMock()
+        mock_repo.get_labels.return_value = [existing_label]
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "test-token"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["labels", "--repo", "owner/repo", "--force"])
+
+        assert result.exit_code == 0, result.output
+        assert existing_label.edit.called
+        assert "updated" in result.output
+
+    def test_missing_token_exits_nonzero(self) -> None:
+        """Missing GITHUB_TOKEN exits with code 1."""
+        env = {k: v for k, v in __import__("os").environ.items() if k != "GITHUB_TOKEN"}
+        with patch.dict("os.environ", env, clear=True):
+            result = runner.invoke(app, ["labels"])
+
+        assert result.exit_code == 1
+
+
+# ---------------------------------------------------------------------------
+# Step 2 — milestone list
+# ---------------------------------------------------------------------------
+
+
+class TestMilestoneList:
+    """Step 2: list milestones (read-only verification step)."""
+
+    def test_lists_milestones(self) -> None:
+        """Open and closed milestones are printed with number, state, title."""
+        m1 = _make_milestone(1, "v1.0 — Skills Foundation", open_issues=5)
+        m2 = _make_milestone(2, "v1.1 — Quality Gates", state="closed")
+        mock_repo = MagicMock()
+        mock_repo.get_milestones.return_value = [m1, m2]
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "list"])
+
+        assert result.exit_code == 0, result.output
+        assert "#  1" in result.output
+        assert "v1.0" in result.output
+        assert "#  2" in result.output
+
+    def test_empty_repo_prints_no_milestones(self) -> None:
+        """Empty milestone list prints a 'No milestones.' message."""
+        mock_repo = MagicMock()
+        mock_repo.get_milestones.return_value = []
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "list"])
+
+        assert result.exit_code == 0
+        assert "No milestones" in result.output
+
+
+# ---------------------------------------------------------------------------
+# Step 3 — milestone create
+# ---------------------------------------------------------------------------
+
+
+class TestMilestoneCreate:
+    """Step 3: create a new milestone."""
+
+    def test_creates_milestone_title_only(self) -> None:
+        """Milestone is created with title only (no due date, no description)."""
+        new_ms = _make_milestone(3, "test-milestone")
+        mock_repo = MagicMock()
+        mock_repo.create_milestone.return_value = new_ms
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "create", "--title", "test-milestone"])
+
+        assert result.exit_code == 0, result.output
+        mock_repo.create_milestone.assert_called_once_with(title="test-milestone")
+        assert "Created milestone #3" in result.output
+
+    def test_creates_milestone_with_due_date(self) -> None:
+        """Due date is parsed and passed as datetime to create_milestone."""
+        new_ms = _make_milestone(4, "sprint")
+        mock_repo = MagicMock()
+        mock_repo.create_milestone.return_value = new_ms
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "create", "--title", "sprint", "--due", "2026-03-31"])
+
+        assert result.exit_code == 0, result.output
+        call_kwargs = mock_repo.create_milestone.call_args[1]
+        assert "due_on" in call_kwargs
+        assert call_kwargs["due_on"].year == 2026
+
+    def test_creates_milestone_with_description(self) -> None:
+        """Description is included when provided."""
+        new_ms = _make_milestone(5, "release")
+        mock_repo = MagicMock()
+        mock_repo.create_milestone.return_value = new_ms
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(
+                app, ["milestone", "create", "--title", "release", "--description", "Release milestone"]
+            )
+
+        assert result.exit_code == 0, result.output
+        call_kwargs = mock_repo.create_milestone.call_args[1]
+        assert call_kwargs.get("description") == "Release milestone"
+
+
+# ---------------------------------------------------------------------------
+# Step 4 — issue create
+# ---------------------------------------------------------------------------
+
+
+class TestIssueCreate:
+    """Step 4: create an issue with labels and optional milestone."""
+
+    def test_creates_issue_with_all_labels(self) -> None:
+        """Issue created with priority, type, and status:needs-grooming labels."""
+        new_issue = _make_issue(42, "feat: add skill", [])
+        new_issue.html_url = "https://github.com/owner/repo/issues/42"
+        mock_repo = MagicMock()
+        mock_repo.create_issue.return_value = new_issue
+        mock_repo.get_label.side_effect = _make_label
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(
+                app,
+                [
+                    "issue",
+                    "create",
+                    "--title",
+                    "feat: add skill",
+                    "--priority-label",
+                    "priority:p1",
+                    "--type-label",
+                    "type:feature",
+                ],
+            )
+
+        assert result.exit_code == 0, result.output
+        assert "Created issue #42" in result.output
+        # Verify labels passed to create_issue include all three
+        call_kwargs = mock_repo.create_issue.call_args[1]
+        label_names = [lbl.name for lbl in call_kwargs["labels"]]
+        assert "status:needs-grooming" in label_names
+        assert "priority:p1" in label_names
+        assert "type:feature" in label_names
+
+    def test_creates_issue_with_milestone(self) -> None:
+        """Issue is assigned to milestone when --milestone is provided."""
+        new_issue = _make_issue(43, "fix: bug", [])
+        new_issue.html_url = "https://github.com/owner/repo/issues/43"
+        milestone_obj = _make_milestone(1, "v1.0")
+        mock_repo = MagicMock()
+        mock_repo.create_issue.return_value = new_issue
+        mock_repo.get_label.side_effect = _make_label
+        mock_repo.get_milestone.return_value = milestone_obj
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["issue", "create", "--title", "fix: bug", "--milestone", "1"])
+
+        assert result.exit_code == 0, result.output
+        call_kwargs = mock_repo.create_issue.call_args[1]
+        assert call_kwargs["milestone"] is milestone_obj
+
+    def test_missing_title_exits_nonzero(self) -> None:
+        """Missing --title exits with code 1."""
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.Github"),
+            patch("github_project_setup.get_repo", return_value=MagicMock()),
+        ):
+            result = runner.invoke(app, ["issue", "create"])
+
+        assert result.exit_code == 1
+
+    def test_unknown_label_is_skipped_not_fatal(self) -> None:
+        """Unknown label prints a warning but issue is still created."""
+        from github import GithubException
+
+        new_issue = _make_issue(44, "test", [])
+        new_issue.html_url = "https://github.com/owner/repo/issues/44"
+        mock_repo = MagicMock()
+        mock_repo.create_issue.return_value = new_issue
+        mock_repo.get_label.side_effect = GithubException(404, "not found")
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(
+                app, ["issue", "create", "--title", "test", "--priority-label", "priority:p99-nonexistent"]
+            )
+
+        assert result.exit_code == 0, result.output
+        assert "Created issue" in result.output
+
+
+# ---------------------------------------------------------------------------
+# Step 5 — milestone start (the new command)
+# ---------------------------------------------------------------------------
+
+
+class TestMilestoneStart:
+    """Step 5: bulk-transition status:needs-grooming → status:in-progress."""
+
+    def _setup_repo(
+        self, issues: list[MagicMock] | None = None, milestone_state: str = "open"
+    ) -> tuple[MagicMock, MagicMock]:
+        """Return (mock_repo, milestone) with issues attached."""
+        if issues is None:
+            issues = [
+                _make_issue(10, "First issue", ["priority:p1", "status:needs-grooming"]),
+                _make_issue(11, "Second issue", ["priority:p2", "status:needs-grooming"]),
+            ]
+        milestone = _make_milestone(1, "v1.0", state=milestone_state, open_issues=len(issues))
+        mock_repo = MagicMock()
+        mock_repo.get_milestone.return_value = milestone
+        mock_repo.get_issues.return_value = issues
+        mock_repo.get_label.return_value = _make_label("status:in-progress")
+        return mock_repo, milestone
+
+    # --- happy path ---
+
+    def test_transitions_needs_grooming_to_in_progress(self) -> None:
+        """Each issue has status:needs-grooming removed and status:in-progress added."""
+        mock_repo, _ = self._setup_repo()
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "start", "--number", "1"])
+
+        assert result.exit_code == 0, result.output
+        # Both issues should have been edited
+        for iss in mock_repo.get_issues.return_value:
+            iss.edit.assert_called_once()
+            call_kwargs = iss.edit.call_args[1]
+            assert "status:in-progress" in call_kwargs["labels"]
+            assert "status:needs-grooming" not in call_kwargs["labels"]
+
+    def test_skips_already_in_progress_issues(self) -> None:
+        """Issues already labeled status:in-progress are skipped, not double-edited."""
+        issues = [_make_issue(20, "already done", ["priority:p1", "status:in-progress"])]
+        mock_repo, _ = self._setup_repo(issues=issues)
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "start", "--number", "1"])
+
+        assert result.exit_code == 0, result.output
+        # edit must NOT have been called
+        issues[0].edit.assert_not_called()
+        assert "skipped" in result.output
+
+    def test_creates_in_progress_label_when_missing(self) -> None:
+        """status:in-progress label is created if it doesn't exist in the repo."""
+        from github import GithubException
+
+        issues = [_make_issue(30, "work", ["status:needs-grooming"])]
+        new_label = _make_label("status:in-progress")
+        mock_repo, _ = self._setup_repo(issues=issues)
+        mock_repo.get_label.side_effect = GithubException(404, "not found")
+        mock_repo.create_label.return_value = new_label
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "start", "--number", "1"])
+
+        assert result.exit_code == 0, result.output
+        mock_repo.create_label.assert_called_once()
+
+    def test_summary_counts_reported(self) -> None:
+        """Final summary line shows transitioned/skipped/failed counts."""
+        issues = [
+            _make_issue(40, "pending", ["status:needs-grooming"]),
+            _make_issue(41, "already", ["status:in-progress"]),
+        ]
+        mock_repo, _ = self._setup_repo(issues=issues)
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "start", "--number", "1"])
+
+        assert result.exit_code == 0, result.output
+        assert "1 transitioned" in result.output
+        assert "1 already in-progress" in result.output
+        assert "0 failed" in result.output
+
+    # --- error paths ---
+
+    def test_closed_milestone_exits_nonzero(self) -> None:
+        """Closed milestone exits with code 1 and explains the error."""
+        mock_repo, _ = self._setup_repo(milestone_state="closed")
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "start", "--number", "1"])
+
+        assert result.exit_code == 1
+        assert "already closed" in result.stderr
+
+    def test_empty_milestone_exits_zero_with_warning(self) -> None:
+        """Milestone with zero open issues exits 0 and prints a warning."""
+        milestone = _make_milestone(1, "empty", open_issues=0)
+        mock_repo = MagicMock()
+        mock_repo.get_milestone.return_value = milestone
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "start", "--number", "1"])
+
+        assert result.exit_code == 0
+        assert "no open issues" in result.output.lower()
+
+    def test_milestone_not_found_lists_open_milestones(self) -> None:
+        """Non-existent milestone prints open milestones and exits 1."""
+        from github import GithubException
+
+        open_ms = _make_milestone(2, "v2.0")
+        mock_repo = MagicMock()
+        mock_repo.get_milestone.side_effect = GithubException(404, "not found")
+        mock_repo.get_milestones.return_value = [open_ms]
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "start", "--number", "999"])
+
+        assert result.exit_code == 1
+        assert "v2.0" in result.stderr
+
+    def test_per_issue_failure_continues_and_exits_nonzero(self) -> None:
+        """Failure on one issue is logged; remaining issues are attempted; exit code 1."""
+        from github import GithubException
+
+        good_issue = _make_issue(50, "good", ["status:needs-grooming"])
+        bad_issue = _make_issue(51, "bad", ["status:needs-grooming"])
+        bad_issue.edit.side_effect = GithubException(403, "forbidden")
+        issues = [good_issue, bad_issue]
+        mock_repo, _ = self._setup_repo(issues=issues)
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "start", "--number", "1"])
+
+        assert result.exit_code == 1
+        good_issue.edit.assert_called_once()  # good issue was still processed
+        assert "FAILED" in result.stderr
+
+    def test_preserves_non_status_labels(self) -> None:
+        """priority:p1 label is preserved; only status:needs-grooming is removed."""
+        issues = [_make_issue(60, "preserve labels", ["priority:p1", "type:feature", "status:needs-grooming"])]
+        mock_repo, _ = self._setup_repo(issues=issues)
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["milestone", "start", "--number", "1"])
+
+        assert result.exit_code == 0, result.output
+        call_kwargs = issues[0].edit.call_args[1]
+        assert "priority:p1" in call_kwargs["labels"]
+        assert "type:feature" in call_kwargs["labels"]
+        assert "status:needs-grooming" not in call_kwargs["labels"]
+        assert "status:in-progress" in call_kwargs["labels"]
+
+
+# ---------------------------------------------------------------------------
+# Step 6 — issue list (verification step after start)
+# ---------------------------------------------------------------------------
+
+
+class TestIssueList:
+    """Step 6: list issues to verify status labels after milestone start."""
+
+    def test_lists_open_issues_with_labels_and_milestone(self) -> None:
+        """Issue list shows number, title, labels, and milestone."""
+        issue = _make_issue(70, "do work", ["priority:p1", "status:in-progress"])
+        issue.milestone = _make_milestone(1, "v1.0")
+        mock_repo = MagicMock()
+        mock_repo.get_issues.return_value = [issue]
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["issue", "list", "--state", "open"])
+
+        assert result.exit_code == 0, result.output
+        assert "#  70" in result.output
+        assert "status:in-progress" in result.output
+        assert "v1.0" in result.output
+
+    def test_empty_issue_list_prints_message(self) -> None:
+        """No matching issues prints 'No issues found.'"""
+        mock_repo = MagicMock()
+        mock_repo.get_issues.return_value = []
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["issue", "list"])
+
+        assert result.exit_code == 0
+        assert "No issues found" in result.output
+
+    def test_priority_filter_passes_label_to_api(self) -> None:
+        """--priority p1 filters issues by the priority:p1 label."""
+        p1_label = _make_label("priority:p1")
+        issue = _make_issue(80, "p1 work", ["priority:p1"])
+        issue.milestone = None
+        mock_repo = MagicMock()
+        mock_repo.get_label.return_value = p1_label
+        mock_repo.get_issues.return_value = [issue]
+
+        with (
+            patch.dict("os.environ", {"GITHUB_TOKEN": "tok"}),
+            patch("github_project_setup.get_repo", return_value=mock_repo),
+            patch("github_project_setup.Github"),
+        ):
+            result = runner.invoke(app, ["issue", "list", "--priority", "p1"])
+
+        assert result.exit_code == 0, result.output
+        mock_repo.get_label.assert_called_with("priority:p1")
+        call_kwargs = mock_repo.get_issues.call_args[1]
+        assert "labels" in call_kwargs
--- a/src/tests/app-smoke.test.ts
+++ b/src/tests/app-smoke.test.ts
@ -1,19 +1,19 @@
 /**
- * App-level smoke tests for the gsd CLI package.
+ * Unit tests for the gsd CLI package.
 *
 * Tests the glue code that IS the product:
 * - app-paths resolve to ~/.gsd/
 * - loader sets all required env vars
 * - resource-loader syncs bundled resources
 * - wizard loadStoredEnvKeys hydrates env
- * - npm pack produces a valid tarball
- * - tarball installs and the `gsd` binary resolves
+ *
+ * Integration tests (npm pack, install, launch) are in ./integration/pack-install.test.ts
 */

 import test from "node:test";
 import assert from "node:assert/strict";
-import { execSync, spawn } from "node:child_process";
-import { existsSync, mkdirSync, mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs";
+import { execSync } from "node:child_process";
+import { existsSync, mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs";
 import { join } from "node:path";
 import { tmpdir } from "node:os";
 import { fileURLToPath } from "node:url";
@ -150,57 +150,6 @@ test("initResources syncs extensions, agents, and skills to target dir", async (
 // 4. wizard loadStoredEnvKeys hydration
 // ═══════════════════════════════════════════════════════════════════════════

-test("buildResourceLoader expands ~/.pi extension directories into entry files", async () => {
-  const originalHome = process.env.HOME;
-  const tmp = mkdtempSync(join(tmpdir(), "gsd-pi-ext-test-"));
-  const fakeHome = join(tmp, "home");
-  const fakeAgentDir = join(tmp, "agent");
-  const piExtensionsDir = join(fakeHome, ".pi", "agent", "extensions");
-  mkdirSync(piExtensionsDir, { recursive: true });
-  mkdirSync(fakeAgentDir, { recursive: true });
-
-  writeFileSync(
-    join(piExtensionsDir, "top-level.ts"),
-    "export default function(pi){ pi.on('agent_start', () => {}); }\n",
-  );
-
-  const packagedDir = join(piExtensionsDir, "packaged-ext");
-  mkdirSync(packagedDir, { recursive: true });
-  writeFileSync(
-    join(packagedDir, "package.json"),
-    JSON.stringify({ pi: { extensions: ["./custom-entry.ts"] } }, null, 2),
-  );
-  writeFileSync(
-    join(packagedDir, "custom-entry.ts"),
-    "export default function(pi){ pi.on('agent_start', () => {}); }\n",
-  );
-
-  process.env.HOME = fakeHome;
-
-  try {
-    const { buildResourceLoader } = await import("../resource-loader.ts");
-    const loader = buildResourceLoader(fakeAgentDir);
-    await loader.reload();
-    const { extensions, errors } = loader.getExtensions();
-
-    assert.ok(
-      extensions.some((ext) => ext.path.endsWith("top-level.ts")),
-      "loads top-level ~/.pi extension files",
-    );
-    assert.ok(
-      extensions.some((ext) => ext.path.endsWith("packaged-ext/custom-entry.ts")),
-      "loads packaged ~/.pi extensions via pi.extensions manifest",
-    );
-    assert.ok(
-      !errors.some((err) => err.path === piExtensionsDir),
-      "does not try to load the ~/.pi/agent/extensions directory itself as a module",
-    );
-  } finally {
-    if (originalHome) process.env.HOME = originalHome; else delete process.env.HOME;
-    rmSync(tmp, { recursive: true, force: true });
-  }
-});
-
 test("loadStoredEnvKeys hydrates process.env from auth.json", async () => {
  const { loadStoredEnvKeys } = await import("../wizard.ts");
  const { AuthStorage } = await import("@gsd/pi-coding-agent");
@ -273,171 +222,3 @@ test("loadStoredEnvKeys does not overwrite existing env vars", async () => {
    rmSync(tmp, { recursive: true, force: true });
  }
 });
-
-// ═══════════════════════════════════════════════════════════════════════════
-// 6. npm pack produces valid tarball with correct file layout
-// ═══════════════════════════════════════════════════════════════════════════
-
-test("npm pack produces tarball with required files", async () => {
-  // Build first
-  execSync("npm run build", { cwd: projectRoot, stdio: "pipe" });
-
-  // Pack
-  let packOutput: string;
-  try {
-    packOutput = execSync("npm pack --json 2>/dev/null", {
-      cwd: projectRoot,
-      encoding: "utf-8",
-    });
-  } catch (e: any) {
-    // ENOBUFS is a system buffer exhaustion, not a code issue
-    if (e.code === 'ENOBUFS') {
-      console.log('  SKIP: System buffer exhaustion (ENOBUFS)');
-      return;
-    }
-    throw e;
-  }
-  const packInfo = JSON.parse(packOutput);
-  const tarball = packInfo[0].filename;
-  const tarballPath = join(projectRoot, tarball);
-
-  assert.ok(existsSync(tarballPath), `tarball ${tarball} created`);
-
-  try {
-    // List tarball contents
-    const contents = execSync(`tar tzf ${tarballPath}`, { encoding: "utf-8" });
-    const files = contents.split("\n").filter(Boolean);
-
-    // Critical files must be present
-    assert.ok(files.some(f => f.includes("dist/loader.js")), "tarball contains dist/loader.js");
-    assert.ok(files.some(f => f.includes("dist/cli.js")), "tarball contains dist/cli.js");
-    assert.ok(files.some(f => f.includes("dist/app-paths.js")), "tarball contains dist/app-paths.js");
-    assert.ok(files.some(f => f.includes("dist/wizard.js")), "tarball contains dist/wizard.js");
-    assert.ok(files.some(f => f.includes("dist/resource-loader.js")), "tarball contains dist/resource-loader.js");
-    assert.ok(files.some(f => f.includes("pkg/package.json")), "tarball contains pkg/package.json");
-    assert.ok(files.some(f => f.includes("src/resources/extensions/gsd/index.ts")), "tarball contains bundled gsd extension");
-    // AGENTS.md was merged into system.md (commit acea86b)
-    assert.ok(files.some(f => f.includes("scripts/postinstall.js")), "tarball contains postinstall script");
-
-    // pkg/package.json must have piConfig
-    const pkgJson = readFileSync(join(projectRoot, "pkg", "package.json"), "utf-8");
-    const pkg = JSON.parse(pkgJson);
-    assert.equal(pkg.piConfig?.name, "gsd", "pkg/package.json piConfig.name is gsd");
-    assert.equal(pkg.piConfig?.configDir, ".gsd", "pkg/package.json piConfig.configDir is .gsd");
-  } finally {
-    // Clean up tarball
-    rmSync(tarballPath, { force: true });
-  }
-});
-
-// ═══════════════════════════════════════════════════════════════════════════
-// 7. npm pack → install → gsd binary resolves
-// ═══════════════════════════════════════════════════════════════════════════
-
-test("tarball installs and gsd binary resolves", async () => {
-  // Build and pack
-  execSync("npm run build", { cwd: projectRoot, stdio: "pipe" });
-  let packOutput: string;
-  try {
-    packOutput = execSync("npm pack --json 2>/dev/null", {
-      cwd: projectRoot,
-      encoding: "utf-8",
-    });
-  } catch (e: any) {
-    // ENOBUFS is a system buffer exhaustion, not a code issue
-    if (e.code === 'ENOBUFS') {
-      console.log('  SKIP: System buffer exhaustion (ENOBUFS)');
-      return;
-    }
-    throw e;
-  }
-  const packInfo = JSON.parse(packOutput);
-  const tarball = packInfo[0].filename;
-  const tarballPath = join(projectRoot, tarball);
-
-  const tmp = mkdtempSync(join(tmpdir(), "gsd-install-test-"));
-
-  try {
-    // Install from tarball into a temp prefix
-    execSync(`npm install --prefix ${tmp} ${tarballPath} --no-save 2>&1`, {
-      encoding: "utf-8",
-      env: { ...process.env, PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD: "1" },
-    });
-
-    // Verify the gsd bin exists in the installed package
-    const installedBin = join(tmp, "node_modules", ".bin", "gsd");
-    assert.ok(existsSync(installedBin), "gsd binary exists in node_modules/.bin/");
-
-    // Verify loader.js is executable (has shebang)
-    const installedLoader = join(tmp, "node_modules", "gsd-pi", "dist", "loader.js");
-    const loaderContent = readFileSync(installedLoader, "utf-8");
-    assert.ok(loaderContent.startsWith("#!/usr/bin/env node"), "loader.js has node shebang");
-
-    // Verify bundled resources are present
-    const installedGsdExt = join(tmp, "node_modules", "gsd-pi", "src", "resources", "extensions", "gsd", "index.ts");
-    assert.ok(existsSync(installedGsdExt), "bundled gsd extension present in installed package");
-  } finally {
-    rmSync(tarballPath, { force: true });
-    rmSync(tmp, { recursive: true, force: true });
-  }
-});
-
-// ═══════════════════════════════════════════════════════════════════════════
-// 8. Launch → extensions load → no errors on stderr
-// ═══════════════════════════════════════════════════════════════════════════
-
-test("gsd launches and loads extensions without errors", async () => {
-  // Build first
-  execSync("npm run build", { cwd: projectRoot, stdio: "pipe" });
-
-  // Launch gsd with all optional keys set (skip wizard) and capture stderr.
-  // Kill after 5 seconds — we just need to see if extensions load.
-  const output = await new Promise<string>((resolve) => {
-    let stderr = "";
-    const child = spawn("node", ["dist/loader.js"], {
-      cwd: projectRoot,
-      env: {
-        ...process.env,
-        BRAVE_API_KEY: "test",
-        BRAVE_ANSWERS_KEY: "test",
-        CONTEXT7_API_KEY: "test",
-        JINA_API_KEY: "test",
-        TAVILY_API_KEY: "test",
-      },
-      stdio: ["pipe", "pipe", "pipe"],
-    });
-
-    child.stderr.on("data", (data: Buffer) => {
-      stderr += data.toString();
-    });
-
-    // Close stdin immediately so it's non-TTY
-    child.stdin.end();
-
-    // Give it 5s to start up
-    const timer = setTimeout(() => {
-      child.kill("SIGTERM");
-    }, 5000);
-
-    child.on("close", () => {
-      clearTimeout(timer);
-      resolve(stderr);
-    });
-  });
-
-  // No extension load errors
-  assert.ok(
-    !output.includes("[gsd] Extension load error"),
-    `no extension load errors on stderr (got: ${output.slice(0, 500)})`,
-  );
-
-  // No crash / unhandled errors
-  assert.ok(
-    !output.includes("Error: Cannot find module"),
-    "no missing module errors",
-  );
-  assert.ok(
-    !output.includes("ERR_MODULE_NOT_FOUND"),
-    "no ERR_MODULE_NOT_FOUND",
-  );
-});
--- a/src/tests/ci_monitor.test.ts
+++ b/src/tests/ci_monitor.test.ts
@ -0,0 +1,98 @@
+// Tests for ci_monitor.cjs — cross-platform CI monitoring tool
+//
+// Sections:
+//   (a) Script exists and is executable
+//   (b) --help shows all commands
+//   (c) list-workflows finds workflow files
+//   (d) check-actions parses actions from workflow
+//   (e) Commands validate required arguments
+
+import { existsSync } from 'node:fs';
+import { spawnSync } from 'node:child_process';
+import { join, dirname } from 'node:path';
+import { fileURLToPath } from 'node:url';
+
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const ROOT = join(__dirname, '..', '..');
+const SCRIPT_PATH = join(ROOT, 'scripts', 'ci_monitor.cjs');
+
+let passed = 0;
+let failed = 0;
+
+function assert(condition: boolean, message: string): void {
+  if (condition) {
+    passed++;
+  } else {
+    failed++;
+    console.error(`  FAIL: ${message}`);
+  }
+}
+
+function runScript(args: string[]): { stdout: string; stderr: string; status: number | null } {
+  const result = spawnSync('node', [SCRIPT_PATH, ...args], {
+    encoding: 'utf-8',
+    timeout: 30000,
+  });
+  return {
+    stdout: result.stdout || '',
+    stderr: result.stderr || '',
+    status: result.status,
+  };
+}
+
+// ─── Tests ────────────────────────────────────────────────────────────────
+
+console.log('# === (a) Script exists and is executable ===');
+assert(existsSync(SCRIPT_PATH), 'ci_monitor.cjs exists');
+const scriptStat = spawnSync('node', ['--check', SCRIPT_PATH], { encoding: 'utf-8' });
+assert(scriptStat.status === 0, 'ci_monitor.cjs has valid JavaScript syntax');
+
+console.log('\n# === (b) --help shows all commands ===');
+const help = runScript(['--help']);
+assert(help.status === 0, '--help exits with code 0');
+assert(help.stdout.includes('runs'), 'help shows runs command');
+assert(help.stdout.includes('watch'), 'help shows watch command');
+assert(help.stdout.includes('fail-fast'), 'help shows fail-fast command');
+assert(help.stdout.includes('log-failed'), 'help shows log-failed command');
+assert(help.stdout.includes('test-summary'), 'help shows test-summary command');
+assert(help.stdout.includes('check-actions'), 'help shows check-actions command');
+assert(help.stdout.includes('grep'), 'help shows grep command');
+assert(help.stdout.includes('wait-for'), 'help shows wait-for command');
+
+console.log('\n# === (c) list-workflows finds workflow files ===');
+const workflows = runScript(['list-workflows']);
+// May fail if no .github/workflows exists, that's OK
+if (workflows.status === 0) {
+  assert(workflows.stdout.includes('.yml') || workflows.stdout.includes('No workflow files') || workflows.stdout.includes('No .github'), 'list-workflows output mentions yml files or none found');
+} else {
+  // If it fails, should be due to missing directory
+  assert(workflows.stderr.includes('No .github/workflows'), 'list-workflows fails gracefully when no workflows dir');
+}
+
+console.log('\n# === (d) check-actions validates workflow file ===');
+const checkMissing = runScript(['check-actions', '.github/workflows/nonexistent.yml']);
+assert(checkMissing.status !== 0, 'check-actions fails for missing file');
+assert(checkMissing.stderr.includes('not found') || checkMissing.stderr.includes('File not found'), 'check-actions reports missing file');
+
+console.log('\n# === (e) Commands validate required arguments ===');
+const grepNoPattern = runScript(['grep', '12345']);
+assert(grepNoPattern.status !== 0, 'grep fails without --pattern');
+assert(grepNoPattern.stderr.includes('--pattern') || grepNoPattern.stderr.includes('required'), 'grep reports missing pattern');
+
+const waitNoKeyword = runScript(['wait-for', '12345', 'build']);
+assert(waitNoKeyword.status !== 0, 'wait-for fails without --keyword');
+assert(waitNoKeyword.stderr.includes('--keyword') || waitNoKeyword.stderr.includes('required'), 'wait-for reports missing keyword');
+
+const compareMissing = runScript(['compare', '12345']);
+assert(compareMissing.status !== 0, 'compare fails with only one run-id');
+
+// ─── Summary ───────────────────────────────────────────────────────────────
+
+console.log('\n# ========================================');
+console.log(`# Results: ${passed} passed, ${failed} failed`);
+
+if (failed > 0) {
+  process.exit(1);
+}
+
+console.log('# All tests passed ✓');
--- a/src/tests/integration/pack-install.test.ts
+++ b/src/tests/integration/pack-install.test.ts
@ -0,0 +1,189 @@
+/**
+ * Integration tests for npm pack and install.
+ *
+ * These tests spawn child processes (npm pack, node)
+ * and are resource-intensive. Run separately from unit tests.
+ *
+ * Prerequisite: npm run build must be run first.
+ *
+ * Run with: npm run build && npm run test:integration
+ */
+
+import test from "node:test";
+import assert from "node:assert/strict";
+import { execFileSync, spawn } from "node:child_process";
+import { createReadStream, existsSync, mkdtempSync, readFileSync, rmSync } from "node:fs";
+import { join } from "node:path";
+import { tmpdir } from "node:os";
+import { fileURLToPath } from "node:url";
+import { createGunzip } from "node:zlib";
+
+const projectRoot = process.cwd();
+
+if (!existsSync(join(projectRoot, "dist"))) {
+  throw new Error("dist/ not found — run: npm run build");
+}
+
+function packTarball(): string {
+  const pkg = JSON.parse(readFileSync(join(projectRoot, "package.json"), "utf-8"));
+  const safeName = pkg.name.replace(/^@[^/]+\//, "").replace(/\//g, "-");
+  const tarball = `${safeName}-${pkg.version}.tgz`;
+  execFileSync("npm", ["pack"], { cwd: projectRoot, stdio: ["ignore", "ignore", "pipe"] });
+  return join(projectRoot, tarball);
+}
+
+/** List file paths inside a .tgz using Node built-ins only (no tar CLI or npm package). */
+function listTarEntries(tarballPath: string): Promise<string[]> {
+  return new Promise((resolve, reject) => {
+    const files: string[] = [];
+    const chunks: Buffer[] = [];
+    const gunzip = createGunzip();
+    const input = createReadStream(tarballPath);
+    gunzip.on("data", (chunk: Buffer) => { chunks.push(chunk); });
+    gunzip.on("end", () => {
+      const buf = Buffer.concat(chunks);
+      let offset = 0;
+      while (offset + 512 <= buf.length) {
+        const header = buf.subarray(offset, offset + 512);
+        if (header.every(b => b === 0)) break; // end-of-archive sentinel
+        const name   = header.subarray(0,   100).toString("utf8").replace(/\0.*/, "");
+        const prefix = header.subarray(345, 500).toString("utf8").replace(/\0.*/, "");
+        const type   = String.fromCharCode(header[156]);
+        const size   = parseInt(header.subarray(124, 136).toString("utf8").replace(/\0/g, "").trim(), 8) || 0;
+        if (name && type !== "5") files.push(prefix ? `${prefix}/${name}` : name);
+        offset += 512 + Math.ceil(size / 512) * 512;
+      }
+      resolve(files);
+    });
+    input.on("error", reject);
+    gunzip.on("error", reject);
+    input.pipe(gunzip);
+  });
+}
+
+// ═══════════════════════════════════════════════════════════════════════════
+// 1. npm pack produces valid tarball with correct file layout
+// ═══════════════════════════════════════════════════════════════════════════
+
+test("npm pack produces tarball with required files", async () => {
+  const tarballPath = packTarball();
+
+  assert.ok(existsSync(tarballPath), "tarball created");
+
+  try {
+    const files = await listTarEntries(tarballPath);
+
+    // Critical files must be present
+    assert.ok(files.some(f => f.includes("dist/loader.js")), "tarball contains dist/loader.js");
+    assert.ok(files.some(f => f.includes("dist/cli.js")), "tarball contains dist/cli.js");
+    assert.ok(files.some(f => f.includes("dist/app-paths.js")), "tarball contains dist/app-paths.js");
+    assert.ok(files.some(f => f.includes("dist/wizard.js")), "tarball contains dist/wizard.js");
+    assert.ok(files.some(f => f.includes("dist/resource-loader.js")), "tarball contains dist/resource-loader.js");
+    assert.ok(files.some(f => f.includes("pkg/package.json")), "tarball contains pkg/package.json");
+    assert.ok(files.some(f => f.includes("src/resources/extensions/gsd/index.ts")), "tarball contains bundled gsd extension");
+    assert.ok(files.some(f => f.includes("scripts/postinstall.js")), "tarball contains postinstall script");
+
+    // pkg/package.json must have piConfig
+    const pkgJson = readFileSync(join(projectRoot, "pkg", "package.json"), "utf-8");
+    const pkg = JSON.parse(pkgJson);
+    assert.equal(pkg.piConfig?.name, "gsd", "pkg/package.json piConfig.name is gsd");
+    assert.equal(pkg.piConfig?.configDir, ".gsd", "pkg/package.json piConfig.configDir is .gsd");
+  } finally {
+    rmSync(tarballPath, { force: true });
+  }
+});
+
+// ═══════════════════════════════════════════════════════════════════════════
+// 2. npm pack → install → gsd binary resolves
+// ═══════════════════════════════════════════════════════════════════════════
+
+test("tarball installs and gsd binary resolves", async () => {
+  const tarballPath = packTarball();
+
+  const tmp = mkdtempSync(join(tmpdir(), "gsd-install-test-"));
+
+  try {
+    // Install from tarball into a temp prefix
+    execFileSync("npm", ["install", "--prefix", tmp, tarballPath, "--no-save"], {
+      env: { ...process.env, PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD: "1" },
+      stdio: ["ignore", "ignore", "pipe"],
+    });
+
+    // Verify the gsd bin exists in the installed package
+    const binName = process.platform === "win32" ? "gsd.cmd" : "gsd";
+    const installedBin = join(tmp, "node_modules", ".bin", binName);
+    assert.ok(existsSync(installedBin), `gsd binary exists in node_modules/.bin/ (${binName})`);
+
+    // Verify loader.js is executable (has shebang)
+    const installedLoader = join(tmp, "node_modules", "gsd-pi", "dist", "loader.js");
+    const loaderContent = readFileSync(installedLoader, "utf-8");
+    if (process.platform !== "win32") {
+      assert.ok(loaderContent.startsWith("#!/usr/bin/env node"), "loader.js has node shebang");
+    }
+
+    // Verify bundled resources are present
+    const installedGsdExt = join(tmp, "node_modules", "gsd-pi", "src", "resources", "extensions", "gsd", "index.ts");
+    assert.ok(existsSync(installedGsdExt), "bundled gsd extension present in installed package");
+  } finally {
+    rmSync(tarballPath, { force: true });
+    rmSync(tmp, { recursive: true, force: true });
+  }
+});
+
+// ═══════════════════════════════════════════════════════════════════════════
+// 3. Launch → extensions load → no errors on stderr
+// ═══════════════════════════════════════════════════════════════════════════
+
+test("gsd launches and loads extensions without errors", async () => {
+  // Launch gsd with all optional keys set (skip wizard) and capture stderr.
+  // Kill after 5 seconds — we just need to see if extensions load.
+  // Assumes build already done.
+  const output = await new Promise<string>((resolve) => {
+    let stderr = "";
+    const child = spawn("node", ["dist/loader.js"], {
+      cwd: projectRoot,
+      env: {
+        ...process.env,
+        BRAVE_API_KEY: "test",
+        BRAVE_ANSWERS_KEY: "test",
+        CONTEXT7_API_KEY: "test",
+        JINA_API_KEY: "test",
+        TAVILY_API_KEY: "test",
+      },
+      stdio: ["pipe", "pipe", "pipe"],
+    });
+
+    child.stderr.on("data", (data: Buffer) => {
+      stderr += data.toString();
+    });
+
+    // Close stdin immediately so it's non-TTY
+    child.stdin.end();
+
+    // Give it 5s to start up
+    const timer = setTimeout(() => {
+      child.kill("SIGTERM");
+    }, 5000);
+
+    child.on("close", () => {
+      clearTimeout(timer);
+      resolve(stderr);
+    });
+  });
+
+  // No extension load errors
+  assert.ok(
+    !output.includes("[gsd] Extension load error"),
+    `no extension load errors on stderr (got: ${output.slice(0, 500)})`,
+  );
+
+  // No crash / unhandled errors
+  assert.ok(
+    !output.includes("Error: Cannot find module"),
+    "no missing module errors",
+  );
+  assert.ok(
+    !output.includes("ERR_MODULE_NOT_FOUND"),
+    "no ERR_MODULE_NOT_FOUND",
+  );
+});