singularity-forge/scripts/rtk-benchmark.mjs

170 lines
5 KiB
JavaScript
Raw Normal View History

feat: managed RTK integration with opt-in preference and web UI toggle (#2620) * feat: integrate managed RTK across shell workflows * fix(rtk): unify managed fallback and live savings wiring * fix(rtk): improve TUI status visibility * fix(tests): make portability tests independent of pi-coding-agent dist build The CI portability test runs don't guarantee that packages/pi-coding-agent has been compiled. Any test that imported files pulling in @gsd/pi-coding-agent (resource-loader, preferences-skills, async-bash-tool, etc.) crashed with ERR_MODULE_NOT_FOUND pointing at dist/index.js. Two changes to dist-redirect.mjs (the Node ESM loader hook used by all unit tests): - Redirect the bare @gsd/pi-coding-agent specifier to the workspace source entrypoint (src/index.ts) so no dist/ artifact is needed. - Extend the load() hook to transpile *.ts files under packages/pi-coding-agent/src/ through TypeScript's transpileModule. Node's --experimental-strip-types can't handle parameter properties and similar syntax present in that package's source; full transpilation avoids the ERR_UNSUPPORTED_TYPESCRIPT_SYNTAX crash. Also fix the dashboard.tsx responsive grid: - xl:grid-cols-5 → xl:grid-cols-4 2xl:grid-cols-5 (5 metric cards no longer fit at xl without overflow; test contract expected xl:grid-cols-4) - Keep loading-skeletons.tsx in sync with the same breakpoints. Add src/tests/resolve-ts-loader.test.ts to guard the loader behaviour: - bare @gsd/pi-coding-agent redirect points to workspace source - direct source-entry rewrite (.js → .ts) - transpilation removes TS parameter property syntax that strip-only mode cannot parse * fix(tests): redirect all workspace package imports to source in portability tests The previous fix only redirected @gsd/pi-coding-agent to its source entrypoint. In CI, pi-coding-agent/src itself imports @gsd/pi-ai (and other workspace packages) which were still pointing at dist/. Since no workspace dist is built during the portability test run, any transitive resolution hit the same ERR_MODULE_NOT_FOUND. Changes to dist-redirect.mjs: - Redirect @gsd/pi-ai, @gsd/pi-ai/oauth, @gsd/pi-agent-core, and @gsd/pi-tui bare imports to their workspace src/ entrypoints. - Broaden the load() transpilation condition from '/packages/pi-coding-agent/src/' to '/packages/*/src/' so that all workspace source files are run through TypeScript's transpileModule, handling parameter properties and other syntax that Node's strip-only mode rejects. Verified by hiding all four workspace dist/ directories locally and running the failing test set — 96/96 pass. * fix(tests): redirect @gsd/native sub-paths; fix Windows .cmd spawnSync Two more portability failures after the previous fix: 1. @gsd/native sub-path imports (@gsd/native/fd, @gsd/native/text, etc.) were not redirected — the loader only handled the bare specifier. Added a prefix-match redirect for @gsd/native/* → packages/native/src/<sub>/index.ts. 2. Windows RTK tests failed because createFakeRtk produces a .cmd wrapper on Windows, and spawnSync(binaryPath, [...]) without shell:true silently returns non-zero when the binary is a .cmd file. Added shell: /\.(cmd|bat)$/i.test(binaryPath) to the spawnSync calls in: - src/resources/extensions/shared/rtk.ts (rewriteCommandWithRtk) - src/resources/extensions/shared/rtk-session-stats.ts (readCurrentRtkGainSummary) - packages/pi-coding-agent/src/utils/rtk.ts (rewriteCommandForGsd) Production use of rtk.exe is unaffected; the shell flag is only true for .cmd/.bat paths. Verified: all 93 portability tests pass with all workspace dist/ directories removed (simulating CI portability environment). * fix(tests): Windows portability fixes — HOME env, managed RTK path, perf threshold Four Windows-specific failures fixed: 1. app-smoke.test.ts: process.env.HOME is undefined on Windows (uses USERPROFILE instead). Changed to homedir() from node:os which works cross-platform. 2. Managed RTK path tests on Windows: tests placed a fake RTK as rtk.exe (by copying a .cmd script into a .exe filename), which Windows cannot execute. Two-part fix: - resolveRtkBinaryPath() in both rtk.ts files now falls back to rtk.cmd in the managed dir on Windows when rtk.exe is absent. - withManagedFakeRtk and equivalent patterns in rtk.test.ts, rtk-session-stats.test.ts, rtk-execution-seams.test.ts changed to place the fake at rtk.cmd instead of rtk.exe on Windows. 3. bg_shell RTK test on Windows: requires bash (for shell sessions), which is not available on the blacksmith-4vcpu-windows-2025 runner without Git Bash installed. Test now skips on win32. 4. derive-state-db perf assertion: 10ms threshold was too tight for Windows CI runners (measured 12ms under load). Raised to 25ms — still catches real regressions (baseline is 3ms locally and ~12ms on stressed runners). * fix(tests): fix managed RTK path fallback on Windows in src/rtk.ts + fix copyable fake Two remaining Windows failures: 1. src/rtk.ts was never patched with the rtk.cmd managed-dir fallback (only the shared/rtk.ts and pi-coding-agent/src/utils/rtk.ts were updated). Added the same rtk.cmd fallback and shell:.cmd detection to src/rtk.ts, which is what rtk.test.ts imports from. 2. createFakeRtk on Windows wrote '%~dp0\fake-rtk.js' in the .cmd content — this resolves relative to the .cmd file's own directory. When the test copies rtk.cmd to a different managed dir, %~dp0 resolves to the copy destination where fake-rtk.js does not exist. Fixed by embedding the absolute path to fake-rtk.js directly in the .cmd content so the fake works correctly regardless of where the .cmd is copied. * feat(experimental): add RTK opt-in preference with web UI toggle - Add `experimental` category to GSDPreferences with `rtk: boolean` (default: false) - RTK is now opt-in: disabled by default for all projects unless explicitly enabled - Validate experimental.* keys; unknown experimental keys produce warnings Web UI: - Add ExperimentalPanel component with animated toggle switch per flag - Add /api/experimental route (GET/PATCH) to read/write flags in preferences.md - Add 'Experimental' tab to settings dialog sidebar nav (FlaskConical icon) - Include ExperimentalPanel at bottom of gsd-prefs mega-scroll - Fix toggle disabled state: trigger loadSettingsData for 'experimental' section and self-fetch on mount when data is absent Dashboard: - Gate RTK Saved metric card on rtkEnabled from live auto state (web) - Gate TUI dashboard RTK savings row on rtkEnabled - Gate TUI footer RTK status updates on experimental.rtk preference - Propagate rtkEnabled through AutoDashboardData → bridge-service → store Build: - Add scripts/build-if-stale.cjs: incremental build driver that skips each step (packages, root tsc, copy-resources, web) when output is newer than source; replaces full rebuild chain in gsd:web - Add scripts/web-stop.cjs: robust stop with registry + legacy PID + orphan sweep via pgrep; handles crash/restart orphaned next-server processes - gsd:web now uses build-if-stale.cjs (fast cold starts, instant when unchanged) - gsd:web:stop / gsd:web:stop:all use web-stop.cjs directly Fix: correct import path in rtk-status.ts (./preferences.js not ../preferences.js) * fix: restore em-dash encoding in package.json to match upstream * refactor(rtk): move command rewrite out of pi-coding-agent into GSD extension Per review feedback from igouss: pi-coding-agent should not be modified to add GSD-specific logic. Instead, add a proper extension point and wire RTK through it. Changes to packages/pi-coding-agent (extension API only — no RTK logic): - Add BashTransformEvent + BashTransformEventResult types to extension API - Add on('bash_transform') overload to ExtensionAPI interface - Add emitBashTransform() to ExtensionRunner (chains all handlers in order) - Call emitBashTransform() in wrapToolWithExtensions before bash tool execution - Export new types from extensions/index.ts and package index.ts - Revert all RTK-specific changes from bash-executor.ts, tools/bash.ts - Remove packages/pi-coding-agent/src/utils/rtk.ts entirely Changes to GSD extension: - Register bash_transform handler in register-hooks.ts that calls rewriteCommandWithRtk() from the existing shared/rtk.ts module - Handler is a no-op when RTK is disabled or not installed * fix: correct import path for shared/rtk.js in register-hooks * fix(tests): remove deleted pi-coding-agent/utils/rtk imports from execution seams test The RTK rewrite logic was moved out of pi-coding-agent into the GSD extension (bash_transform hook). Tests that directly imported the deleted utils/rtk.ts are removed; remaining tests verify the shared RTK module and GSD-layer surfaces that still call rewriteCommandWithRtk.
2026-03-26 08:33:07 -07:00
#!/usr/bin/env node
import { spawnSync } from 'node:child_process'
import { homedir, tmpdir } from 'node:os'
import { join, dirname } from 'node:path'
import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from 'node:fs'
function getManagedRtkPath() {
return join(homedir(), '.gsd', 'agent', 'bin', process.platform === 'win32' ? 'rtk.exe' : 'rtk')
}
function run(command, args, options = {}) {
const result = spawnSync(command, args, {
encoding: 'utf-8',
stdio: ['ignore', 'pipe', 'pipe'],
...options,
})
if (result.error) throw result.error
return result
}
function ensureOk(result, label) {
if (result.status !== 0) {
throw new Error(`${label} failed: ${result.stderr || result.stdout || `exit ${result.status}`}`)
}
}
function createFixture(projectDir) {
mkdirSync(join(projectDir, 'src', 'components'), { recursive: true })
writeFileSync(join(projectDir, 'package.json'), JSON.stringify({
name: 'gsd-rtk-benchmark',
version: '1.0.0',
scripts: {
test: 'node test.js',
},
}, null, 2))
const testLines = []
for (let i = 0; i < 120; i += 1) {
const group = i % 6
testLines.push(`console.log('FAIL src/components/file${group}.test.ts:${i + 1}: expected value ${i}')`)
}
testLines.push('process.exit(1)')
writeFileSync(join(projectDir, 'test.js'), `${testLines.join('\n')}\n`)
for (let i = 1; i <= 80; i += 1) {
writeFileSync(
join(projectDir, 'src', 'components', `file${i}.ts`),
`export function component_${i}() {\n return "value_${i}";\n}\n`,
)
}
ensureOk(run('git', ['init', '-q'], { cwd: projectDir }), 'git init')
ensureOk(run('git', ['config', 'user.email', 'benchmark@example.com'], { cwd: projectDir }), 'git config email')
ensureOk(run('git', ['config', 'user.name', 'Benchmark'], { cwd: projectDir }), 'git config name')
ensureOk(run('git', ['add', '.'], { cwd: projectDir }), 'git add')
ensureOk(run('git', ['commit', '-qm', 'init'], { cwd: projectDir }), 'git commit')
for (let i = 1; i <= 25; i += 1) {
writeFileSync(
join(projectDir, 'src', 'components', `file${i}.ts`),
`export function component_${i}() {\n return "value_${i}";\n}\n// change ${i}\n`,
)
}
for (let i = 81; i <= 100; i += 1) {
writeFileSync(
join(projectDir, 'src', 'components', `file${i}.ts`),
`export const new_${i} = ${i}\n`,
)
}
}
function renderMarkdown({ summary, history, binaryPath }) {
const timestamp = new Date().toISOString()
return [
'# RTK benchmark evidence',
'',
`- Generated: ${timestamp}`,
`- RTK binary: \`${binaryPath}\``,
`- Telemetry: disabled via \`RTK_TELEMETRY_DISABLED=1\``,
`- Fixture: synthetic git + find + ls + npm test workload`,
'',
'## Aggregate savings',
'',
'| Commands | Input tokens | Output tokens | Saved tokens | Savings | Avg command time |',
'| --- | ---: | ---: | ---: | ---: | ---: |',
`| ${summary.total_commands} | ${summary.total_input} | ${summary.total_output} | ${summary.total_saved} | ${summary.avg_savings_pct.toFixed(1)}% | ${summary.avg_time_ms} ms |`,
'',
'## Command breakdown',
'',
'```text',
history.trim(),
'```',
'',
'## Commands exercised',
'',
'- `git status`',
'- `git diff`',
'- `find src -type f`',
'- `ls -R src`',
'- `npm run test`',
'',
].join('\n')
}
function main() {
const outputIndex = process.argv.indexOf('--output')
const outputPath = outputIndex !== -1 ? process.argv[outputIndex + 1] : null
const binaryPath = process.env.GSD_RTK_PATH || getManagedRtkPath()
if (!binaryPath) {
throw new Error('RTK binary path not resolved')
}
const workspace = mkdtempSync(join(tmpdir(), 'gsd-rtk-benchmark-'))
const homeDir = join(workspace, 'home')
const projectDir = join(workspace, 'project')
mkdirSync(homeDir, { recursive: true })
mkdirSync(projectDir, { recursive: true })
try {
createFixture(projectDir)
const env = {
...process.env,
HOME: homeDir,
RTK_TELEMETRY_DISABLED: '1',
}
const commands = [
['git', 'status'],
['git', 'diff'],
['find', 'src', '-type', 'f'],
['ls', '-R', 'src'],
['npm', 'run', 'test'],
]
for (const command of commands) {
run(binaryPath, command, { cwd: projectDir, env })
}
const summaryJson = run(binaryPath, ['gain', '--all', '--format', 'json'], { cwd: projectDir, env })
ensureOk(summaryJson, 'rtk gain --all --format json')
const historyText = run(binaryPath, ['gain', '--history'], { cwd: projectDir, env })
ensureOk(historyText, 'rtk gain --history')
const parsed = JSON.parse(summaryJson.stdout)
const markdown = renderMarkdown({
summary: parsed.summary,
history: historyText.stdout,
binaryPath,
})
if (outputPath) {
mkdirSync(dirname(outputPath), { recursive: true })
writeFileSync(outputPath, markdown, 'utf-8')
console.log(outputPath)
return
}
console.log(markdown)
} finally {
rmSync(workspace, { recursive: true, force: true })
}
}
main()