feat(browser-tools): add 10 new browser tools (#698)

Implement all features from the browser-tools feature additions proposal:

1. browser_extract — structured data extraction with JSON Schema validation
2. browser_save_state / browser_restore_state — session state persistence
3. browser_generate_test — Playwright test code generation from session
4. browser_mock_route / browser_block_urls / browser_clear_routes — network interception
5. browser_emulate_device — device emulation with 143 Playwright device presets
6. browser_visual_diff — visual regression diffing with baseline management
7. browser_save_pdf — PDF generation (Chromium page.pdf)
8. browser_zoom_region — region capture with upscaling via sharp
9. browser_action_cache — intent→selector caching for repeat visits
10. browser_check_injection — prompt injection detection on page content

Total browser tools: 47 → 60. No new dependencies — uses existing
sharp, ajv, @sinclair/typebox, and Playwright core APIs.
This commit is contained in:
Tom Boucher 2026-03-16 17:34:44 -04:00
parent a90aa0c8d6
commit ca299db1c6
14 changed files with 2344 additions and 1 deletions

14
.gitignore vendored
View file

@ -1,6 +1,5 @@
# ── GSD project state (development-only, lives in worktree branches) ──
.gsd/
.claude/
RELEASE-GUIDE.md
@ -50,3 +49,16 @@ AGENTS.md
.bg-shell/
TODOS.md
.planning/
# ── GSD baseline (auto-generated) ──
.gsd/activity/
.gsd/runtime/
.gsd/worktrees/
.gsd/auto.lock
.gsd/metrics.json
.gsd/completed-units.json
.gsd/STATE.md
.gsd/gsd.db
.gsd/DISCUSSION-MANIFEST.json
.gsd/milestones/**/*-CONTINUE.md
.gsd/milestones/**/continue.md

View file

@ -0,0 +1,312 @@
# Browser-Tools Feature Additions — Implementation Requirements
> Ref: [#698](https://github.com/gsd-build/gsd-2/issues/698)
> Status: Proposal — open for contributor review
## Current State
Browser-tools ships **47 tools** across 10 modules (~8,300 lines). The extension wraps Playwright's Chromium instance with intent resolution, semantic actions, assertions, state diffing, an action timeline, HAR/trace export, and a deterministic ref system. Context is managed via `lifecycle.ts` (browser/context/page lifecycle) and `state.ts` (session tracking).
Key existing capabilities: `browser_navigate`, `browser_click`, `browser_evaluate`, `browser_assert`, `browser_diff`, `browser_batch`, `browser_find_best`, `browser_act`, `browser_trace_start/stop`, `browser_export_har`, `browser_set_viewport`, `browser_screenshot`, `browser_snapshot_refs`.
No existing support for: storage state persistence, route interception, PDF export, structured data extraction, device emulation profiles, visual diffing, or test code generation.
---
## Feature 1: Structured Data Extraction with Schema Validation
**Tool:** `browser_extract`
### What it does
Accept a JSON Schema (or simplified shape description), extract matching structured data from the current page, validate against the schema, return typed JSON.
### Implementation requirements
| Item | Details |
|---|---|
| **New file** | `tools/extract.ts` |
| **Playwright API** | `page.evaluate()` — runs extraction logic in-page |
| **Schema validation** | Use `@sinclair/typebox` (already a dependency) for schema definition; `ajv` or inline validation for runtime checking |
| **Extraction strategy** | 1. Convert page to accessibility tree or clean text via existing `browser_get_accessibility_tree` / `browser_get_page_source` infrastructure. 2. Use `page.evaluate()` to run CSS selector-based extraction. 3. For complex extraction, pass schema + page content to the LLM via tool result and let the agent extract (Stagehand approach) |
| **Tool signature** | `browser_extract({ schema: JSONSchema, selector?: string, multiple?: boolean })``{ data: T, validationErrors?: string[] }` |
| **Dependencies** | None new — Typebox already available, `page.evaluate` is Playwright core |
| **Estimated effort** | **1624 hours** |
| **Risk** | Medium — extraction quality depends heavily on page structure; may need multiple strategies (DOM-based, a11y-tree-based, LLM-assisted) |
### Acceptance criteria
- [ ] Extracts data matching a provided JSON schema from a page
- [ ] Returns validation errors when extracted data doesn't match schema
- [ ] Supports scoping extraction to a CSS selector
- [ ] Supports extracting arrays of items (`multiple: true`)
- [ ] Handles pages with dynamic content (waits for network idle before extraction)
---
## Feature 2: Session State Persistence & Restoration
**Tools:** `browser_save_state`, `browser_restore_state`
### What it does
Save cookies, localStorage, sessionStorage, and auth tokens to disk. Restore them on a subsequent browser session to resume authenticated state without re-logging in.
### Implementation requirements
| Item | Details |
|---|---|
| **New tools in** | `tools/session.ts` (extend existing file) |
| **Playwright API** | `context.storageState()` for cookies + localStorage; `page.evaluate()` for sessionStorage (not included in Playwright's storageState) |
| **Storage location** | Session artifacts directory: `.gsd/browser-state/<name>.json` |
| **Tool signatures** | `browser_save_state({ name?: string })``{ path, cookieCount, localStorageOrigins }` / `browser_restore_state({ name?: string })``{ restored, cookieCount }` |
| **Restore mechanism** | `browser.newContext({ storageState: path })` for new sessions; `context.addCookies()` + `page.evaluate()` for mid-session restore |
| **Security** | State files may contain auth tokens — add to `.gitignore` pattern, warn in tool output |
| **Dependencies** | None new — all Playwright core APIs |
| **Estimated effort** | **812 hours** |
| **Risk** | Low — Playwright's `storageState()` is well-tested; sessionStorage requires extra handling |
### Acceptance criteria
- [ ] Saves cookies + localStorage via `context.storageState()`
- [ ] Saves sessionStorage via `page.evaluate()` (per-origin)
- [ ] Restores state on new browser context launch
- [ ] Restores state mid-session (cookies + evaluate injection)
- [ ] State files written to `.gsd/browser-state/` and gitignored
- [ ] Tool output shows count of restored items, never displays secret values
---
## Feature 3: Test Code Generation from Session
**Tool:** `browser_generate_test`
### What it does
Record agent interactions during a browser session and emit a Playwright test script. Turns AI-driven exploration into deterministic, reproducible tests.
### Implementation requirements
| Item | Details |
|---|---|
| **New file** | `tools/codegen.ts` |
| **Data source** | Action timeline (already tracked in `state.ts`) + trace data from `browser_trace_start/stop` |
| **Code generation** | Transform timeline entries (navigate, click, type, assert) into Playwright test syntax: `await page.goto(...)`, `await page.click(...)`, `await expect(page.locator(...)).toBeVisible()` |
| **Tool signature** | `browser_generate_test({ name?: string, includeAssertions?: boolean })``{ path, actionCount, testCode }` |
| **Output format** | Standard Playwright test file (`*.spec.ts`) written to project's test directory or session artifacts |
| **Selector strategy** | Prefer stable selectors: `getByRole` > `getByText` > CSS selector (use ref metadata for best selectors) |
| **Dependencies** | None new — reads from existing timeline/trace infrastructure |
| **Estimated effort** | **2030 hours** |
| **Risk** | High — generated selectors may be brittle; action timeline may not capture all nuances (hover timing, scroll position, wait conditions); output quality varies significantly by page complexity |
### Acceptance criteria
- [ ] Generates a runnable Playwright test from a recorded session
- [ ] Includes navigation, click, type, and assertion actions
- [ ] Uses stable selectors (role-based preferred over CSS)
- [ ] Generated test passes when run against the same page state
- [ ] Writes test file to configurable output path
---
## Feature 4: Network Request Interception & Mocking
**Tools:** `browser_mock_route`, `browser_block_urls`, `browser_clear_routes`
### What it does
Intercept network requests to mock API responses, block URLs (analytics, ads), simulate error conditions (500s, timeouts, slow responses).
### Implementation requirements
| Item | Details |
|---|---|
| **New file** | `tools/network-mock.ts` |
| **Playwright API** | `page.route(urlPattern, handler)` for interception; `route.fulfill()` for mock responses; `route.abort()` for blocking |
| **Tool signatures** | `browser_mock_route({ url: string, status?: number, body?: string, headers?: Record })` / `browser_block_urls({ patterns: string[] })` / `browser_clear_routes()` |
| **State tracking** | Track active routes in module state for cleanup and listing |
| **Dependencies** | None new — Playwright core API |
| **Estimated effort** | **1216 hours** |
| **Risk** | Low — Playwright's route API is mature and well-documented |
### Acceptance criteria
- [ ] Mock API responses with custom status, body, and headers
- [ ] Block requests matching URL patterns (glob or regex)
- [ ] Simulate slow responses with configurable delay
- [ ] Clear all active routes
- [ ] List active routes for debugging
- [ ] Routes survive page navigation within the same context
---
## Feature 5: Device Emulation Presets
**Tool:** `browser_emulate_device`
### What it does
One-call device simulation: viewport + user agent + touch + device scale factor. Wraps Playwright's device descriptors.
### Implementation requirements
| Item | Details |
|---|---|
| **Extend** | `tools/interaction.ts` (alongside `browser_set_viewport`) or new `tools/device.ts` |
| **Playwright API** | `playwright.devices['iPhone 15']``{ viewport, userAgent, deviceScaleFactor, isMobile, hasTouch }` applied via context recreation or page emulation |
| **Tool signature** | `browser_emulate_device({ device: string })``{ device, viewport, userAgent, isMobile }` |
| **Device list** | Expose Playwright's built-in device descriptors (~100 devices); accept fuzzy matching on device name |
| **Limitation** | Some properties (userAgent, isMobile) can only be set at context creation — may require context restart |
| **Dependencies** | None new — Playwright ships device descriptors |
| **Estimated effort** | **610 hours** |
| **Risk** | Low-Medium — context restart for full emulation changes the page state; partial emulation (viewport only) is simpler but less accurate |
### Acceptance criteria
- [ ] Accept device name (e.g., "iPhone 15", "Pixel 7") and configure full emulation
- [ ] Support fuzzy matching on device name with suggestions on no match
- [ ] Set viewport, user agent, device scale factor, touch, and mobile flag
- [ ] Warn when context restart is required and confirm with user
---
## Feature 6: Visual Diffing (Screenshot Comparison)
**Tool:** `browser_visual_diff`
### What it does
Compare two screenshots pixel-by-pixel, return a diff image and similarity score.
### Implementation requirements
| Item | Details |
|---|---|
| **New file** | `tools/visual-diff.ts` |
| **Comparison library** | `pixelmatch` (lightweight, ~200 lines, MIT) or Playwright's built-in `expect(page).toHaveScreenshot()` comparison |
| **Tool signature** | `browser_visual_diff({ baseline?: string, current?: string, threshold?: number })``{ match: boolean, similarity: number, diffPixels: number, diffImagePath?: string }` |
| **Baseline management** | Save baselines to `.gsd/browser-baselines/`; auto-name by URL + viewport |
| **Dependencies** | `pixelmatch` + `pngjs` (new deps, ~50KB total) or use Playwright's built-in comparator |
| **Estimated effort** | **1014 hours** |
| **Risk** | Medium — anti-aliasing and dynamic content (timestamps, ads) cause false positives; threshold tuning needed |
### Acceptance criteria
- [ ] Compare current page screenshot against a stored baseline
- [ ] Return similarity score (01) and diff pixel count
- [ ] Generate diff image highlighting changed regions
- [ ] Configurable threshold for pass/fail
- [ ] Support element-scoped comparison (crop to selector)
---
## Feature 7: PDF Generation
**Tool:** `browser_save_pdf`
### What it does
Render current page as PDF artifact.
### Implementation requirements
| Item | Details |
|---|---|
| **Extend** | `tools/screenshot.ts` or new `tools/pdf.ts` |
| **Playwright API** | `page.pdf({ path, format, printBackground })` — Chromium only (already our engine) |
| **Tool signature** | `browser_save_pdf({ filename?: string, format?: string, printBackground?: boolean })``{ path, pageCount, sizeBytes }` |
| **Output location** | Session artifacts directory |
| **Dependencies** | None — Playwright core API |
| **Estimated effort** | **35 hours** |
| **Risk** | Low — straightforward Playwright wrapper |
### Acceptance criteria
- [ ] Generate PDF from current page
- [ ] Support A4/Letter/custom page formats
- [ ] Include background graphics option
- [ ] Write to session artifacts with configurable filename
- [ ] Return file path and size
---
## Feature 8: Region Zoom / Targeted High-Res Capture
**Tool:** `browser_zoom_region`
### What it does
Capture and upscale a specific rectangular region for detailed inspection of dense UIs.
### Implementation requirements
| Item | Details |
|---|---|
| **Extend** | `tools/screenshot.ts` |
| **Playwright API** | `page.screenshot({ clip: { x, y, width, height } })` for region capture; upscale via `sharp` or return at native device pixel ratio |
| **Tool signature** | `browser_zoom_region({ x, y, width, height, scale?: number })` → screenshot image |
| **Dependencies** | Optional `sharp` for upscaling, or rely on Playwright's deviceScaleFactor |
| **Estimated effort** | **46 hours** |
| **Risk** | Low |
### Acceptance criteria
- [ ] Capture arbitrary rectangular region by coordinates
- [ ] Support scale factor for upscaling (2x, 3x)
- [ ] Return as inline image (same as `browser_screenshot`)
---
## Feature 9: Action Caching / Replay (Lower Priority)
**Tool:** Internal optimization, not a user-facing tool
### Implementation requirements
| Item | Details |
|---|---|
| **Cache key** | URL + DOM structure hash → selector mapping |
| **Storage** | In-memory LRU cache with optional disk persistence |
| **Integration point** | `browser_find_best` / `browser_act` — check cache before LLM resolution |
| **Estimated effort** | **1218 hours** |
| **Risk** | Medium — cache invalidation when page structure changes; stale selectors cause silent failures |
---
## Feature 10: Prompt Injection Detection (Lower Priority)
**Tool:** `browser_check_injection`
### Implementation requirements
| Item | Details |
|---|---|
| **Detection strategy** | Regex/keyword scan on screenshot OCR text or page text content for known injection patterns ("ignore previous", "system prompt", "you are now") |
| **Integration point** | Optional auto-check after `browser_screenshot` or `browser_navigate` |
| **Estimated effort** | **812 hours** |
| **Risk** | Medium — false positives on legitimate content; OCR adds latency; determined adversaries can evade keyword detection |
---
## Summary — Effort & Priority Matrix
| # | Feature | Priority | Effort | New Deps | Risk |
|---|---|---|---|---|---|
| 1 | Structured data extraction | High | 1624h | None | Medium |
| 2 | Session state persistence | High | 812h | None | Low |
| 3 | Test code generation | High | 2030h | None | High |
| 4 | Network interception/mocking | High | 1216h | None | Low |
| 5 | Device emulation presets | Medium | 610h | None | Low-Med |
| 6 | Visual diffing | Medium | 1014h | pixelmatch (~50KB) | Medium |
| 7 | PDF generation | Medium | 35h | None | Low |
| 8 | Region zoom capture | Medium | 46h | Optional sharp | Low |
| 9 | Action caching | Lower | 1218h | None | Medium |
| 10 | Prompt injection detection | Lower | 812h | None | Medium |
| | **Total** | | **~100150h** | | |
## Recommended Implementation Order
1. **PDF generation** (Feature 7) — smallest, zero deps, immediate utility, good warmup
2. **Session state persistence** (Feature 2) — high value, low risk, moderate effort
3. **Network interception** (Feature 4) — high value, low risk, Playwright API is mature
4. **Region zoom** (Feature 8) — small effort, extends existing screenshot tool
5. **Device emulation** (Feature 5) — moderate effort, extends existing viewport tool
6. **Structured extraction** (Feature 1) — high value but needs design iteration on extraction strategy
7. **Visual diffing** (Feature 6) — useful for UAT, needs threshold tuning
8. **Test code generation** (Feature 3) — high value but high risk, best tackled after timeline infrastructure is battle-tested
9. **Action caching** (Feature 9) — optimization, defer until intent resolution is a proven bottleneck
10. **Prompt injection** (Feature 10) — defensive, defer until production use cases mature
## Notes for Contributors
- All features wrap existing Playwright APIs — no custom browser extensions or CDP hacking needed
- Features 2, 4, 5, 7, 8 are straightforward Playwright wrappers with low implementation risk
- Features 1 and 3 involve more design work — open sub-issues for design discussion before implementation
- Each feature should be a separate PR with its own tests
- Follow the existing tool registration pattern in `index.ts``tools/*.ts`
- Use `Type` from `@sinclair/typebox` for tool parameter schemas (existing convention)
- Session artifacts go in the artifacts directory managed by `session.ts`

View file

@ -17,6 +17,16 @@ import { registerWaitTools } from "./tools/wait.js";
import { registerPageTools } from "./tools/pages.js";
import { registerFormTools } from "./tools/forms.js";
import { registerIntentTools } from "./tools/intent.js";
import { registerPdfTools } from "./tools/pdf.js";
import { registerStatePersistenceTools } from "./tools/state-persistence.js";
import { registerNetworkMockTools } from "./tools/network-mock.js";
import { registerDeviceTools } from "./tools/device.js";
import { registerExtractTools } from "./tools/extract.js";
import { registerVisualDiffTools } from "./tools/visual-diff.js";
import { registerZoomTools } from "./tools/zoom.js";
import { registerCodegenTools } from "./tools/codegen.js";
import { registerActionCacheTools } from "./tools/action-cache.js";
import { registerInjectionDetectionTools } from "./tools/injection-detect.js";
export default function (pi: ExtensionAPI) {
pi.on("session_shutdown", async () => { await closeBrowser(); });
@ -48,4 +58,14 @@ export default function (pi: ExtensionAPI) {
registerPageTools(pi, deps);
registerFormTools(pi, deps);
registerIntentTools(pi, deps);
registerPdfTools(pi, deps);
registerStatePersistenceTools(pi, deps);
registerNetworkMockTools(pi, deps);
registerDeviceTools(pi, deps);
registerExtractTools(pi, deps);
registerVisualDiffTools(pi, deps);
registerZoomTools(pi, deps);
registerCodegenTools(pi, deps);
registerActionCacheTools(pi, deps);
registerInjectionDetectionTools(pi, deps);
}

View file

@ -612,3 +612,28 @@ describe("constrainScreenshot", () => {
assert.equal(meta.height, 1568);
});
});
// ---------------------------------------------------------------------------
// browser_save_pdf — tool registration
// ---------------------------------------------------------------------------
describe("browser_save_pdf tool registration", () => {
it("registerPdfTools exports a function", () => {
const { registerPdfTools } = jiti("../tools/pdf.ts");
assert.equal(typeof registerPdfTools, "function", "registerPdfTools should be a function");
});
it("tool can be registered with a mock pi", () => {
const { registerPdfTools } = jiti("../tools/pdf.ts");
const registeredTools = [];
const mockPi = {
registerTool: (tool) => registeredTools.push(tool),
};
const mockDeps = {};
registerPdfTools(mockPi, mockDeps);
assert.equal(registeredTools.length, 1, "should register exactly 1 tool");
assert.equal(registeredTools[0].name, "browser_save_pdf", "tool name should be browser_save_pdf");
assert.ok(registeredTools[0].parameters, "tool should have parameters schema");
assert.equal(typeof registeredTools[0].execute, "function", "tool should have execute function");
});
});

View file

@ -0,0 +1,216 @@
import type { ExtensionAPI } from "@gsd/pi-coding-agent";
import { Type } from "@sinclair/typebox";
import type { ToolDeps } from "../state.js";
/**
* Action caching cache semantic intent selector mappings to skip LLM inference on repeat visits.
* Internal optimization that hooks into browser_find_best / browser_act.
*/
interface CacheEntry {
selector: string;
score: number;
url: string;
domHash: string;
timestamp: number;
hitCount: number;
}
const cache = new Map<string, CacheEntry>();
const MAX_CACHE_SIZE = 200;
export function registerActionCacheTools(pi: ExtensionAPI, deps: ToolDeps): void {
// -------------------------------------------------------------------------
// browser_action_cache
// -------------------------------------------------------------------------
pi.registerTool({
name: "browser_action_cache",
label: "Browser Action Cache",
description:
"Manage the action cache that maps page structure + intent → resolved selectors. " +
"Cache reduces token cost on repeat visits to same pages. " +
"Actions: 'stats' (show cache metrics), 'get' (lookup cached selector), " +
"'put' (store a selector mapping), 'clear' (flush cache).",
parameters: Type.Object({
action: Type.String({
description: "Cache action: 'stats', 'get', 'put', or 'clear'.",
}),
intent: Type.Optional(
Type.String({ description: "Semantic intent key (for get/put). E.g., 'submit_form', 'close_dialog'." }),
),
selector: Type.Optional(
Type.String({ description: "CSS selector to cache (for put)." }),
),
score: Type.Optional(
Type.Number({ description: "Confidence score 01 for the cached selector (for put)." }),
),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { page: p } = await deps.ensureBrowser();
const url = p.url();
switch (params.action) {
case "stats": {
const entries = [...cache.values()];
const totalHits = entries.reduce((sum, e) => sum + e.hitCount, 0);
return {
content: [{
type: "text",
text: `Action cache: ${cache.size} entries, ${totalHits} total hits\nMax size: ${MAX_CACHE_SIZE}`,
}],
details: {
size: cache.size,
maxSize: MAX_CACHE_SIZE,
totalHits,
entries: entries.map((e) => ({
url: e.url,
selector: e.selector,
hitCount: e.hitCount,
score: e.score,
})),
},
};
}
case "get": {
if (!params.intent) {
return {
content: [{ type: "text", text: "Intent parameter required for 'get' action." }],
details: { error: "missing_intent" },
isError: true,
};
}
const domHash = await computeDomHash(p);
const key = buildCacheKey(url, domHash, params.intent);
const entry = cache.get(key);
if (!entry) {
return {
content: [{ type: "text", text: `Cache miss for intent "${params.intent}" on ${url}` }],
details: { hit: false, intent: params.intent, url },
};
}
// Validate the cached selector still exists
const exists = await p.locator(entry.selector).first().isVisible().catch(() => false);
if (!exists) {
cache.delete(key);
return {
content: [{ type: "text", text: `Cache entry stale (selector no longer visible): ${entry.selector}` }],
details: { hit: false, stale: true, selector: entry.selector },
};
}
entry.hitCount++;
return {
content: [{
type: "text",
text: `Cache hit: "${params.intent}" → ${entry.selector} (score: ${entry.score}, hits: ${entry.hitCount})`,
}],
details: { hit: true, ...entry },
};
}
case "put": {
if (!params.intent || !params.selector) {
return {
content: [{ type: "text", text: "Intent and selector parameters required for 'put' action." }],
details: { error: "missing_params" },
isError: true,
};
}
const domHash = await computeDomHash(p);
const key = buildCacheKey(url, domHash, params.intent);
// Evict oldest entries if at capacity
if (cache.size >= MAX_CACHE_SIZE && !cache.has(key)) {
const oldestKey = [...cache.entries()]
.sort(([, a], [, b]) => a.timestamp - b.timestamp)[0]?.[0];
if (oldestKey) cache.delete(oldestKey);
}
const entry: CacheEntry = {
selector: params.selector,
score: params.score ?? 1.0,
url,
domHash,
timestamp: Date.now(),
hitCount: 0,
};
cache.set(key, entry);
return {
content: [{
type: "text",
text: `Cached: "${params.intent}" → ${params.selector} (cache size: ${cache.size})`,
}],
details: { stored: true, key, ...entry, cacheSize: cache.size },
};
}
case "clear": {
const size = cache.size;
cache.clear();
return {
content: [{ type: "text", text: `Action cache cleared (${size} entries removed).` }],
details: { cleared: size },
};
}
default:
return {
content: [{ type: "text", text: `Unknown action: ${params.action}. Use 'stats', 'get', 'put', or 'clear'.` }],
details: { error: "unknown_action" },
isError: true,
};
}
} catch (err: any) {
return {
content: [{ type: "text", text: `Action cache error: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
}
function buildCacheKey(url: string, domHash: string, intent: string): string {
// Normalize URL — strip hash and query params for broader matching
let normalized: string;
try {
const u = new URL(url);
normalized = `${u.origin}${u.pathname}`;
} catch {
normalized = url;
}
return `${normalized}|${domHash}|${intent}`;
}
async function computeDomHash(page: any): Promise<string> {
try {
return await page.evaluate(() => {
// Structural hash based on element count + tag distribution
const tags = new Map<string, number>();
const all = document.querySelectorAll("*");
for (const el of all) {
const tag = el.tagName;
tags.set(tag, (tags.get(tag) ?? 0) + 1);
}
const entries = [...tags.entries()].sort((a, b) => a[0].localeCompare(b[0]));
const str = entries.map(([t, c]) => `${t}:${c}`).join("|");
// Simple hash
let h = 5381;
for (let i = 0; i < str.length; i++) {
h = ((h << 5) - h + str.charCodeAt(i)) | 0;
}
return (h >>> 0).toString(16);
});
} catch {
return "unknown";
}
}

View file

@ -0,0 +1,274 @@
import type { ExtensionAPI } from "@gsd/pi-coding-agent";
import { Type } from "@sinclair/typebox";
import type { ToolDeps } from "../state.js";
import { getActionTimeline } from "../state.js";
/**
* Test code generation transform recorded browser session into a Playwright test script.
*/
export function registerCodegenTools(pi: ExtensionAPI, deps: ToolDeps): void {
pi.registerTool({
name: "browser_generate_test",
label: "Browser Generate Test",
description:
"Generate a runnable Playwright test script from the recorded action timeline. " +
"Transforms navigation, click, type, and assertion actions into standard Playwright test syntax. " +
"Uses stable selectors (role-based preferred). Writes the test file to a configurable path.",
parameters: Type.Object({
name: Type.Optional(
Type.String({ description: "Test name (used for describe/test block and filename). Default: 'recorded-session'." }),
),
outputPath: Type.Optional(
Type.String({
description:
"Output file path for the generated test. Default: writes to session artifacts directory. " +
"Use a path ending in .spec.ts for standard Playwright test convention.",
}),
),
includeAssertions: Type.Optional(
Type.Boolean({ description: "Include assertion steps from the timeline (default: true)." }),
),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
await deps.ensureBrowser();
const timeline = getActionTimeline();
if (timeline.entries.length === 0) {
return {
content: [{ type: "text", text: "No actions recorded in the current session. Interact with pages first, then generate a test." }],
details: { error: "no_actions" },
isError: true,
};
}
const testName = params.name ?? "recorded-session";
const includeAssertions = params.includeAssertions ?? true;
// Transform timeline entries into Playwright test code
const testLines: string[] = [];
const imports = new Set<string>();
imports.add("test");
imports.add("expect");
testLines.push(`test.describe('${escapeString(testName)}', () => {`);
testLines.push(` test('recorded session', async ({ page }) => {`);
let lastUrl = "";
let actionCount = 0;
for (const entry of timeline.entries) {
if (entry.status === "error" && entry.tool !== "browser_assert") continue;
const params = parseParamsSummary(entry.paramsSummary);
switch (entry.tool) {
case "browser_navigate": {
const url = params.url;
if (url && url !== lastUrl) {
testLines.push(` await page.goto(${quote(url)});`);
lastUrl = url;
actionCount++;
}
break;
}
case "browser_click": {
const selector = params.selector;
if (selector) {
testLines.push(` await page.locator(${quote(selector)}).click();`);
actionCount++;
}
break;
}
case "browser_click_ref": {
// Refs are session-specific — add comment
testLines.push(` // browser_click_ref: ${entry.paramsSummary} — replace with stable selector`);
actionCount++;
break;
}
case "browser_type": {
const selector = params.selector;
const text = params.text;
if (selector && text) {
testLines.push(` await page.locator(${quote(selector)}).fill(${quote(text)});`);
actionCount++;
}
break;
}
case "browser_fill_ref": {
testLines.push(` // browser_fill_ref: ${entry.paramsSummary} — replace with stable selector`);
actionCount++;
break;
}
case "browser_key_press": {
const key = params.key;
if (key) {
testLines.push(` await page.keyboard.press(${quote(key)});`);
actionCount++;
}
break;
}
case "browser_select_option": {
const selector = params.selector;
const option = params.option;
if (selector && option) {
testLines.push(` await page.locator(${quote(selector)}).selectOption(${quote(option)});`);
actionCount++;
}
break;
}
case "browser_set_checked": {
const selector = params.selector;
const checked = params.checked;
if (selector) {
testLines.push(` await page.locator(${quote(selector)}).setChecked(${checked === "true"});`);
actionCount++;
}
break;
}
case "browser_hover": {
const selector = params.selector;
if (selector) {
testLines.push(` await page.locator(${quote(selector)}).hover();`);
actionCount++;
}
break;
}
case "browser_wait_for": {
const condition = params.condition;
const value = params.value;
if (condition === "selector_visible" && value) {
testLines.push(` await expect(page.locator(${quote(value)})).toBeVisible();`);
actionCount++;
} else if (condition === "text_visible" && value) {
testLines.push(` await expect(page.locator('body')).toContainText(${quote(value)});`);
actionCount++;
} else if (condition === "url_contains" && value) {
testLines.push(` await page.waitForURL(${quote(`**/*${value}*`)});`);
actionCount++;
} else if (condition === "network_idle") {
testLines.push(` await page.waitForLoadState('networkidle');`);
actionCount++;
} else if (condition === "delay" && value) {
testLines.push(` await page.waitForTimeout(${value});`);
actionCount++;
}
break;
}
case "browser_assert": {
if (!includeAssertions) break;
// The assertion details are in verificationSummary
if (entry.verificationSummary) {
testLines.push(` // Assertion: ${entry.verificationSummary}`);
}
actionCount++;
break;
}
case "browser_scroll": {
const direction = params.direction;
const amount = params.amount ?? "300";
const delta = direction === "up" ? `-${amount}` : amount;
testLines.push(` await page.mouse.wheel(0, ${delta});`);
actionCount++;
break;
}
case "browser_set_viewport": {
const width = params.width;
const height = params.height;
if (width && height) {
testLines.push(` await page.setViewportSize({ width: ${width}, height: ${height} });`);
actionCount++;
}
break;
}
default:
// Skip tools that don't map to Playwright test actions
break;
}
}
testLines.push(` });`);
testLines.push(`});`);
const importLine = `import { ${[...imports].join(", ")} } from '@playwright/test';`;
const fullTest = `${importLine}\n\n${testLines.join("\n")}\n`;
// Write to file
let outputPath: string;
if (params.outputPath) {
outputPath = params.outputPath;
} else {
const safeName = deps.sanitizeArtifactName(testName, "recorded-session");
outputPath = deps.buildSessionArtifactPath(`${safeName}.spec.ts`);
}
await deps.ensureSessionArtifactDir();
const { path: writtenPath, bytes } = await deps.writeArtifactFile(outputPath, fullTest);
return {
content: [{
type: "text",
text: `Test generated: ${writtenPath}\nActions: ${actionCount}\nTimeline entries processed: ${timeline.entries.length}\n\n${fullTest}`,
}],
details: {
path: writtenPath,
bytes,
actionCount,
timelineEntries: timeline.entries.length,
testCode: fullTest,
},
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Test generation failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
}
function escapeString(s: string): string {
return s.replace(/'/g, "\\'").replace(/\\/g, "\\\\");
}
function quote(s: string): string {
// Use single quotes for simple strings, backtick for those with quotes
if (!s.includes("'")) return `'${s}'`;
if (!s.includes("`")) return `\`${s}\``;
return `'${s.replace(/'/g, "\\'")}'`;
}
/**
* Parse the paramsSummary string back into key-value pairs.
* Format: key="value", key=value, key=[N], key={...}
*/
function parseParamsSummary(summary: string): Record<string, string> {
const result: Record<string, string> = {};
if (!summary) return result;
const regex = /(\w+)=(?:"([^"]*(?:\\"[^"]*)*)"|([^,\s]+))/g;
let match;
while ((match = regex.exec(summary)) !== null) {
const key = match[1];
const value = match[2] ?? match[3];
result[key] = value;
}
return result;
}

View file

@ -0,0 +1,183 @@
import type { ExtensionAPI } from "@gsd/pi-coding-agent";
import { Type } from "@sinclair/typebox";
import type { ToolDeps } from "../state.js";
/**
* Device emulation tool full device simulation using Playwright's built-in device descriptors.
*/
export function registerDeviceTools(pi: ExtensionAPI, deps: ToolDeps): void {
pi.registerTool({
name: "browser_emulate_device",
label: "Browser Emulate Device",
description:
"Simulate a specific device by setting viewport, user agent, device scale factor, touch, and mobile flag. " +
"Uses Playwright's built-in device descriptors (~143 devices). Accepts fuzzy matching on device name. " +
"Note: Full emulation (user agent, isMobile) requires a context restart — the current page state will be lost. " +
"The tool recreates the context with the device profile applied.",
parameters: Type.Object({
device: Type.String({
description:
"Device name (e.g., 'iPhone 15', 'Pixel 7', 'iPad Pro 11'). " +
"Case-insensitive fuzzy matching. Use 'list' to see all available devices.",
}),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { chromium, devices } = await import("playwright");
const allDeviceNames = Object.keys(devices);
// Handle 'list' request
if (params.device.toLowerCase() === "list") {
// Group by base device name (remove landscape variants for cleaner display)
const baseNames = allDeviceNames.filter((n) => !n.endsWith(" landscape"));
return {
content: [{
type: "text",
text: `Available devices (${allDeviceNames.length} total, ${baseNames.length} base):\n${baseNames.join("\n")}`,
}],
details: { devices: baseNames, total: allDeviceNames.length },
};
}
// Fuzzy match device name
const needle = params.device.toLowerCase();
let exactMatch = allDeviceNames.find((n) => n.toLowerCase() === needle);
if (!exactMatch) {
// Try contains match
const containsMatches = allDeviceNames.filter((n) => n.toLowerCase().includes(needle));
if (containsMatches.length === 1) {
exactMatch = containsMatches[0];
} else if (containsMatches.length > 1) {
// Pick the shortest match (most specific)
containsMatches.sort((a, b) => a.length - b.length);
exactMatch = containsMatches[0];
const suggestions = containsMatches.slice(0, 5).join(", ");
// Continue with best match but mention alternatives
} else {
// No match at all — suggest closest
const suggestions = allDeviceNames
.map((n) => ({ name: n, score: fuzzyScore(needle, n.toLowerCase()) }))
.sort((a, b) => b.score - a.score)
.slice(0, 5)
.map((s) => s.name);
return {
content: [{
type: "text",
text: `No device matching "${params.device}". Did you mean:\n${suggestions.map((s) => ` - ${s}`).join("\n")}`,
}],
details: { error: "no_match", suggestions },
isError: true,
};
}
}
const deviceDescriptor = devices[exactMatch!];
if (!deviceDescriptor) {
return {
content: [{ type: "text", text: `Device descriptor not found for "${exactMatch}"` }],
details: { error: "descriptor_not_found" },
isError: true,
};
}
// Context restart required for full emulation.
// Save current URL to navigate back after restart.
const { page: currentPage, context: currentCtx } = await deps.ensureBrowser();
const currentUrl = currentPage.url();
// Close existing browser and relaunch with device profile
await deps.closeBrowser();
// Re-launch — ensureBrowser doesn't accept device params, so we do it manually.
// This is a one-off context creation with device emulation.
const needsHeadless = process.platform === "linux" && !process.env.DISPLAY;
const launchOptions: Record<string, unknown> = {
headless: needsHeadless || process.env.FORCE_HEADLESS === "true",
};
const customPath = process.env.BROWSER_PATH;
if (customPath) launchOptions.executablePath = customPath;
const browser = await chromium.launch(launchOptions);
const context = await browser.newContext({
...deviceDescriptor,
});
// Inject evaluate helpers
const { EVALUATE_HELPERS_SOURCE } = await import("../evaluate-helpers.js");
await context.addInitScript(EVALUATE_HELPERS_SOURCE);
// Wire up state
const {
setBrowser, setContext, pageRegistry, setSessionStartedAt,
setSessionArtifactDir, resetAllState,
} = await import("../state.js");
const { registryAddPage, registrySetActive } = await import("../core.js");
// Reset state for new session
resetAllState();
setBrowser(browser);
setContext(context);
setSessionStartedAt(Date.now());
const page = await context.newPage();
const entry = registryAddPage(pageRegistry, {
page,
title: "",
url: "about:blank",
opener: null,
});
registrySetActive(pageRegistry, entry.id);
deps.attachPageListeners(page, entry.id);
// Navigate back to previous URL if it wasn't about:blank
if (currentUrl && currentUrl !== "about:blank") {
await page.goto(currentUrl, { waitUntil: "domcontentloaded", timeout: 15000 }).catch(() => {});
}
const viewport = deviceDescriptor.viewport;
const vpText = viewport ? `${viewport.width}x${viewport.height}` : "unknown";
return {
content: [{
type: "text",
text: `Device emulation active: ${exactMatch}\nViewport: ${vpText}\nUser Agent: ${deviceDescriptor.userAgent?.slice(0, 80) ?? "default"}...\nMobile: ${deviceDescriptor.isMobile ?? false}\nTouch: ${deviceDescriptor.hasTouch ?? false}\nScale Factor: ${deviceDescriptor.deviceScaleFactor ?? 1}\n\nContext was restarted for full emulation. Page state was reset.`,
}],
details: {
device: exactMatch,
viewport: vpText,
isMobile: deviceDescriptor.isMobile ?? false,
hasTouch: deviceDescriptor.hasTouch ?? false,
deviceScaleFactor: deviceDescriptor.deviceScaleFactor ?? 1,
userAgent: deviceDescriptor.userAgent,
restoredUrl: currentUrl,
},
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Device emulation failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
}
/**
* Simple fuzzy scoring counts matching characters in order.
*/
function fuzzyScore(needle: string, haystack: string): number {
let score = 0;
let hi = 0;
for (let ni = 0; ni < needle.length && hi < haystack.length; ni++) {
const idx = haystack.indexOf(needle[ni], hi);
if (idx >= 0) {
score++;
hi = idx + 1;
}
}
return score / Math.max(needle.length, 1);
}

View file

@ -0,0 +1,229 @@
import type { ExtensionAPI } from "@gsd/pi-coding-agent";
import { Type } from "@sinclair/typebox";
import type { ToolDeps } from "../state.js";
/**
* Structured data extraction with JSON Schema validation.
*/
export function registerExtractTools(pi: ExtensionAPI, deps: ToolDeps): void {
pi.registerTool({
name: "browser_extract",
label: "Browser Extract",
description:
"Extract structured data from the current page using CSS selectors and validate against a JSON Schema. " +
"Provide a schema describing the shape of data you want. The tool extracts data by evaluating " +
"CSS selectors in the page context, then validates the result against your schema. " +
"Supports extracting single objects or arrays of items. Waits for network idle before extraction.",
parameters: Type.Object({
schema: Type.Record(Type.String(), Type.Unknown(), {
description:
"JSON Schema describing the data shape to extract. Properties should include " +
"'_selector' (CSS selector) and '_attribute' (attribute to read, default: 'textContent') hints. " +
"Example: { type: 'object', properties: { title: { _selector: 'h1', _attribute: 'textContent' }, price: { _selector: '.price', _attribute: 'textContent' } } }",
}),
selector: Type.Optional(
Type.String({ description: "CSS selector to scope extraction to a specific container element." }),
),
multiple: Type.Optional(
Type.Boolean({
description:
"If true, extract an array of items. The 'selector' parameter becomes the item container selector, " +
"and schema properties are extracted relative to each matched container.",
}),
),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { page: p } = await deps.ensureBrowser();
// Wait for network idle before extraction
await p.waitForLoadState("networkidle", { timeout: 10000 }).catch(() => {});
const schema = params.schema as any;
const scopeSelector = params.selector;
const multiple = params.multiple ?? false;
// Build extraction plan from schema
const extractionPlan = buildExtractionPlan(schema);
// Execute extraction in page context
const rawData = await p.evaluate(
({ plan, scope, multi }: { plan: ExtractionField[]; scope: string | undefined; multi: boolean }) => {
function extractFromContainer(container: Element, fields: typeof plan): Record<string, unknown> {
const result: Record<string, unknown> = {};
for (const field of fields) {
const el = container.querySelector(field.selector);
if (!el) {
result[field.name] = null;
continue;
}
let value: unknown;
switch (field.attribute) {
case "textContent":
value = (el.textContent ?? "").trim();
break;
case "innerText":
value = ((el as HTMLElement).innerText ?? "").trim();
break;
case "innerHTML":
value = el.innerHTML;
break;
case "href":
value = (el as HTMLAnchorElement).href ?? el.getAttribute("href");
break;
case "src":
value = (el as HTMLImageElement).src ?? el.getAttribute("src");
break;
case "value":
value = (el as HTMLInputElement).value;
break;
default:
value = el.getAttribute(field.attribute) ?? (el.textContent ?? "").trim();
}
// Type coercion
if (field.type === "number" && typeof value === "string") {
const num = parseFloat(value.replace(/[^0-9.-]/g, ""));
value = isNaN(num) ? value : num;
} else if (field.type === "boolean" && typeof value === "string") {
value = value.toLowerCase() === "true" || value === "1";
}
result[field.name] = value;
}
return result;
}
const root = scope ? document.querySelector(scope) : document.body;
if (!root) return { data: null, error: `Scope selector "${scope}" not found` };
if (multi) {
// For multiple items, scope is the item selector
const containers = scope
? document.querySelectorAll(scope)
: [document.body];
const items = Array.from(containers).map((container) =>
extractFromContainer(container, plan),
);
return { data: items, error: null };
} else {
return { data: extractFromContainer(root, plan), error: null };
}
},
{ plan: extractionPlan, scope: scopeSelector, multi: multiple },
);
if (rawData.error) {
return {
content: [{ type: "text", text: `Extraction failed: ${rawData.error}` }],
details: { error: rawData.error },
isError: true,
};
}
// Validate against schema using ajv
const validationErrors = await validateData(rawData.data, schema, multiple);
const resultText = JSON.stringify(rawData.data, null, 2);
const truncated = resultText.length > 4000 ? resultText.slice(0, 4000) + "\n...(truncated)" : resultText;
return {
content: [{
type: "text",
text: validationErrors.length > 0
? `Extracted data (with ${validationErrors.length} validation warning(s)):\n${truncated}\n\nValidation warnings:\n${validationErrors.join("\n")}`
: `Extracted data:\n${truncated}`,
}],
details: {
data: rawData.data,
validationErrors: validationErrors.length > 0 ? validationErrors : undefined,
fieldCount: extractionPlan.length,
itemCount: multiple ? (rawData.data as any[])?.length ?? 0 : 1,
},
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Extraction failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
}
interface ExtractionField {
name: string;
selector: string;
attribute: string;
type: string;
}
function buildExtractionPlan(schema: any): ExtractionField[] {
const fields: ExtractionField[] = [];
if (!schema || typeof schema !== "object") return fields;
const properties = schema.properties ?? schema;
for (const [name, propSchema] of Object.entries(properties)) {
const prop = propSchema as any;
if (!prop || typeof prop !== "object") continue;
// Skip meta fields
if (name === "type" || name === "required" || name === "properties" || name === "$schema") continue;
const selector = prop._selector ?? prop.selector ?? `[data-field="${name}"], .${name}, #${name}`;
const attribute = prop._attribute ?? prop.attribute ?? "textContent";
const type = prop.type ?? "string";
fields.push({ name, selector, attribute, type });
}
return fields;
}
async function validateData(data: unknown, schema: any, isArray: boolean): Promise<string[]> {
const errors: string[] = [];
try {
const ajvModule = await import("ajv");
const Ajv = ajvModule.default ?? ajvModule;
const ajv = new (Ajv as any)({ allErrors: true, strict: false });
// Clean schema — remove our custom _selector/_attribute hints before validation
const cleanSchema = cleanSchemaForValidation(schema);
// Wrap in array schema if multiple
const validationSchema = isArray
? { type: "array", items: cleanSchema }
: cleanSchema;
const validate = ajv.compile(validationSchema);
const valid = validate(data);
if (!valid && validate.errors) {
for (const err of validate.errors) {
errors.push(`${err.instancePath || "/"}: ${err.message}`);
}
}
} catch (err: any) {
errors.push(`Schema validation setup failed: ${err.message}`);
}
return errors;
}
function cleanSchemaForValidation(schema: any): any {
if (!schema || typeof schema !== "object") return schema;
if (Array.isArray(schema)) return schema.map(cleanSchemaForValidation);
const cleaned: any = {};
for (const [key, value] of Object.entries(schema)) {
if (key.startsWith("_")) continue; // Remove our custom hints
if (key === "selector" && typeof value === "string") continue; // Also remove plain 'selector'
if (key === "attribute" && typeof value === "string") continue; // Also remove plain 'attribute'
cleaned[key] = cleanSchemaForValidation(value);
}
return cleaned;
}

View file

@ -0,0 +1,221 @@
import type { ExtensionAPI } from "@gsd/pi-coding-agent";
import { Type } from "@sinclair/typebox";
import type { ToolDeps } from "../state.js";
/**
* Prompt injection detection scan page content for text attempting to hijack the agent.
*/
// Known injection patterns — regex patterns that match common prompt injection attempts
const INJECTION_PATTERNS: Array<{ pattern: RegExp; category: string; severity: "high" | "medium" | "low" }> = [
// Direct instruction override attempts
{ pattern: /ignore\s+(all\s+)?previous\s+(instructions?|prompts?)/i, category: "instruction_override", severity: "high" },
{ pattern: /disregard\s+(all\s+)?previous\s+(instructions?|prompts?)/i, category: "instruction_override", severity: "high" },
{ pattern: /forget\s+(all\s+)?previous\s+(instructions?|prompts?)/i, category: "instruction_override", severity: "high" },
{ pattern: /override\s+(all\s+)?previous\s+(instructions?|prompts?)/i, category: "instruction_override", severity: "high" },
// System prompt extraction
{ pattern: /(?:what|show|reveal|display|repeat|tell)\s+(?:me\s+)?(?:your|the)\s+system\s+prompt/i, category: "prompt_extraction", severity: "high" },
{ pattern: /print\s+(?:your|the)\s+(?:system\s+)?(?:prompt|instructions)/i, category: "prompt_extraction", severity: "high" },
// Role reassignment
{ pattern: /you\s+are\s+now\s+(?:a\s+)?(?:DAN|jailbreak|unrestricted|unfiltered)/i, category: "role_reassignment", severity: "high" },
{ pattern: /act\s+as\s+(?:a\s+)?(?:DAN|jailbreak|unrestricted|unfiltered)/i, category: "role_reassignment", severity: "high" },
{ pattern: /entering\s+(?:a\s+)?(?:developer|admin|root|sudo)\s+mode/i, category: "role_reassignment", severity: "high" },
// Hidden instruction injection
{ pattern: /\[SYSTEM\]\s*:/i, category: "fake_system_message", severity: "high" },
{ pattern: /\[INST\]\s*:/i, category: "fake_system_message", severity: "medium" },
{ pattern: /<\/?system>/i, category: "fake_system_message", severity: "high" },
// Tool/action manipulation
{ pattern: /execute\s+(?:the\s+following\s+)?(?:command|code|script)/i, category: "command_injection", severity: "medium" },
{ pattern: /run\s+(?:this|the\s+following)\s+(?:command|code|script)/i, category: "command_injection", severity: "medium" },
// Invisible text / social engineering
{ pattern: /do\s+not\s+(?:read|process|show)\s+(?:the\s+)?(?:following|rest)/i, category: "social_engineering", severity: "low" },
{ pattern: /(?:this|the\s+following)\s+(?:is|are)\s+(?:your\s+)?new\s+instructions/i, category: "instruction_override", severity: "high" },
// Base64/encoded content markers
{ pattern: /base64\s*:\s*[A-Za-z0-9+\/=]{50,}/i, category: "encoded_payload", severity: "medium" },
];
export function registerInjectionDetectionTools(pi: ExtensionAPI, deps: ToolDeps): void {
pi.registerTool({
name: "browser_check_injection",
label: "Browser Check Injection",
description:
"Scan current page content for potential prompt injection attempts. " +
"Checks visible text and hidden elements for patterns that might hijack the agent. " +
"Returns findings with severity levels. Use after navigating to untrusted pages.",
parameters: Type.Object({
includeHidden: Type.Optional(
Type.Boolean({
description:
"Also scan hidden/invisible text (default: true). " +
"Hidden text is a common vector for injection attacks.",
}),
),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { page: p } = await deps.ensureBrowser();
const includeHidden = params.includeHidden ?? true;
// Extract text content from the page
const pageContent = await p.evaluate((scanHidden: boolean) => {
const results: Array<{ text: string; source: string; visible: boolean }> = [];
// 1. Visible text content
const bodyText = document.body?.innerText ?? "";
results.push({ text: bodyText, source: "body_visible_text", visible: true });
// 2. Title and meta
results.push({ text: document.title, source: "page_title", visible: true });
// Meta descriptions and keywords
const metas = document.querySelectorAll("meta[name], meta[property]");
for (const meta of metas) {
const content = meta.getAttribute("content");
if (content) {
results.push({
text: content,
source: `meta:${meta.getAttribute("name") || meta.getAttribute("property")}`,
visible: false,
});
}
}
if (scanHidden) {
// 3. Hidden elements (display:none, visibility:hidden, opacity:0, off-screen, aria-hidden)
const allElements = document.querySelectorAll("*");
for (const el of allElements) {
const htmlEl = el as HTMLElement;
const style = window.getComputedStyle(htmlEl);
const isHidden =
style.display === "none" ||
style.visibility === "hidden" ||
style.opacity === "0" ||
htmlEl.getAttribute("aria-hidden") === "true" ||
(htmlEl.offsetWidth === 0 && htmlEl.offsetHeight === 0);
if (isHidden && htmlEl.textContent?.trim()) {
const text = htmlEl.textContent.trim();
if (text.length > 5 && text.length < 5000) {
results.push({ text, source: "hidden_element", visible: false });
}
}
}
// 4. HTML comments
const walker = document.createTreeWalker(
document.documentElement,
NodeFilter.SHOW_COMMENT,
);
let node;
while ((node = walker.nextNode())) {
const text = (node as Comment).textContent?.trim() ?? "";
if (text.length > 10) {
results.push({ text, source: "html_comment", visible: false });
}
}
// 5. Data attributes with text content
const dataElements = document.querySelectorAll("[data-prompt], [data-instruction], [data-system]");
for (const el of dataElements) {
for (const attr of el.attributes) {
if (attr.name.startsWith("data-") && attr.value.length > 10) {
results.push({
text: attr.value,
source: `data_attribute:${attr.name}`,
visible: false,
});
}
}
}
}
return results;
}, includeHidden);
// Scan all extracted text against injection patterns
const findings: Array<{
pattern: string;
category: string;
severity: string;
source: string;
visible: boolean;
matchedText: string;
}> = [];
for (const { text, source, visible } of pageContent) {
for (const { pattern, category, severity } of INJECTION_PATTERNS) {
const match = text.match(pattern);
if (match) {
findings.push({
pattern: pattern.source.slice(0, 60),
category,
severity,
source,
visible,
matchedText: match[0].slice(0, 100),
});
}
}
}
// Deduplicate findings by category + source
const seen = new Set<string>();
const uniqueFindings = findings.filter((f) => {
const key = `${f.category}|${f.source}|${f.matchedText}`;
if (seen.has(key)) return false;
seen.add(key);
return true;
});
const highCount = uniqueFindings.filter((f) => f.severity === "high").length;
const medCount = uniqueFindings.filter((f) => f.severity === "medium").length;
const lowCount = uniqueFindings.filter((f) => f.severity === "low").length;
if (uniqueFindings.length === 0) {
return {
content: [{
type: "text",
text: `No prompt injection patterns detected.\nScanned: ${pageContent.length} text regions (hidden: ${includeHidden})`,
}],
details: {
clean: true,
scannedRegions: pageContent.length,
includeHidden,
},
};
}
const findingLines = uniqueFindings.map((f) =>
` [${f.severity.toUpperCase()}] ${f.category} in ${f.source}${!f.visible ? " (HIDDEN)" : ""}: "${f.matchedText}"`,
);
return {
content: [{
type: "text",
text: `⚠️ Prompt injection patterns detected: ${uniqueFindings.length} finding(s)\nHigh: ${highCount} | Medium: ${medCount} | Low: ${lowCount}\n\n${findingLines.join("\n")}\n\n⚠ This page may be attempting to manipulate the agent. Proceed with caution.`,
}],
details: {
clean: false,
findings: uniqueFindings,
counts: { high: highCount, medium: medCount, low: lowCount, total: uniqueFindings.length },
scannedRegions: pageContent.length,
includeHidden,
},
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Injection check failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
}

View file

@ -0,0 +1,244 @@
import type { ExtensionAPI } from "@gsd/pi-coding-agent";
import { Type } from "@sinclair/typebox";
import type { ToolDeps } from "../state.js";
/**
* Network interception & mocking tools mock API responses, block URLs, simulate errors.
*/
interface ActiveRoute {
id: number;
pattern: string;
type: "mock" | "block";
status?: number;
delay?: number;
description: string;
}
let nextRouteId = 1;
const activeRoutes: ActiveRoute[] = [];
const routeCleanups: Map<number, () => Promise<void>> = new Map();
export function registerNetworkMockTools(pi: ExtensionAPI, deps: ToolDeps): void {
// -------------------------------------------------------------------------
// browser_mock_route
// -------------------------------------------------------------------------
pi.registerTool({
name: "browser_mock_route",
label: "Browser Mock Route",
description:
"Intercept network requests matching a URL pattern and respond with custom status, body, and headers. " +
"Supports simulating slow responses via delay parameter. " +
"Routes survive page navigation within the same context. Use browser_clear_routes to remove all mocks.",
parameters: Type.Object({
url: Type.String({
description: "URL pattern to intercept. Supports glob patterns (e.g., '**/api/users*') or exact URLs.",
}),
status: Type.Optional(
Type.Number({ description: "HTTP status code for the mock response (default: 200)." }),
),
body: Type.Optional(
Type.String({ description: "Response body string. For JSON responses, pass a JSON string." }),
),
contentType: Type.Optional(
Type.String({ description: "Content-Type header (default: 'application/json' if body looks like JSON, else 'text/plain')." }),
),
headers: Type.Optional(
Type.Record(Type.String(), Type.String(), {
description: "Additional response headers as key-value pairs.",
}),
),
delay: Type.Optional(
Type.Number({ description: "Delay in milliseconds before sending the response. Simulates slow responses." }),
),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { page: p } = await deps.ensureBrowser();
const routeId = nextRouteId++;
const status = params.status ?? 200;
const body = params.body ?? "";
const delay = params.delay ?? 0;
// Auto-detect content type
let contentType = params.contentType;
if (!contentType) {
try {
JSON.parse(body);
contentType = "application/json";
} catch {
contentType = "text/plain";
}
}
const headers: Record<string, string> = {
"content-type": contentType,
"access-control-allow-origin": "*",
...(params.headers ?? {}),
};
const handler = async (route: any) => {
if (delay > 0) {
await new Promise((resolve) => setTimeout(resolve, delay));
}
await route.fulfill({
status,
body,
headers,
});
};
await p.route(params.url, handler);
const cleanup = async () => {
try {
await p.unroute(params.url, handler);
} catch {
// Page may be closed
}
};
const routeInfo: ActiveRoute = {
id: routeId,
pattern: params.url,
type: "mock",
status,
delay: delay > 0 ? delay : undefined,
description: `Mock ${params.url}${status}${delay > 0 ? ` (${delay}ms delay)` : ""}`,
};
activeRoutes.push(routeInfo);
routeCleanups.set(routeId, cleanup);
return {
content: [{
type: "text",
text: `Route mocked: ${routeInfo.description}\nRoute ID: ${routeId}\nActive routes: ${activeRoutes.length}`,
}],
details: { routeId, ...routeInfo, activeRouteCount: activeRoutes.length },
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Mock route failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
// -------------------------------------------------------------------------
// browser_block_urls
// -------------------------------------------------------------------------
pi.registerTool({
name: "browser_block_urls",
label: "Browser Block URLs",
description:
"Block network requests matching URL patterns. Useful for blocking analytics, ads, or third-party scripts. " +
"Accepts glob patterns. Routes survive page navigation.",
parameters: Type.Object({
patterns: Type.Array(Type.String(), {
description: "URL patterns to block (glob syntax, e.g., ['**/analytics*', '**/ads*']).",
}),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { page: p } = await deps.ensureBrowser();
const results: ActiveRoute[] = [];
for (const pattern of params.patterns) {
const routeId = nextRouteId++;
const handler = async (route: any) => {
await route.abort("blockedbyclient");
};
await p.route(pattern, handler);
const cleanup = async () => {
try {
await p.unroute(pattern, handler);
} catch {}
};
const routeInfo: ActiveRoute = {
id: routeId,
pattern,
type: "block",
description: `Block ${pattern}`,
};
activeRoutes.push(routeInfo);
routeCleanups.set(routeId, cleanup);
results.push(routeInfo);
}
return {
content: [{
type: "text",
text: `Blocked ${results.length} URL pattern(s):\n${results.map((r) => ` - ${r.description} (ID: ${r.id})`).join("\n")}\nActive routes: ${activeRoutes.length}`,
}],
details: { blocked: results, activeRouteCount: activeRoutes.length },
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Block URLs failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
// -------------------------------------------------------------------------
// browser_clear_routes
// -------------------------------------------------------------------------
pi.registerTool({
name: "browser_clear_routes",
label: "Browser Clear Routes",
description:
"Remove all active route mocks and URL blocks. Also lists currently active routes if called with no routes active.",
parameters: Type.Object({}),
async execute(_toolCallId, _params, _signal, _onUpdate, _ctx) {
try {
await deps.ensureBrowser();
const count = activeRoutes.length;
if (count === 0) {
return {
content: [{ type: "text", text: "No active routes to clear." }],
details: { cleared: 0 },
};
}
const routeDescriptions = activeRoutes.map((r) => r.description);
// Clean up all routes
for (const [id, cleanup] of routeCleanups) {
await cleanup();
}
activeRoutes.length = 0;
routeCleanups.clear();
return {
content: [{
type: "text",
text: `Cleared ${count} route(s):\n${routeDescriptions.map((d) => ` - ${d}`).join("\n")}`,
}],
details: { cleared: count, routes: routeDescriptions },
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Clear routes failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
}

View file

@ -0,0 +1,92 @@
import type { ExtensionAPI } from "@gsd/pi-coding-agent";
import { Type } from "@sinclair/typebox";
import type { ToolDeps } from "../state.js";
export function registerPdfTools(pi: ExtensionAPI, deps: ToolDeps): void {
pi.registerTool({
name: "browser_save_pdf",
label: "Browser Save PDF",
description:
"Render current page as PDF artifact via Playwright's page.pdf(). " +
"Supports A4/Letter/custom page formats and optional background graphics. " +
"Writes to session artifacts directory. Chromium only.",
parameters: Type.Object({
filename: Type.Optional(
Type.String({ description: "Output filename (default: auto-generated from page title + timestamp)." }),
),
format: Type.Optional(
Type.String({
description:
"Page format: 'A4' (default), 'Letter', 'Legal', 'Tabloid', or custom like '8.5in x 11in'. " +
"Custom format uses CSS dimension syntax for width x height.",
}),
),
printBackground: Type.Optional(
Type.Boolean({ description: "Include background graphics (default: true)." }),
),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { page: p } = await deps.ensureBrowser();
const url = p.url();
const title = await p.title().catch(() => "untitled");
// Resolve filename
const timestamp = deps.formatArtifactTimestamp(Date.now());
const safeName = deps.sanitizeArtifactName(params.filename || `${title}-${timestamp}`, `pdf-${timestamp}`);
const filename = safeName.endsWith(".pdf") ? safeName : `${safeName}.pdf`;
// Resolve format
const knownFormats = new Set(["A4", "Letter", "Legal", "Tabloid", "Ledger", "A0", "A1", "A2", "A3", "A5", "A6"]);
const formatInput = params.format ?? "A4";
let pdfOptions: Record<string, unknown> = {};
if (knownFormats.has(formatInput)) {
pdfOptions.format = formatInput;
} else {
// Custom format: parse "WIDTHin x HEIGHTin" or "WIDTHcm x HEIGHTcm" etc.
const customMatch = formatInput.match(/^(.+?)\s*[xX×]\s*(.+)$/);
if (customMatch) {
pdfOptions.width = customMatch[1]!.trim();
pdfOptions.height = customMatch[2]!.trim();
} else {
pdfOptions.format = "A4"; // fallback
}
}
pdfOptions.printBackground = params.printBackground ?? true;
// Generate PDF
await deps.ensureSessionArtifactDir();
const outputPath = deps.buildSessionArtifactPath(filename);
pdfOptions.path = outputPath;
await p.pdf(pdfOptions as any);
// Read file size
const { stat } = await import("node:fs/promises");
const fileStat = await stat(outputPath);
const sizeBytes = fileStat.size;
const sizeKB = (sizeBytes / 1024).toFixed(1);
return {
content: [
{
type: "text",
text: `PDF saved: ${outputPath}\nSize: ${sizeKB} KB\nFormat: ${formatInput}\nPage: ${title}\nURL: ${url}`,
},
],
details: { path: outputPath, sizeBytes, format: formatInput, pageUrl: url, pageTitle: title },
};
} catch (err: any) {
return {
content: [{ type: "text", text: `PDF generation failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
}

View file

@ -0,0 +1,202 @@
import type { ExtensionAPI } from "@gsd/pi-coding-agent";
import { Type } from "@sinclair/typebox";
import type { ToolDeps } from "../state.js";
/**
* State persistence tools save/restore cookies, localStorage, sessionStorage.
*/
const STATE_DIR = ".gsd/browser-state";
export function registerStatePersistenceTools(pi: ExtensionAPI, deps: ToolDeps): void {
// -------------------------------------------------------------------------
// browser_save_state
// -------------------------------------------------------------------------
pi.registerTool({
name: "browser_save_state",
label: "Browser Save State",
description:
"Save cookies, localStorage, and sessionStorage to disk so authenticated sessions survive browser restarts. " +
"State files are written to .gsd/browser-state/ and should be gitignored (may contain auth tokens). " +
"Never displays secret values in output.",
parameters: Type.Object({
name: Type.Optional(
Type.String({ description: "Name for the state file (default: 'default'). Used as the filename stem." }),
),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { context: ctx, page: p } = await deps.ensureBrowser();
const name = deps.sanitizeArtifactName(params.name ?? "default", "default");
const { mkdir, writeFile } = await import("node:fs/promises");
const path = await import("node:path");
const stateDir = path.resolve(process.cwd(), STATE_DIR);
await mkdir(stateDir, { recursive: true });
// 1. Playwright storageState: cookies + localStorage
const storageState = await ctx.storageState();
// 2. sessionStorage: must be extracted per-origin via page.evaluate
const sessionStorageData: Record<string, Record<string, string>> = {};
try {
const origin = new URL(p.url()).origin;
const ssData = await p.evaluate(() => {
const data: Record<string, string> = {};
for (let i = 0; i < sessionStorage.length; i++) {
const key = sessionStorage.key(i);
if (key) data[key] = sessionStorage.getItem(key) ?? "";
}
return data;
});
if (Object.keys(ssData).length > 0) {
sessionStorageData[origin] = ssData;
}
} catch {
// Page may not have a valid origin (about:blank, etc.)
}
const combined = {
storageState,
sessionStorage: sessionStorageData,
savedAt: new Date().toISOString(),
url: p.url(),
};
const filePath = path.join(stateDir, `${name}.json`);
await writeFile(filePath, JSON.stringify(combined, null, 2));
// Ensure .gitignore covers the state dir
const gitignorePath = path.resolve(process.cwd(), STATE_DIR, ".gitignore");
await writeFile(gitignorePath, "*\n!.gitignore\n").catch(() => {});
const cookieCount = storageState.cookies?.length ?? 0;
const localStorageOrigins = storageState.origins?.length ?? 0;
const sessionStorageOrigins = Object.keys(sessionStorageData).length;
return {
content: [{
type: "text",
text: `State saved: ${filePath}\nCookies: ${cookieCount}\nlocalStorage origins: ${localStorageOrigins}\nsessionStorage origins: ${sessionStorageOrigins}`,
}],
details: {
path: filePath,
cookieCount,
localStorageOrigins,
sessionStorageOrigins,
},
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Save state failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
// -------------------------------------------------------------------------
// browser_restore_state
// -------------------------------------------------------------------------
pi.registerTool({
name: "browser_restore_state",
label: "Browser Restore State",
description:
"Restore cookies, localStorage, and sessionStorage from a previously saved state file. " +
"Injects cookies via context.addCookies() and storage via page.evaluate(). " +
"For full fidelity, restore before navigating to the target site.",
parameters: Type.Object({
name: Type.Optional(
Type.String({ description: "Name of the state file to restore (default: 'default')." }),
),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { context: ctx, page: p } = await deps.ensureBrowser();
const name = deps.sanitizeArtifactName(params.name ?? "default", "default");
const { readFile } = await import("node:fs/promises");
const path = await import("node:path");
const filePath = path.join(process.cwd(), STATE_DIR, `${name}.json`);
let raw: string;
try {
raw = await readFile(filePath, "utf-8");
} catch {
return {
content: [{ type: "text", text: `State file not found: ${filePath}` }],
details: { error: "file_not_found", path: filePath },
isError: true,
};
}
const combined = JSON.parse(raw);
const storageState = combined.storageState;
const sessionStorageData: Record<string, Record<string, string>> = combined.sessionStorage ?? {};
// 1. Restore cookies
let cookieCount = 0;
if (storageState?.cookies?.length) {
await ctx.addCookies(storageState.cookies);
cookieCount = storageState.cookies.length;
}
// 2. Restore localStorage via page.evaluate
let localStorageOrigins = 0;
if (storageState?.origins?.length) {
for (const origin of storageState.origins) {
try {
await p.evaluate((items: Array<{ name: string; value: string }>) => {
for (const { name, value } of items) {
localStorage.setItem(name, value);
}
}, origin.localStorage ?? []);
localStorageOrigins++;
} catch {
// Origin mismatch — localStorage can only be set on matching origin
}
}
}
// 3. Restore sessionStorage via page.evaluate
let sessionStorageOrigins = 0;
for (const [_origin, data] of Object.entries(sessionStorageData)) {
try {
await p.evaluate((items: Record<string, string>) => {
for (const [key, value] of Object.entries(items)) {
sessionStorage.setItem(key, value);
}
}, data);
sessionStorageOrigins++;
} catch {
// Origin mismatch
}
}
return {
content: [{
type: "text",
text: `State restored from: ${filePath}\nCookies: ${cookieCount}\nlocalStorage origins: ${localStorageOrigins}\nsessionStorage origins: ${sessionStorageOrigins}\nSaved at: ${combined.savedAt ?? "unknown"}`,
}],
details: {
path: filePath,
cookieCount,
localStorageOrigins,
sessionStorageOrigins,
savedAt: combined.savedAt,
savedUrl: combined.url,
},
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Restore state failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
}

View file

@ -0,0 +1,209 @@
import type { ExtensionAPI } from "@gsd/pi-coding-agent";
import { Type } from "@sinclair/typebox";
import type { ToolDeps } from "../state.js";
/**
* Visual regression diffing compare current page screenshot against a stored baseline.
*/
const BASELINE_DIR = ".gsd/browser-baselines";
export function registerVisualDiffTools(pi: ExtensionAPI, deps: ToolDeps): void {
pi.registerTool({
name: "browser_visual_diff",
label: "Browser Visual Diff",
description:
"Compare current page screenshot against a stored baseline pixel-by-pixel. " +
"Returns similarity score (01), diff pixel count, and optionally generates a diff image highlighting changes. " +
"On first run with no baseline, saves the current screenshot as the baseline. " +
"Baselines are stored in .gsd/browser-baselines/ (gitignored, environment-specific).",
parameters: Type.Object({
name: Type.Optional(
Type.String({
description:
"Baseline name (default: auto-generated from URL + viewport). " +
"Use consistent names to compare the same view across runs.",
}),
),
selector: Type.Optional(
Type.String({
description: "CSS selector to scope comparison to a specific element instead of full viewport.",
}),
),
threshold: Type.Optional(
Type.Number({
description:
"Pixel matching threshold 01 (default: 0.1). " +
"Higher values are more tolerant of anti-aliasing and rendering differences.",
}),
),
updateBaseline: Type.Optional(
Type.Boolean({
description: "If true, overwrite the existing baseline with the current screenshot (default: false).",
}),
),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { page: p } = await deps.ensureBrowser();
const { mkdir, readFile, writeFile } = await import("node:fs/promises");
const pathMod = await import("node:path");
const baselineDir = pathMod.resolve(process.cwd(), BASELINE_DIR);
await mkdir(baselineDir, { recursive: true });
// Ensure .gitignore
const gitignorePath = pathMod.join(baselineDir, ".gitignore");
await writeFile(gitignorePath, "*\n!.gitignore\n").catch(() => {});
// Generate baseline name
const url = p.url();
const viewport = p.viewportSize();
const vpSuffix = viewport ? `${viewport.width}x${viewport.height}` : "unknown";
const autoName = deps.sanitizeArtifactName(
`${new URL(url).pathname.replace(/\//g, "-")}-${vpSuffix}`,
`baseline-${vpSuffix}`,
);
const name = deps.sanitizeArtifactName(params.name ?? autoName, autoName);
const baselinePath = pathMod.join(baselineDir, `${name}.png`);
const diffPath = pathMod.join(baselineDir, `${name}-diff.png`);
// Capture current screenshot as PNG (needed for pixel comparison)
let currentBuffer: Buffer;
if (params.selector) {
const locator = p.locator(params.selector).first();
currentBuffer = await locator.screenshot({ type: "png" });
} else {
currentBuffer = await p.screenshot({ type: "png", fullPage: false });
}
// Check if baseline exists
let baselineBuffer: Buffer | null = null;
try {
baselineBuffer = await readFile(baselinePath) as Buffer;
} catch {
// No baseline yet
}
if (!baselineBuffer || params.updateBaseline) {
// Save as new baseline
await writeFile(baselinePath, currentBuffer);
return {
content: [{
type: "text",
text: baselineBuffer
? `Baseline updated: ${baselinePath}\nSize: ${(currentBuffer.length / 1024).toFixed(1)} KB`
: `Baseline created (first run): ${baselinePath}\nSize: ${(currentBuffer.length / 1024).toFixed(1)} KB\nRe-run to compare against this baseline.`,
}],
details: {
baselinePath,
baselineCreated: !baselineBuffer,
baselineUpdated: !!baselineBuffer,
sizeBytes: currentBuffer.length,
},
};
}
// Perform pixel comparison using sharp for PNG decoding
const sharp = (await import("sharp")).default;
const baselineMeta = await sharp(baselineBuffer).metadata();
const currentMeta = await sharp(currentBuffer).metadata();
const bWidth = baselineMeta.width ?? 0;
const bHeight = baselineMeta.height ?? 0;
const cWidth = currentMeta.width ?? 0;
const cHeight = currentMeta.height ?? 0;
// If dimensions differ, report mismatch
if (bWidth !== cWidth || bHeight !== cHeight) {
return {
content: [{
type: "text",
text: `Dimension mismatch: baseline is ${bWidth}x${bHeight}, current is ${cWidth}x${cHeight}. Cannot compare.\nUse updateBaseline: true to reset.`,
}],
details: {
match: false,
dimensionMismatch: true,
baselineDimensions: { width: bWidth, height: bHeight },
currentDimensions: { width: cWidth, height: cHeight },
},
};
}
// Extract raw RGBA pixel data
const baselineRaw = await sharp(baselineBuffer).ensureAlpha().raw().toBuffer();
const currentRaw = await sharp(currentBuffer).ensureAlpha().raw().toBuffer();
const width = bWidth;
const height = bHeight;
const totalPixels = width * height;
const threshold = params.threshold ?? 0.1;
// Simple pixel-by-pixel comparison (avoiding pixelmatch dependency)
const diffData = Buffer.alloc(width * height * 4);
let diffPixels = 0;
const thresholdSq = threshold * threshold * 255 * 255 * 3;
for (let i = 0; i < totalPixels; i++) {
const offset = i * 4;
const dr = baselineRaw[offset] - currentRaw[offset];
const dg = baselineRaw[offset + 1] - currentRaw[offset + 1];
const db = baselineRaw[offset + 2] - currentRaw[offset + 2];
const distSq = dr * dr + dg * dg + db * db;
if (distSq > thresholdSq) {
diffPixels++;
// Mark diff pixels as red
diffData[offset] = 255; // R
diffData[offset + 1] = 0; // G
diffData[offset + 2] = 0; // B
diffData[offset + 3] = 255; // A
} else {
// Dim unchanged pixels
diffData[offset] = currentRaw[offset] >> 1;
diffData[offset + 1] = currentRaw[offset + 1] >> 1;
diffData[offset + 2] = currentRaw[offset + 2] >> 1;
diffData[offset + 3] = 255;
}
}
const similarity = 1 - (diffPixels / totalPixels);
const match = diffPixels === 0;
// Save diff image
await sharp(diffData, { raw: { width, height, channels: 4 } })
.png()
.toFile(diffPath);
return {
content: [{
type: "text",
text: match
? `Visual diff: MATCH (100% similar)\nBaseline: ${baselinePath}`
: `Visual diff: ${(similarity * 100).toFixed(2)}% similar\nDiff pixels: ${diffPixels} of ${totalPixels} (${((diffPixels / totalPixels) * 100).toFixed(2)}%)\nDiff image: ${diffPath}\nBaseline: ${baselinePath}`,
}],
details: {
match,
similarity,
diffPixels,
totalPixels,
diffPercentage: (diffPixels / totalPixels) * 100,
dimensions: { width, height },
baselinePath,
diffImagePath: match ? undefined : diffPath,
threshold,
},
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Visual diff failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
}

View file

@ -0,0 +1,104 @@
import type { ExtensionAPI } from "@gsd/pi-coding-agent";
import { Type } from "@sinclair/typebox";
import type { ToolDeps } from "../state.js";
/**
* Region zoom / high-res capture capture and upscale specific page regions.
*/
export function registerZoomTools(pi: ExtensionAPI, deps: ToolDeps): void {
pi.registerTool({
name: "browser_zoom_region",
label: "Browser Zoom Region",
description:
"Capture and optionally upscale a specific rectangular region of the page for detailed inspection. " +
"Useful for dense UIs where full-page screenshots have text too small to read. " +
"Returns the region as an inline image, same as browser_screenshot.",
parameters: Type.Object({
x: Type.Number({ description: "Left coordinate of the region in CSS pixels." }),
y: Type.Number({ description: "Top coordinate of the region in CSS pixels." }),
width: Type.Number({ description: "Width of the region in CSS pixels." }),
height: Type.Number({ description: "Height of the region in CSS pixels." }),
scale: Type.Optional(
Type.Number({
description: "Upscale factor (default: 2). Use 1 for native resolution, 2-4 for zoomed detail.",
}),
),
}),
async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
try {
const { page: p } = await deps.ensureBrowser();
const { x, y, width, height } = params;
const scale = params.scale ?? 2;
// Validate dimensions
if (width <= 0 || height <= 0) {
return {
content: [{ type: "text", text: "Width and height must be positive." }],
details: { error: "invalid_dimensions" },
isError: true,
};
}
// Capture the region using Playwright's clip option
const regionBuffer = await p.screenshot({
type: "png",
clip: { x, y, width, height },
});
let outputBuffer: Buffer = regionBuffer;
let outputMime = "image/png";
// Upscale if scale > 1
if (scale > 1) {
const sharp = (await import("sharp")).default;
const targetWidth = Math.round(width * scale);
const targetHeight = Math.round(height * scale);
outputBuffer = await sharp(regionBuffer)
.resize(targetWidth, targetHeight, {
kernel: "lanczos3",
fit: "fill",
})
.png()
.toBuffer();
}
const base64Data = outputBuffer.toString("base64");
const title = await p.title();
const url = p.url();
return {
content: [
{
type: "text",
text: `Region capture: ${width}x${height} at (${x},${y})${scale > 1 ? ` upscaled ${scale}x to ${Math.round(width * scale)}x${Math.round(height * scale)}` : ""}\nPage: ${title}\nURL: ${url}`,
},
{
type: "image",
data: base64Data,
mimeType: outputMime,
},
],
details: {
region: { x, y, width, height },
scale,
outputDimensions: {
width: Math.round(width * scale),
height: Math.round(height * scale),
},
title,
url,
},
};
} catch (err: any) {
return {
content: [{ type: "text", text: `Region zoom failed: ${err.message}` }],
details: { error: err.message },
isError: true,
};
}
},
});
}