feat(M002/S05): Intent-ranked retrieval and semantic actions

Tasks: - chore(M002/S05): auto-commit after complete-slice - chore(M002/S05): auto-commit after complete-slice - chore(M002/S05/T01): auto-commit after execute-task - chore(M002/S05/T01): auto-commit after execute-task - chore(M002/S05): auto-commit after plan-slice - docs(S05): add slice plan Branch: gsd/M002/S05
2026-03-13 00:31:37 -06:00 · 2026-03-13 00:31:37 -06:00 · 8c549bd9c7
commit 8c549bd9c7
parent 57f6b0fd90
9 changed files with 768 additions and 13 deletions
--- a/.gsd/DECISIONS.md
+++ b/.gsd/DECISIONS.md
@ -28,3 +28,5 @@
 | D020 | M002/S04 | pattern | Form analysis evaluate location | Form analysis evaluate logic lives in tools/forms.ts, not extracted to evaluate-helpers.ts | Form-specific, not a shared utility. The label resolution heuristic is only used by form tools. Keeping it local avoids bloating the shared injection. | Yes — if S05 intent tools need label resolution |
 | D021 | M002/S04 | pattern | Fill uses Playwright APIs, not evaluate | browser_fill_form uses Playwright locator.fill()/selectOption()/setChecked() instead of page.evaluate() value setting | Playwright APIs trigger proper input/change events and handle framework-specific reactivity (React, Vue). Direct value setting via evaluate skips event dispatch and breaks reactive frameworks. | No |
 | D022 | M002/S04 | pattern | Fill field matching priority | Label (exact → case-insensitive) → name → placeholder → aria-label | Label is the most human-readable identifier. Name is the most reliable programmatic identifier. Placeholder and aria-label are fallbacks. Exact match before fuzzy prevents wrong-field fills. | Yes — if real-world usage shows a different priority works better |
+| D023 | M002/S05 | pattern | Intent scoring model | 4 orthogonal dimensions per intent, each 0-1, summed and clamped | Consistent scoring structure across all 8 intents. Makes scoring testable and debuggable — each dimension has a named reason. 4 dimensions balance discrimination vs complexity. | Yes — could add/remove dimensions per intent if real-world usage shows imbalance |
+| D024 | M002/S05 | pattern | search_field action type | Focus instead of click for search_field intent in browser_act | Search fields need keyboard focus for typing, not a click that might submit or toggle. Focus is the semantically correct action. Other intents use click. | Yes — if focus proves unreliable on specific input implementations |
--- a/.gsd/PROJECT.md
+++ b/.gsd/PROJECT.md
@ -16,7 +16,7 @@ The GSD extension is fully functional with:
 - Guided `/gsd` wizard flow
 - `secure_env_collect` tool with masked TUI input, multi-destination write support, guidance display, and summary screen
 - Proactive secret management: planning prompts forecast secrets, manifests persist them, auto-mode collects them before first dispatch
- Browser-tools extension with 45 registered tools covering navigation, interaction, inspection, verification, tracing, debugging, and form intelligence (browser_analyze_form, browser_fill_form)
+- Browser-tools extension with 47 registered tools covering navigation, interaction, inspection, verification, tracing, debugging, form intelligence (browser_analyze_form, browser_fill_form), and intent-ranked retrieval and semantic actions (browser_find_best, browser_act)
 - Browser-tools `core.js` with shared utilities for action timeline, page registry, state diffing, assertions, fingerprinting

 ## Architecture / Key Patterns
@ -26,7 +26,7 @@ The GSD extension is fully functional with:
 - **Secrets gate**: `startAuto()` checks `getManifestStatus()` before first dispatch
 - **Disk-driven state**: `.gsd/` files are the source of truth, `STATE.md` is derived cache
 - **File parsing**: `files.ts` has markdown parsers for all GSD file types
- **Browser-tools**: Modular structure — slim `index.ts` orchestrator, 8 focused infrastructure modules (state.ts, utils.ts, evaluate-helpers.ts, lifecycle.ts, capture.ts, settle.ts, refs.ts), 10 categorized tool files under `tools/` (including forms.ts), shared infrastructure in `core.js` (~1000 lines). Browser-side utilities injected once via `addInitScript` under `window.__pi` namespace. Uses Playwright for browser control. Accessibility-first state representation, deterministic versioned refs, adaptive DOM settling, compact post-action summaries. Form tools use Playwright locator APIs for type-aware filling with structured result reporting.
+- **Browser-tools**: Modular structure — slim `index.ts` orchestrator, 8 focused infrastructure modules (state.ts, utils.ts, evaluate-helpers.ts, lifecycle.ts, capture.ts, settle.ts, refs.ts), 11 categorized tool files under `tools/` (including forms.ts, intent.ts), shared infrastructure in `core.js` (~1000 lines). Browser-side utilities injected once via `addInitScript` under `window.__pi` namespace. Uses Playwright for browser control. Accessibility-first state representation, deterministic versioned refs, adaptive DOM settling, compact post-action summaries. Form tools use Playwright locator APIs for type-aware filling with structured result reporting. Intent tools use deterministic 4-dimension heuristic scoring for element retrieval and one-call semantic actions.
 - **Prompt templates**: `prompts/` directory with mustache-like `{{var}}` substitution
 - **TUI components**: `@gsd/pi-tui` provides `Editor`, `Text`, key handling, themes
 - **Branch-per-slice**: git branches isolate slice work, squash-merged to main on completion
--- a/.gsd/REQUIREMENTS.md
+++ b/.gsd/REQUIREMENTS.md
@ -50,24 +50,24 @@ This file is the explicit capability and coverage contract for the project.

 ### R024 — Intent-ranked element retrieval (browser_find_best)
 - Class: core-capability
- Status: active
+- Status: validated
 - Description: A browser_find_best tool that takes an intent string (e.g. "submit form", "close dialog", "primary CTA") and returns scored candidates with reasons, using deterministic heuristic ranking.
 - Why it matters: The agent frequently needs "which button submits this form?" Currently it does browser_find → gets 15 candidates → reasons about which one. A heuristic ranker cuts a round trip and reduces reasoning tokens.
 - Source: user
 - Primary owning slice: M002/S05
 - Supporting slices: M002/S01
- Validation: unmapped
+- Validation: 8 intents implemented with 4-dimension scoring (submit_form, close_dialog, primary_cta, search_field, next_step, dismiss, auth_action, back_navigation). Each returns up to 5 candidates sorted by score with CSS selectors and reason strings. Intent normalization accepts underscores/spaces/hyphens. Verified via Playwright tests against real HTML pages with differentiated rankings. Build passes, tool count = 47.
 - Notes: Deterministic heuristics only. No hidden LLM calls.

 ### R025 — Semantic action tool (browser_act)
 - Class: core-capability
- Status: active
+- Status: validated
 - Description: A browser_act tool that takes a semantic intent (e.g. "submit the current form", "close the active modal", "click the primary CTA") and executes the obvious action sequence internally.
 - Why it matters: Each of these common micro-tasks currently takes 2-4 tool calls. browser_act collapses them into one.
 - Source: user
 - Primary owning slice: M002/S05
 - Supporting slices: M002/S04
- Validation: unmapped
+- Validation: Resolves top candidate via same scoring engine as browser_find_best. Executes via Playwright locator.click() with getByRole fallback (focus for search_field). Settles via settleAfterActionAdaptive, returns before/after diff. Zero-candidate returns isError:true without throwing. Verified via Playwright test scripts. Build passes, tool count = 47.
 - Notes: Builds on browser_find_best for element selection. Bounded — does not loop or retry.

 ### R026 — Test coverage for new and refactored code
@ -345,16 +345,16 @@ This file is the explicit capability and coverage contract for the project.
 | R021 | core-capability | validated | M002/S03 | none | screenshot param default false, capture gated, browser_reload unchanged, build passes |
 | R022 | core-capability | validated | M002/S04 | M002/S01 | 7-level label resolution, form auto-detection, verified against 12-field test form |
 | R023 | core-capability | validated | M002/S04 | M002/S01 | 5-strategy field resolution, type-aware fill, verified end-to-end with 10 fields |
-| R024 | core-capability | active | M002/S05 | M002/S01 | unmapped |
-| R025 | core-capability | active | M002/S05 | M002/S04 | unmapped |
+| R024 | core-capability | validated | M002/S05 | M002/S01 | 8-intent scoring, Playwright tests, differentiated rankings, build passes |
+| R025 | core-capability | validated | M002/S05 | M002/S04 | top candidate execution via Playwright locator, settle + diff, graceful error, build passes |
 | R026 | quality-attribute | active | M002/S06 | all M002 | unmapped |
 | R027 | core-capability | deferred | none | none | unmapped |
 | R028 | anti-feature | out-of-scope | none | none | n/a |

 ## Coverage Summary

- Active requirements: 3
- Validated requirements: 19
+- Active requirements: 1
+- Validated requirements: 21
 - Deferred requirements: 3
 - Out of scope: 3
 - Unmapped active requirements: 3
--- a/.gsd/STATE.md
+++ b/.gsd/STATE.md
@ -1,7 +1,7 @@
 # GSD State

 **Active Milestone:** M002 — Browser Tools Performance & Intelligence
-**Active Slice:** S05 — Intent-ranked retrieval and semantic actions
+**Active Slice:** S06 — Test coverage
 **Phase:** planning
 **Requirements Status:** 7 active · 15 validated · 3 deferred · 3 out of scope

@ -16,4 +16,4 @@
 - None

 ## Next Action
-Plan slice S05 (Intent-ranked retrieval and semantic actions).
+Plan slice S06 (Test coverage).
--- a/.gsd/milestones/M002/M002-ROADMAP.md
+++ b/.gsd/milestones/M002/M002-ROADMAP.md
@ -70,7 +70,7 @@ This milestone is complete only when all are true:
 - [x] **S04: Form intelligence** `risk:medium` `depends:[S01]`
  > After this: browser_analyze_form returns field inventory (labels, types, required, values, validation) for any form; browser_fill_form fills fields by label/name/placeholder mapping and optionally submits — verified by running both tools against a real multi-field form.

- [ ] **S05: Intent-ranked retrieval and semantic actions** `risk:medium` `depends:[S01]`
+- [x] **S05: Intent-ranked retrieval and semantic actions** `risk:medium` `depends:[S01]`
  > After this: browser_find_best returns scored candidates for intents like "submit form", "close dialog", "primary CTA"; browser_act executes common micro-tasks in one call — verified by running both tools against real pages.

 - [ ] **S06: Test coverage** `risk:low` `depends:[S01,S02,S03,S04,S05]`
--- a/.gsd/milestones/M002/slices/S05/S05-PLAN.md
+++ b/.gsd/milestones/M002/slices/S05/S05-PLAN.md
@ -0,0 +1,52 @@
+# S05: Intent-ranked retrieval and semantic actions
+
+**Goal:** `browser_find_best` returns scored candidates for semantic intents; `browser_act` resolves the top candidate and executes it in one call.
+**Demo:** Run `browser_find_best` with intent "submit_form" against a real page with a form and get ranked candidates. Run `browser_act` with intent "close_dialog" against a page with a modal and see it dismissed.
+
+## Must-Haves
+
+- `browser_find_best` registered and functional with 8 intents: submit_form, close_dialog, primary_cta, search_field, next_step, dismiss, auth_action, back_navigation
+- Each intent uses deterministic heuristic scoring (no LLM calls) with 2+ scoring dimensions per intent
+- Candidates include CSS selectors usable with Playwright locator APIs
+- Results capped at 5 candidates, scored 0-1 with human-readable reasons
+- Intent strings normalized (accept underscores, spaces, mixed case)
+- `browser_act` resolves top candidate, executes via Playwright locator click (not evaluate click), settles, returns before/after diff
+- `browser_act` returns error (not throw) when zero candidates found
+- Both tools wired into index.ts, tool count = 47
+- Build passes
+
+## Proof Level
+
+- This slice proves: integration (new tools against real browser pages)
+- Real runtime required: yes (Playwright against real pages)
+- Human/UAT required: no (automated verification sufficient)
+
+## Verification
+
+- `npm run build` passes
+- `grep -c "pi.registerTool" src/resources/extensions/browser-tools/tools/*.ts` sums to 47
+- `browser_find_best` with intent "submit_form" against a page with a `<form>` returns candidates with scores > 0
+- `browser_find_best` with intent "close_dialog" against a page with a `[role="dialog"]` returns close button candidates
+- `browser_act` with intent "submit_form" clicks the submit button and returns before/after state
+- `browser_act` against a page with no dialog returns a graceful error (not throw) for "close_dialog" intent
+- Scoring heuristics produce differentiated rankings (top candidate scores higher than others)
+
+## Integration Closure
+
+- Upstream surfaces consumed: `evaluate-helpers.ts` (window.__pi utilities), `lifecycle.ts` (ensureBrowser, getActiveTarget), `state.ts` (ToolDeps, CompactPageState), `utils.ts` (action tracking, formatting), `core.js` (diffCompactStates), `settle.ts` (settleAfterActionAdaptive)
+- New wiring introduced: `tools/intent.ts` + import/call in `index.ts`
+- What remains before the milestone is truly usable end-to-end: S06 (test coverage)
+
+## Tasks
+
+- [x] **T01: Implement browser_find_best and browser_act with 8-intent scoring engine** `est:45m`
+  - Why: This is the entire slice — two tools sharing a single intent resolution engine, all in one file following the established forms.ts pattern. The scoring evaluate script, both tool registrations, and the index.ts wiring are tightly coupled and well within a single context window (~350 lines new code, 2 files created/modified).
+  - Files: `src/resources/extensions/browser-tools/tools/intent.ts` (new), `src/resources/extensions/browser-tools/index.ts` (wire)
+  - Do: Build `buildIntentScoringScript(intent, scope?)` as a string template evaluate returning scored candidates with cssPath selectors. Implement 8 intent scoring functions using window.__pi utilities (inferRole, accessibleName, isVisible, isEnabled, isInteractiveEl). Register `browser_find_best` (intent + optional scope → scored candidates) and `browser_act` (intent + optional scope → resolve top candidate → Playwright locator click → settle → diff). Wire via registerIntentTools import + call in index.ts.
+  - Verify: `npm run build` passes; grep tool count = 47; run both tools against real test pages via Playwright scripts
+  - Done when: Both tools registered, build passes, verified against real pages with forms and dialogs
+
+## Files Likely Touched
+
+- `src/resources/extensions/browser-tools/tools/intent.ts` (new)
+- `src/resources/extensions/browser-tools/index.ts` (wire registration)
--- a/.gsd/milestones/M002/slices/S05/tasks/T01-PLAN.md
+++ b/.gsd/milestones/M002/slices/S05/tasks/T01-PLAN.md
@ -0,0 +1,85 @@
+---
+estimated_steps: 5
+estimated_files: 2
+---
+
+# T01: Implement browser_find_best and browser_act with 8-intent scoring engine
+
+**Slice:** S05 — Intent-ranked retrieval and semantic actions
+**Milestone:** M002
+
+## Description
+
+Create `tools/intent.ts` with both `browser_find_best` and `browser_act`, sharing a single intent resolution engine built as a string template evaluate script (same pattern as forms.ts `buildFormAnalysisScript`). The scoring engine runs entirely in-browser via `page.evaluate()`, using `window.__pi` utilities for element metadata. Each of 8 intents has a candidate selector strategy and multi-dimensional scoring function. `browser_act` takes the top candidate from the same scoring logic, executes via Playwright `locator().click()` (D021), settles, and returns a before/after diff.
+
+## Steps
+
+1. **Create `tools/intent.ts`** with the `registerIntentTools(pi, deps)` export function. Define the 8 intent names as a const array and use `StringEnum` for the parameter schema. Build `buildIntentScoringScript(intent, scope?)` as a string template that:
+   - Normalizes the intent string (lowercase, strip spaces/underscores/hyphens)
+   - For each intent, selects candidate elements (e.g., submit_form → buttons/inputs inside or near forms; close_dialog → buttons inside `[role="dialog"]` or `dialog` elements)
+   - Scores each candidate 0-1 across 2-4 dimensions (structural position, role, text signals, visibility/enabled state)
+   - Returns top 5 candidates sorted by score, each with: `{ score, selector, tag, role, name, text, reason }`
+   - Uses `window.__pi.cssPath()` for selector generation, `window.__pi.inferRole()` / `window.__pi.accessibleName()` / `window.__pi.isVisible()` / `window.__pi.isEnabled()` for scoring signals
+
+2. **Implement the 8 intent scoring functions** inside the evaluate string template:
+   - `submitform` — query `button[type="submit"], input[type="submit"], button:not([type])` within forms; score by: is-submit-type, inside-form, text-suggests-submission, visible+enabled
+   - `closedialog` — query buttons/links inside `[role="dialog"], dialog, [aria-modal="true"]`; score by: text-matches-close-pattern, has-aria-label-close, is-visible, position (top-right gets a boost)
+   - `primarycta` — query all visible enabled buttons/links; score by: visual prominence (size), semantic weight (role=button > link), text-not-cancel/dismiss, position (main content area)
+   - `searchfield` — query inputs with type=search or role=searchbox or name/placeholder matching "search"; score by: type-match, placeholder-match, visibility, is-in-header/nav
+   - `nextstep` — query buttons/links with text matching next/continue/proceed/forward patterns; score by: text-match-strength, is-button, visible+enabled, not-disabled
+   - `dismiss` — query buttons/links matching close/cancel/dismiss/skip/no-thanks patterns; score by: text-match, position, inside-dialog/modal/overlay, is-visible
+   - `authaction` — query buttons/links matching login/sign-in/signup/register patterns; score by: text-match-strength, is-button-or-link, prominent-position, visible
+   - `backnavigation` — query buttons/links matching back/previous/return patterns; score by: text-match, has-back-arrow/icon, is-in-nav/header, visible
+
+3. **Register `browser_find_best`** tool:
+   - Parameters: `intent` (StringEnum of 8 intents), optional `scope` (CSS selector to narrow search)
+   - Execute: ensureBrowser → getActiveTarget → captureCompactPageState (before) → target.evaluate(buildIntentScoringScript) → format results as markdown with scores and selectors → tracked action finish
+   - Output format: numbered candidates with score, selector, role, text, and reason
+
+4. **Register `browser_act`** tool:
+   - Parameters: `intent` (same StringEnum), optional `scope` (CSS selector)
+   - Execute: ensureBrowser → captureCompactPageState (before) → target.evaluate(buildIntentScoringScript) → if zero candidates, return error → take top candidate → locator(candidate.selector).click() with getByRole fallback → settleAfterActionAdaptive → captureCompactPageState (after) → diffCompactStates → format result with before/after diff
+   - For search_field intent: focus instead of click
+   - Error handling: graceful error return when no candidates found, captureErrorScreenshot on unexpected failures
+
+5. **Wire into index.ts**: Add `import { registerIntentTools } from "./tools/intent.js"` and `registerIntentTools(pi, deps)` call. Verify build passes and tool count = 47.
+
+## Must-Haves
+
+- [ ] `browser_find_best` registered with 8-intent StringEnum parameter
+- [ ] `browser_act` registered with same 8-intent parameter
+- [ ] Intent scoring runs as a single page.evaluate() string template per call
+- [ ] Each intent has 2+ orthogonal scoring dimensions producing differentiated rankings
+- [ ] Scoring uses `window.__pi.*` utilities (no inline redeclarations)
+- [ ] Candidates include CSS selectors from `window.__pi.cssPath()`
+- [ ] Results capped at 5 candidates, scored 0-1
+- [ ] Intent string normalization handles underscores, spaces, mixed case
+- [ ] `browser_act` clicks via `target.locator(selector).click()` not `page.evaluate(() => el.click())`
+- [ ] `browser_act` returns error (not throw) when zero candidates
+- [ ] Both tools use tracked action pattern (beginTrackedAction / finishTrackedAction)
+- [ ] Tool count = 47 after wiring
+- [ ] `npm run build` passes
+
+## Verification
+
+- `npm run build` passes with zero errors
+- `grep -c "pi.registerTool" src/resources/extensions/browser-tools/tools/*.ts | awk -F: '{s+=$2} END {print s}'` outputs 47
+- Playwright verification script against a test HTML page with form + dialog:
+  - `browser_find_best` intent="submit_form" returns candidates with submit button scored highest
+  - `browser_find_best` intent="close_dialog" returns close/dismiss button inside dialog
+  - `browser_act` intent="submit_form" clicks the submit button
+  - `browser_act` intent="close_dialog" with no dialog on page returns error, not crash
+
+## Inputs
+
+- `src/resources/extensions/browser-tools/tools/forms.ts` — pattern for string template evaluates, tool registration, error handling
+- `src/resources/extensions/browser-tools/tools/interaction.ts` — pattern for Playwright locator click with getByRole fallback
+- `src/resources/extensions/browser-tools/evaluate-helpers.ts` — window.__pi API surface (9 functions)
+- `src/resources/extensions/browser-tools/index.ts` — wiring pattern (import + ToolDeps + registerXTools call)
+- `src/resources/extensions/browser-tools/state.ts` — ToolDeps interface, CompactPageState type
+- S05-RESEARCH.md — intent list, scoring guidance, common pitfalls
+
+## Expected Output
+
+- `src/resources/extensions/browser-tools/tools/intent.ts` — new file with ~350-400 lines containing `registerIntentTools(pi, deps)`, `buildIntentScoringScript()`, and both tool registrations
+- `src/resources/extensions/browser-tools/index.ts` — modified with 1 new import line + 1 new registration call
--- a/src/resources/extensions/browser-tools/index.ts
+++ b/src/resources/extensions/browser-tools/index.ts
@ -16,6 +16,7 @@ import { registerRefTools } from "./tools/refs.js";
 import { registerWaitTools } from "./tools/wait.js";
 import { registerPageTools } from "./tools/pages.js";
 import { registerFormTools } from "./tools/forms.js";
+import { registerIntentTools } from "./tools/intent.js";

 export default function (pi: ExtensionAPI) {
 	pi.on("session_shutdown", async () => { await closeBrowser(); });
@ -46,4 +47,5 @@ export default function (pi: ExtensionAPI) {
 	registerRefTools(pi, deps);         registerWaitTools(pi, deps);
 	registerPageTools(pi, deps);
 	registerFormTools(pi, deps);
+	registerIntentTools(pi, deps);
 }
--- a/src/resources/extensions/browser-tools/tools/intent.ts
+++ b/src/resources/extensions/browser-tools/tools/intent.ts
@ -0,0 +1,614 @@
+import type { ExtensionAPI } from "@gsd/pi-coding-agent";
+import { Type } from "@sinclair/typebox";
+import { StringEnum } from "@gsd/pi-ai";
+import { diffCompactStates } from "../core.js";
+import type { ToolDeps, CompactPageState } from "../state.js";
+import {
+	setLastActionBeforeState,
+	setLastActionAfterState,
+} from "../state.js";
+
+// ---------------------------------------------------------------------------
+// Intent definitions
+// ---------------------------------------------------------------------------
+
+const INTENTS = [
+	"submit_form",
+	"close_dialog",
+	"primary_cta",
+	"search_field",
+	"next_step",
+	"dismiss",
+	"auth_action",
+	"back_navigation",
+] as const;
+
+type Intent = (typeof INTENTS)[number];
+
+// ---------------------------------------------------------------------------
+// Scoring evaluate script — runs entirely in-browser via page.evaluate()
+// ---------------------------------------------------------------------------
+
+/**
+ * Builds a self-contained IIFE string that scores candidate elements for a
+ * given intent. Returns top 5 candidates sorted by score descending, each
+ * with { score, selector, tag, role, name, text, reason }.
+ *
+ * Uses window.__pi utilities (injected via addInitScript) for element
+ * metadata — no inline redeclarations.
+ */
+function buildIntentScoringScript(intent: string, scope?: string): string {
+	const scopeSelector = JSON.stringify(scope ?? null);
+
+	return `(() => {
+	var pi = window.__pi;
+	if (!pi) return { error: "window.__pi not available — browser helpers not injected" };
+
+	var intentRaw = ${JSON.stringify(intent)};
+	var normalized = intentRaw.toLowerCase().replace(/[\\s_\\-]+/g, "");
+	var scopeSel = ${scopeSelector};
+	var root = scopeSel ? document.querySelector(scopeSel) : document.body;
+	if (!root) return { error: "Scope selector not found: " + scopeSel };
+
+	// --- Shared helpers ---
+	function textOf(el) {
+		return (el.textContent || "").trim().replace(/\\s+/g, " ").slice(0, 120).toLowerCase();
+	}
+
+	function clamp01(v) { return Math.max(0, Math.min(1, v)); }
+
+	function makeCandidate(el, score, reason) {
+		return {
+			score: Math.round(clamp01(score) * 100) / 100,
+			selector: pi.cssPath(el),
+			tag: el.tagName.toLowerCase(),
+			role: pi.inferRole(el) || "",
+			name: pi.accessibleName(el) || "",
+			text: textOf(el).slice(0, 80),
+			reason: reason,
+		};
+	}
+
+	function qsa(sel) { return Array.from(root.querySelectorAll(sel)); }
+
+	function visibleEnabled(el) {
+		return pi.isVisible(el) && pi.isEnabled(el);
+	}
+
+	function textMatches(el, patterns) {
+		var t = textOf(el);
+		var n = (pi.accessibleName(el) || "").toLowerCase();
+		var combined = t + " " + n;
+		for (var i = 0; i < patterns.length; i++) {
+			if (combined.indexOf(patterns[i]) !== -1) return true;
+		}
+		return false;
+	}
+
+	function textMatchStrength(el, patterns) {
+		var t = textOf(el);
+		var n = (pi.accessibleName(el) || "").toLowerCase();
+		var combined = t + " " + n;
+		var count = 0;
+		for (var i = 0; i < patterns.length; i++) {
+			if (combined.indexOf(patterns[i]) !== -1) count++;
+		}
+		return Math.min(count / Math.max(patterns.length, 1), 1);
+	}
+
+	// --- Intent-specific scoring ---
+	var candidates = [];
+
+	if (normalized === "submitform") {
+		var els = qsa('button[type="submit"], input[type="submit"], button:not([type]), button[type="button"]');
+		for (var i = 0; i < els.length; i++) {
+			var el = els[i];
+			if (!visibleEnabled(el)) continue;
+			var d1 = el.type === "submit" || el.getAttribute("type") === "submit" ? 0.35 : 0;
+			var d2 = el.closest("form") ? 0.3 : 0;
+			var d3 = textMatches(el, ["submit", "send", "save", "create", "add", "post", "confirm", "ok", "done", "register", "sign up", "log in"]) ? 0.2 : 0;
+			var d4 = 0.15;
+			var score = d1 + d2 + d3 + d4;
+			var reasons = [];
+			if (d1 > 0) reasons.push("submit-type");
+			if (d2 > 0) reasons.push("inside-form");
+			if (d3 > 0) reasons.push("text-suggests-submit");
+			reasons.push("visible+enabled");
+			candidates.push(makeCandidate(el, score, reasons.join(", ")));
+		}
+	}
+
+	else if (normalized === "closedialog") {
+		var containers = qsa('[role="dialog"], dialog, [aria-modal="true"], [role="alertdialog"]');
+		for (var ci = 0; ci < containers.length; ci++) {
+			var btns = containers[ci].querySelectorAll("button, a, [role='button']");
+			for (var bi = 0; bi < btns.length; bi++) {
+				var el = btns[bi];
+				if (!visibleEnabled(el)) continue;
+				var d1 = textMatches(el, ["close", "cancel", "dismiss", "×", "✕", "x", "got it", "ok", "done"]) ? 0.35 : 0;
+				var ariaLbl = (el.getAttribute("aria-label") || "").toLowerCase();
+				var d2 = (ariaLbl.indexOf("close") !== -1 || ariaLbl.indexOf("dismiss") !== -1) ? 0.25 : 0;
+				var d3 = 0.2;
+				var rect = el.getBoundingClientRect();
+				var parentRect = containers[ci].getBoundingClientRect();
+				var isTopRight = rect.top - parentRect.top < 60 && parentRect.right - rect.right < 60;
+				var d4 = isTopRight ? 0.2 : 0;
+				var score = d1 + d2 + d3 + d4;
+				var reasons = [];
+				if (d1 > 0) reasons.push("text-matches-close");
+				if (d2 > 0) reasons.push("aria-label-close");
+				reasons.push("inside-dialog");
+				if (d4 > 0) reasons.push("top-right-position");
+				candidates.push(makeCandidate(el, score, reasons.join(", ")));
+			}
+		}
+	}
+
+	else if (normalized === "primarycta") {
+		var els = qsa("button, a, [role='button'], input[type='submit'], input[type='button']");
+		for (var i = 0; i < els.length; i++) {
+			var el = els[i];
+			if (!visibleEnabled(el)) continue;
+			var rect = el.getBoundingClientRect();
+			var area = rect.width * rect.height;
+			var d1 = clamp01(area / 12000);
+			var role = pi.inferRole(el);
+			var d2 = role === "button" ? 0.25 : (role === "link" ? 0.1 : 0.15);
+			var isNegative = textMatches(el, ["cancel", "dismiss", "close", "skip", "no thanks", "no, thanks", "maybe later"]);
+			var d3 = isNegative ? 0 : 0.2;
+			var inMain = !!el.closest("main, [role='main'], article, section, .hero, .content");
+			var d4 = inMain ? 0.15 : 0;
+			var score = d1 + d2 + d3 + d4;
+			var reasons = [];
+			reasons.push("size:" + Math.round(area));
+			if (d2 >= 0.25) reasons.push("button-role");
+			if (d3 > 0) reasons.push("non-dismissive");
+			if (d4 > 0) reasons.push("in-main-content");
+			candidates.push(makeCandidate(el, score, reasons.join(", ")));
+		}
+	}
+
+	else if (normalized === "searchfield") {
+		var els = qsa("input, textarea, [role='searchbox'], [role='combobox'], [contenteditable='true']");
+		for (var i = 0; i < els.length; i++) {
+			var el = els[i];
+			if (!pi.isVisible(el)) continue;
+			var type = (el.getAttribute("type") || "text").toLowerCase();
+			if (["hidden", "submit", "button", "reset", "image", "checkbox", "radio", "file"].indexOf(type) !== -1 && el.tagName.toLowerCase() === "input") continue;
+			var d1 = type === "search" || pi.inferRole(el) === "searchbox" ? 0.4 : 0;
+			var ph = (el.getAttribute("placeholder") || "").toLowerCase();
+			var nm = (el.getAttribute("name") || "").toLowerCase();
+			var ariaLbl = (el.getAttribute("aria-label") || "").toLowerCase();
+			var combined = ph + " " + nm + " " + ariaLbl;
+			var d2 = combined.indexOf("search") !== -1 || combined.indexOf("query") !== -1 || combined.indexOf("find") !== -1 ? 0.3 : 0;
+			var d3 = pi.isEnabled(el) ? 0.15 : 0;
+			var inHeader = !!el.closest("header, nav, [role='banner'], [role='navigation'], [role='search']");
+			var d4 = inHeader ? 0.15 : 0;
+			var score = d1 + d2 + d3 + d4;
+			if (score < 0.1) continue;
+			var reasons = [];
+			if (d1 > 0) reasons.push("search-type/role");
+			if (d2 > 0) reasons.push("name/placeholder-match");
+			if (d3 > 0) reasons.push("enabled");
+			if (d4 > 0) reasons.push("in-header/nav");
+			candidates.push(makeCandidate(el, score, reasons.join(", ")));
+		}
+	}
+
+	else if (normalized === "nextstep") {
+		var els = qsa("button, a, [role='button'], input[type='submit'], input[type='button']");
+		var patterns = ["next", "continue", "proceed", "forward", "go", "step"];
+		for (var i = 0; i < els.length; i++) {
+			var el = els[i];
+			if (!visibleEnabled(el)) continue;
+			var d1 = textMatchStrength(el, patterns) * 0.4;
+			if (d1 === 0) continue;
+			var role = pi.inferRole(el);
+			var d2 = role === "button" ? 0.25 : 0.1;
+			var d3 = 0.2;
+			var isDisabled = !pi.isEnabled(el);
+			var d4 = isDisabled ? 0 : 0.15;
+			var score = d1 + d2 + d3 + d4;
+			var reasons = [];
+			reasons.push("text-match");
+			if (d2 >= 0.25) reasons.push("button-role");
+			reasons.push("visible");
+			if (d4 > 0) reasons.push("enabled");
+			candidates.push(makeCandidate(el, score, reasons.join(", ")));
+		}
+	}
+
+	else if (normalized === "dismiss") {
+		var els = qsa("button, a, [role='button'], [role='link']");
+		var patterns = ["close", "cancel", "dismiss", "skip", "no thanks", "no, thanks", "maybe later", "not now", "×", "✕"];
+		for (var i = 0; i < els.length; i++) {
+			var el = els[i];
+			if (!visibleEnabled(el)) continue;
+			var d1 = textMatchStrength(el, patterns) * 0.35;
+			if (d1 === 0) continue;
+			var inOverlay = !!el.closest('[role="dialog"], dialog, [aria-modal="true"], [role="alertdialog"], .modal, .overlay, .popup, .popover, .toast, .banner');
+			var d2 = inOverlay ? 0.3 : 0.05;
+			var rect = el.getBoundingClientRect();
+			var isEdge = rect.top < 80 || rect.right > window.innerWidth - 80;
+			var d3 = isEdge ? 0.15 : 0;
+			var d4 = 0.15;
+			var score = d1 + d2 + d3 + d4;
+			var reasons = [];
+			reasons.push("text-match");
+			if (d2 >= 0.3) reasons.push("inside-overlay");
+			if (d3 > 0) reasons.push("edge-position");
+			reasons.push("visible+enabled");
+			candidates.push(makeCandidate(el, score, reasons.join(", ")));
+		}
+	}
+
+	else if (normalized === "authaction") {
+		var els = qsa("button, a, [role='button'], [role='link'], input[type='submit']");
+		var patterns = ["log in", "login", "sign in", "signin", "sign up", "signup", "register", "create account", "join", "get started"];
+		for (var i = 0; i < els.length; i++) {
+			var el = els[i];
+			if (!visibleEnabled(el)) continue;
+			var d1 = textMatchStrength(el, patterns) * 0.4;
+			if (d1 === 0) continue;
+			var role = pi.inferRole(el);
+			var d2 = (role === "button" || role === "link") ? 0.25 : 0.1;
+			var rect = el.getBoundingClientRect();
+			var inHeader = !!el.closest("header, nav, [role='banner'], [role='navigation']");
+			var isProminent = inHeader || rect.top < 200;
+			var d3 = isProminent ? 0.2 : 0.05;
+			var d4 = 0.15;
+			var score = d1 + d2 + d3 + d4;
+			var reasons = [];
+			reasons.push("text-match");
+			if (d2 >= 0.25) reasons.push("button-or-link");
+			if (d3 >= 0.2) reasons.push("prominent-position");
+			reasons.push("visible+enabled");
+			candidates.push(makeCandidate(el, score, reasons.join(", ")));
+		}
+	}
+
+	else if (normalized === "backnavigation") {
+		var els = qsa("button, a, [role='button'], [role='link']");
+		var patterns = ["back", "previous", "prev", "return", "go back"];
+		for (var i = 0; i < els.length; i++) {
+			var el = els[i];
+			if (!visibleEnabled(el)) continue;
+			var d1 = textMatchStrength(el, patterns) * 0.35;
+			if (d1 === 0) continue;
+			var innerHtml = el.innerHTML.toLowerCase();
+			var hasArrow = innerHtml.indexOf("←") !== -1 || innerHtml.indexOf("&larr") !== -1 || innerHtml.indexOf("arrow") !== -1 || innerHtml.indexOf("chevron-left") !== -1 || innerHtml.indexOf("back") !== -1;
+			var d2 = hasArrow ? 0.25 : 0;
+			var inNav = !!el.closest("header, nav, [role='banner'], [role='navigation'], .breadcrumb, .toolbar");
+			var d3 = inNav ? 0.25 : 0.05;
+			var d4 = 0.15;
+			var score = d1 + d2 + d3 + d4;
+			var reasons = [];
+			reasons.push("text-match");
+			if (d2 > 0) reasons.push("has-back-arrow/icon");
+			if (d3 >= 0.25) reasons.push("in-nav/header");
+			reasons.push("visible+enabled");
+			candidates.push(makeCandidate(el, score, reasons.join(", ")));
+		}
+	}
+
+	else {
+		return { error: "Unknown intent: " + intentRaw + ". Valid: submit_form, close_dialog, primary_cta, search_field, next_step, dismiss, auth_action, back_navigation" };
+	}
+
+	// Sort by score descending, cap at 5
+	candidates.sort(function(a, b) { return b.score - a.score; });
+	candidates = candidates.slice(0, 5);
+
+	return { intent: intentRaw, normalized: normalized, count: candidates.length, candidates: candidates };
+})()`;
+}
+
+// ---------------------------------------------------------------------------
+// Result types
+// ---------------------------------------------------------------------------
+
+interface IntentCandidate {
+	score: number;
+	selector: string;
+	tag: string;
+	role: string;
+	name: string;
+	text: string;
+	reason: string;
+}
+
+interface IntentScoringResult {
+	intent: string;
+	normalized: string;
+	count: number;
+	candidates: IntentCandidate[];
+	error?: string;
+}
+
+// ---------------------------------------------------------------------------
+// Registration
+// ---------------------------------------------------------------------------
+
+export function registerIntentTools(pi: ExtensionAPI, deps: ToolDeps): void {
+
+	// -----------------------------------------------------------------------
+	// browser_find_best
+	// -----------------------------------------------------------------------
+	pi.registerTool({
+		name: "browser_find_best",
+		label: "Find Best",
+		description:
+			"Find the best-matching element for a semantic intent. Returns up to 5 scored candidates (0-1) ranked by structural position, role, text signals, and visibility. Use this to discover which element the agent should interact with for a given goal — e.g. intent=\"submit_form\" finds submit buttons, intent=\"close_dialog\" finds close/dismiss buttons inside dialogs. Each candidate includes a CSS selector usable with browser_click.",
+		parameters: Type.Object({
+			intent: StringEnum(INTENTS, {
+				description:
+					"Semantic intent: submit_form, close_dialog, primary_cta, search_field, next_step, dismiss, auth_action, back_navigation",
+			}),
+			scope: Type.Optional(
+				Type.String({
+					description:
+						"CSS selector to narrow the search area. If omitted, searches the full page.",
+				})
+			),
+		}),
+
+		async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
+			let actionId: number | null = null;
+			let beforeState: CompactPageState | null = null;
+			try {
+				const { page: p } = await deps.ensureBrowser();
+				const target = deps.getActiveTarget();
+				beforeState = await deps.captureCompactPageState(p, {
+					selectors: params.scope ? [params.scope] : [],
+					includeBodyText: false,
+					target,
+				});
+				actionId = deps.beginTrackedAction("browser_find_best", params, beforeState.url).id;
+
+				const script = buildIntentScoringScript(params.intent, params.scope);
+				const result = await target.evaluate(script) as IntentScoringResult;
+
+				if (result.error) {
+					deps.finishTrackedAction(actionId, {
+						status: "error",
+						error: result.error,
+						beforeState,
+					});
+					return {
+						content: [{ type: "text" as const, text: result.error }],
+						details: {},
+						isError: true,
+					};
+				}
+
+				const afterState = await deps.captureCompactPageState(p, {
+					selectors: params.scope ? [params.scope] : [],
+					includeBodyText: false,
+					target,
+				});
+				setLastActionBeforeState(beforeState);
+				setLastActionAfterState(afterState);
+
+				deps.finishTrackedAction(actionId, {
+					status: "success",
+					afterUrl: afterState.url,
+					beforeState,
+					afterState,
+				});
+
+				// Format output
+				const lines: string[] = [];
+				lines.push(`Intent: ${params.intent} → ${result.count} candidate(s)`);
+				if (params.scope) lines.push(`Scope: ${params.scope}`);
+				lines.push("");
+
+				if (result.candidates.length === 0) {
+					lines.push("No candidates found for this intent on the current page.");
+				} else {
+					for (let i = 0; i < result.candidates.length; i++) {
+						const c = result.candidates[i];
+						lines.push(`${i + 1}. **${c.score}** \`${c.selector}\``);
+						lines.push(`   ${c.tag}${c.role ? ` [${c.role}]` : ""} — "${c.name || c.text}"`);
+						lines.push(`   Reason: ${c.reason}`);
+					}
+				}
+
+				return {
+					content: [{ type: "text" as const, text: lines.join("\n") }],
+					details: { intentResult: result },
+				};
+			} catch (err: unknown) {
+				const screenshot = await deps.captureErrorScreenshot(
+					(() => { try { return deps.getActivePage(); } catch { return null; } })()
+				);
+				const errMsg = deps.firstErrorLine(err);
+
+				if (actionId !== null) {
+					deps.finishTrackedAction(actionId, {
+						status: "error",
+						error: errMsg,
+						beforeState: beforeState ?? undefined,
+					});
+				}
+
+				const content: Array<{ type: "text"; text: string } | { type: "image"; data: string; mimeType: string }> = [
+					{ type: "text", text: `browser_find_best failed: ${errMsg}` },
+				];
+				if (screenshot) {
+					content.push({ type: "image", data: screenshot.data, mimeType: screenshot.mimeType });
+				}
+				return { content, details: {}, isError: true };
+			}
+		},
+	});
+
+	// -----------------------------------------------------------------------
+	// browser_act
+	// -----------------------------------------------------------------------
+	pi.registerTool({
+		name: "browser_act",
+		label: "Browser Act",
+		description:
+			"Execute a semantic action in one call. Resolves the top candidate for the given intent (same scoring as browser_find_best), performs the action (click for buttons/links, focus for search fields), settles the page, and returns a before/after diff. Use when you know what you want to accomplish semantically — e.g. intent=\"submit_form\" finds and clicks the submit button, intent=\"close_dialog\" dismisses the dialog.",
+		parameters: Type.Object({
+			intent: StringEnum(INTENTS, {
+				description:
+					"Semantic intent: submit_form, close_dialog, primary_cta, search_field, next_step, dismiss, auth_action, back_navigation",
+			}),
+			scope: Type.Optional(
+				Type.String({
+					description:
+						"CSS selector to narrow the search area. If omitted, searches the full page.",
+				})
+			),
+		}),
+
+		async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {
+			let actionId: number | null = null;
+			let beforeState: CompactPageState | null = null;
+			try {
+				const { page: p } = await deps.ensureBrowser();
+				const target = deps.getActiveTarget();
+				beforeState = await deps.captureCompactPageState(p, {
+					selectors: params.scope ? [params.scope] : [],
+					includeBodyText: true,
+					target,
+				});
+				actionId = deps.beginTrackedAction("browser_act", params, beforeState.url).id;
+
+				// Score candidates
+				const script = buildIntentScoringScript(params.intent, params.scope);
+				const result = await target.evaluate(script) as IntentScoringResult;
+
+				if (result.error) {
+					deps.finishTrackedAction(actionId, {
+						status: "error",
+						error: result.error,
+						beforeState,
+					});
+					return {
+						content: [{ type: "text" as const, text: `browser_act failed: ${result.error}` }],
+						details: {},
+						isError: true,
+					};
+				}
+
+				if (result.candidates.length === 0) {
+					deps.finishTrackedAction(actionId, {
+						status: "error",
+						error: `No candidates found for intent "${params.intent}"`,
+						beforeState,
+					});
+					return {
+						content: [{
+							type: "text" as const,
+							text: `browser_act: No candidates found for intent "${params.intent}" on the current page. The page may not have the expected elements (e.g. no dialog for close_dialog, no form for submit_form).`,
+						}],
+						details: { intentResult: result },
+						isError: true,
+					};
+				}
+
+				// Take top candidate and execute action
+				const top = result.candidates[0];
+				const normalizedIntent = params.intent.toLowerCase().replace(/[\s_-]+/g, "");
+
+				if (normalizedIntent === "searchfield") {
+					// Focus instead of click for search fields
+					try {
+						await target.locator(top.selector).first().focus({ timeout: 5000 });
+					} catch {
+						// Fallback: click to focus
+						await target.locator(top.selector).first().click({ timeout: 5000 });
+					}
+				} else {
+					// Click via Playwright locator (D021)
+					try {
+						await target.locator(top.selector).first().click({ timeout: 5000 });
+					} catch {
+						// getByRole fallback from interaction.ts pattern
+						const nameMatch = top.selector.match(/\[(?:aria-label|name|placeholder)="([^"]+)"\]/i);
+						const roleName = nameMatch?.[1];
+						let clicked = false;
+						for (const role of ["button", "link", "combobox", "textbox"] as const) {
+							try {
+								const loc = roleName
+									? target.getByRole(role, { name: new RegExp(roleName, "i") })
+									: target.getByRole(role, { name: new RegExp(top.name.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"), "i") });
+								await loc.first().click({ timeout: 3000 });
+								clicked = true;
+								break;
+							} catch { /* try next role */ }
+						}
+						if (!clicked) {
+							throw new Error(`Could not click top candidate "${top.selector}" for intent "${params.intent}"`);
+						}
+					}
+				}
+
+				// Settle after action
+				await deps.settleAfterActionAdaptive(p);
+
+				// Capture after state and diff
+				const afterState = await deps.captureCompactPageState(p, {
+					selectors: params.scope ? [params.scope] : [],
+					includeBodyText: true,
+					target,
+				});
+				const diff = diffCompactStates(beforeState, afterState);
+				const summary = deps.formatCompactStateSummary(afterState);
+				const jsErrors = deps.getRecentErrors(p.url());
+
+				setLastActionBeforeState(beforeState);
+				setLastActionAfterState(afterState);
+
+				deps.finishTrackedAction(actionId, {
+					status: "success",
+					afterUrl: afterState.url,
+					diffSummary: diff.summary,
+					beforeState,
+					afterState,
+				});
+
+				// Format output
+				const lines: string[] = [];
+				lines.push(`Intent: ${params.intent}`);
+				lines.push(`Action: ${normalizedIntent === "searchfield" ? "focused" : "clicked"} top candidate (score: ${top.score})`);
+				lines.push(`Target: \`${top.selector}\` — "${top.name || top.text}"`);
+				lines.push(`Reason: ${top.reason}`);
+				lines.push("");
+				lines.push(`Diff:\n${deps.formatDiffText(diff)}`);
+				if (jsErrors.trim()) {
+					lines.push(`\nJS Errors:\n${jsErrors}`);
+				}
+				lines.push(`\nPage summary:\n${summary}`);
+
+				return {
+					content: [{ type: "text" as const, text: lines.join("\n") }],
+					details: { intentResult: result, topCandidate: top, diff },
+				};
+			} catch (err: unknown) {
+				const screenshot = await deps.captureErrorScreenshot(
+					(() => { try { return deps.getActivePage(); } catch { return null; } })()
+				);
+				const errMsg = deps.firstErrorLine(err);
+
+				if (actionId !== null) {
+					deps.finishTrackedAction(actionId, {
+						status: "error",
+						error: errMsg,
+						beforeState: beforeState ?? undefined,
+					});
+				}
+
+				const content: Array<{ type: "text"; text: string } | { type: "image"; data: string; mimeType: string }> = [
+					{ type: "text", text: `browser_act failed: ${errMsg}` },
+				];
+				if (screenshot) {
+					content.push({ type: "image", data: screenshot.data, mimeType: screenshot.mimeType });
+				}
+				return { content, details: {}, isError: true };
+			}
+		},
+	});
+}