diff --git a/.plans/token-optimization-suite.md b/.plans/token-optimization-suite.md new file mode 100644 index 000000000..62118901e --- /dev/null +++ b/.plans/token-optimization-suite.md @@ -0,0 +1,220 @@ +# Token Optimization Suite — Implementation Plan + +## Overview +Comprehensive token optimization across the GSD dispatch pipeline. Six phases targeting +prompt caching, accurate token counting, structured data compression, prompt compression, +semantic context selection, and context distillation. + +## Phase 1: Prompt Cache Optimization (P0) +**Goal:** Restructure dispatch prompt assembly for maximum cache hit rates. + +### What +Anthropic prompt caching gives 90% savings on cached input tokens. Currently, GSD places +`cache_control` on system prompts and the last user message (in `packages/pi-ai/src/providers/anthropic.ts`). +But dispatch prompts in `auto-prompts.ts` mix static and dynamic content throughout, +reducing cache prefix reuse. + +### Tasks +1. **Create `prompt-cache-optimizer.ts`** — module that separates prompt content into + cacheable (static) and dynamic (per-task) sections. + - Static: templates, plans, decisions, roadmap, project context + - Dynamic: task-specific instructions, file contents, overrides + - Export `splitForCaching(prompt: string, staticSections: string[]): { staticPrefix: string; dynamicSuffix: string }` + +2. **Add `buildCacheablePrefix()` to auto-prompts.ts** — for each builder, extract the + static portion that's reused across tasks in the same slice: + - Slice plan (same across all tasks in slice) + - Decisions register (same across all tasks) + - Requirements (same within scope) + - Templates (always the same) + +3. **Metrics tracking** — extend `metrics.ts` to track `cacheHitRate` per unit. + Already tracks `cacheRead` and `cacheWrite` tokens — add derived percentage. + +### Files Modified +- `src/resources/extensions/gsd/prompt-cache-optimizer.ts` (NEW) +- `src/resources/extensions/gsd/auto-prompts.ts` (modify builders) +- `src/resources/extensions/gsd/metrics.ts` (add cache hit rate) +- `src/resources/extensions/gsd/tests/prompt-cache-optimizer.test.ts` (NEW) + +--- + +## Phase 2: Accurate Multi-Provider Token Counting (P1) +**Goal:** Replace GPT-4o-only tiktoken with provider-aware counting. + +### What +`token-counter.ts` uses `tiktoken` with `gpt-4o` encoder for ALL providers. Claude uses a +different tokenizer, so counts can be off by 15-25%. This causes budget under/over-allocation. + +### Tasks +1. **Add provider-aware counting** — extend `countTokens()` to accept an optional + `provider` parameter: + - `anthropic`: Use `@anthropic-ai/sdk` `messages.countTokens()` for exact counts + - `openai`: Keep tiktoken (already accurate) + - `google`/`mistral`/others: Keep chars/4 heuristic (best available) + +2. **Add `estimateTokensForProvider(text, provider)` function** — synchronous estimation + that uses provider-specific char ratios: + - Anthropic: ~3.5 chars/token (their tokenizer is slightly more efficient) + - OpenAI: ~4 chars/token (tiktoken accurate) + - Others: ~4 chars/token (conservative default) + +3. **Update `context-budget.ts`** — use provider-aware `CHARS_PER_TOKEN` constant based + on the configured execution model's provider. + +### Files Modified +- `src/resources/extensions/gsd/token-counter.ts` (extend) +- `src/resources/extensions/gsd/context-budget.ts` (provider-aware ratio) +- `src/resources/extensions/gsd/tests/token-counter.test.ts` (NEW) +- `src/resources/extensions/gsd/tests/context-budget.test.ts` (extend) + +--- + +## Phase 3: Structured Data Compression with TOON (P1) +**Goal:** Reduce token usage for structured data blocks in prompts by 30-60%. + +### What +Decisions registers, requirements lists, task plans, and metrics are passed as verbose +markdown tables. TOON (Token-Oriented Object Notation) removes braces/brackets/quotes, +using indentation and tabular patterns instead. + +### Tasks +1. **Add `@toon-format/toon` dependency** — install the npm package. + +2. **Create `structured-data-formatter.ts`** — module that converts structured data to + TOON format for prompt injection: + - `formatDecisionsTOON(decisions: Decision[]): string` + - `formatRequirementsTOON(requirements: Requirement[]): string` + - `formatTaskPlanTOON(tasks: TaskPlanEntry[]): string` + - Each includes a brief format header so the LLM knows how to parse it + +3. **Integrate with `context-store.ts`** — add TOON variants of `formatDecisionsForPrompt()` + and `formatRequirementsForPrompt()`. + +4. **Gate behind inline level** — `minimal` and `standard` use TOON; `full` uses markdown + (backward compatible). + +### Files Modified +- `package.json` (add dependency) +- `src/resources/extensions/gsd/structured-data-formatter.ts` (NEW) +- `src/resources/extensions/gsd/context-store.ts` (add TOON variants) +- `src/resources/extensions/gsd/auto-prompts.ts` (use TOON when level != full) +- `src/resources/extensions/gsd/tests/structured-data-formatter.test.ts` (NEW) + +--- + +## Phase 4: Prompt Compression via LLMLingua-2 (P2) +**Goal:** Compress large context blocks 3-5x while preserving semantic meaning. + +### What +When context exceeds budget, instead of dropping entire sections (current behavior), +compress them using LLMLingua-2. This preserves information density while reducing tokens. + +### Tasks +1. **Create `prompt-compressor.ts`** — wrapper around compression logic: + - `compressContext(text: string, targetRatio: number): Promise` + - Supports configurable compression ratios (2x for light, 5x for aggressive) + - Falls back to section-boundary truncation if compression fails + - Includes compression stats for metrics + +2. **Integrate with `context-budget.ts`** — add `compressBeforeTruncate` option: + - When content exceeds budget, try compression first + - Only truncate if compressed content still exceeds budget + - Track compression ratio in metrics + +3. **Gate behind preference** — new `compression_strategy` preference: + - `"truncate"` (default, backward-compatible): current section-boundary truncation + - `"compress"`: use LLMLingua-2 before truncating + - Budget profile auto-enables compress for `budget` and `balanced` + +### Files Modified +- `src/resources/extensions/gsd/prompt-compressor.ts` (NEW) +- `src/resources/extensions/gsd/context-budget.ts` (integrate) +- `src/resources/extensions/gsd/preferences.ts` (add compression_strategy) +- `src/resources/extensions/gsd/types.ts` (add CompressionStrategy type) +- `src/resources/extensions/gsd/tests/prompt-compressor.test.ts` (NEW) + +### Note +LLMLingua-2 JS port (`@atjsh/llmlingua-2`) is experimental. We'll implement the interface +with a fallback path so the feature degrades gracefully. If the JS port isn't stable enough, +we can use the Compresso REST API as an alternative, or implement a simpler heuristic +compression (remove redundant whitespace, deduplicate repeated patterns, abbreviate +common programming terms). + +--- + +## Phase 5: Semantic Context Selection (P2) +**Goal:** Only include semantically relevant content in prompts instead of entire files. + +### What +`diff-context.ts` currently selects recently-changed files. `auto-prompts.ts` inlines +entire files. For large files, this wastes tokens on irrelevant sections. + +### Tasks +1. **Create `semantic-chunker.ts`** — wrapper for semantic text splitting: + - `chunkByRelevance(content: string, query: string, maxChunks: number): string[]` + - Splits content into semantic chunks (function boundaries, class boundaries, etc.) + - Scores chunks by relevance to the task description + - Returns top-N most relevant chunks + - Uses simple TF-IDF scoring (no embeddings needed for v1) + +2. **Integrate with `inlineFile()`** — when inlining large files (>2000 chars), + chunk and select relevant portions: + - Extract task description/plan as the "query" + - Score file chunks against the query + - Include only high-scoring chunks with `[...N chunks omitted]` markers + +3. **Add `context_selection` preference**: + - `"full"`: inline entire files (current behavior) + - `"smart"`: use semantic chunking for files over threshold + - Auto-enabled for `budget` and `balanced` profiles + +### Files Modified +- `src/resources/extensions/gsd/semantic-chunker.ts` (NEW) +- `src/resources/extensions/gsd/auto-prompts.ts` (integrate with inlineFile) +- `src/resources/extensions/gsd/preferences.ts` (add context_selection) +- `src/resources/extensions/gsd/types.ts` (add ContextSelectionMode type) +- `src/resources/extensions/gsd/tests/semantic-chunker.test.ts` (NEW) + +--- + +## Phase 6: Summary Distillation (P3) +**Goal:** Produce tighter dependency summaries when budget is constrained. + +### What +`inlineDependencySummaries()` currently concatenates full summaries from prior slices. +When a slice has many dependencies, this consumes a large portion of the context budget. + +### Tasks +1. **Create `summary-distiller.ts`** — reduces multiple summaries to a condensed form: + - `distillSummaries(summaries: string[], budgetChars: number): string` + - Extracts key facts: files modified, decisions made, patterns established + - Removes verbose prose, keeps structured data + - Preserves all `key_files`, `key_decisions`, `provides`, `requires` frontmatter + - Falls back to section-boundary truncation for non-parseable summaries + +2. **Integrate with `auto-prompts.ts`** — use distiller when: + - Dependency count > 2 AND budget is constrained + - InlineLevel is "minimal" or "standard" + - Budget pressure is above 50% + +### Files Modified +- `src/resources/extensions/gsd/summary-distiller.ts` (NEW) +- `src/resources/extensions/gsd/auto-prompts.ts` (integrate with inlineDependencySummaries) +- `src/resources/extensions/gsd/tests/summary-distiller.test.ts` (NEW) + +--- + +## Implementation Order +1. Phase 2 (token counting) — foundation, needed by other phases +2. Phase 1 (cache optimization) — highest ROI +3. Phase 3 (TOON format) — quick win on structured data +4. Phase 6 (summary distillation) — pure logic, no 3rd party +5. Phase 5 (semantic chunking) — TF-IDF v1, no 3rd party +6. Phase 4 (prompt compression) — depends on 3rd party stability + +## Testing Strategy +- Each phase adds dedicated unit tests +- Existing tests must continue to pass (no regressions) +- Token savings tests validate measurable reduction +- Run full test suite after each phase: `npm run test:unit` diff --git a/src/resources/extensions/gsd/auto-prompts.ts b/src/resources/extensions/gsd/auto-prompts.ts index d34622d1f..775c54f2a 100644 --- a/src/resources/extensions/gsd/auto-prompts.ts +++ b/src/resources/extensions/gsd/auto-prompts.ts @@ -21,6 +21,9 @@ import type { GSDPreferences } from "./preferences.js"; import { join } from "node:path"; import { existsSync } from "node:fs"; import { computeBudgets, resolveExecutorContextWindow } from "./context-budget.js"; +import { compressToTarget } from "./prompt-compressor.js"; +import { distillSummaries } from "./summary-distiller.js"; +import { formatDecisionsCompact, formatRequirementsCompact } from "./structured-data-formatter.js"; // ─── Executor Constraints ───────────────────────────────────────────────────── @@ -111,8 +114,21 @@ export async function inlineDependencySummaries( } const result = sections.join("\n\n"); - // When a budget is provided, truncate at section boundaries to fit if (budgetChars !== undefined && result.length > budgetChars) { + // For 3+ summaries, try distillation first (preserves more information) + if (sections.length >= 3) { + const rawSummaries = sections.map(s => { + // Extract content after the header line + const lines = s.split("\n"); + const contentStart = lines.findIndex(l => l.startsWith("Source:")); + return contentStart >= 0 ? lines.slice(contentStart + 1).join("\n").trim() : s; + }); + const distilled = distillSummaries(rawSummaries, budgetChars); + if (distilled.content.length <= budgetChars) { + return distilled.content; + } + } + // Fall back to section-boundary truncation const { truncateAtSectionBoundary } = await import("./context-budget.js"); return truncateAtSectionBoundary(result, budgetChars).content; } @@ -139,15 +155,19 @@ export async function inlineGsdRootFile( * Falls back to filesystem via inlineGsdRootFile when DB unavailable or empty. */ export async function inlineDecisionsFromDb( - base: string, milestoneId?: string, scope?: string, + base: string, milestoneId?: string, scope?: string, level?: InlineLevel, ): Promise { + const inlineLevel = level ?? resolveInlineLevel(); try { const { isDbAvailable } = await import("./gsd-db.js"); if (isDbAvailable()) { const { queryDecisions, formatDecisionsForPrompt } = await import("./context-store.js"); const decisions = queryDecisions({ milestoneId, scope }); if (decisions.length > 0) { - const formatted = formatDecisionsForPrompt(decisions); + // Use compact format for non-full levels to save ~35% tokens + const formatted = inlineLevel !== "full" + ? formatDecisionsCompact(decisions) + : formatDecisionsForPrompt(decisions); return `### Decisions\nSource: \`.gsd/DECISIONS.md\`\n\n${formatted}`; } } @@ -162,15 +182,19 @@ export async function inlineDecisionsFromDb( * Falls back to filesystem via inlineGsdRootFile when DB unavailable or empty. */ export async function inlineRequirementsFromDb( - base: string, sliceId?: string, + base: string, sliceId?: string, level?: InlineLevel, ): Promise { + const inlineLevel = level ?? resolveInlineLevel(); try { const { isDbAvailable } = await import("./gsd-db.js"); if (isDbAvailable()) { const { queryRequirements, formatRequirementsForPrompt } = await import("./context-store.js"); const requirements = queryRequirements({ sliceId }); if (requirements.length > 0) { - const formatted = formatRequirementsForPrompt(requirements); + // Use compact format for non-full levels to save ~40% tokens + const formatted = inlineLevel !== "full" + ? formatRequirementsCompact(requirements) + : formatRequirementsForPrompt(requirements); return `### Requirements\nSource: \`.gsd/REQUIREMENTS.md\`\n\n${formatted}`; } } @@ -519,9 +543,9 @@ export async function buildPlanMilestonePrompt(mid: string, midTitle: string, ba if (inlineLevel !== "minimal") { const projectInline = await inlineProjectFromDb(base); if (projectInline) inlined.push(projectInline); - const requirementsInline = await inlineRequirementsFromDb(base); + const requirementsInline = await inlineRequirementsFromDb(base, undefined, inlineLevel); if (requirementsInline) inlined.push(requirementsInline); - const decisionsInline = await inlineDecisionsFromDb(base, mid); + const decisionsInline = await inlineDecisionsFromDb(base, mid, undefined, inlineLevel); if (decisionsInline) inlined.push(decisionsInline); } const knowledgeInlinePM = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge"); @@ -614,9 +638,9 @@ export async function buildPlanSlicePrompt( const researchInline = await inlineFileOptional(researchPath, researchRel, "Slice Research"); if (researchInline) inlined.push(researchInline); if (inlineLevel !== "minimal") { - const decisionsInline = await inlineDecisionsFromDb(base, mid); + const decisionsInline = await inlineDecisionsFromDb(base, mid, undefined, inlineLevel); if (decisionsInline) inlined.push(decisionsInline); - const requirementsInline = await inlineRequirementsFromDb(base, sid); + const requirementsInline = await inlineRequirementsFromDb(base, sid, inlineLevel); if (requirementsInline) inlined.push(requirementsInline); } const knowledgeInlinePS = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge"); @@ -728,6 +752,12 @@ export async function buildExecuteTaskPrompt( const budgets = computeBudgets(contextWindow); const verificationBudget = `~${Math.round(budgets.verificationBudgetChars / 1000)}K chars`; + // Compress carry-forward section when it exceeds 40% of inline context budget + const carryForwardBudget = Math.floor(budgets.inlineContextBudgetChars * 0.4); + const finalCarryForward = carryForwardSection.length > carryForwardBudget + ? compressToTarget(carryForwardSection, carryForwardBudget).content + : carryForwardSection; + return loadPrompt("execute-task", { overridesSection, workingDirectory: base, @@ -737,7 +767,7 @@ export async function buildExecuteTaskPrompt( taskPlanPath: taskPlanRelPath, taskPlanInline, slicePlanExcerpt, - carryForwardSection, + carryForwardSection: finalCarryForward, resumeSection, priorTaskLines: priorLines, taskSummaryPath, @@ -760,7 +790,7 @@ export async function buildCompleteSlicePrompt( inlined.push(await inlineFile(roadmapPath, roadmapRel, "Milestone Roadmap")); inlined.push(await inlineFile(slicePlanPath, slicePlanRel, "Slice Plan")); if (inlineLevel !== "minimal") { - const requirementsInline = await inlineRequirementsFromDb(base, sid); + const requirementsInline = await inlineRequirementsFromDb(base, sid, inlineLevel); if (requirementsInline) inlined.push(requirementsInline); } const knowledgeInlineCS = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge"); @@ -831,9 +861,9 @@ export async function buildCompleteMilestonePrompt( // Inline root GSD files (skip for minimal — completion can read these if needed) if (inlineLevel !== "minimal") { - const requirementsInline = await inlineRequirementsFromDb(base); + const requirementsInline = await inlineRequirementsFromDb(base, undefined, inlineLevel); if (requirementsInline) inlined.push(requirementsInline); - const decisionsInline = await inlineDecisionsFromDb(base, mid); + const decisionsInline = await inlineDecisionsFromDb(base, mid, undefined, inlineLevel); if (decisionsInline) inlined.push(decisionsInline); const projectInline = await inlineProjectFromDb(base); if (projectInline) inlined.push(projectInline); @@ -903,9 +933,9 @@ export async function buildValidateMilestonePrompt( // Inline root GSD files if (inlineLevel !== "minimal") { - const requirementsInline = await inlineRequirementsFromDb(base); + const requirementsInline = await inlineRequirementsFromDb(base, undefined, inlineLevel); if (requirementsInline) inlined.push(requirementsInline); - const decisionsInline = await inlineDecisionsFromDb(base, mid); + const decisionsInline = await inlineDecisionsFromDb(base, mid, undefined, inlineLevel); if (decisionsInline) inlined.push(decisionsInline); const projectInline = await inlineProjectFromDb(base); if (projectInline) inlined.push(projectInline); @@ -1051,9 +1081,9 @@ export async function buildReassessRoadmapPrompt( if (inlineLevel !== "minimal") { const projectInline = await inlineProjectFromDb(base); if (projectInline) inlined.push(projectInline); - const requirementsInline = await inlineRequirementsFromDb(base); + const requirementsInline = await inlineRequirementsFromDb(base, undefined, inlineLevel); if (requirementsInline) inlined.push(requirementsInline); - const decisionsInline = await inlineDecisionsFromDb(base, mid); + const decisionsInline = await inlineDecisionsFromDb(base, mid, undefined, inlineLevel); if (decisionsInline) inlined.push(decisionsInline); } const knowledgeInlineRA = await inlineGsdRootFile(base, "knowledge.md", "Project Knowledge"); diff --git a/src/resources/extensions/gsd/context-budget.ts b/src/resources/extensions/gsd/context-budget.ts index e39e2fdca..29bf03836 100644 --- a/src/resources/extensions/gsd/context-budget.ts +++ b/src/resources/extensions/gsd/context-budget.ts @@ -8,6 +8,9 @@ * @see D001 (module location), D002 (200K fallback), D003 (section-boundary truncation) */ +import { type TokenProvider, getCharsPerToken } from "./token-counter.js"; +import { compressToTarget } from "./prompt-compressor.js"; + // ─── Budget ratio constants ────────────────────────────────────────────────── // Percentages of total context window allocated to each budget category. // These are applied after tokens→chars conversion. @@ -93,9 +96,10 @@ export interface MinimalPreferences { * Returns deterministic output for any given input. Invalid inputs (≤ 0) * silently default to 200K (D002). */ -export function computeBudgets(contextWindow: number): BudgetAllocation { +export function computeBudgets(contextWindow: number, provider?: TokenProvider): BudgetAllocation { const effectiveWindow = contextWindow > 0 ? contextWindow : DEFAULT_CONTEXT_WINDOW; - const totalChars = effectiveWindow * CHARS_PER_TOKEN; + const charsPerToken = provider ? getCharsPerToken(provider) : CHARS_PER_TOKEN; + const totalChars = effectiveWindow * charsPerToken; return { summaryBudgetChars: Math.floor(totalChars * SUMMARY_RATIO), @@ -197,6 +201,25 @@ export function resolveExecutorContextWindow( return DEFAULT_CONTEXT_WINDOW; } +/** + * Smart context reduction: compress first, then truncate if still over budget. + * Returns the content within budget with maximum information preservation. + */ +export function reduceToFit(content: string, budgetChars: number): TruncationResult { + if (!content || content.length <= budgetChars) { + return { content, droppedSections: 0 }; + } + + // Step 1: Try compression + const compressed = compressToTarget(content, budgetChars); + if (compressed.compressedChars <= budgetChars) { + return { content: compressed.content, droppedSections: 0 }; + } + + // Step 2: Truncate the compressed content at section boundaries + return truncateAtSectionBoundary(compressed.content, budgetChars); +} + // ─── Internal helpers ──────────────────────────────────────────────────────── /** diff --git a/src/resources/extensions/gsd/prompt-cache-optimizer.ts b/src/resources/extensions/gsd/prompt-cache-optimizer.ts new file mode 100644 index 000000000..36b886208 --- /dev/null +++ b/src/resources/extensions/gsd/prompt-cache-optimizer.ts @@ -0,0 +1,213 @@ +/** + * Prompt Cache Optimizer — separates prompt content into cacheable static + * prefixes and dynamic per-task suffixes to maximize provider cache hit rates. + * + * Anthropic caches by prefix match (up to 4 breakpoints, 90% savings). + * OpenAI auto-caches prompts with 1024+ stable prefix tokens (50% savings). + * Both benefit from placing static content first and dynamic content last. + */ + +/** Content classification for cache optimization */ +export type ContentRole = "static" | "semi-static" | "dynamic"; + +/** A labeled section of prompt content with its cache role */ +export interface PromptSection { + /** Identifier for this section (for metrics/debugging) */ + label: string; + /** The content string */ + content: string; + /** Cache role: static (reused across tasks), semi-static (reused within scope), dynamic (per-task) */ + role: ContentRole; +} + +/** Result of optimizing prompt sections for caching */ +export interface CacheOptimizedPrompt { + /** Assembled prompt with static content first, dynamic last */ + prompt: string; + /** Character count of the cacheable prefix (static + semi-static sections) */ + cacheablePrefixChars: number; + /** Total character count */ + totalChars: number; + /** Estimated cache efficiency: cacheablePrefixChars / totalChars */ + cacheEfficiency: number; + /** Number of sections by role */ + sectionCounts: Record; +} + +// ─── Label classification maps ─────────────────────────────────────────────── + +/** Labels that never change within a session */ +const STATIC_LABELS = new Set([ + "system-prompt", + "base-instructions", + "executor-constraints", +]); + +/** Prefix patterns for static labels (e.g. "template-*") */ +const STATIC_PREFIXES = ["template-"] as const; + +/** Labels that change per-slice but not per-task */ +const SEMI_STATIC_LABELS = new Set([ + "slice-plan", + "decisions", + "requirements", + "roadmap", + "prior-summaries", + "project-context", + "overrides", +]); + +/** Labels that change per-task */ +const DYNAMIC_LABELS = new Set([ + "task-plan", + "task-instructions", + "task-context", + "file-contents", + "diff-context", + "verification-commands", +]); + +// ─── Public API ────────────────────────────────────────────────────────────── + +/** + * Classify common GSD prompt sections by their caching potential. + * Returns the appropriate ContentRole for a section label. + */ +export function classifySection(label: string): ContentRole { + if (STATIC_LABELS.has(label)) return "static"; + if (STATIC_PREFIXES.some((p) => label.startsWith(p))) return "static"; + if (SEMI_STATIC_LABELS.has(label)) return "semi-static"; + if (DYNAMIC_LABELS.has(label)) return "dynamic"; + // Conservative default: unknown labels are treated as dynamic + return "dynamic"; +} + +/** + * Build a PromptSection from content with automatic role classification. + * + * @param label Section label (e.g., "slice-plan", "task-instructions") + * @param content The section content + * @param role Optional explicit role override + */ +export function section( + label: string, + content: string, + role?: ContentRole, +): PromptSection { + return { + label, + content, + role: role ?? classifySection(label), + }; +} + +/** + * Optimize prompt sections for maximum cache hit rates. + * Reorders sections: static first, then semi-static, then dynamic. + * Preserves relative order within each role group. + * + * @param sections Array of labeled prompt sections + * @returns Cache-optimized prompt with statistics + */ +export function optimizeForCaching( + sections: PromptSection[], +): CacheOptimizedPrompt { + const groups: Record = { + static: [], + "semi-static": [], + dynamic: [], + }; + + for (const s of sections) { + groups[s.role].push(s); + } + + const ordered = [ + ...groups["static"], + ...groups["semi-static"], + ...groups["dynamic"], + ]; + + const prompt = ordered.map((s) => s.content).join("\n\n"); + + const staticChars = groups["static"].reduce( + (sum, s) => sum + s.content.length, + 0, + ); + const semiStaticChars = groups["semi-static"].reduce( + (sum, s) => sum + s.content.length, + 0, + ); + + // Account for separator characters between sections in the cacheable prefix + const staticSeparators = + groups["static"].length > 0 + ? (groups["static"].length - 1) * 2 // "\n\n" between static sections + : 0; + const semiStaticSeparators = + groups["semi-static"].length > 0 + ? (groups["semi-static"].length - 1) * 2 + : 0; + // Separator between static and semi-static groups + const groupSeparator = + groups["static"].length > 0 && groups["semi-static"].length > 0 ? 2 : 0; + + const cacheablePrefixChars = + staticChars + + semiStaticChars + + staticSeparators + + semiStaticSeparators + + groupSeparator; + const totalChars = prompt.length; + const cacheEfficiency = totalChars > 0 ? cacheablePrefixChars / totalChars : 0; + + return { + prompt, + cacheablePrefixChars, + totalChars, + cacheEfficiency, + sectionCounts: { + static: groups["static"].length, + "semi-static": groups["semi-static"].length, + dynamic: groups["dynamic"].length, + }, + }; +} + +/** + * Estimate the cache savings for a given optimization result. + * Based on provider pricing: + * - Anthropic: 90% savings on cached tokens + * - OpenAI: 50% savings on cached tokens + * + * @param result The cache-optimized prompt + * @param provider Provider name for savings calculation + * @returns Estimated savings as a decimal (0.0-1.0) + */ +export function estimateCacheSavings( + result: CacheOptimizedPrompt, + provider: "anthropic" | "openai" | "other", +): number { + switch (provider) { + case "anthropic": + return result.cacheEfficiency * 0.9; + case "openai": + return result.cacheEfficiency * 0.5; + case "other": + return 0; + } +} + +/** + * Compute cache hit rate from token usage metrics. + * Returns a percentage 0-100. + */ +export function computeCacheHitRate(usage: { + cacheRead: number; + cacheWrite: number; + input: number; +}): number { + const denominator = usage.cacheRead + usage.input; + if (denominator === 0) return 0; + return (usage.cacheRead / denominator) * 100; +} diff --git a/src/resources/extensions/gsd/prompt-compressor.ts b/src/resources/extensions/gsd/prompt-compressor.ts new file mode 100644 index 000000000..7f72b45ce --- /dev/null +++ b/src/resources/extensions/gsd/prompt-compressor.ts @@ -0,0 +1,508 @@ +/** + * Prompt Compressor — deterministic text compression for context reduction. + * + * Applies a series of lossless and near-lossless transformations to reduce + * token count while preserving semantic meaning. No LLM calls, no external + * dependencies. Sub-millisecond for typical prompt sizes. + * + * Compression techniques (applied in order): + * 1. Redundant whitespace normalization + * 2. Markdown formatting reduction (collapse verbose tables, lists) + * 3. Common phrase abbreviation + * 4. Repeated pattern deduplication + * 5. Low-information content removal (empty sections, boilerplate) + */ + +export type CompressionLevel = "light" | "moderate" | "aggressive"; + +export interface CompressionResult { + /** The compressed content */ + content: string; + /** Original character count */ + originalChars: number; + /** Compressed character count */ + compressedChars: number; + /** Savings percentage (0-100) */ + savingsPercent: number; + /** Which compression level was applied */ + level: CompressionLevel; + /** Number of transformations applied */ + transformationsApplied: number; +} + +export interface CompressionOptions { + /** Compression intensity. Default: "moderate" */ + level?: CompressionLevel; + /** Preserve markdown headings (useful for section-boundary truncation). Default: true */ + preserveHeadings?: boolean; + /** Preserve code blocks verbatim. Default: true */ + preserveCodeBlocks?: boolean; + /** Target character count (compression stops when achieved). Default: no target */ + targetChars?: number; +} + +// ─── Phrase Abbreviation Map ──────────────────────────────────────────────── + +/** + * Build a regex that matches a verbose phrase even when split across lines. + * Whitespace between words is matched with \s+ to handle line wrapping. + */ +function phraseRegex(phrase: string): RegExp { + const words = phrase.split(/\s+/); + const pattern = `\\b${words.join("\\s+")}\\b`; + return new RegExp(pattern, "gi"); +} + +const VERBOSE_PHRASES: Array<[RegExp, string]> = [ + [phraseRegex("In order to"), "To"], + [phraseRegex("It is important to note that"), "Note:"], + [phraseRegex("As mentioned previously"), "(see above)"], + [phraseRegex("The following"), "These"], + [phraseRegex("In addition to"), "Also,"], + [phraseRegex("Due to the fact that"), "Because"], + [phraseRegex("At this point in time"), "Now"], + [phraseRegex("For the purpose of"), "For"], + [phraseRegex("In the event that"), "If"], + [phraseRegex("With regard to"), "Re:"], + [phraseRegex("Prior to"), "Before"], + [phraseRegex("Subsequent to"), "After"], + [phraseRegex("In accordance with"), "Per"], + [phraseRegex("A number of"), "Several"], + [phraseRegex("In the case of"), "For"], + [phraseRegex("On the basis of"), "Based on"], +]; + +// ─── Code Block Extraction ────────────────────────────────────────────────── + +interface ExtractedBlocks { + text: string; + blocks: Map; +} + +function extractCodeBlocks(content: string): ExtractedBlocks { + const blocks = new Map(); + let counter = 0; + + const text = content.replace(/```[\s\S]*?```/g, (match) => { + const placeholder = `\x00CODEBLOCK_${counter++}\x00`; + blocks.set(placeholder, match); + return placeholder; + }); + + return { text, blocks }; +} + +function restoreCodeBlocks(text: string, blocks: Map): string { + let result = text; + for (const [placeholder, block] of blocks) { + result = result.replace(placeholder, block); + } + return result; +} + +// ─── Light Transformations ────────────────────────────────────────────────── + +function normalizeWhitespace(content: string): string { + // Collapse 3+ consecutive blank lines to 2 + let result = content.replace(/(\n\s*){3,}\n/g, "\n\n"); + // Trim trailing whitespace on every line + result = result.replace(/[ \t]+$/gm, ""); + return result; +} + +function removeMarkdownComments(content: string): string { + return content.replace(//g, ""); +} + +function removeHorizontalRules(content: string): string { + // Remove horizontal rules (---, ***, ___) that stand alone on a line + return content.replace(/^\s*[-*_]{3,}\s*$/gm, ""); +} + +function collapseEmptyListItems(content: string): string { + // Collapse repeated empty list items (- \n- \n- \n) into one + return content.replace(/(^[ \t]*[-*+]\s*$\n){2,}/gm, "$1"); +} + +function applyLightTransformations(content: string): { content: string; count: number } { + let count = 0; + let result = content; + + const after1 = normalizeWhitespace(result); + if (after1 !== result) count++; + result = after1; + + const after2 = removeMarkdownComments(result); + if (after2 !== result) count++; + result = after2; + + const after3 = removeHorizontalRules(result); + if (after3 !== result) count++; + result = after3; + + const after4 = collapseEmptyListItems(result); + if (after4 !== result) count++; + result = after4; + + return { content: result, count }; +} + +// ─── Moderate Transformations ─────────────────────────────────────────────── + +function abbreviateVerbosePhrases(content: string): { content: string; count: number } { + let count = 0; + let result = content; + + for (const [pattern, replacement] of VERBOSE_PHRASES) { + const after = result.replace(pattern, replacement); + if (after !== result) count++; + result = after; + } + + return { content: result, count }; +} + +function removeBoilerplateLines(content: string): string { + const lines = content.split("\n"); + const filtered = lines.filter((line) => { + const trimmed = line.trim(); + // Remove lines that are just N/A, (none), (empty), (not applicable) + if (/^(?:N\/A|\(none\)|\(empty\)|\(not applicable\))$/i.test(trimmed)) { + return false; + } + return true; + }); + return filtered.join("\n"); +} + +function deduplicateConsecutiveLines(content: string): string { + const lines = content.split("\n"); + const result: string[] = []; + + for (let i = 0; i < lines.length; i++) { + if (i === 0 || lines[i] !== lines[i - 1] || lines[i].trim() === "") { + result.push(lines[i]); + } + } + + return result.join("\n"); +} + +function collapseTableFormatting(content: string): string { + // Remove excessive padding in markdown table cells + // Matches table rows like | cell | cell | and collapses to | cell | cell | + return content.replace(/\|[ \t]{2,}([^|\n]*?)[ \t]{2,}\|/g, (_, cellContent) => { + return `| ${cellContent.trim()} |`; + }); +} + +function applyModerateTransformations(content: string): { content: string; count: number } { + let count = 0; + let result = content; + + const phraseResult = abbreviateVerbosePhrases(result); + count += phraseResult.count; + result = phraseResult.content; + + const after1 = removeBoilerplateLines(result); + if (after1 !== result) count++; + result = after1; + + const after2 = deduplicateConsecutiveLines(result); + if (after2 !== result) count++; + result = after2; + + const after3 = collapseTableFormatting(result); + if (after3 !== result) count++; + result = after3; + + return { content: result, count }; +} + +// ─── Aggressive Transformations ───────────────────────────────────────────── + +function removeMarkdownEmphasis(content: string): string { + // Bold: **text** or __text__ + let result = content.replace(/\*\*(.+?)\*\*/g, "$1"); + result = result.replace(/__(.+?)__/g, "$1"); + // Italic: *text* or _text_ (single, not inside words) + result = result.replace(/(? { + if (line.length <= 300) return line; + // Find a sentence boundary (. ! ?) near the 300 char mark + const truncateZone = line.slice(0, 300); + const lastSentenceEnd = Math.max( + truncateZone.lastIndexOf(". "), + truncateZone.lastIndexOf("! "), + truncateZone.lastIndexOf("? "), + ); + if (lastSentenceEnd > 150) { + return line.slice(0, lastSentenceEnd + 1); + } + // Fallback: cut at last space before 300 + const lastSpace = truncateZone.lastIndexOf(" "); + if (lastSpace > 150) { + return line.slice(0, lastSpace); + } + return truncateZone; + }); + return result.join("\n"); +} + +function removeBulletMarkers(content: string): string { + // Remove bullet markers: - , * , + , numbered (1. 2. etc) + return content.replace(/^[ \t]*(?:[-*+]|\d+\.)\s+/gm, ""); +} + +function removeBlockquoteMarkers(content: string): string { + return content.replace(/^[ \t]*>+\s?/gm, ""); +} + +function deduplicateStructuralPatterns(content: string): string { + // Deduplicate consecutive lines that match the same "Key: value" pattern + const lines = content.split("\n"); + const result: string[] = []; + const seen = new Set(); + let lastWasStructural = false; + + for (const line of lines) { + const trimmed = line.trim(); + // Detect structural patterns: "Key: value" + const structMatch = trimmed.match(/^(\w[\w\s]*?):\s+(.+)$/); + if (structMatch) { + if (seen.has(trimmed)) { + lastWasStructural = true; + continue; + } + seen.add(trimmed); + lastWasStructural = true; + } else { + // Reset seen set when structural block ends + if (!lastWasStructural || trimmed === "") { + seen.clear(); + } + lastWasStructural = false; + } + result.push(line); + } + + return result.join("\n"); +} + +function applyAggressiveTransformations( + content: string, + preserveHeadings: boolean, +): { content: string; count: number } { + let count = 0; + let result = content; + + const after1 = removeMarkdownEmphasis(result); + if (after1 !== result) count++; + result = after1; + + const after2 = removeMarkdownLinks(result); + if (after2 !== result) count++; + result = after2; + + const after3 = truncateLongLines(result); + if (after3 !== result) count++; + result = after3; + + const after4 = removeBulletMarkers(result); + if (after4 !== result) count++; + result = after4; + + const after5 = removeBlockquoteMarkers(result); + if (after5 !== result) count++; + result = after5; + + const after6 = deduplicateStructuralPatterns(result); + if (after6 !== result) count++; + result = after6; + + return { content: result, count }; +} + +// ─── Heading Preservation ─────────────────────────────────────────────────── + +interface ExtractedHeadings { + text: string; + headings: Map; +} + +function extractHeadings(content: string): ExtractedHeadings { + const headings = new Map(); + let counter = 0; + + const text = content.replace(/^(#{1,6}\s.+)$/gm, (match) => { + const placeholder = `\x00HEADING_${counter++}\x00`; + headings.set(placeholder, match); + return placeholder; + }); + + return { text, headings }; +} + +function restoreHeadings(text: string, headings: Map): string { + let result = text; + for (const [placeholder, heading] of headings) { + result = result.replace(placeholder, heading); + } + return result; +} + +// ─── Public API ───────────────────────────────────────────────────────────── + +/** + * Compress prompt content using deterministic text transformations. + */ +export function compressPrompt(content: string, options?: CompressionOptions): CompressionResult { + const level = options?.level ?? "moderate"; + const preserveHeadings = options?.preserveHeadings ?? true; + const preserveCodeBlocks = options?.preserveCodeBlocks ?? true; + + if (content === "") { + return { + content: "", + originalChars: 0, + compressedChars: 0, + savingsPercent: 0, + level, + transformationsApplied: 0, + }; + } + + const originalChars = content.length; + let working = content; + let totalTransformations = 0; + + // Extract code blocks if preserving + let codeBlocks: Map | null = null; + if (preserveCodeBlocks) { + const extracted = extractCodeBlocks(working); + working = extracted.text; + codeBlocks = extracted.blocks; + } + + // Extract headings if preserving + let headings: Map | null = null; + if (preserveHeadings) { + const extracted = extractHeadings(working); + working = extracted.text; + headings = extracted.headings; + } + + // Apply light transformations (always) + const lightResult = applyLightTransformations(working); + working = lightResult.content; + totalTransformations += lightResult.count; + + // Check target + if (options?.targetChars && getRestoredLength(working, codeBlocks, headings) <= options.targetChars) { + return buildResult(working, originalChars, level, totalTransformations, codeBlocks, headings); + } + + // Apply moderate transformations + if (level === "moderate" || level === "aggressive") { + const modResult = applyModerateTransformations(working); + working = modResult.content; + totalTransformations += modResult.count; + + if (options?.targetChars && getRestoredLength(working, codeBlocks, headings) <= options.targetChars) { + return buildResult(working, originalChars, level, totalTransformations, codeBlocks, headings); + } + } + + // Apply aggressive transformations + if (level === "aggressive") { + const aggResult = applyAggressiveTransformations(working, preserveHeadings); + working = aggResult.content; + totalTransformations += aggResult.count; + } + + return buildResult(working, originalChars, level, totalTransformations, codeBlocks, headings); +} + +/** + * Compress with a target size — applies progressively more aggressive + * compression until the target is reached or all transformations exhausted. + */ +export function compressToTarget(content: string, targetChars: number): CompressionResult { + if (content.length <= targetChars) { + return { + content, + originalChars: content.length, + compressedChars: content.length, + savingsPercent: 0, + level: "light", + transformationsApplied: 0, + }; + } + + const levels: CompressionLevel[] = ["light", "moderate", "aggressive"]; + + for (const level of levels) { + const result = compressPrompt(content, { level, targetChars }); + if (result.compressedChars <= targetChars) { + return result; + } + // If aggressive and still over target, return best effort + if (level === "aggressive") { + return result; + } + } + + // Unreachable, but satisfy TypeScript + return compressPrompt(content, { level: "aggressive" }); +} + +// ─── Helpers ──────────────────────────────────────────────────────────────── + +function getRestoredLength( + text: string, + codeBlocks: Map | null, + headings: Map | null, +): number { + let result = text; + if (headings) result = restoreHeadings(result, headings); + if (codeBlocks) result = restoreCodeBlocks(result, codeBlocks); + return result.length; +} + +function buildResult( + working: string, + originalChars: number, + level: CompressionLevel, + transformationsApplied: number, + codeBlocks: Map | null, + headings: Map | null, +): CompressionResult { + let content = working; + if (headings) content = restoreHeadings(content, headings); + if (codeBlocks) content = restoreCodeBlocks(content, codeBlocks); + + const compressedChars = content.length; + const savingsPercent = originalChars > 0 + ? Math.round(((originalChars - compressedChars) / originalChars) * 10000) / 100 + : 0; + + return { + content, + originalChars, + compressedChars, + savingsPercent, + level, + transformationsApplied, + }; +} diff --git a/src/resources/extensions/gsd/semantic-chunker.ts b/src/resources/extensions/gsd/semantic-chunker.ts new file mode 100644 index 000000000..41747dd89 --- /dev/null +++ b/src/resources/extensions/gsd/semantic-chunker.ts @@ -0,0 +1,336 @@ +// GSD Extension — Semantic Chunker with TF-IDF Relevance Scoring +// Splits code/text into semantic chunks and selects the most relevant ones for a given task. +// Pure TypeScript — no external dependencies. + +// ─── Types ────────────────────────────────────────────────────────────────── + +export interface Chunk { + content: string; + startLine: number; + endLine: number; + score: number; +} + +export interface ChunkResult { + chunks: Chunk[]; + totalChunks: number; + omittedChunks: number; + savingsPercent: number; +} + +interface ChunkOptions { + minLines?: number; + maxLines?: number; +} + +interface RelevanceOptions { + maxChunks?: number; + minChunkLines?: number; + maxChunkLines?: number; + minScore?: number; +} + +// ─── Constants ────────────────────────────────────────────────────────────── + +const CODE_BOUNDARY_RE = /^(export\s+)?(async\s+)?(function|class|interface|type|const|enum)\s/; + +const MARKDOWN_HEADING_RE = /^#{1,6}\s/; + +const STOP_WORDS = new Set([ + "the", "a", "an", "is", "are", "was", "were", "be", "to", "of", "in", + "for", "on", "with", "at", "by", "from", "this", "that", "it", "as", + "or", "and", "not", "but", "if", "do", "no", "so", "up", "its", "has", + "had", "get", "set", "can", "may", "all", "use", "new", "one", "two", + "also", "each", "than", "been", "into", "most", "only", "over", "such", + "how", "some", "any", "our", "his", "her", "out", "did", "let", "say", "she", +]); + +const DEFAULT_MIN_LINES = 3; +const DEFAULT_MAX_LINES = 80; +const DEFAULT_MAX_CHUNKS = 5; +const DEFAULT_MIN_SCORE = 0.1; + +// ─── Content Type Detection ───────────────────────────────────────────────── + +type ContentType = "code" | "markdown" | "text"; + +function detectContentType(lines: string[]): ContentType { + let codeSignals = 0; + let mdSignals = 0; + const sampleSize = Math.min(lines.length, 50); + + for (let i = 0; i < sampleSize; i++) { + const line = lines[i]; + if (CODE_BOUNDARY_RE.test(line) || /^\s*import\s/.test(line)) { + codeSignals++; + } + if (MARKDOWN_HEADING_RE.test(line)) { + mdSignals++; + } + } + + if (mdSignals >= 2 && mdSignals > codeSignals) return "markdown"; + if (codeSignals >= 2) return "code"; + return "text"; +} + +// ─── Tokenizer ────────────────────────────────────────────────────────────── + +function tokenize(text: string): string[] { + return text + .toLowerCase() + .split(/[\s\W]+/) + .filter((w) => w.length >= 2 && !STOP_WORDS.has(w)); +} + +// ─── splitIntoChunks ──────────────────────────────────────────────────────── + +export function splitIntoChunks( + content: string, + options?: ChunkOptions, +): Chunk[] { + if (!content || content.trim().length === 0) return []; + + const minLines = options?.minLines ?? DEFAULT_MIN_LINES; + const maxLines = options?.maxLines ?? DEFAULT_MAX_LINES; + const lines = content.split("\n"); + + if (lines.length === 0) return []; + + const contentType = detectContentType(lines); + let boundaries: number[]; + + switch (contentType) { + case "code": + boundaries = findCodeBoundaries(lines); + break; + case "markdown": + boundaries = findMarkdownBoundaries(lines); + break; + default: + boundaries = findTextBoundaries(lines); + break; + } + + // Always include 0 as first boundary + if (boundaries.length === 0 || boundaries[0] !== 0) { + boundaries.unshift(0); + } + + // Build raw chunks from boundaries + const rawChunks: Chunk[] = []; + for (let i = 0; i < boundaries.length; i++) { + const start = boundaries[i]; + const end = i + 1 < boundaries.length ? boundaries[i + 1] - 1 : lines.length - 1; + const chunkLines = lines.slice(start, end + 1); + rawChunks.push({ + content: chunkLines.join("\n"), + startLine: start + 1, // 1-based + endLine: end + 1, // 1-based + score: 0, + }); + } + + // Split oversized chunks at maxLines + const splitChunks: Chunk[] = []; + for (const chunk of rawChunks) { + const chunkLineCount = chunk.endLine - chunk.startLine + 1; + if (chunkLineCount <= maxLines) { + splitChunks.push(chunk); + } else { + const chunkLines = chunk.content.split("\n"); + for (let offset = 0; offset < chunkLines.length; offset += maxLines) { + const slice = chunkLines.slice(offset, offset + maxLines); + splitChunks.push({ + content: slice.join("\n"), + startLine: chunk.startLine + offset, + endLine: chunk.startLine + offset + slice.length - 1, + score: 0, + }); + } + } + } + + // Merge tiny chunks into predecessor + const merged: Chunk[] = []; + for (const chunk of splitChunks) { + const chunkLineCount = chunk.endLine - chunk.startLine + 1; + if (chunkLineCount < minLines && merged.length > 0) { + const prev = merged[merged.length - 1]; + prev.content += "\n" + chunk.content; + prev.endLine = chunk.endLine; + } else { + merged.push({ ...chunk }); + } + } + + return merged; +} + +function findCodeBoundaries(lines: string[]): number[] { + const boundaries: number[] = []; + for (let i = 0; i < lines.length; i++) { + if (CODE_BOUNDARY_RE.test(lines[i])) { + // Also consider a blank line before a boundary marker + if (i > 0 && lines[i - 1].trim() === "" && !boundaries.includes(i)) { + boundaries.push(i); + } else if (!boundaries.includes(i)) { + boundaries.push(i); + } + } + } + return boundaries; +} + +function findMarkdownBoundaries(lines: string[]): number[] { + const boundaries: number[] = []; + for (let i = 0; i < lines.length; i++) { + if (MARKDOWN_HEADING_RE.test(lines[i])) { + boundaries.push(i); + } + } + return boundaries; +} + +function findTextBoundaries(lines: string[]): number[] { + const boundaries: number[] = [0]; + for (let i = 1; i < lines.length; i++) { + if (lines[i - 1].trim() === "" && lines[i].trim() !== "") { + boundaries.push(i); + } + } + return boundaries; +} + +// ─── scoreChunks ──────────────────────────────────────────────────────────── + +export function scoreChunks(chunks: Chunk[], query: string): Chunk[] { + if (chunks.length === 0) return []; + + const queryTerms = tokenize(query); + if (queryTerms.length === 0) { + return chunks.map((c) => ({ ...c, score: 0 })); + } + + const totalChunks = chunks.length; + + // Pre-compute IDF for each query term + const termChunkCounts = new Map(); + const chunkTokenSets: Set[] = []; + + for (const chunk of chunks) { + const tokens = new Set(tokenize(chunk.content)); + chunkTokenSets.push(tokens); + for (const term of queryTerms) { + if (tokens.has(term)) { + termChunkCounts.set(term, (termChunkCounts.get(term) ?? 0) + 1); + } + } + } + + const idf = new Map(); + for (const term of queryTerms) { + const df = termChunkCounts.get(term) ?? 0; + idf.set(term, Math.log(1 + totalChunks / (1 + df))); + } + + // Score each chunk + const scored = chunks.map((chunk, idx) => { + const chunkTokens = tokenize(chunk.content); + const totalTerms = chunkTokens.length; + if (totalTerms === 0) return { ...chunk, score: 0 }; + + // Count term frequencies + const termFreq = new Map(); + for (const token of chunkTokens) { + termFreq.set(token, (termFreq.get(token) ?? 0) + 1); + } + + let score = 0; + for (const term of queryTerms) { + const tf = (termFreq.get(term) ?? 0) / totalTerms; + const termIdf = idf.get(term) ?? 0; + score += tf * termIdf; + } + + return { ...chunk, score }; + }); + + // Normalize to 0-1 + const maxScore = Math.max(...scored.map((c) => c.score)); + if (maxScore > 0) { + for (const chunk of scored) { + chunk.score = chunk.score / maxScore; + } + } + + return scored; +} + +// ─── chunkByRelevance ─────────────────────────────────────────────────────── + +export function chunkByRelevance( + content: string, + query: string, + options?: RelevanceOptions, +): ChunkResult { + const maxChunks = options?.maxChunks ?? DEFAULT_MAX_CHUNKS; + const minScore = options?.minScore ?? DEFAULT_MIN_SCORE; + const minLines = options?.minChunkLines ?? DEFAULT_MIN_LINES; + const maxLines = options?.maxChunkLines ?? DEFAULT_MAX_LINES; + + const rawChunks = splitIntoChunks(content, { minLines, maxLines }); + if (rawChunks.length === 0) { + return { chunks: [], totalChunks: 0, omittedChunks: 0, savingsPercent: 0 }; + } + + const scored = scoreChunks(rawChunks, query); + + // Filter by minScore and take top maxChunks by score + const qualifying = scored + .filter((c) => c.score >= minScore) + .sort((a, b) => b.score - a.score) + .slice(0, maxChunks); + + // Return in original document order (by startLine) + const selected = qualifying.sort((a, b) => a.startLine - b.startLine); + + const totalChars = content.length; + const selectedChars = selected.reduce((sum, c) => sum + c.content.length, 0); + const savingsPercent = totalChars > 0 + ? Math.round(((totalChars - selectedChars) / totalChars) * 100) + : 0; + + return { + chunks: selected, + totalChunks: rawChunks.length, + omittedChunks: rawChunks.length - selected.length, + savingsPercent: Math.max(0, savingsPercent), + }; +} + +// ─── formatChunks ─────────────────────────────────────────────────────────── + +export function formatChunks(result: ChunkResult, filePath: string): string { + if (result.chunks.length === 0) { + return `[${filePath}: empty or no relevant chunks]`; + } + + const parts: string[] = []; + let lastEndLine = 0; + + for (const chunk of result.chunks) { + // Show omission gap + if (lastEndLine > 0 && chunk.startLine > lastEndLine + 1) { + const gapLines = chunk.startLine - lastEndLine - 1; + parts.push(`[...${gapLines} lines omitted...]`); + } + + parts.push(`[Lines ${chunk.startLine}-${chunk.endLine}]`); + parts.push(chunk.content); + + lastEndLine = chunk.endLine; + } + + return parts.join("\n"); +} diff --git a/src/resources/extensions/gsd/structured-data-formatter.ts b/src/resources/extensions/gsd/structured-data-formatter.ts new file mode 100644 index 000000000..20c3768eb --- /dev/null +++ b/src/resources/extensions/gsd/structured-data-formatter.ts @@ -0,0 +1,144 @@ +/** + * Structured Data Formatter — compact notation for prompt injection. + * + * Converts GSD data structures into a token-efficient format that removes + * markdown table overhead, redundant labels, and formatting while remaining + * perfectly readable by LLMs. + * + * Format rules: + * - No table pipes, dashes, or header rows + * - Use indentation (2 spaces) for structure instead of delimiters + * - Omit field names when the pattern is clear from a header + * - Use single-line entries for simple records + * - Use multi-line with indentation for complex records + */ + +// --------------------------------------------------------------------------- +// Types (inline — no imports from other GSD modules) +// --------------------------------------------------------------------------- + +interface DecisionInput { + id: string; + when_context: string; + scope: string; + decision: string; + choice: string; + rationale: string; + revisable: string; +} + +interface RequirementInput { + id: string; + class: string; + status: string; + description: string; + why: string; + primary_owner: string; + validation: string; +} + +interface TaskPlanInput { + id: string; + title: string; + description: string; + done: boolean; + estimate: string; + files?: string[]; + verify?: string; +} + +// --------------------------------------------------------------------------- +// Decisions +// --------------------------------------------------------------------------- + +/** Compact format for a single decision record (pipe-separated, no padding). */ +export function formatDecisionCompact(decision: DecisionInput): string { + return [ + decision.id, + decision.when_context, + decision.scope, + decision.decision, + decision.choice, + decision.rationale, + decision.revisable, + ].join(" | "); +} + +/** Format multiple decisions in compact notation with a Fields header. */ +export function formatDecisionsCompact(decisions: DecisionInput[]): string { + if (decisions.length === 0) { + return "# Decisions (compact)\n(none)"; + } + + const header = "# Decisions (compact)\nFields: id | when | scope | decision | choice | rationale | revisable"; + const lines = decisions.map(formatDecisionCompact); + return `${header}\n\n${lines.join("\n")}`; +} + +// --------------------------------------------------------------------------- +// Requirements +// --------------------------------------------------------------------------- + +/** Compact format for a single requirement record (multi-line). */ +export function formatRequirementCompact(req: RequirementInput): string { + const lines: string[] = []; + lines.push(`${req.id} [${req.class}] (${req.status}) owner:${req.primary_owner}`); + lines.push(` ${req.description}`); + lines.push(` why: ${req.why}`); + lines.push(` validate: ${req.validation}`); + return lines.join("\n"); +} + +/** Format multiple requirements in compact notation. */ +export function formatRequirementsCompact(requirements: RequirementInput[]): string { + if (requirements.length === 0) { + return "# Requirements (compact)\n(none)"; + } + + const header = "# Requirements (compact)"; + const blocks = requirements.map(formatRequirementCompact); + return `${header}\n\n${blocks.join("\n\n")}`; +} + +// --------------------------------------------------------------------------- +// Task Plans +// --------------------------------------------------------------------------- + +/** Compact format for task plan entries. */ +export function formatTaskPlanCompact(tasks: TaskPlanInput[]): string { + if (tasks.length === 0) { + return "# Tasks (compact)\n(none)"; + } + + const header = "# Tasks (compact)"; + const blocks = tasks.map((t) => { + const check = t.done ? "x" : " "; + const lines: string[] = []; + lines.push(`${t.id} [${check}] ${t.title} (${t.estimate})`); + if (t.files && t.files.length > 0) { + lines.push(` files: ${t.files.join(", ")}`); + } + if (t.verify) { + lines.push(` verify: ${t.verify}`); + } + lines.push(` ${t.description}`); + return lines.join("\n"); + }); + + return `${header}\n\n${blocks.join("\n\n")}`; +} + +// --------------------------------------------------------------------------- +// Savings measurement +// --------------------------------------------------------------------------- + +/** + * Measure the token savings of compact format vs markdown format. + * Returns savings as a percentage (0-100). + * A positive number means compact is smaller (saves tokens). + */ +export function measureSavings(compactContent: string, markdownContent: string): number { + if (markdownContent.length === 0) return 0; + const saved = markdownContent.length - compactContent.length; + return (saved / markdownContent.length) * 100; +} diff --git a/src/resources/extensions/gsd/summary-distiller.ts b/src/resources/extensions/gsd/summary-distiller.ts new file mode 100644 index 000000000..1aee5b203 --- /dev/null +++ b/src/resources/extensions/gsd/summary-distiller.ts @@ -0,0 +1,258 @@ +/** + * Summary distiller — extracts essential structured data from SUMMARY.md files, + * dropping verbose prose to save context budget. + */ + +export interface DistillationResult { + content: string; + summaryCount: number; + savingsPercent: number; + originalChars: number; + distilledChars: number; +} + +interface ParsedFrontmatter { + id: string; + provides: string[]; + requires: string[]; + key_files: string[]; + key_decisions: string[]; + patterns_established: string[]; +} + +interface DistilledEntry { + id: string; + oneLiner: string; + provides: string[]; + requires: string[]; + key_files: string[]; + key_decisions: string[]; + patterns: string[]; +} + +// ─── Frontmatter parsing ───────────────────────────────────────────────────── + +function parseFrontmatter(raw: string): ParsedFrontmatter { + const result: ParsedFrontmatter = { + id: "", + provides: [], + requires: [], + key_files: [], + key_decisions: [], + patterns_established: [], + }; + + // Extract frontmatter block between --- markers + const fmMatch = raw.match(/^---\r?\n([\s\S]*?)\r?\n---/); + if (!fmMatch) return result; + + const fmBlock = fmMatch[1]; + const lines = fmBlock.split(/\r?\n/); + + let currentKey: string | null = null; + + for (const line of lines) { + // Scalar value: key: value + const scalarMatch = line.match(/^(\w[\w_]*):\s*(.+)$/); + if (scalarMatch) { + const [, key, value] = scalarMatch; + currentKey = key; + setScalar(result, key, value.trim()); + continue; + } + + // Array-start key with empty value: key:\n or key: []\n + const arrayStartMatch = line.match(/^(\w[\w_]*):\s*(\[\])?\s*$/); + if (arrayStartMatch) { + currentKey = arrayStartMatch[1]; + continue; + } + + // Array item: - value + const itemMatch = line.match(/^\s+-\s+(.+)$/); + if (itemMatch && currentKey) { + pushItem(result, currentKey, itemMatch[1].trim()); + continue; + } + } + + return result; +} + +function setScalar(fm: ParsedFrontmatter, key: string, value: string): void { + if (key === "id") fm.id = value; +} + +function pushItem(fm: ParsedFrontmatter, key: string, value: string): void { + switch (key) { + case "provides": fm.provides.push(value); break; + case "requires": fm.requires.push(value); break; + case "key_files": fm.key_files.push(value); break; + case "key_decisions": fm.key_decisions.push(value); break; + case "patterns_established": fm.patterns_established.push(value); break; + } +} + +// ─── Body parsing ──────────────────────────────────────────────────────────── + +function extractTitleAndOneLiner(body: string): { id: string; oneLiner: string } { + const lines = body.split(/\r?\n/); + let titleId = ""; + let oneLiner = ""; + let foundTitle = false; + + for (const line of lines) { + const titleMatch = line.match(/^#\s+(\S+):\s*(.*)$/); + if (titleMatch && !foundTitle) { + titleId = titleMatch[1]; + // If the title line itself has text after "S01: ", use that as a fallback + if (titleMatch[2].trim()) { + oneLiner = titleMatch[2].trim(); + } + foundTitle = true; + continue; + } + + // First non-empty line after the title is the one-liner + if (foundTitle && !oneLiner && line.trim() && !line.startsWith("#")) { + oneLiner = line.trim(); + break; + } + } + + return { id: titleId, oneLiner }; +} + +function getBodyAfterFrontmatter(raw: string): string { + const fmMatch = raw.match(/^---\r?\n[\s\S]*?\r?\n---\r?\n?/); + if (fmMatch) { + return raw.slice(fmMatch[0].length); + } + return raw; +} + +// ─── Public API ────────────────────────────────────────────────────────────── + +/** + * Distill a single SUMMARY.md content string into a compact structured block. + */ +export function distillSingle(summary: string): string { + const fm = parseFrontmatter(summary); + const body = getBodyAfterFrontmatter(summary); + const { id: titleId, oneLiner } = extractTitleAndOneLiner(body); + + const id = fm.id || titleId || "???"; + + return formatEntry({ + id, + oneLiner, + provides: fm.provides, + requires: fm.requires, + key_files: fm.key_files, + key_decisions: fm.key_decisions, + patterns: fm.patterns_established, + }); +} + +function formatEntry(entry: DistilledEntry): string { + return formatEntryWithDropLevel(entry, 0); +} + +/** + * Format an entry, progressively dropping fields based on dropLevel: + * 0 = full output + * 1 = drop patterns + * 2 = drop patterns + key_decisions + * 3 = drop patterns + key_decisions + key_files + */ +function formatEntryWithDropLevel(entry: DistilledEntry, dropLevel: number): string { + const lines: string[] = []; + lines.push(`## ${entry.id}: ${entry.oneLiner}`); + + if (entry.provides.length > 0) { + lines.push(`provides: ${entry.provides.join(", ")}`); + } + if (entry.requires.length > 0) { + lines.push(`requires: ${entry.requires.join(", ")}`); + } + if (dropLevel < 3 && entry.key_files.length > 0) { + lines.push(`key_files: ${entry.key_files.join(", ")}`); + } + if (dropLevel < 2 && entry.key_decisions.length > 0) { + lines.push(`key_decisions: ${entry.key_decisions.join(", ")}`); + } + if (dropLevel < 1 && entry.patterns.length > 0) { + lines.push(`patterns: ${entry.patterns.join(", ")}`); + } + + return lines.join("\n"); +} + +/** + * Distill multiple SUMMARY.md contents into a budget-constrained output. + */ +export function distillSummaries(summaries: string[], budgetChars: number): DistillationResult { + const originalChars = summaries.reduce((sum, s) => sum + s.length, 0); + + if (summaries.length === 0) { + return { + content: "", + summaryCount: 0, + savingsPercent: 0, + originalChars: 0, + distilledChars: 0, + }; + } + + // Parse all entries up front + const entries: DistilledEntry[] = summaries.map((summary) => { + const fm = parseFrontmatter(summary); + const body = getBodyAfterFrontmatter(summary); + const { id: titleId, oneLiner } = extractTitleAndOneLiner(body); + return { + id: fm.id || titleId || "???", + oneLiner, + provides: fm.provides, + requires: fm.requires, + key_files: fm.key_files, + key_decisions: fm.key_decisions, + patterns: fm.patterns_established, + }; + }); + + // Try progressively more aggressive dropping until it fits + for (let dropLevel = 0; dropLevel <= 3; dropLevel++) { + const blocks = entries.map((e) => formatEntryWithDropLevel(e, dropLevel)); + const content = blocks.join("\n\n"); + if (content.length <= budgetChars) { + const distilledChars = content.length; + return { + content, + summaryCount: summaries.length, + savingsPercent: originalChars > 0 + ? Math.round((1 - distilledChars / originalChars) * 100) + : 0, + originalChars, + distilledChars, + }; + } + } + + // Even at max drop level it doesn't fit — truncate + const blocks = entries.map((e) => formatEntryWithDropLevel(e, 3)); + let content = blocks.join("\n\n"); + if (content.length > budgetChars) { + content = content.slice(0, Math.max(0, budgetChars - 15)) + "\n[...truncated]"; + } + + const distilledChars = content.length; + return { + content, + summaryCount: summaries.length, + savingsPercent: originalChars > 0 + ? Math.round((1 - distilledChars / originalChars) * 100) + : 0, + originalChars, + distilledChars, + }; +} diff --git a/src/resources/extensions/gsd/tests/context-budget.test.ts b/src/resources/extensions/gsd/tests/context-budget.test.ts index 1e3f1c67c..6ac2531f6 100644 --- a/src/resources/extensions/gsd/tests/context-budget.test.ts +++ b/src/resources/extensions/gsd/tests/context-budget.test.ts @@ -18,6 +18,8 @@ import { resolveExecutorContextWindow, } from "../context-budget.js"; +import type { TokenProvider } from "../token-counter.js"; + // ─── Test helpers ───────────────────────────────────────────────────────────── function makeRegistry(models: MinimalModel[]): MinimalModelRegistry { @@ -281,3 +283,70 @@ describe("context-budget: resolveExecutorContextWindow", () => { assert.equal(result, 200_000); // falls through to default }); }); + +// ─── computeBudgets with provider ───────────────────────────────────────────── + +describe("context-budget: computeBudgets with provider", () => { + it("anthropic budgets differ from default budgets for same window", () => { + const defaultBudgets = computeBudgets(200_000); + const anthropicBudgets = computeBudgets(200_000, "anthropic"); + + // anthropic uses 3.5 chars/token vs default 4.0 + // so anthropic totalChars = 200K * 3.5 = 700K vs default 200K * 4 = 800K + assert.ok( + anthropicBudgets.summaryBudgetChars < defaultBudgets.summaryBudgetChars, + `anthropic summary (${anthropicBudgets.summaryBudgetChars}) should be less than default (${defaultBudgets.summaryBudgetChars})`, + ); + assert.ok( + anthropicBudgets.inlineContextBudgetChars < defaultBudgets.inlineContextBudgetChars, + `anthropic inline (${anthropicBudgets.inlineContextBudgetChars}) should be less than default (${defaultBudgets.inlineContextBudgetChars})`, + ); + }); + + it("openai provider matches default budgets (both use 4.0 chars/token)", () => { + const defaultBudgets = computeBudgets(128_000); + const openaiBudgets = computeBudgets(128_000, "openai"); + + assert.deepStrictEqual(openaiBudgets, defaultBudgets); + }); + + it("anthropic budgets are proportional to 3.5 chars/token", () => { + const b = computeBudgets(200_000, "anthropic"); + // 200K tokens * 3.5 chars/token = 700K chars total + assert.equal(b.summaryBudgetChars, Math.floor(700_000 * 0.15)); + assert.equal(b.inlineContextBudgetChars, Math.floor(700_000 * 0.40)); + assert.equal(b.verificationBudgetChars, Math.floor(700_000 * 0.10)); + }); + + it("bedrock budgets match anthropic (both use 3.5 chars/token)", () => { + const anthropicBudgets = computeBudgets(200_000, "anthropic"); + const bedrockBudgets = computeBudgets(200_000, "bedrock"); + + assert.deepStrictEqual(bedrockBudgets, anthropicBudgets); + }); + + it("default behavior unchanged when no provider is passed", () => { + const b = computeBudgets(128_000); + // 128K * 4 = 512K + assert.equal(b.summaryBudgetChars, Math.floor(512_000 * 0.15)); + assert.equal(b.inlineContextBudgetChars, Math.floor(512_000 * 0.40)); + assert.equal(b.verificationBudgetChars, Math.floor(512_000 * 0.10)); + assert.equal(b.continueThresholdPercent, 70); + assert.equal(b.taskCountRange.min, 2); + assert.equal(b.taskCountRange.max, 5); + }); + + it("task count range is unaffected by provider", () => { + const defaultBudgets = computeBudgets(200_000); + const anthropicBudgets = computeBudgets(200_000, "anthropic"); + + assert.deepStrictEqual(anthropicBudgets.taskCountRange, defaultBudgets.taskCountRange); + assert.equal(anthropicBudgets.continueThresholdPercent, defaultBudgets.continueThresholdPercent); + }); + + it("handles zero input with provider — defaults to 200K", () => { + const b = computeBudgets(0, "anthropic"); + const b200 = computeBudgets(200_000, "anthropic"); + assert.deepStrictEqual(b, b200); + }); +}); diff --git a/src/resources/extensions/gsd/tests/prompt-cache-optimizer.test.ts b/src/resources/extensions/gsd/tests/prompt-cache-optimizer.test.ts new file mode 100644 index 000000000..67e01d685 --- /dev/null +++ b/src/resources/extensions/gsd/tests/prompt-cache-optimizer.test.ts @@ -0,0 +1,314 @@ +/** + * Unit tests for prompt-cache-optimizer.ts — cache-aware prompt reordering. + */ + +import { describe, it } from "node:test"; +import assert from "node:assert/strict"; + +import { + type ContentRole, + type PromptSection, + classifySection, + section, + optimizeForCaching, + estimateCacheSavings, + computeCacheHitRate, +} from "../prompt-cache-optimizer.js"; + +// ─── classifySection ───────────────────────────────────────────────────────── + +describe("prompt-cache-optimizer: classifySection", () => { + it("classifies system-prompt as static", () => { + assert.equal(classifySection("system-prompt"), "static"); + }); + + it("classifies base-instructions as static", () => { + assert.equal(classifySection("base-instructions"), "static"); + }); + + it("classifies executor-constraints as static", () => { + assert.equal(classifySection("executor-constraints"), "static"); + }); + + it("classifies template-* prefixed labels as static", () => { + assert.equal(classifySection("template-code"), "static"); + assert.equal(classifySection("template-review"), "static"); + assert.equal(classifySection("template-"), "static"); + }); + + it("classifies slice-plan as semi-static", () => { + assert.equal(classifySection("slice-plan"), "semi-static"); + }); + + it("classifies decisions as semi-static", () => { + assert.equal(classifySection("decisions"), "semi-static"); + }); + + it("classifies requirements as semi-static", () => { + assert.equal(classifySection("requirements"), "semi-static"); + }); + + it("classifies roadmap as semi-static", () => { + assert.equal(classifySection("roadmap"), "semi-static"); + }); + + it("classifies prior-summaries as semi-static", () => { + assert.equal(classifySection("prior-summaries"), "semi-static"); + }); + + it("classifies project-context as semi-static", () => { + assert.equal(classifySection("project-context"), "semi-static"); + }); + + it("classifies overrides as semi-static", () => { + assert.equal(classifySection("overrides"), "semi-static"); + }); + + it("classifies task-plan as dynamic", () => { + assert.equal(classifySection("task-plan"), "dynamic"); + }); + + it("classifies task-instructions as dynamic", () => { + assert.equal(classifySection("task-instructions"), "dynamic"); + }); + + it("classifies task-context as dynamic", () => { + assert.equal(classifySection("task-context"), "dynamic"); + }); + + it("classifies file-contents as dynamic", () => { + assert.equal(classifySection("file-contents"), "dynamic"); + }); + + it("classifies diff-context as dynamic", () => { + assert.equal(classifySection("diff-context"), "dynamic"); + }); + + it("classifies verification-commands as dynamic", () => { + assert.equal(classifySection("verification-commands"), "dynamic"); + }); + + it("defaults unknown labels to dynamic", () => { + assert.equal(classifySection("something-unknown"), "dynamic"); + assert.equal(classifySection(""), "dynamic"); + assert.equal(classifySection("random-label"), "dynamic"); + }); +}); + +// ─── section() helper ──────────────────────────────────────────────────────── + +describe("prompt-cache-optimizer: section()", () => { + it("auto-classifies based on label", () => { + const s = section("system-prompt", "You are an assistant."); + assert.equal(s.label, "system-prompt"); + assert.equal(s.content, "You are an assistant."); + assert.equal(s.role, "static"); + }); + + it("auto-classifies semi-static labels", () => { + const s = section("slice-plan", "Plan content here."); + assert.equal(s.role, "semi-static"); + }); + + it("auto-classifies dynamic labels", () => { + const s = section("task-instructions", "Do this task."); + assert.equal(s.role, "dynamic"); + }); + + it("allows manual role override", () => { + const s = section("unknown-label", "content", "static"); + assert.equal(s.role, "static"); + }); + + it("override takes precedence over auto-classification", () => { + const s = section("system-prompt", "content", "dynamic"); + assert.equal(s.role, "dynamic"); + }); +}); + +// ─── optimizeForCaching ────────────────────────────────────────────────────── + +describe("prompt-cache-optimizer: optimizeForCaching", () => { + it("orders static before semi-static before dynamic", () => { + const sections: PromptSection[] = [ + { label: "task", content: "DYNAMIC", role: "dynamic" }, + { label: "plan", content: "SEMI", role: "semi-static" }, + { label: "sys", content: "STATIC", role: "static" }, + ]; + + const result = optimizeForCaching(sections); + const parts = result.prompt.split("\n\n"); + assert.equal(parts[0], "STATIC"); + assert.equal(parts[1], "SEMI"); + assert.equal(parts[2], "DYNAMIC"); + }); + + it("preserves relative order within the same role group", () => { + const sections: PromptSection[] = [ + { label: "d1", content: "D-first", role: "dynamic" }, + { label: "d2", content: "D-second", role: "dynamic" }, + { label: "s1", content: "S-first", role: "static" }, + { label: "s2", content: "S-second", role: "static" }, + ]; + + const result = optimizeForCaching(sections); + const parts = result.prompt.split("\n\n"); + assert.equal(parts[0], "S-first"); + assert.equal(parts[1], "S-second"); + assert.equal(parts[2], "D-first"); + assert.equal(parts[3], "D-second"); + }); + + it("calculates cacheEfficiency correctly", () => { + const sections: PromptSection[] = [ + { label: "sys", content: "AAAA", role: "static" }, // 4 chars + { label: "plan", content: "BBBB", role: "semi-static" }, // 4 chars + { label: "task", content: "CCCC", role: "dynamic" }, // 4 chars + ]; + + const result = optimizeForCaching(sections); + // Cacheable prefix = "AAAA" + "\n\n" + "BBBB" = 10 chars + // Total = "AAAA\n\nBBBB\n\nCCCC" = 16 chars + assert.equal(result.cacheablePrefixChars, 10); + assert.equal(result.totalChars, 16); + assert.ok(Math.abs(result.cacheEfficiency - 10 / 16) < 0.001); + }); + + it("returns correct section counts", () => { + const sections: PromptSection[] = [ + { label: "a", content: "x", role: "static" }, + { label: "b", content: "y", role: "static" }, + { label: "c", content: "z", role: "semi-static" }, + { label: "d", content: "w", role: "dynamic" }, + ]; + + const result = optimizeForCaching(sections); + assert.deepEqual(result.sectionCounts, { + static: 2, + "semi-static": 1, + dynamic: 1, + }); + }); + + it("handles empty sections array", () => { + const result = optimizeForCaching([]); + assert.equal(result.prompt, ""); + assert.equal(result.cacheablePrefixChars, 0); + assert.equal(result.totalChars, 0); + assert.equal(result.cacheEfficiency, 0); + assert.deepEqual(result.sectionCounts, { + static: 0, + "semi-static": 0, + dynamic: 0, + }); + }); + + it("handles only static sections (100% cacheable)", () => { + const sections: PromptSection[] = [ + { label: "sys", content: "Hello", role: "static" }, + ]; + + const result = optimizeForCaching(sections); + assert.equal(result.cacheEfficiency, 1); + assert.equal(result.cacheablePrefixChars, result.totalChars); + }); + + it("handles only dynamic sections (0% cacheable)", () => { + const sections: PromptSection[] = [ + { label: "task", content: "Do something", role: "dynamic" }, + ]; + + const result = optimizeForCaching(sections); + assert.equal(result.cacheablePrefixChars, 0); + assert.equal(result.cacheEfficiency, 0); + }); +}); + +// ─── estimateCacheSavings ──────────────────────────────────────────────────── + +describe("prompt-cache-optimizer: estimateCacheSavings", () => { + it("returns 90% of cache efficiency for anthropic", () => { + const result = optimizeForCaching([ + { label: "sys", content: "AAAA", role: "static" }, + { label: "task", content: "CCCC", role: "dynamic" }, + ]); + // cacheEfficiency = 4 / 10 = 0.4 + const savings = estimateCacheSavings(result, "anthropic"); + assert.ok(Math.abs(savings - result.cacheEfficiency * 0.9) < 0.001); + }); + + it("returns 50% of cache efficiency for openai", () => { + const result = optimizeForCaching([ + { label: "sys", content: "AAAA", role: "static" }, + { label: "task", content: "CCCC", role: "dynamic" }, + ]); + const savings = estimateCacheSavings(result, "openai"); + assert.ok(Math.abs(savings - result.cacheEfficiency * 0.5) < 0.001); + }); + + it("returns 0 for other providers", () => { + const result = optimizeForCaching([ + { label: "sys", content: "AAAA", role: "static" }, + ]); + assert.equal(estimateCacheSavings(result, "other"), 0); + }); + + it("returns 0 when cache efficiency is 0", () => { + const result = optimizeForCaching([ + { label: "task", content: "CCCC", role: "dynamic" }, + ]); + assert.equal(estimateCacheSavings(result, "anthropic"), 0); + assert.equal(estimateCacheSavings(result, "openai"), 0); + }); +}); + +// ─── computeCacheHitRate ───────────────────────────────────────────────────── + +describe("prompt-cache-optimizer: computeCacheHitRate", () => { + it("computes hit rate as percentage", () => { + const rate = computeCacheHitRate({ + cacheRead: 800, + cacheWrite: 200, + input: 200, + }); + // 800 / (800 + 200) * 100 = 80% + assert.equal(rate, 80); + }); + + it("returns 0 when no cache activity", () => { + const rate = computeCacheHitRate({ + cacheRead: 0, + cacheWrite: 0, + input: 0, + }); + assert.equal(rate, 0); + }); + + it("returns 100 when everything is from cache", () => { + const rate = computeCacheHitRate({ + cacheRead: 1000, + cacheWrite: 0, + input: 0, + }); + assert.equal(rate, 100); + }); + + it("returns 0 when nothing is from cache", () => { + const rate = computeCacheHitRate({ + cacheRead: 0, + cacheWrite: 500, + input: 1000, + }); + assert.equal(rate, 0); + }); + + it("ignores cacheWrite in hit rate calculation", () => { + const rate = computeCacheHitRate({ + cacheRead: 500, + cacheWrite: 9999, + input: 500, + }); + // 500 / (500 + 500) * 100 = 50% + assert.equal(rate, 50); + }); +}); diff --git a/src/resources/extensions/gsd/tests/prompt-compressor.test.ts b/src/resources/extensions/gsd/tests/prompt-compressor.test.ts new file mode 100644 index 000000000..36f99b4f8 --- /dev/null +++ b/src/resources/extensions/gsd/tests/prompt-compressor.test.ts @@ -0,0 +1,529 @@ +import test from "node:test"; +import assert from "node:assert/strict"; + +import { + compressPrompt, + compressToTarget, +} from "../prompt-compressor.js"; +import type { + CompressionLevel, + CompressionResult, + CompressionOptions, +} from "../prompt-compressor.js"; + +// ─── Test Fixtures ────────────────────────────────────────────────────────── + +const WHITESPACE_HEAVY = `# Section One + +Some content here. + + + +Another paragraph here. + + +Yet another paragraph. + + + +# Section Two + +More content.`; + +const MARKDOWN_COMMENTS = `# Title + + + +Some content here. + + + +More content.`; + +const HORIZONTAL_RULES = `# Section One + +Some content. + +--- + +# Section Two + +More content. + +*** + +# Section Three + +Final content.`; + +const VERBOSE_PROSE = `In order to implement this feature, it is important to note that the following +requirements must be met. Due to the fact that the system operates in real-time, +prior to deployment we need to verify all components. In addition to the main +module, a number of auxiliary services are required. In the event that a service +fails, subsequent to the failure, the system should recover. For the purpose of +monitoring, in accordance with our SLA, with regard to uptime, at this point in +time we achieve 99.9%. On the basis of recent data, in the case of peak traffic, +as mentioned previously, the system scales automatically.`; + +const BOILERPLATE_CONTENT = `# Requirements + +## Feature A +Must support pagination. + +## Feature B +N/A + +## Feature C +(none) + +## Feature D +(empty) + +## Feature E +(not applicable) + +## Feature F +Must handle errors gracefully.`; + +const DUPLICATE_LINES = `Status: active +Status: active +Status: active +Priority: high +Name: test project +Name: test project`; + +const EMPHASIS_CONTENT = `This is **bold text** and this is *italic text*. +Also __underline bold__ and _underline italic_. +Check [this link](https://example.com) and [another](https://test.org).`; + +const CODE_BLOCK_CONTENT = `# Setup Guide + +In order to configure the system, run the following command: + +\`\`\`typescript +const config = { + debug: true, + verbose: false, + timeout: 3000, +}; +\`\`\` + +Due to the fact that configuration is loaded at startup, prior to +running the application, verify the config file exists. + +\`\`\`bash +ls -la config.json +\`\`\` + +The following steps complete the setup.`; + +const HEADING_CONTENT = `# Main Title + +## Subsection A + +In order to do something, the following steps are needed. + +## Subsection B + +More content here with **emphasis** and [a link](https://example.com). + +### Sub-subsection + +Details here.`; + +const REALISTIC_GSD_CONTENT = `# Project: GSD Task Manager + + + +## Decisions + +| Decision ID | Title | Status | Date | +|---------------|------------------------------|------------|--------------| +| DEC-001 | Use TypeScript | Approved | 2024-01-15 | +| DEC-002 | Adopt monorepo | Approved | 2024-01-20 | +| DEC-003 | Use node:test | Approved | 2024-02-01 | + +## Requirements + +### Must-Have + +In order to support the core workflow, it is important to note that the following +requirements are non-negotiable. Due to the fact that the system must operate in +CI environments, prior to any release, all tests must pass. + +- The system must handle concurrent operations +- The system must handle concurrent operations +- Error recovery must be automatic +- Configuration must be file-based +- Configuration must be file-based + +### Nice-to-Have + +N/A + +### Out of Scope + +(none) + +--- + +## Implementation Notes + +> In accordance with our coding standards, all modules should follow +> the single responsibility principle. With regard to testing, a number of +> integration tests should supplement unit tests. + +For the purpose of maintaining code quality, at this point in time we require +100% branch coverage on critical paths. In the event that coverage drops below +the threshold, subsequent to the detection, the CI pipeline should fail. + +**Important**: The following constraints apply: +- *Memory usage* must stay under 512MB +- *CPU usage* must not exceed 80% sustained +- Response times under 100ms for the 95th percentile + +In addition to the above, the system should support plugin architecture. +As mentioned previously, this was decided in DEC-001. + +--- + +## Status + +Status: active +Status: active +Priority: high +Sprint: 14 +Sprint: 14 +Milestone: v2.1.0`; + +const LONG_LINE = "This is a very long line that goes on and on. It contains multiple sentences that discuss various topics. The purpose of this line is to test the truncation functionality. When lines exceed 300 characters, they should be truncated at a sentence boundary. This ensures that the compressed output remains readable. Additional text is added here to make sure we exceed the 300 character limit for testing purposes. Even more text follows to pad the line further."; + +const BLOCKQUOTE_CONTENT = `> This is a blockquote +> with multiple lines +> that should have markers removed. + +Normal paragraph here. + +> Another blockquote.`; + +const BULLET_LIST = `Some intro text: + +- First item in the list +- Second item in the list +* Third item with star ++ Fourth item with plus +1. Numbered item one +2. Numbered item two + +Closing text.`; + +// ─── Light Compression Tests ──────────────────────────────────────────────── + +test("light compression removes extra whitespace", () => { + const result = compressPrompt(WHITESPACE_HEAVY, { level: "light" }); + assert.ok(result.compressedChars < result.originalChars, "should reduce size"); + assert.ok(!result.content.includes(" \n"), "should not have trailing spaces"); + // Should not have 3+ consecutive blank lines + assert.ok(!result.content.match(/\n\s*\n\s*\n\s*\n/), "should not have 3+ blank lines"); + assert.equal(result.level, "light"); +}); + +test("light compression removes markdown comments", () => { + const result = compressPrompt(MARKDOWN_COMMENTS, { level: "light" }); + assert.ok(!result.content.includes(""), "should not contain comment end"); + assert.ok(result.content.includes("# Title"), "should preserve heading"); + assert.ok(result.content.includes("Some content here."), "should preserve normal content"); +}); + +test("light compression removes horizontal rules", () => { + const result = compressPrompt(HORIZONTAL_RULES, { level: "light" }); + assert.ok(!result.content.match(/^---$/m), "should not contain ---"); + assert.ok(!result.content.match(/^\*\*\*$/m), "should not contain ***"); + assert.ok(result.content.includes("# Section One"), "should preserve headings"); + assert.ok(result.content.includes("# Section Two"), "should preserve headings"); +}); + +test("light compression preserves code blocks", () => { + const result = compressPrompt(CODE_BLOCK_CONTENT, { level: "light" }); + assert.ok(result.content.includes("const config = {"), "should preserve code block content"); + assert.ok(result.content.includes("```typescript"), "should preserve code fence"); + assert.ok(result.content.includes("```bash"), "should preserve code fence"); +}); + +// ─── Moderate Compression Tests ───────────────────────────────────────────── + +test("moderate compression abbreviates verbose phrases", () => { + const result = compressPrompt(VERBOSE_PROSE, { level: "moderate" }); + assert.ok(result.content.includes("To implement"), "should abbreviate 'In order to'"); + assert.ok(result.content.includes("Because"), "should abbreviate 'Due to the fact that'"); + assert.ok(result.content.includes("Before deployment"), "should abbreviate 'Prior to'"); + assert.ok(result.content.includes("Also,"), "should abbreviate 'In addition to'"); + assert.ok(result.content.includes("Several"), "should abbreviate 'A number of'"); + assert.ok(result.content.includes("If"), "should abbreviate 'In the event that'"); + assert.ok(result.content.includes("After"), "should abbreviate 'Subsequent to'"); + assert.ok(!result.content.includes("For the purpose of"), "should abbreviate 'For the purpose of'"); + assert.ok(result.content.includes("Per"), "should abbreviate 'In accordance with'"); + assert.ok(result.content.includes("Re:"), "should abbreviate 'With regard to'"); + assert.ok(result.content.includes("Now"), "should abbreviate 'At this point in time'"); + assert.ok(result.content.includes("Based on"), "should abbreviate 'On the basis of'"); + assert.ok(result.content.includes("(see above)"), "should abbreviate 'As mentioned previously'"); + assert.ok(result.compressedChars < result.originalChars, "should reduce size"); +}); + +test("moderate compression deduplicates consecutive lines", () => { + const input = "Line one\nLine one\nLine one\nLine two\nLine three\nLine three"; + const result = compressPrompt(input, { level: "moderate" }); + const lines = result.content.split("\n").filter((l) => l.trim() !== ""); + // Count occurrences of "Line one" + const lineOneCount = lines.filter((l) => l === "Line one").length; + assert.equal(lineOneCount, 1, "should deduplicate 'Line one'"); + const lineThreeCount = lines.filter((l) => l === "Line three").length; + assert.equal(lineThreeCount, 1, "should deduplicate 'Line three'"); +}); + +test("moderate compression removes boilerplate", () => { + const result = compressPrompt(BOILERPLATE_CONTENT, { level: "moderate" }); + assert.ok(!result.content.match(/^\s*N\/A\s*$/m), "should remove N/A lines"); + assert.ok(!result.content.includes("(none)"), "should remove (none)"); + assert.ok(!result.content.includes("(empty)"), "should remove (empty)"); + assert.ok(!result.content.includes("(not applicable)"), "should remove (not applicable)"); + assert.ok(result.content.includes("Must support pagination"), "should keep real content"); + assert.ok(result.content.includes("Must handle errors"), "should keep real content"); +}); + +test("moderate compression collapses table formatting", () => { + const table = `| Name | Value | Status | +| foo | bar | active |`; + const result = compressPrompt(table, { level: "moderate" }); + // Should have reduced padding + assert.ok(result.compressedChars < result.originalChars, "should reduce table padding"); +}); + +// ─── Aggressive Compression Tests ─────────────────────────────────────────── + +test("aggressive compression removes emphasis and links", () => { + const result = compressPrompt(EMPHASIS_CONTENT, { level: "aggressive" }); + assert.ok(!result.content.includes("**"), "should remove bold markers"); + assert.ok(!result.content.includes("__"), "should remove underline bold markers"); + assert.ok(result.content.includes("bold text"), "should keep bold text content"); + assert.ok(result.content.includes("italic text"), "should keep italic text content"); + assert.ok(result.content.includes("this link"), "should keep link text"); + assert.ok(!result.content.includes("https://example.com"), "should remove link URLs"); + assert.ok(!result.content.includes("https://test.org"), "should remove link URLs"); +}); + +test("aggressive compression removes bullet markers", () => { + const result = compressPrompt(BULLET_LIST, { level: "aggressive" }); + assert.ok(!result.content.match(/^- /m), "should remove dash bullets"); + assert.ok(!result.content.match(/^\* /m), "should remove star bullets"); + assert.ok(!result.content.match(/^\+ /m), "should remove plus bullets"); + assert.ok(!result.content.match(/^\d+\. /m), "should remove numbered bullets"); + assert.ok(result.content.includes("First item"), "should keep bullet content"); + assert.ok(result.content.includes("Numbered item"), "should keep numbered content"); +}); + +test("aggressive compression removes blockquote markers", () => { + const result = compressPrompt(BLOCKQUOTE_CONTENT, { level: "aggressive" }); + assert.ok(!result.content.match(/^> /m), "should remove blockquote markers"); + assert.ok(result.content.includes("This is a blockquote"), "should keep blockquote content"); + assert.ok(result.content.includes("Normal paragraph"), "should keep normal content"); +}); + +test("aggressive compression truncates long lines", () => { + const result = compressPrompt(LONG_LINE, { level: "aggressive" }); + const lines = result.content.split("\n"); + for (const line of lines) { + assert.ok(line.length <= 300, `line should be <= 300 chars, got ${line.length}`); + } +}); + +test("aggressive compression deduplicates structural patterns", () => { + const result = compressPrompt(DUPLICATE_LINES, { level: "aggressive" }); + const lines = result.content.split("\n").filter((l) => l.trim() !== ""); + const statusCount = lines.filter((l) => l.includes("Status: active")).length; + assert.equal(statusCount, 1, "should keep only one Status: active"); + const nameCount = lines.filter((l) => l.includes("Name: test project")).length; + assert.equal(nameCount, 1, "should keep only one Name: test project"); +}); + +// ─── Preservation Tests ───────────────────────────────────────────────────── + +test("code block preservation protects code from compression", () => { + const result = compressPrompt(CODE_BLOCK_CONTENT, { + level: "aggressive", + preserveCodeBlocks: true, + }); + // Code blocks should be untouched + assert.ok(result.content.includes("const config = {"), "code block preserved"); + assert.ok(result.content.includes("debug: true,"), "code block details preserved"); + assert.ok(result.content.includes("ls -la config.json"), "bash code block preserved"); + // But surrounding prose should be compressed + assert.ok(!result.content.includes("In order to"), "prose should be compressed"); + assert.ok(!result.content.includes("Due to the fact that"), "prose should be compressed"); +}); + +test("code block preservation can be disabled", () => { + const result = compressPrompt(CODE_BLOCK_CONTENT, { + level: "aggressive", + preserveCodeBlocks: false, + }); + // Phrase abbreviation still works on surrounding text + assert.ok(result.compressedChars < result.originalChars, "should still compress"); +}); + +test("heading preservation keeps headings intact", () => { + const result = compressPrompt(HEADING_CONTENT, { + level: "aggressive", + preserveHeadings: true, + }); + assert.ok(result.content.includes("# Main Title"), "should preserve h1"); + assert.ok(result.content.includes("## Subsection A"), "should preserve h2"); + assert.ok(result.content.includes("## Subsection B"), "should preserve h2"); + assert.ok(result.content.includes("### Sub-subsection"), "should preserve h3"); +}); + +// ─── compressToTarget Tests ───────────────────────────────────────────────── + +test("compressToTarget tries progressively harder levels", () => { + // Set a target that light compression cannot reach + const lightResult = compressPrompt(REALISTIC_GSD_CONTENT, { level: "light" }); + const moderateResult = compressPrompt(REALISTIC_GSD_CONTENT, { level: "moderate" }); + + // Target between light and moderate results + const target = Math.floor((lightResult.compressedChars + moderateResult.compressedChars) / 2); + const result = compressToTarget(REALISTIC_GSD_CONTENT, target); + + // Should have used at least moderate + assert.ok( + result.level === "moderate" || result.level === "aggressive", + `should use moderate or aggressive, got ${result.level}`, + ); + assert.ok(result.compressedChars <= target, "should meet target"); +}); + +test("compressToTarget returns best effort when target unreachable", () => { + // Set an impossibly small target + const result = compressToTarget(REALISTIC_GSD_CONTENT, 10); + assert.equal(result.level, "aggressive", "should try aggressive as last resort"); + assert.ok(result.compressedChars > 10, "cannot reach impossibly small target"); + assert.ok( + result.compressedChars < REALISTIC_GSD_CONTENT.length, + "should still compress as much as possible", + ); +}); + +test("compressToTarget returns unchanged if already under target", () => { + const result = compressToTarget("short text", 1000); + assert.equal(result.content, "short text"); + assert.equal(result.savingsPercent, 0); + assert.equal(result.transformationsApplied, 0); +}); + +// ─── Realistic GSD Content Test ───────────────────────────────────────────── + +test("realistic GSD content compresses significantly", () => { + const result = compressPrompt(REALISTIC_GSD_CONTENT, { level: "aggressive" }); + + // Should achieve meaningful compression + assert.ok(result.savingsPercent > 15, `should achieve >15% savings, got ${result.savingsPercent}%`); + assert.ok(result.transformationsApplied > 3, "should apply multiple transformations"); + + // Key content preserved + assert.ok(result.content.includes("# Project: GSD Task Manager"), "title preserved"); + assert.ok(result.content.includes("DEC-001"), "decision IDs preserved"); + assert.ok(result.content.includes("TypeScript"), "decision content preserved"); + assert.ok(result.content.includes("## Decisions"), "section headings preserved"); + assert.ok(result.content.includes("## Requirements"), "section headings preserved"); + + // Comments removed + assert.ok(!result.content.includes("