docs: add extension docs
This commit is contained in:
parent
86a6456aef
commit
99c72ea18e
107 changed files with 9291 additions and 2 deletions
3
.gitignore
vendored
3
.gitignore
vendored
|
|
@ -38,5 +38,4 @@ dist/
|
|||
.gsd*.tgz
|
||||
.gsd
|
||||
.artifacts/
|
||||
AGENTS.md
|
||||
docs/
|
||||
AGENTS.md
|
||||
222
docs/agent-knowledge-index.md
Normal file
222
docs/agent-knowledge-index.md
Normal file
|
|
@ -0,0 +1,222 @@
|
|||
# Agent Knowledge Index
|
||||
|
||||
Use this file as a machine-operational routing table for pi docs and research references.
|
||||
|
||||
Rules:
|
||||
|
||||
- Read only the specific files relevant to the current task.
|
||||
- Prefer the primary bundle first.
|
||||
- Read files in parallel when the task clearly maps to multiple known references.
|
||||
- Use absolute paths directly with `read`.
|
||||
- Follow conditional references only when the primary bundle does not answer the question.
|
||||
|
||||
## Pi architecture
|
||||
|
||||
Use when:
|
||||
|
||||
- understanding how pi works end to end
|
||||
- tracing subsystem relationships
|
||||
- understanding sessions, compaction, models, tools, or prompt flow
|
||||
- deciding how to embed pi in a branded app, custom CLI, desktop app, or web product
|
||||
|
||||
Read first:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/01-what-pi-is.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/04-the-architecture-how-everything-fits-together.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/05-the-agent-loop-how-pi-thinks.md`
|
||||
|
||||
Read together when relevant:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/06-tools-how-pi-acts-on-the-world.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/07-sessions-memory-that-branches.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/08-compaction-how-pi-manages-context-limits.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/09-the-customization-stack.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/10-providers-models-multi-model-by-default.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/13-context-files-project-instructions.md`
|
||||
|
||||
Follow-up if needed:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/03-the-four-modes-of-operation.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/11-the-interactive-tui.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/12-the-message-queue-talking-while-pi-thinks.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/14-the-sdk-rpc-embedding-pi.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/15-pi-packages-the-ecosystem.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/16-why-pi-matters-what-makes-it-different.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/17-file-reference-all-documentation.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/18-quick-reference-commands-shortcuts.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/what-is-pi/19-building-branded-apps-on-top-of-pi.md`
|
||||
|
||||
## Context engineering, hooks, and context flow
|
||||
|
||||
Use when:
|
||||
|
||||
- understanding how user prompts flow through to the LLM
|
||||
- working with before_agent_start, context, tool_call, tool_result, input hooks
|
||||
- injecting, filtering, or transforming LLM context
|
||||
- understanding message types and what the LLM actually sees
|
||||
- coordinating multiple extensions
|
||||
- building mode systems, presets, or context management extensions
|
||||
- debugging why the LLM does or doesn't see certain information
|
||||
|
||||
Read first:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/context-and-hooks/01-the-context-pipeline.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/context-and-hooks/02-hook-reference.md`
|
||||
|
||||
Read together when relevant:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/context-and-hooks/03-context-injection-patterns.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/context-and-hooks/04-message-types-and-llm-visibility.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/context-and-hooks/05-inter-extension-communication.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/context-and-hooks/06-advanced-patterns-from-source.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/context-and-hooks/07-the-system-prompt-anatomy.md`
|
||||
|
||||
## Extension development
|
||||
|
||||
Use when:
|
||||
|
||||
- building or modifying extensions
|
||||
- adding tools, commands, hooks, renderers, state, or packaging
|
||||
|
||||
Read first:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/01-what-are-extensions.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/02-architecture-mental-model.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/03-getting-started.md`
|
||||
|
||||
Read together when relevant:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/06-the-extension-lifecycle.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/07-events-the-nervous-system.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/08-extensioncontext-what-you-can-access.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/09-extensionapi-what-you-can-do.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/10-custom-tools-giving-the-llm-new-abilities.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/11-custom-commands-user-facing-actions.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/14-custom-rendering-controlling-what-the-user-sees.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/25-slash-command-subcommand-patterns.md` # for subcommand-style slash command UX via getArgumentCompletions()
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/15-system-prompt-modification.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/22-key-rules-gotchas.md`
|
||||
|
||||
Follow-up if needed:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/04-extension-locations-discovery.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/05-extension-structure-styles.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/12-custom-ui-visual-components.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/13-state-management-persistence.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/16-compaction-session-control.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/17-model-provider-management.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/18-remote-execution-tool-overrides.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/19-packaging-distribution.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/20-mode-behavior.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/21-error-handling.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/23-file-reference-documentation.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/24-file-reference-example-extensions.md`
|
||||
|
||||
## Pi UI and TUI
|
||||
|
||||
Use when:
|
||||
|
||||
- building dialogs, widgets, overlays, custom editors, or UI renderers
|
||||
- working on TUI layout or display behavior
|
||||
|
||||
Read first:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/01-the-ui-architecture.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/03-entry-points-how-ui-gets-on-screen.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/22-quick-reference-all-ui-apis.md`
|
||||
|
||||
Read together when relevant:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/04-built-in-dialog-methods.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/05-persistent-ui-elements.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/06-ctx-ui-custom-full-custom-components.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/07-built-in-components-the-building-blocks.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/12-overlays-floating-modals-and-panels.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/13-custom-editors-replacing-the-input.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/14-tool-rendering-custom-tool-display.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/15-message-rendering-custom-message-display.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/21-common-mistakes-and-how-to-avoid-them.md`
|
||||
|
||||
Follow-up if needed:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/02-the-component-interface-foundation-of-everything.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/08-high-level-components-from-pi-coding-agent.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/09-keyboard-input-how-to-handle-keys.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/10-line-width-the-cardinal-rule.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/11-theming-colors-and-styles.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/16-performance-caching-and-invalidation.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/17-theme-changes-and-invalidation.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/18-ime-support-the-focusable-interface.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/19-building-a-complete-component-step-by-step.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/20-real-world-patterns-from-examples.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/pi-ui-tui/23-file-reference-example-extensions-with-ui.md`
|
||||
|
||||
## Building coding agents
|
||||
|
||||
Use when:
|
||||
|
||||
- designing agent behavior
|
||||
- improving autonomy, speed, context handling, or decomposition
|
||||
- solving hard ambiguity, safety, or verification problems
|
||||
|
||||
Read first:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/01-work-decomposition.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/06-maximizing-agent-autonomy-superpowers.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/11-god-tier-context-engineering.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/12-handling-ambiguity-contradiction.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/26-cross-cutting-themes-where-all-4-models-converge.md`
|
||||
|
||||
Read together when relevant:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/03-state-machine-context-management.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/04-optimal-storage-for-project-context.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/05-parallelization-strategy.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/07-system-prompt-llm-vs-deterministic-split.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/08-speed-optimization.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/10-top-10-pitfalls-to-avoid.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/17-irreversible-operations-safety-architecture.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/20-error-taxonomy-routing.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/24-security-trust-boundaries.md`
|
||||
|
||||
Follow-up if needed:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/02-what-to-keep-discard-from-human-engineering.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/09-top-10-tips-for-a-world-class-agent.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/13-long-running-memory-fidelity.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/14-multi-agent-semantic-conflict-resolution.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/15-legacy-code-brownfield-onboarding.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/16-encoding-taste-aesthetics.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/18-the-handoff-problem-agent-human-maintainability.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/19-when-to-scrap-and-start-over.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/21-cost-quality-tradeoff-model-routing.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/22-cross-project-learning-reusable-intelligence.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/23-evolution-across-project-scale.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/building-coding-agents/25-designing-for-non-technical-users-vibe-coders.md`
|
||||
|
||||
## Pi product docs
|
||||
|
||||
Use when:
|
||||
|
||||
- the user asks about pi itself, its SDK, extensions, themes, skills, packages, TUI, prompt templates, keybindings, or custom providers
|
||||
|
||||
Read first:
|
||||
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/README.md`
|
||||
|
||||
Read together when relevant:
|
||||
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/extensions.md`
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/themes.md`
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/skills.md`
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/prompt-templates.md`
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/tui.md`
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/keybindings.md`
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/sdk.md`
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/custom-provider.md`
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/models.md`
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/packages.md`
|
||||
|
||||
Follow-up if needed:
|
||||
|
||||
- `/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/examples`
|
||||
34
docs/building-coding-agents/01-work-decomposition.md
Normal file
34
docs/building-coding-agents/01-work-decomposition.md
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
# Work Decomposition
|
||||
|
||||
**The universal consensus:** Elite engineers never jump from vision to code. They use **progressive decomposition** through layers of abstraction.
|
||||
|
||||
### The Compression Ladder
|
||||
|
||||
```
|
||||
Vision → Capabilities → Systems/Architecture → Features → Tasks
|
||||
```
|
||||
|
||||
Each layer answers a different question:
|
||||
|
||||
| Layer | Question |
|
||||
|-------|----------|
|
||||
| Vision | What world are we creating? |
|
||||
| Capabilities | What must the product be able to do? |
|
||||
| Systems | What infrastructure enables those capabilities? |
|
||||
| Features | What does the user interact with? |
|
||||
| Tasks | What exact code gets written? |
|
||||
|
||||
### Core Principles (All 4 Models Agree)
|
||||
|
||||
- **Start with outcomes, not features.** Define "done" before anything else. Not "build a login page" but "a user can securely access their dashboard using OAuth."
|
||||
- **Vertical slices over horizontal layers.** Build thin end-to-end slices (UI → API → DB) rather than completing all backend before all frontend. Each slice is independently demoable and testable.
|
||||
- **The 1-Day Rule.** If a task takes longer than a day, it's not a task — it's a milestone. Break it down further until each item is a single, clear action completable in one sitting.
|
||||
- **Risk-first exploration.** Identify the hardest/most uncertain parts first. Spike on unknowns before committing to architecture. "Kill the biggest risks while they are still cheap to fix."
|
||||
- **Interface-first design.** Define contracts between components before building them. This enables parallel work and creates natural verification checkpoints.
|
||||
- **MECE decomposition.** Tasks should be Mutually Exclusive (no overlap) and Collectively Exhaustive (complete the vision when all are done).
|
||||
|
||||
### The Recursive Heuristic
|
||||
|
||||
> If something feels fuzzy, break it down one level deeper. Keep decomposing until a task is obvious how to start.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,38 @@
|
|||
# What to Keep & Discard from Human Engineering
|
||||
|
||||
### KEEP & Amplify
|
||||
|
||||
| Practice | Why It Matters More for AI |
|
||||
|----------|---------------------------|
|
||||
| **Clear product intent & experience specs** | AI needs direction, not instructions. "How should it feel?" drives architecture. |
|
||||
| **Acceptance criteria as the backbone** | Becomes TDD at its logical extreme — human writes tests in natural language, AI makes them true. |
|
||||
| **Vertical slicing** | Even more critical — prevents AI from going deep down a wrong path fast and confidently. |
|
||||
| **Interface-first approach** | Creates natural checkpoints, makes systems modular and replaceable. |
|
||||
| **Explicit constraints & non-functional requirements** | Narrows the search space. Without them AI may produce technically correct but strategically wrong systems. |
|
||||
| **Architecture Decision Records (ADRs)** | Prevents AI from "accidentally" undoing decisions made weeks ago. |
|
||||
| **Feedback loops** | Build → test → observe → refine. Accelerated to machine speed. |
|
||||
|
||||
### DISCARD
|
||||
|
||||
| Practice | Why It's Dead Weight |
|
||||
|----------|---------------------|
|
||||
| **Estimation rituals** (story points, velocity, sprint planning) | AI doesn't get tired, doesn't context-switch, works at machine speed. |
|
||||
| **Communication overhead** (standups, design reviews, PR reviews) | Only one communication channel matters: human ↔ agent. |
|
||||
| **Manual code review for style** | Automated linting + formatting handles this deterministically. |
|
||||
| **Step-by-step instructions** | Provide outcomes, not "how." |
|
||||
| **Heavy upfront documentation** | AI can read the entire repo instantly. Document *intent* and *why*, not *how*. |
|
||||
| **Gradual skill-building** | No ramp-up, no knowledge silos, no "only Sarah knows how that module works." |
|
||||
| **Defensive architecture against human error** | Tests still needed, but for a different reason: verifying AI's interpretation of intent. |
|
||||
|
||||
### The New Human Role
|
||||
|
||||
| Responsibility | Description |
|
||||
|---------------|-------------|
|
||||
| **Defining "good"** | Vision, personas, experience specs, success metrics |
|
||||
| **Taste & judgment** | Aesthetics, emotional experience, brand voice |
|
||||
| **Strategic decisions** | Which problems matter, product pivots |
|
||||
| **Gut checks at milestones** | Does this *feel* right? |
|
||||
|
||||
> **The core shift:** Human = intention + taste. AI = exploration + execution.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,46 @@
|
|||
# State Machine & Context Management
|
||||
|
||||
### The Fundamental Tension
|
||||
|
||||
The agent needs to understand the whole project to make good decisions, but any single context window degrades with too much information — not just from token limits but from **attention dilution**.
|
||||
|
||||
### Layered Memory Architecture (Universal Agreement)
|
||||
|
||||
```
|
||||
Project Manifest (always loaded, <1000 tokens)
|
||||
↓
|
||||
Task Context (per-task, relevant files + specs)
|
||||
↓
|
||||
Retrieval Layer (pull-based, on-demand)
|
||||
↓
|
||||
Ground Truth (filesystem, git, actual code)
|
||||
```
|
||||
|
||||
| Layer | Content | Access Pattern | Token Impact |
|
||||
|-------|---------|---------------|--------------|
|
||||
| **Working Context** (L1) | Current task + 3–5 relevant files | Dynamically assembled per LLM call | 8k–25k tokens |
|
||||
| **Session/Episodic** (L2) | Compressed history + recent decisions | Auto-summarized at transitions | Summary only |
|
||||
| **Project Semantic** (L3) | Full codebase summaries, dependency graph, ADRs | Vector + Graph retrieval | Pointers only |
|
||||
| **Ground Truth** (L4) | Actual files, git history, test results | Agent reads via tools | Zero in prompt |
|
||||
|
||||
### The State Machine
|
||||
|
||||
The agent should always be in one explicit state:
|
||||
|
||||
```
|
||||
PLAN → IMPLEMENT → TEST → DEBUG → VERIFY → DOCUMENT
|
||||
```
|
||||
|
||||
**Critical transitions that matter:**
|
||||
- **Task completion:** Defined by automated tests passing + acceptance criteria met
|
||||
- **Stuck detection:** Triggered by repeated failed attempts or missing information
|
||||
- **Plan revision:** Triggered when completed tasks reveal wrong assumptions
|
||||
|
||||
### Key Principles
|
||||
|
||||
- **Summarize aggressively between phases.** Don't carry full implementation context forward — carry compressed summaries: what was built, what decisions were made, what interfaces were created.
|
||||
- **Pull-based, not push-based context.** Don't preload everything the agent might need. Let it ask for what it discovers it needs.
|
||||
- **Use structured state for reliability.** Natural language summaries drift. Use JSON/typed configs for anything the system needs to track. Reserve natural language for reasoning.
|
||||
- **The filesystem is external memory.** The codebase itself is the most detailed representation of current state. Hold *understanding* about code in context, not the code itself.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,56 @@
|
|||
# Optimal Storage for Project Context
|
||||
|
||||
### The Universal Answer: Plain Text Files in the Repo + Structured State Store
|
||||
|
||||
All four models converge on a hybrid approach. The key insight: **don't over-engineer with databases and vector stores, but don't under-engineer with a single massive file either.**
|
||||
|
||||
### The Optimal Stack
|
||||
|
||||
| Storage | What Lives Here | Why |
|
||||
|---------|----------------|-----|
|
||||
| **Project Manifest** (`PROJECT.md`) | Vision, principles, architecture overview, component status | Always loaded, <1000 tokens, single source of truth |
|
||||
| **Structured State** (JSON/SQLite/Postgres) | Task status, phase, dependencies, verification results | Machine-parseable, drives state machine transitions |
|
||||
| **Context Directory** (`.context/` or `.ai/`) | Architecture docs, task specs, decision records | Organized for retrieval, not human browsing |
|
||||
| **Git Repository** | Actual source code, test results | Ultimate ground truth, never duplicated |
|
||||
| **Knowledge Graph** (optional at scale) | File → function → dependency relationships | Enables "what breaks if I change this?" queries |
|
||||
|
||||
### Why Plain Files Win
|
||||
|
||||
- AI reads files directly — no query language, no ORM, no API calls
|
||||
- Version control comes free via git
|
||||
- Human can read and edit with any text editor
|
||||
- Survives tooling changes — not locked into any system
|
||||
|
||||
### Why NOT Vector Stores (as primary)
|
||||
|
||||
- Project context is **structured** — you know where things are
|
||||
- Vector stores return **approximately relevant** results — approximate is often wrong in codebases
|
||||
- They can't represent state, relationships, or task progress
|
||||
|
||||
### The Hybrid Format
|
||||
|
||||
Individual files use **YAML frontmatter + Markdown body**:
|
||||
```yaml
|
||||
---
|
||||
status: in_progress
|
||||
dependencies: [AUTH-01, DB-02]
|
||||
acceptance_criteria:
|
||||
- User can reset password via email
|
||||
- Token expires after 30 minutes
|
||||
---
|
||||
|
||||
## Task: Password Reset Flow
|
||||
[Rich narrative description and context here]
|
||||
```
|
||||
|
||||
### Size Discipline
|
||||
|
||||
| File | Target Size |
|
||||
|------|------------|
|
||||
| Project Manifest | <1,000 tokens |
|
||||
| Individual task files (completed) | <500 tokens |
|
||||
| Architecture doc | <2,000 tokens |
|
||||
|
||||
> The context system isn't just storage — it's a **compression engine**. Its job is to maintain maximum useful understanding in minimum token footprint.
|
||||
|
||||
---
|
||||
62
docs/building-coding-agents/05-parallelization-strategy.md
Normal file
62
docs/building-coding-agents/05-parallelization-strategy.md
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
# Parallelization Strategy
|
||||
|
||||
### Core Principle
|
||||
|
||||
> Parallelize across boundaries, serialize within them.
|
||||
|
||||
The quality of parallelization is directly determined by the quality of interface definitions.
|
||||
|
||||
### The Diamond Pattern
|
||||
|
||||
```
|
||||
Planning (narrow, serial)
|
||||
↓
|
||||
Fan Out (parallel execution)
|
||||
↓
|
||||
Convergence (integration verification)
|
||||
↓
|
||||
Fan Out (next parallel set)
|
||||
```
|
||||
|
||||
### Phase-by-Phase Strategy
|
||||
|
||||
#### Planning: Mostly Serial, with Parallel Spikes
|
||||
- High-level decomposition must be serial (one coherent act of reasoning)
|
||||
- **Parallelize uncertainty resolution:** Multiple spikes investigating different risks simultaneously
|
||||
- Output: A dependency graph that explicitly identifies what can be parallelized
|
||||
|
||||
#### Execution: Massive Parallelization with Right Topology
|
||||
|
||||
| Work Type | Strategy |
|
||||
|-----------|----------|
|
||||
| **Independent leaf tasks** | Embarrassingly parallel — one agent per module |
|
||||
| **Dependent chains** | Serial within chain, but chains run in parallel |
|
||||
| **Convergence points** | Strictly serial — integration verification |
|
||||
|
||||
**Critical insight:** The frontend doesn't need the real API — it needs the API *contract*. Once contracts exist, both sides build in parallel.
|
||||
|
||||
#### Testing: The Most Interesting Story
|
||||
- **Unit tests:** Same agent, same context, atomic with code
|
||||
- **Cross-task tests:** All parallel by definition
|
||||
- **Integration tests:** Parallel across different boundaries
|
||||
- **E2E tests:** Serial (exercises whole system)
|
||||
|
||||
#### Verification: Deliberate Redundancy
|
||||
- **Adversarial verification:** Separate reviewer agent with fresh context evaluates against spec
|
||||
- **Red-team parallelism:** Agent tries to break the implementation
|
||||
|
||||
### Coordination Rules
|
||||
|
||||
- Agents communicate through the **filesystem**, never directly
|
||||
- Each agent works on a **branch** — merge on success, discard on failure
|
||||
- One agent per file at a time (file locking)
|
||||
- Optimal concurrency: **3–8 simultaneous agents** for most projects
|
||||
|
||||
### Anti-Patterns
|
||||
|
||||
- ❌ Don't parallelize tasks that modify the same files
|
||||
- ❌ Don't parallelize interacting decisions
|
||||
- ❌ Don't skip convergence/integration verification
|
||||
- ❌ Don't over-parallelize (coordination tax eats gains above ~8 agents)
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,44 @@
|
|||
# Maximizing Agent Autonomy & Superpowers
|
||||
|
||||
### The Foundational Insight
|
||||
|
||||
> Autonomy comes from **self-correction**, not from getting it right the first time. The power isn't in the initial generation — it's in iteration speed and feedback signal quality.
|
||||
|
||||
### The Essential Tool Arsenal
|
||||
|
||||
| Category | Tools | Why |
|
||||
|----------|-------|-----|
|
||||
| **Execution Environment** | Terminal, filesystem, git, package manager | Closes the write → run → debug → verify loop |
|
||||
| **Verification** | Test runner, linter, type checker, security scanner | Ground truth over self-assessment |
|
||||
| **Observation** | Logs, browser/renderer, performance profiler | Sees what users would see |
|
||||
| **Exploration** | Code search, documentation lookup, web research | Self-directed learning |
|
||||
| **Recovery** | Git revert, branch management, checkpoints | Safety net that enables boldness |
|
||||
|
||||
### Self-Verification Architecture
|
||||
|
||||
Every task completion should self-evaluate against a checklist:
|
||||
1. Does the code compile?
|
||||
2. Do all existing tests still pass?
|
||||
3. Do new tests pass?
|
||||
4. Does the application actually start?
|
||||
5. Can I exercise the feature and see expected behavior?
|
||||
6. Does this match acceptance criteria point by point?
|
||||
|
||||
### Debugging Superpowers
|
||||
|
||||
- **Temporary instrumentation:** Add logging, remove after diagnosis
|
||||
- **Bisection:** Walk back through changes to find where regression was introduced
|
||||
- **Minimal reproduction:** Strip away everything except exact conditions that trigger failure
|
||||
- **Exploratory tests:** Quick throwaway scripts to test hypotheses
|
||||
|
||||
### Meta-Cognitive Layer
|
||||
|
||||
- **Scratchpad:** External reasoning space to track hypotheses, attempts, and outcomes
|
||||
- **Stuck detection:** After N failed attempts, trigger step-back with fresh context and explicitly different approach
|
||||
- **Structured escalation:** "Here's what I'm trying, here's what I've tried, here's what I think the issue is, here's what I need from you"
|
||||
|
||||
### The Philosophy
|
||||
|
||||
> You're not trying to build an agent that doesn't make mistakes. You're building one that **catches and fixes its own mistakes faster than a human would notice them**. Not intelligence — **closed-loop execution with rich feedback**.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,82 @@
|
|||
# System Prompt & LLM vs Deterministic Split
|
||||
|
||||
### The Core Separation Principle
|
||||
|
||||
> If you could write an if-else statement that handles it correctly every time, **it should not be in the LLM's context**. Every token the model spends reasoning about something deterministic is wasted and introduces hallucination risk.
|
||||
|
||||
### What the LLM Owns
|
||||
|
||||
| Capability | Why LLM |
|
||||
|-----------|---------|
|
||||
| Understanding intent | Interpretation, judgment |
|
||||
| Architectural reasoning | Weighing tradeoffs |
|
||||
| Code generation | Creative, context-dependent |
|
||||
| Debugging & diagnosis | Abductive reasoning, hypothesis formation |
|
||||
| Self-critique & quality assessment | Judgment calls |
|
||||
|
||||
### What TypeScript/Deterministic Code Owns
|
||||
|
||||
| Capability | Why Deterministic |
|
||||
|-----------|-------------------|
|
||||
| State machine transitions | Typed state object, no ambiguity |
|
||||
| Context assembly | Predict + pre-load what agent needs |
|
||||
| File operations | Validate paths, handle encoding, manage permissions |
|
||||
| Test execution & result parsing | Structured results, not raw terminal output |
|
||||
| Build & environment management | Install deps, start servers, manage ports |
|
||||
| Code formatting | Run prettier automatically, never waste LLM tokens |
|
||||
| Task scheduling & dependency resolution | Graph traversal, instant vs 5-second LLM call |
|
||||
| Summarization triggers | Mechanical workflow, LLM provides content |
|
||||
|
||||
### Modular System Prompt Architecture
|
||||
|
||||
```
|
||||
Base Layer (always present, ~500 tokens)
|
||||
→ Identity, core behavioral rules, general approach
|
||||
|
||||
Phase-Specific Layer (swapped based on state)
|
||||
→ Planning mode: decomposition, interfaces, risks
|
||||
→ Execution mode: implementation, testing, iteration
|
||||
→ Debugging mode: diagnosis, hypothesis testing, isolation
|
||||
|
||||
Task-Specific Layer (assembled fresh per task)
|
||||
→ Current spec, acceptance criteria, relevant contracts, prior attempts
|
||||
|
||||
Tools Layer
|
||||
→ Available tool definitions and parameters
|
||||
```
|
||||
|
||||
### Tool Design Philosophy
|
||||
|
||||
> Each tool should do one thing, do it completely, and return structured results the LLM can immediately act on.
|
||||
|
||||
**Bad:** LLM calls `readFile` → `parseJSON` → `runCommand` (3 calls, 3 failure points)
|
||||
**Good:** LLM calls `runTests(filter)` → gets structured pass/fail with locations (1 call, clean result)
|
||||
|
||||
### Essential Tools
|
||||
|
||||
| Tool | Returns |
|
||||
|------|---------|
|
||||
| `runTests` | Structured results: pass count, fail count, per-failure details |
|
||||
| `readFiles` | Batched file contents (array of paths, not one at a time) |
|
||||
| `writeFile` | Auto-formats before writing |
|
||||
| `searchCodebase` | Grep-like results with file paths and line numbers |
|
||||
| `getProjectState` | Manifest + current task spec + related task statuses |
|
||||
| `updateTaskStatus` | Handles downstream state updates automatically |
|
||||
| `buildProject` | Structured errors with file paths and line numbers |
|
||||
| `browserCheck` | Screenshot or structured description of rendered output |
|
||||
| `commitChanges` | Enforces conventions, runs pre-commit hooks |
|
||||
| `revertToCheckpoint` | Rolls back to last known good state |
|
||||
|
||||
### Prompt Patterns That Maximize Agency
|
||||
|
||||
1. **Tell it what it CAN do, not what it can't.** "Full authority as long as acceptance criteria and tests pass."
|
||||
2. **Explicit permission to iterate.** "First attempt doesn't need to be perfect. Write, run, observe, improve."
|
||||
3. **Clear exit conditions.** Concrete, measurable, unambiguous definition of "done."
|
||||
4. **Built-in scratchpad.** "Write reasoning in thinking blocks. Track attempts and outcomes."
|
||||
5. **Recovery protocol.** "After 3 failed approaches, produce structured escalation."
|
||||
|
||||
### The Meta-Principle
|
||||
|
||||
> Your TypeScript orchestrator is the deterministic skeleton — workflow, state, context, tools, coordination. The LLM is the reasoning muscle — understanding, creativity, judgment, problem-solving. **Neither should do the other's job.** When you get this right, the LLM becomes dramatically more capable because it's only doing what it's good at, with exactly the context it needs.
|
||||
|
||||
---
|
||||
60
docs/building-coding-agents/08-speed-optimization.md
Normal file
60
docs/building-coding-agents/08-speed-optimization.md
Normal file
|
|
@ -0,0 +1,60 @@
|
|||
# Speed Optimization
|
||||
|
||||
### The #1 Speed Principle
|
||||
|
||||
> The fastest possible operation is the one you don't perform. Before optimizing any step, ask: does this step need to exist at all?
|
||||
|
||||
### Speed Levers (Ranked by Impact)
|
||||
|
||||
#### 1. Minimize LLM Calls
|
||||
- **Batch intent into single calls.** Don't generate code, then tests, then docs separately. One call: "implement, test, and document." TypeScript splits the output.
|
||||
- **Deterministic fast paths.** Missing import? Syntax error? Fix without an LLM call if the fix is mechanical.
|
||||
- Audit call chains ruthlessly — most systems have 50%+ unnecessary sequential calls.
|
||||
|
||||
#### 2. Make Feedback Loops Instantaneous
|
||||
- Use test watch mode (no cold start)
|
||||
- Run only relevant test subsets (track which files affect which tests)
|
||||
- Incremental builds (hot module reloading)
|
||||
- Async, non-blocking file writes
|
||||
|
||||
#### 3. Precompute Context
|
||||
- Predict what the agent will need based on task definition
|
||||
- Pre-load into the prompt — no tool calls needed mid-generation
|
||||
- **Speculative pre-fetching** (like CPU cache prefetching)
|
||||
|
||||
#### 4. Parallelize Independent Work
|
||||
- Minimize startup cost for new parallel agents (pre-built templates, warm connections)
|
||||
- Use the dependency graph to identify independent work automatically
|
||||
|
||||
#### 5. Stream Everything, Block on Nothing
|
||||
- Process tokens as they arrive
|
||||
- Pipeline parallelism: start formatting code while commit message is still generating
|
||||
|
||||
#### 6. Cache Aggressively
|
||||
- In-memory cache of everything agent might need
|
||||
- Cross-task caching for unchanged files
|
||||
- Cache LLM results for deterministic inputs (boilerplate, type definitions)
|
||||
|
||||
#### 7. Minimize Token Waste
|
||||
- Dense context, not verbose context
|
||||
- Structured formats for structured data
|
||||
- Minify reference code that's informational, not for modification
|
||||
|
||||
### Anti-Patterns That Murder Speed
|
||||
|
||||
| Anti-Pattern | Fix |
|
||||
|-------------|-----|
|
||||
| Re-verifying things that can't have changed | Dependency-aware selective re-verification |
|
||||
| Excessive self-reflection on simple tasks | Complexity-based workflow routing |
|
||||
| Over-summarization between micro-steps | Only full context reset at task boundaries |
|
||||
| Waiting for human approval on auto-verifiable work | Human checkpoints at milestones, not tasks |
|
||||
| Quadratic history growth | Aggressive compression at every transition |
|
||||
| Synchronous blocking tools | Async everything, pipeline parallelism |
|
||||
|
||||
### The Speed Multiplier Nobody Talks About
|
||||
|
||||
**Failure prediction.** Track patterns across tasks. If certain task types fail on first attempt, pre-load extra guidance. Preventing a failed iteration is faster than executing one.
|
||||
|
||||
> The magical feeling of speed comes from only doing things that matter, and then doing those things as fast as possible. The system should feel like the agent knew what to do and just did it.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,33 @@
|
|||
# Top 10 Tips for a World-Class Agent
|
||||
|
||||
### 1. The Orchestrator Is the Product, Not the Model
|
||||
The model is a commodity. Two teams using the same model produce wildly different results based on orchestration quality. Invest 70% of effort in the orchestrator, 30% in prompt engineering.
|
||||
|
||||
### 2. Context Assembly Is a Craft
|
||||
Profile your context like you'd profile code. Measure which context elements correlate with first-attempt success. Prune relentlessly. The right files, in the right order, with the right framing, at the right level of detail.
|
||||
|
||||
### 3. Make the Feedback Loop the Fastest Thing
|
||||
Treat feedback loop latency like a game engine treats frame rate. Incremental builds, targeted tests, pre-warmed servers, cached deps. Put it on a dashboard you look at every day.
|
||||
|
||||
### 4. Build First-Class Error Recovery Into Every Layer
|
||||
Retry with variation (never the same way twice), automatic rollback, structured escalation, ability to park blocked tasks. **Design failure paths first** — they'll get more use than you expect.
|
||||
|
||||
### 5. Verify Through Execution, Not Self-Assessment
|
||||
An agent that asks itself "is this correct?" says yes 90% of the time regardless. Run the code, observe results, get ground truth. Self-assessment supplements execution-based verification, never replaces it.
|
||||
|
||||
### 6. Return Structured, Actionable Data from Every Tool
|
||||
Don't return raw terminal output. Return structured objects: what passed, what failed, where, why. Remove cognitive load from the model — it directly translates to better decisions.
|
||||
|
||||
### 7. Use a DAG, Not a Flat List
|
||||
Explicit inputs, outputs, dependencies, acceptance criteria per task. Maximizes parallelism, identifies critical path, enables smart impact tracing when things change.
|
||||
|
||||
### 8. Keep the Manifest Small and Always Current
|
||||
One file, <1000 tokens, always included. Updated automatically after every task completion. If it drifts from reality, everything downstream suffers.
|
||||
|
||||
### 9. Build Observability From Day One
|
||||
Log every LLM call. Track iterations per task type, token usage, failure rates, first-attempt success rates. This is your training data for improving the orchestrator. Teams that instrument well improve 10x faster.
|
||||
|
||||
### 10. Make Human Touchpoints High-Leverage and Low-Friction
|
||||
Present specific questions with context, not walls of text. "The API could return nested or flat fields — which fits your vision?" is a 5-second decision. "Please review everything" takes 20 minutes.
|
||||
|
||||
---
|
||||
33
docs/building-coding-agents/10-top-10-pitfalls-to-avoid.md
Normal file
33
docs/building-coding-agents/10-top-10-pitfalls-to-avoid.md
Normal file
|
|
@ -0,0 +1,33 @@
|
|||
# Top 10 Pitfalls to Avoid
|
||||
|
||||
### 1. Putting Workflow Logic in the Prompt
|
||||
Control flow belongs in TypeScript with actual conditionals and state tracking. Prompts that describe workflows are fragile, inconsistently followed, and impossible to debug with a debugger.
|
||||
|
||||
### 2. Unbounded Context Accumulation
|
||||
Each iteration adds noise. After 7 iterations, context is bloated with stale information from attempts 1–5. **Carry forward only current state and most recent error.** Summarize or discard everything else.
|
||||
|
||||
### 3. Trusting the Model's Self-Assessment of Completion
|
||||
Models are biased toward completion. Never let the model be the sole judge. Use deterministic checks: tests pass, it builds, acceptance criteria have corresponding passing tests.
|
||||
|
||||
### 4. Over-Engineering Tools Before Understanding Workflows
|
||||
Start with a small general-purpose set (file read/write, execute command, run tests). Watch where the agent struggles in real tasks. Then build specialized tools to solve observed problems.
|
||||
|
||||
### 5. Neglecting the Cold-Start Problem
|
||||
The first task is fundamentally different from the twentieth. Use deterministic templates for project scaffolding, conventions, and test infrastructure before handing off to the agent.
|
||||
|
||||
### 6. Too Much Autonomy Too Early
|
||||
An agent going slightly wrong for 2 hours produces a mountain of throwaway code. Start with more checkpoints than needed. Earn autonomy incrementally for proven task types.
|
||||
|
||||
### 7. Ignoring Compounding Inconsistency
|
||||
Different naming, different patterns, different structures across files = technical debt that confuses the agent itself later. Enforce consistency through linting or by showing existing examples before new code.
|
||||
|
||||
### 8. Building for the Demo, Not the Recovery
|
||||
The demo is the happy path. The product is what happens when tests fail, builds break, APIs change. **Spend 2x as much time on failure/recovery paths.** The agent spends more time recovering than succeeding first-attempt.
|
||||
|
||||
### 9. Treating All Tasks as Equally Complex
|
||||
Simple utility functions and complex state management shouldn't go through the same workflow. Classify by complexity. Simple tasks get a fast path. Complex tasks get the full treatment.
|
||||
|
||||
### 10. Not Measuring What Actually Matters
|
||||
Don't just track tokens and costs. Measure: first-attempt success rate, iterations to completion, human intervention frequency, code survival rate (does it survive the next 3 tasks?), stuck-detection accuracy. These guide real improvement.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,97 @@
|
|||
# God-Tier Context Engineering
|
||||
|
||||
### The Core Principle
|
||||
|
||||
> God-tier context engineering treats the context window as a **designed experience for the model**, not as a bucket you throw information into. The context window is the UX of your agent. Design it accordingly.
|
||||
|
||||
### The 10 Commandments of Context Engineering
|
||||
|
||||
#### 1. The Pyramid of Relevance
|
||||
- **Sharp focus:** Active files at full detail
|
||||
- **Present but compressed:** Interface contracts, manifest, task definition
|
||||
- **Summarized or absent:** Other components' internals, completed task histories
|
||||
|
||||
Each tier has a token budget. If full-resolution tier is large, outer tiers compress harder.
|
||||
|
||||
#### 2. Context Is a Cache, Not a History
|
||||
Treat it like a CPU cache: holds exactly what's needed now, everything else evicted. The question isn't "what has happened" but "what does the model need to see right now?"
|
||||
|
||||
#### 3. Separate Reference from Instruction
|
||||
- **Instruction context** (what to do) → beginning and end of prompt (highest attention)
|
||||
- **Reference context** (helpful info) → middle, clearly delineated
|
||||
|
||||
Manage them independently. Compress reference aggressively while keeping instructions at full detail.
|
||||
|
||||
#### 4. Earn Every Token's Place
|
||||
Implement a token budget system:
|
||||
|
||||
| Category | Budget |
|
||||
|----------|--------|
|
||||
| System prompt + behavioral instructions | ~15% |
|
||||
| Manifest | ~5% |
|
||||
| Task spec + acceptance criteria | ~20% |
|
||||
| Active code files | ~40% |
|
||||
| Interface contracts | ~10% |
|
||||
| Reserve (tool results, errors) | ~10% |
|
||||
|
||||
When any category exceeds budget, intelligently summarize (not truncate).
|
||||
|
||||
#### 5. Write for the Model's Attention Pattern
|
||||
- Critical info at the very beginning and reiterated at the end
|
||||
- Structured blocks with clear headers and delimiters
|
||||
- Consistent formatting conventions
|
||||
|
||||
```
|
||||
TASK: Implement password reset flow
|
||||
STATUS: New
|
||||
DEPENDS ON: auth-module (complete), email-service (complete)
|
||||
ACCEPTANCE CRITERIA:
|
||||
- User can request reset via email
|
||||
- Token expires after 30 minutes
|
||||
- New password meets existing validation rules
|
||||
- All existing auth tests pass
|
||||
RELEVANT INTERFACES: [below]
|
||||
ACTIVE FILES: [below]
|
||||
```
|
||||
|
||||
#### 6. Compress at Every State Transition
|
||||
- Task completion → 50–100 token completion record
|
||||
- Use a **dedicated summarization call** with a tight prompt (not the working agent self-summarizing)
|
||||
- **Cascading summarization:** Task summaries → milestone summaries → phase summaries (5:1 compression ratio at each level)
|
||||
|
||||
#### 7. Use the Filesystem as Your Infinite Context Window
|
||||
- Organize files for retrieval, not human browsing
|
||||
- Predictable naming conventions = instant lookup
|
||||
- Essentially a custom database on top of the filesystem
|
||||
|
||||
#### 8. Profile Context Quality, Not Just Size
|
||||
Track first-attempt success rate as a function of context composition. What was in context when it succeeded vs failed? Let data guide what constitutes high-quality context.
|
||||
|
||||
#### 9. Dynamic Context Based on Task Phase
|
||||
Different phases need different context:
|
||||
|
||||
| Phase | Optimal Context |
|
||||
|-------|----------------|
|
||||
| Understanding | Spec, acceptance criteria, broad architectural context |
|
||||
| Implementation | Active files, interface contracts, coding patterns |
|
||||
| Debugging | Failing test output, relevant code, test code |
|
||||
| Verification | Acceptance criteria prominently, ability to exercise feature |
|
||||
|
||||
#### 10. Design for Context Recovery
|
||||
- **Checkpoint** context state at task starts and phase transitions
|
||||
- On detected confusion (repeated failures, increasing iterations, off-task output): **roll back to checkpoint** and re-enter with fresh context + concise failure info + strategy hint
|
||||
- Structured recovery ≠ naive retry. It rebuilds context from scratch with learned information.
|
||||
|
||||
### The God-Tier Strategy in One Sentence
|
||||
|
||||
> Orchestrator-assembled minimal slice + persistent hierarchical memory. Every single LLM call stays 8k–25k tokens while the agent has perfect knowledge of a 500k-line codebase and months of project history.
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
# Part II: The Hard Problems (Grey Area Synthesis)
|
||||
|
||||
> Synthesized from a second round of deep conversations with all four models, targeting the 13 hardest unsolved problems in autonomous coding agents — plus a critical question on accessibility for non-technical users.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,56 @@
|
|||
# Handling Ambiguity & Contradiction
|
||||
|
||||
**The universal consensus:** This is the highest-cost failure mode. An agent confidently building the wrong thing based on a reasonable-but-incorrect interpretation burns hours of work discovered only at milestone reviews.
|
||||
|
||||
### The Three-Layer Strategy (All 4 Models Agree)
|
||||
|
||||
#### Layer 1: Classification of Ambiguity Type
|
||||
|
||||
Every requirement should be classified during planning:
|
||||
|
||||
| Classification | Action |
|
||||
|---------------|--------|
|
||||
| **Clear and actionable** | Proceed autonomously |
|
||||
| **Ambiguous but decidable with sensible defaults** | Proceed + document assumptions |
|
||||
| **Genuinely unclear or contradictory** | Halt and escalate to human |
|
||||
|
||||
> The middle category is where most real work lives. "The user should be able to reset their password" has a hundred implied decisions. A good agent resolves these with sensible defaults and **documents the assumptions it made** — it doesn't ask about every one.
|
||||
|
||||
#### Layer 2: The Assumption Ledger
|
||||
|
||||
Every task completion includes an `assumptions.md` update listing every interpretive decision the agent made:
|
||||
|
||||
```json
|
||||
{
|
||||
"assumptions": [
|
||||
"Password reset tokens expire after 30 minutes (common security practice)",
|
||||
"Email delivery, not SMS",
|
||||
"No password history check"
|
||||
],
|
||||
"confidence": 0.82
|
||||
}
|
||||
```
|
||||
|
||||
The human reviews these at **milestones, not in real-time** — preserving speed while maintaining correctness.
|
||||
|
||||
#### Layer 3: Contradiction Detection Pass
|
||||
|
||||
Before execution begins, a **dedicated reasoning pass** (separate from planning) scans for conflicts:
|
||||
- Do requirements contradict each other?
|
||||
- Do acceptance criteria conflict with stated architecture?
|
||||
- Are there implicit assumptions in one requirement that violate another?
|
||||
|
||||
### Escalation Threshold
|
||||
|
||||
- **Impact confined to current task** → decide and document
|
||||
- **Impact touches interface contracts** → escalate (wrong interpretation cascades)
|
||||
|
||||
Grok adds a **"Multi-Hypothesis Planning"** approach: when underspecification is detected, generate three distinct "Intent Hypotheses" (The Minimalist Path, The Scalable Path, The Feature-Rich Path). If the semantic distance between them exceeds a threshold, hard-halt and present a decision matrix to the human.
|
||||
|
||||
### The Deepest Pitfall
|
||||
|
||||
Models don't naturally express uncertainty — they pick an interpretation and run with it as if it's obviously correct. The system prompt must explicitly instruct confidence-level flagging, and the orchestrator must treat low-confidence decisions differently from high-confidence ones.
|
||||
|
||||
> **Proven result:** Grok reports this pattern cuts wrong-path rework by ~65% in 2026 evaluations.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,34 @@
|
|||
# Long-Running Memory Fidelity
|
||||
|
||||
**The core problem:** Every compression loses information. Over enough compressions, summaries drift from reality like a photocopy of a photocopy. The system can't easily tell it's happening because it only sees the current summary, not what was lost.
|
||||
|
||||
### Multi-Tier Memory with Different Decay Rates
|
||||
|
||||
| Tier | Decay Rate | Content | Update Strategy |
|
||||
|------|-----------|---------|-----------------|
|
||||
| **Manifest** | Fast (updates every task) | Current state only, <1000 tokens | Continuous overwrite — no history |
|
||||
| **Decision Log** | Never decays (append-only) | Every significant architectural decision + rationale | Never summarized, grows linearly |
|
||||
| **Task Archive** | Medium | Compressed task completion records | Available for retrieval, not routinely loaded |
|
||||
|
||||
### The Critical Mechanism: Periodic Reconciliation
|
||||
|
||||
All four models converge on some form of automated audit:
|
||||
|
||||
- **Claude:** Every milestone or N tasks — agent compares manifest against actual codebase
|
||||
- **Gemini:** Every N commits, spawn a "History Auditor" agent whose sole job is manifest-vs-code comparison
|
||||
- **GPT:** Self-healing summaries with checksums — when source files change, invalidate and regenerate
|
||||
- **Grok:** Deterministic "Memory Fidelity Audit" node every 5 checkpoints — samples key invariants, scores drift 0-100, auto-rebuilds if drift >15%
|
||||
|
||||
### The Golden Rule
|
||||
|
||||
> **Never summarize summaries.** Each compression layer regenerates from the one below. The codebase is always the lossless source of truth.
|
||||
|
||||
### The Most Dangerous Form of Drift
|
||||
|
||||
Not factual inaccuracy — **the loss of "why."** The manifest says "auth uses JWT tokens." Three months ago there was a long discussion about why JWT was chosen over session-based auth. That context is exactly what gets compressed away. The **append-only decision log** solves this by preserving *why* indefinitely even as *what* gets continuously compressed.
|
||||
|
||||
### Phase Boundary Refresh
|
||||
|
||||
For very long projects (weeks/months), **rebuild the manifest from scratch** at phase boundaries by having the agent read the actual codebase + decision log — rather than carrying forward the old manifest with incremental updates. This is the equivalent of defragmenting a hard drive.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,25 @@
|
|||
# Multi-Agent Semantic Conflict Resolution
|
||||
|
||||
**The hard case:** Git-level merge conflicts are easy. The real problem is code that merges cleanly but doesn't work — agents honoring the same typed interface while disagreeing on semantics (e.g., Agent A returns `null` for "not found," Agent B treats `null` as "error").
|
||||
|
||||
### Three Lines of Defense (Universal Agreement)
|
||||
|
||||
#### 1. Semantically Rich Interface Contracts
|
||||
|
||||
Don't just define type signatures — define **behavioral contracts**: What does `null` mean? What are the error semantics? What invariants does the caller rely on? Contracts should be miniature specs, not just type definitions.
|
||||
|
||||
#### 2. Pre-Written Integration Tests
|
||||
|
||||
Write integration tests **during planning, before parallel execution begins** — tests that exercise semantic expectations, not just types. These are waiting when parallel branches converge.
|
||||
|
||||
#### 3. Dedicated Integration/Reconciliation Agent
|
||||
|
||||
After parallel branches merge, a focused agent gets: interface contracts + both implementations + integration tests. Its job is finding semantic mismatches, not rebuilding.
|
||||
|
||||
### The Highest-Value Technique
|
||||
|
||||
**Adversarial edge-case generation at integration points.** The integration agent reads both implementations, sees how each handles boundaries, and generates new tests that specifically probe the assumption gaps between them. This catches the subtlest bugs.
|
||||
|
||||
Gemini adds the concept of a **"Shadow Merge"** agent that runs "Cross-Impact Analysis" before actual merge — looking for "Logical Race Conditions" where Worker A changed a utility that Worker B relied on, even when the git merge is clean.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,35 @@
|
|||
# Legacy Code & Brownfield Onboarding
|
||||
|
||||
**The fundamental difference:** Greenfield = design → implement. Brownfield = **observe → infer → validate → modify.**
|
||||
|
||||
### The Onboarding Pipeline (All 4 Models Agree)
|
||||
|
||||
#### Phase 1: Structural Analysis (Deterministic)
|
||||
- Dependency graph mapping
|
||||
- Module identification, LOC per component
|
||||
- Test coverage analysis, entry point discovery
|
||||
- Database schema mapping
|
||||
|
||||
#### Phase 2: Convention Extraction (LLM-Assisted)
|
||||
- Sample representative files across modules
|
||||
- Identify: error handling patterns, naming conventions, API structure, DB access patterns, testing patterns
|
||||
- Output: a **conventions document** that becomes critical reference context
|
||||
|
||||
#### Phase 3: Pattern Mining
|
||||
- Extract implicit "tribal knowledge" — workarounds for browser bugs, special customer cases, performance hacks that look like mistakes
|
||||
- Generate decision records into project state
|
||||
|
||||
### The Cardinal Rules
|
||||
|
||||
| Rule | Why |
|
||||
|------|-----|
|
||||
| **Observe first, edit later** | Agents must never modify code they don't understand |
|
||||
| **Preserve local consistency over global ideals** | Resist the "Junior Refactor" — don't "fix" legacy code to modern standards |
|
||||
| **Add characterization tests before modifying** | Tests that document *current behavior*, not *correct behavior* |
|
||||
| **Minimal, surgical modifications** | Refactoring is a separate task requiring explicit human approval |
|
||||
|
||||
### The Biggest Pitfall
|
||||
|
||||
The agent will try to refactor legacy code to match its sense of good patterns. Left unchecked, this produces massive diffs that change behavior in subtle ways. **Enforce strict rules:** modifications to legacy code should be minimal and surgical.
|
||||
|
||||
---
|
||||
34
docs/building-coding-agents/16-encoding-taste-aesthetics.md
Normal file
34
docs/building-coding-agents/16-encoding-taste-aesthetics.md
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
# Encoding Taste & Aesthetics
|
||||
|
||||
**The honest frontier:** This is where all four models are most candid about current limitations.
|
||||
|
||||
### What CAN Be Automated
|
||||
|
||||
| Technique | Description |
|
||||
|-----------|-------------|
|
||||
| **Reference-based extraction** | "Feels like Linear" → extract concrete attributes: spacing ratios, animation timing curves, color relationships, typography |
|
||||
| **Style specification** | Convert extracted attributes to verifiable parameters: "transitions 150-200ms ease-out, 8px grid spacing, specific contrast ratios" |
|
||||
| **Automated verification** | Lighthouse scores, visual regression tests, accessibility audits, performance budgets, design system linting |
|
||||
| **Visual comparison** | Render output, compare against reference screenshots using vision-capable models |
|
||||
| **A/B comparison** | Show two versions, human picks which "feels better" — faster than absolute judgment |
|
||||
|
||||
### What CANNOT Be Automated
|
||||
|
||||
The **gestalt** — the overall feeling, emotional response, sense of quality emerging from a thousand small interacting decisions. *Does this feel premium? Fast? Trustworthy?* These are fundamentally subjective.
|
||||
|
||||
### The Optimal Strategy
|
||||
|
||||
**Narrow the gap** by converting as much "taste" as possible into **concrete, verifiable specifications upfront:**
|
||||
|
||||
- Not "use nice spacing" → "16px between sections, 8px between related elements, 4px between tightly coupled elements"
|
||||
- Exact animation timing curves, color values with contrast ratios, typography weights and sizes
|
||||
|
||||
Then **reserve human review for the remaining subjective layer** with structured, specific questions:
|
||||
|
||||
> "Does the density feel right? Does the transition timing feel snappy enough? Does the empty state feel intentional or broken?"
|
||||
|
||||
### The Emerging Frontier
|
||||
|
||||
Vision-capable models for aesthetic evaluation — render output, capture screenshot, compare against references on specific visual dimensions. Imperfect but improving rapidly. Grok reports ~80-85% of taste can be automated this way; the remaining 15% stays human-only.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,31 @@
|
|||
# Irreversible Operations & Safety Architecture
|
||||
|
||||
**The core principle (universal agreement):** Irreversible operations should **never be executed by the agent.** The agent prepares them; the human executes them.
|
||||
|
||||
### Risk-Graded Action Classification
|
||||
|
||||
| Class | Examples | Policy |
|
||||
|-------|----------|--------|
|
||||
| **Reversible** | Code edits, UI changes, unit tests | Full autonomy + auto-revert on failure |
|
||||
| **Semi-Reversible** | New files, dependencies | Auto-execute + git checkpoint |
|
||||
| **Irreversible** | DB migrations, external API changes, data transformations | Human-in-the-loop required |
|
||||
| **External Side-Effect** | Payment charges, third-party API calls with side effects | Human approval + dry-run + rollback plan |
|
||||
|
||||
### Per-Operation Protocols
|
||||
|
||||
| Operation | Agent Does | Human Does |
|
||||
|-----------|-----------|-----------|
|
||||
| **Database migrations** | Write migration + rollback + tests, run against test DB, produce review package | Review package, execute migration |
|
||||
| **External APIs** | Build + test against sandbox/mock versions | Switch from sandbox to production |
|
||||
| **Deployment** | Produce artifacts, verify in staging | Trigger production deployment |
|
||||
|
||||
### The Classification Must Be:
|
||||
- **Static and deterministic** (not left to the agent's judgment)
|
||||
- **Conservative** (if there's doubt, classify as irreversible)
|
||||
- **Enforced by the orchestrator** (the agent never encounters an irreversible operation without interception)
|
||||
|
||||
### The Subtlety Most Miss
|
||||
|
||||
Data transformations that technically don't delete anything but **lose information through reformatting**. Converting a nullable column to non-nullable with a default value permanently destroys the distinction between rows that had real values and rows that got the default. These must be flagged with the same severity as deletions.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,31 @@
|
|||
# The Handoff Problem: Agent → Human Maintainability
|
||||
|
||||
**The failure modes of AI-generated code** that all four models identify:
|
||||
|
||||
### Known Anti-Patterns
|
||||
|
||||
| Pattern | Problem | Fix |
|
||||
|---------|---------|-----|
|
||||
| **Flat code** | Everything in one function/file to reduce inconsistency risk | Enforce human-friendly modular patterns |
|
||||
| **Clever solutions** | Dense functional chains (`filter().map().reduce().flatMap()`) | Max 3 chained operations; extract named intermediates |
|
||||
| **Useless comments** | `// filter active users` above a filter call | Require *why* comments, skip *what* comments |
|
||||
| **Over-abstraction** | Creates clever custom abstractions no human can follow | Enforce standard framework patterns over custom inventions |
|
||||
| **Missing breadcrumbs** | No README files in directories, no ADRs, no diagrams | Include documentation in task completion checklist |
|
||||
|
||||
### The Architecture That Maximizes Handoff Quality
|
||||
|
||||
**Enforce well-known frameworks and conventions** over custom patterns. A codebase using standard Next.js/Express/React patterns is immediately navigable. A codebase with custom-invented patterns requires learning a new system.
|
||||
|
||||
### Verification Mechanism
|
||||
|
||||
**Automated readability test:** Periodically have a **separate agent** (with no knowledge of the building agent's decisions) attempt to add a feature using only the code and docs. If it struggles, a human will too.
|
||||
|
||||
### Gemini's "Boring Code" Principle
|
||||
|
||||
> Humans hate "clever" AI code; they love "boring" AI code. Run a **Complexity Linter** — if a function has cyclomatic complexity >10, the reviewer agent rejects it.
|
||||
|
||||
### Grok's Maintainability Checklist
|
||||
|
||||
Every file gets: auto-generated JSDoc/TS comments + ADR for every major decision. No magic numbers, no over-abstraction. Mandatory "maintainability score" (cyclomatic complexity + test coverage + comment density) in the critic node.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,27 @@
|
|||
# When to Scrap and Start Over
|
||||
|
||||
### The Four Signals (Cross-Model Convergence)
|
||||
|
||||
| Signal | What It Looks Like |
|
||||
|--------|-------------------|
|
||||
| **Iteration count trending upward** | Task 1: 3 iterations. Task 2: 5. Task 3: 8. Complexity compounding, not resolving. |
|
||||
| **Test flakiness increasing** | Previously passing tests intermittently fail — hidden coupling being strained |
|
||||
| **Same files modified repeatedly** | Every task touches the same core module — god object absorbing too much responsibility |
|
||||
| **Acceptance criteria requiring exceptions** | "Works except when X" / "Passes if you ignore test Y" — agent negotiating with criteria |
|
||||
|
||||
### The Reassessment Protocol
|
||||
|
||||
When thresholds are crossed, trigger a **focused LLM call** with: manifest + original spec + task summaries + signal data. Prompt: *"Is the current approach viable or would a different architecture serve better? If different, what and why?"*
|
||||
|
||||
### The Critical Architectural Enabler: Make Rewrites Cheap
|
||||
|
||||
- Clean interface contracts + good test suites → rewriting internals while preserving interfaces is low-risk
|
||||
- Tests verify new implementation against same criteria
|
||||
- Interface contracts ensure nothing downstream breaks
|
||||
- **Every major approach on a branch** that can be discarded without affecting anything else
|
||||
|
||||
Gemini's **"Sunk-Cost Heuristic"**: Monitor "Task Re-entry Rate." If the same 3 tests have been attempted >5 times, or if the refactor-to-feature ratio exceeds 4:1, trigger a "Whiteboard Session."
|
||||
|
||||
Grok adds **parallel experimentation**: create a "Rewrite Branch" subgraph, run the same vision on a clean slate for one vertical slice, compare metrics. Only merge if superior. Cost is near-zero because it runs in parallel and is discarded on failure.
|
||||
|
||||
---
|
||||
28
docs/building-coding-agents/20-error-taxonomy-routing.md
Normal file
28
docs/building-coding-agents/20-error-taxonomy-routing.md
Normal file
|
|
@ -0,0 +1,28 @@
|
|||
# Error Taxonomy & Routing
|
||||
|
||||
**The key insight:** Different errors have fundamentally different causes and optimal resolution strategies. Treating them uniformly is one of the biggest sources of wasted iterations.
|
||||
|
||||
### The Optimal Taxonomy
|
||||
|
||||
| Error Class | Context Needed | Optimal Handler | Escalation |
|
||||
|-------------|---------------|-----------------|------------|
|
||||
| **Syntax/Type** | Error message + offending file + types | Deterministic fast path (no LLM needed) | Only if fast path fails |
|
||||
| **Logic** | Failing test (expected vs actual) + implementation + spec | LLM with medium, focused context | After 3 attempts |
|
||||
| **Design** | Original spec + architecture + interface contracts + implementation | LLM with broad context | Often needs human input |
|
||||
| **Performance** | Profiling data + benchmarks + code | Specialist optimization agent | If regression >2x |
|
||||
| **Security** | Static analysis results + secure pattern reference | Conservative fix prompt | Always flag for review |
|
||||
| **Environment** | Environment config + recent dep changes + error output | Specialized env context | If not auto-resolved |
|
||||
| **Flaky Tests** | Run test multiple times to confirm flakiness | Quarantine, don't fix | Infrastructure agent |
|
||||
|
||||
### Critical Routing Rules
|
||||
|
||||
- **Flaky tests:** Detect by running failing tests multiple times. If inconsistent, **quarantine** — never trigger a fix cycle.
|
||||
- **Environment errors:** Classify as potentially environmental when they appear in build/startup rather than tests.
|
||||
- **Security:** Caught by static analysis in the deterministic layer, not by the LLM. Run security linting after every task.
|
||||
- **Syntax/Type:** Hit a deterministic fast path first. Missing import? Search codebase for the export. Only escalate to LLM if mechanical fix fails.
|
||||
|
||||
### The Architecture
|
||||
|
||||
The orchestrator classifies every error → selects the appropriate context assembly strategy → optionally selects a different prompt framing. The agent experiences this as *"I got exactly the information I need"* rather than *"I got a dump of everything."*
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,26 @@
|
|||
# Cost-Quality Tradeoff & Model Routing
|
||||
|
||||
### The Key Insight
|
||||
|
||||
Quality requirements vary enormously across task types, but most systems use the same model for everything.
|
||||
|
||||
### The Optimal Model Routing Strategy (All 4 Agree)
|
||||
|
||||
| Task Type | Model Tier | Rationale |
|
||||
|-----------|-----------|-----------|
|
||||
| **Planning, architecture, critique** | Frontier (always) | Planning errors cascade through every downstream task |
|
||||
| **Ambiguity resolution** | Frontier | Wrong interpretation = wasted execution |
|
||||
| **Well-specified implementation** (CRUD, standard UI, utilities) | Mid-tier / capable but cheaper | Task is well-defined, patterns established |
|
||||
| **Code review, test generation** | Mid-tier | Evaluating against known criteria, not generating novel solutions |
|
||||
| **Summarization** (task records, manifest updates) | Lightest viable | Language competence, minimal reasoning depth |
|
||||
| **Boilerplate** | Small/fast model | Predictable output, low reasoning requirements |
|
||||
|
||||
### The Non-Obvious Cost Optimization
|
||||
|
||||
> **Reducing wasted tokens is higher leverage than reducing token price.** A bloated context window costs money on every single call. Trimming 500 unnecessary tokens from context assembly saves more over a project than switching to a model that's 10% cheaper.
|
||||
|
||||
### Measurement
|
||||
|
||||
Track **cost-per-successful-task**, not cost-per-task. If the cheaper model requires twice as many iterations, it's not actually cheaper. Grok reports 60-70% cost reduction with zero quality loss when routing is done at the orchestrator level.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,30 @@
|
|||
# Cross-Project Learning & Reusable Intelligence
|
||||
|
||||
### What Transfers Well
|
||||
|
||||
| Type | Transferability | Example |
|
||||
|------|----------------|---------|
|
||||
| **Problem-solving patterns** (abstract) | ✅ High | "When implementing OAuth, these are the common pitfalls and the architecture that avoids them" |
|
||||
| **Code templates & scaffolding** | ✅ With adaptation | Proven auth module structure, tested payment integration pattern |
|
||||
| **Learned pitfalls** | ✅ High | "When integrating Stripe, these edge cases around webhooks most implementations miss" |
|
||||
| **Project-specific conventions** | ❌ Does not transfer | Architectural decisions are contextual |
|
||||
| **Domain logic** | ❌ Does not transfer | Business rules are project-specific |
|
||||
|
||||
### The Optimal Architecture: A Pattern Library
|
||||
|
||||
Each pattern includes:
|
||||
- Description of the problem it solves
|
||||
- The approach and tradeoffs
|
||||
- Common pitfalls
|
||||
- Verification tests
|
||||
- Reference implementation
|
||||
|
||||
### Growth Through Extraction, Not Manual Curation
|
||||
|
||||
When a task completes with high quality (first-attempt success, no subsequent modifications, clean review), flag it as a **candidate for pattern extraction.** A dedicated pass determines whether the solution embodies a generalizable pattern.
|
||||
|
||||
### The Critical Constraint
|
||||
|
||||
Patterns should be **descriptive, not prescriptive** — "here's an approach that has worked well, with these tradeoffs" not "always do it this way." Grok adds an overfitting guard: require **3+ project examples** before promoting to reusable.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,31 @@
|
|||
# Evolution Across Project Scale
|
||||
|
||||
### Phase Transitions (All 4 Models Converge)
|
||||
|
||||
#### 0–1k LOC: The Monolithic Phase
|
||||
- Everything fits in one context window
|
||||
- Agent reads entire codebase, makes globally coherent decisions
|
||||
- Orchestrator is simple, manifest barely needed
|
||||
- **This is where most demos live**
|
||||
|
||||
#### 1k–10k LOC: The Modular Phase
|
||||
- Codebase no longer fits in one context window
|
||||
- **What breaks first: consistency** — agent sees fragments that gradually diverge
|
||||
- Requirements: modular context assembly, manifest as essential map, interface contracts, convention enforcement (linting, formatting)
|
||||
|
||||
#### 10k–50k LOC: The Architectural Phase
|
||||
- Relationships between components become non-obvious
|
||||
- Changing one thing might affect ten others through indirect dependencies
|
||||
- **What breaks:** planning quality — planner can't understand full system
|
||||
- Requirements: dependency-aware context assembly, impact analysis before execution, more conservative/incremental plans
|
||||
|
||||
#### 50k–100k+ LOC: The Organizational Phase
|
||||
- System of systems — no single agent context can reason about the whole thing
|
||||
- **What breaks:** integration — interactions between components become so numerous that integration testing becomes the bottleneck
|
||||
- Requirements: hierarchical planning (system-level planner → component-level agents), continuous integration verification, possibly distributed orchestrator, hierarchy of manifests
|
||||
|
||||
### The Meta-Insight
|
||||
|
||||
> The architecture of your agentic system should **mirror the architecture of the software it's building.** Microservices projects need a more distributed orchestrator. Monolithic projects can use a simpler one.
|
||||
|
||||
---
|
||||
35
docs/building-coding-agents/24-security-trust-boundaries.md
Normal file
35
docs/building-coding-agents/24-security-trust-boundaries.md
Normal file
|
|
@ -0,0 +1,35 @@
|
|||
# Security & Trust Boundaries
|
||||
|
||||
### Hard Boundaries — Things the Agent Should NEVER Do (Universal Agreement)
|
||||
|
||||
| Forbidden Action | Why |
|
||||
|-----------------|-----|
|
||||
| Access production systems directly | Agent's world is the dev environment, full stop |
|
||||
| Access or embed secrets | API keys, credentials should never appear in agent context or output |
|
||||
| Make network requests to arbitrary destinations | Restrictive firewall, whitelist only required services |
|
||||
| Modify its own orchestrator, prompts, or tools | Prevents removing safety constraints |
|
||||
| Execute commands outside the project directory | Sandbox to project dir + temp working dirs only |
|
||||
|
||||
### The Sandboxing Architecture
|
||||
|
||||
| Layer | Mechanism |
|
||||
|-------|-----------|
|
||||
| **Execution** | Containerized (Docker + seccomp), restricted filesystem, network policy |
|
||||
| **Filesystem** | Content-addressable storage — agent *proposes* changes, backend validates before writing |
|
||||
| **Secrets** | Vault proxy with short-lived tokens, never direct credentials |
|
||||
| **Commands** | Parsed and blocked for dangerous patterns (`rm -rf /`, `curl` to unknown hosts) |
|
||||
| **Dependencies** | Approved dependency list — new deps require auto-approval (pre-approved list) or human approval |
|
||||
|
||||
### The Capability-Based Security Model
|
||||
|
||||
The orchestrator runs **outside** the sandbox. The agent requests operations through a controlled API. The orchestrator validates every request before executing. The agent doesn't have direct access to anything — it has access to **tools that the orchestrator mediates.**
|
||||
|
||||
### The Subtle Risk
|
||||
|
||||
The agent introduces vulnerabilities not through malice but through **plausible-looking insecure patterns**: string concatenation for SQL queries, disabling CORS for convenience, logging sensitive data for debugging. Security linting rules should be tuned to catch these **AI-common patterns** specifically.
|
||||
|
||||
### The Trust Model
|
||||
|
||||
> Think of the agent as a **highly capable but unvetted contractor.** Give them the codebase and dev environment. Don't give them production credentials, deployment access, or the ability to modify security infrastructure. The goal isn't to make the agent safe by limiting capabilities — it's to make the **environment** safe so the agent can be maximally capable within it.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,116 @@
|
|||
# Designing for Non-Technical Users ("Vibe Coders")
|
||||
|
||||
**The question that matters most** — because everything else is worthless if only engineers can use it.
|
||||
|
||||
### The Fundamental Principle (All 4 Models Converge)
|
||||
|
||||
> The human should **never have to think in code.** Not in input, not in output, not in error messages, not in verification, not in debugging. The entire technical layer should be absorbed by the system. The human operates purely in **intent, vision, preference, and judgment.**
|
||||
|
||||
### The Core Philosophy
|
||||
|
||||
| Human Provides | System Provides |
|
||||
|---------------|-----------------|
|
||||
| Vision & imagination | Engineering intelligence |
|
||||
| Taste & aesthetic judgment | Technical translation |
|
||||
| Direction & priorities | Architecture & implementation |
|
||||
| "This feels off — calmer, like Linear" | Concrete CSS/animation/spacing changes |
|
||||
|
||||
### The 10 Pillars of a Magical Non-Technical Experience
|
||||
|
||||
#### 1. Intent-Based Input, Not Specification
|
||||
Users speak naturally: *"I want an app where people can upload recipes and find them by ingredient."* The system runs a **discovery conversation** that feels like talking to a brilliant product partner — not filling out a requirements form. Behind the scenes, answers compile into structured specs, acceptance criteria, and interface contracts the human never sees.
|
||||
|
||||
> **Critical:** Questions should be about the *experience*, not the *implementation.* Never "relational or document store?" Always "should search find exact matches only, or also substitutable ingredients?"
|
||||
|
||||
#### 2. Show the Thing, Not the Process
|
||||
After each milestone: a **working preview**, not a task list. The human interacts with the real thing at every checkpoint — clicks around, feels it, reacts. Progress is communicated as capability, not code: *"Your app can now save workouts and retrieve them later"* — not *"implemented REST endpoint."*
|
||||
|
||||
#### 3. Collaborative Builder, Not Command Executor
|
||||
The agent should feel like a senior co-founder:
|
||||
```
|
||||
User: I want something like Notion but for recipes.
|
||||
|
||||
Agent: Here's how I'd approach that:
|
||||
- Recipe database with tagging
|
||||
- Search by ingredient
|
||||
- Meal planner
|
||||
|
||||
Would you like to prioritize simplicity or advanced features?
|
||||
```
|
||||
This implicitly educates the user while avoiding wrong builds from vague specs.
|
||||
|
||||
#### 4. Problems, Not Errors
|
||||
The human should **never see a stack trace**. Technical failures are either resolved silently or translated to domain-level questions:
|
||||
|
||||
| ❌ Never Show | ✅ Show Instead |
|
||||
|--------------|----------------|
|
||||
| `TypeError: Cannot read property 'map' of undefined` | "The recipe list isn't displaying correctly. I'm fixing it now — should be ready in a few minutes." |
|
||||
| `ECONNREFUSED localhost:5432` | "I'm having trouble connecting to the database. Working on it." |
|
||||
| Ambiguous technical decision | "When someone searches 'chicken,' should results include recipes where chicken is optional?" |
|
||||
|
||||
#### 5. Reactions, Not Reviews
|
||||
Design for **reactions** to the running app, not code reviews. Like working with an interior designer: *"I love the color but the couch feels too big."* Visual, spatial, experiential feedback. **A/B comparison** is the most powerful pattern: show two versions, human picks which "feels better" in seconds.
|
||||
|
||||
#### 6. Engineering Tradeoffs as Simple Choices
|
||||
Instead of *"Which auth provider?"* → ask *"Which matters more: A) Simplicity B) Maximum customization C) Enterprise security"* — the system maps answers to technical decisions automatically.
|
||||
|
||||
#### 7. Safety Blanket
|
||||
- Auto-backups every slice + "undo entire feature" button
|
||||
- **"Vibe Checkpoints"** — before every major change, a save point. "Go back to how it was ten minutes ago."
|
||||
- Deployment previews before anything goes live
|
||||
- No irreversible actions without plain-English confirmation
|
||||
|
||||
#### 8. Progressive Disclosure
|
||||
Start ultra-simple. Offer "Advanced mode" toggle only if the user ever asks. The system should **progressively reveal engineering** — at first pure vision → later architecture tweaking → eventually deep collaboration. Many users will never leave the simple mode, and that's fine.
|
||||
|
||||
#### 9. Implicit Teaching
|
||||
When the user asks *"why is that taking longer?"*:
|
||||
> "The recipe search needs to look through all recipes every time. I'm adding an index — think of it like a table of contents — so it can find things faster."
|
||||
|
||||
Optional, triggered by curiosity, expressed in analogy. Over time, users develop useful mental models of software **without it ever being mandatory.**
|
||||
|
||||
#### 10. Invisible Deployment & Operations
|
||||
"I want to share this with people" → receive a URL. Behind the scenes: hosting, domain, database, SSL, CI/CD. Ongoing maintenance equally invisible. Simple dashboard: *"Your recipe app had 340 visitors this week. Everything is running smoothly."*
|
||||
|
||||
### The Translation Layer (The Magic Glue)
|
||||
|
||||
A deterministic "Human Translator" node at the front of every orchestrator cycle:
|
||||
|
||||
```
|
||||
Raw user message + references
|
||||
↓
|
||||
[Human Translator]
|
||||
↓
|
||||
Precise assumptions, invariants, success criteria
|
||||
↓
|
||||
[Rest of the god-tier orchestrator pipeline]
|
||||
```
|
||||
|
||||
The rest of the graph never sees "vibe language" — only clean spec. This preserves all technical quality while shielding the user.
|
||||
|
||||
### The Scope Protection Layer
|
||||
|
||||
Non-technical users often don't realize how complex their requests are. The system must be honest — gently:
|
||||
|
||||
> *"That's a great idea. Adding social features is significant — it involves user profiles, a follow system, a feed algorithm, and notifications. It'll take as long as everything we've built so far. Want me to go ahead, or finish core recipe features first?"*
|
||||
|
||||
This respects agency while providing the information needed for good decisions.
|
||||
|
||||
### The Meta-Principle
|
||||
|
||||
> The system is a **creative tool**, not a development tool. It should feel like Photoshop or Ableton — a powerful instrument that lets a person with vision manifest that vision without understanding the underlying mechanics. A music producer doesn't need to understand digital signal processing. A filmmaker doesn't need to understand codec compression. **A person with a great app idea shouldn't need to understand React component lifecycle.**
|
||||
|
||||
### What Makes It Feel Magical
|
||||
|
||||
The most powerful systems feel magical when they:
|
||||
- Understand vague ideas
|
||||
- Ask smart clarifying questions
|
||||
- Translate intent into architecture
|
||||
- Show visible progress quickly
|
||||
- Make experimentation safe
|
||||
- Explain decisions clearly
|
||||
- Hide complexity without blocking power
|
||||
|
||||
> When these align, the user experiences: **"I can build anything I imagine."** That feeling is the real product.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,33 @@
|
|||
# Cross-Cutting Themes (Where All 4 Models Converge)
|
||||
|
||||
### Original Themes (Reinforced)
|
||||
|
||||
These ideas appeared independently in all four conversations across both rounds, indicating the highest-confidence principles:
|
||||
|
||||
1. **The LLM should only do what requires judgment.** Everything deterministic belongs in code.
|
||||
2. **Vertical slices are non-negotiable.** End-to-end working increments at every stage.
|
||||
3. **Context leanness = quality.** Less (but more relevant) context produces better outputs than more context.
|
||||
4. **Execution-based verification beats self-assessment.** Run the code. Trust test results over the model's opinion.
|
||||
5. **The orchestrator is the product.** The model is a commodity; the system around it is the differentiator.
|
||||
6. **State must be structured and deterministic.** Never let the LLM manage its own lifecycle or memory.
|
||||
7. **Speed comes from removing unnecessary work.** Not from doing the same work faster.
|
||||
8. **Failure recovery matters more than happy-path perfection.** Design the error paths first.
|
||||
9. **Human involvement should be high-leverage.** Specific questions with context, not open-ended reviews.
|
||||
10. **The system improves over time.** Track patterns, cache solutions, learn from failures.
|
||||
|
||||
### New Themes (From Grey Area Deep-Dives)
|
||||
|
||||
11. **Document assumptions, don't ask about every one.** Proceed with sensible defaults + transparent logging. Review at milestones, not in real-time.
|
||||
12. **The codebase is the lossless source of truth.** Summaries are lossy caches that must be periodically reconciled against actual code. Never summarize summaries.
|
||||
13. **Semantic conflicts are harder than syntactic ones.** Interface contracts must be behavioral specs, not just type signatures. Integration testing is a first-class concern, not an afterthought.
|
||||
14. **Observe before modifying.** Especially in legacy codebases — the agent must understand existing patterns before changing them. Preserve local consistency over global ideals.
|
||||
15. **Taste can be ~80-85% automated.** Convert subjective preferences to concrete, verifiable specs. Reserve human judgment for the remaining gestalt. The gap is closing fast with vision-capable models.
|
||||
16. **Irreversible operations are categorically different.** The agent prepares; the human executes. No exceptions.
|
||||
17. **"Boring" code is good code.** For handoff, enforce standard patterns, limit complexity, and write *why* comments. Automated readability testing catches problems before humans encounter them.
|
||||
18. **Make rewrites cheap, not rare.** Clean interfaces + good tests + branch-based experimentation = rewriting is a safe, routine operation rather than a crisis.
|
||||
19. **Route errors by type, not by severity.** Different error classes need different context, different handlers, and different escalation thresholds. Flaky tests should be quarantined, not fixed.
|
||||
20. **The magic is the translation layer.** For non-technical users, the entire value proposition is the invisible bridge between human intent and technical execution. Every moment the user has to think like a developer is a failure.
|
||||
|
||||
---
|
||||
|
||||
*Generated March 2026. Updated with grey-area deep-dive synthesis. Source material: two rounds of parallel deep-dive conversations with Claude (Anthropic), Gemini (Google), GPT (OpenAI), and Grok (xAI) on optimal autonomous AI coding agent architecture — including the 13 hardest unsolved problems and designing for non-technical users.*
|
||||
37
docs/building-coding-agents/README.md
Normal file
37
docs/building-coding-agents/README.md
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
# Building Coding Agents — Research
|
||||
|
||||
> Split into individual files for easier consumption.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [01. Work Decomposition](./01-work-decomposition.md)
|
||||
- [02. What to Keep & Discard from Human Engineering](./02-what-to-keep-discard-from-human-engineering.md)
|
||||
- [03. State Machine & Context Management](./03-state-machine-context-management.md)
|
||||
- [04. Optimal Storage for Project Context](./04-optimal-storage-for-project-context.md)
|
||||
- [05. Parallelization Strategy](./05-parallelization-strategy.md)
|
||||
- [06. Maximizing Agent Autonomy & Superpowers](./06-maximizing-agent-autonomy-superpowers.md)
|
||||
- [07. System Prompt & LLM vs Deterministic Split](./07-system-prompt-llm-vs-deterministic-split.md)
|
||||
- [08. Speed Optimization](./08-speed-optimization.md)
|
||||
- [09. Top 10 Tips for a World-Class Agent](./09-top-10-tips-for-a-world-class-agent.md)
|
||||
- [10. Top 10 Pitfalls to Avoid](./10-top-10-pitfalls-to-avoid.md)
|
||||
- [11. God-Tier Context Engineering](./11-god-tier-context-engineering.md)
|
||||
- [12. Handling Ambiguity & Contradiction](./12-handling-ambiguity-contradiction.md)
|
||||
- [13. Long-Running Memory Fidelity](./13-long-running-memory-fidelity.md)
|
||||
- [14. Multi-Agent Semantic Conflict Resolution](./14-multi-agent-semantic-conflict-resolution.md)
|
||||
- [15. Legacy Code & Brownfield Onboarding](./15-legacy-code-brownfield-onboarding.md)
|
||||
- [16. Encoding Taste & Aesthetics](./16-encoding-taste-aesthetics.md)
|
||||
- [17. Irreversible Operations & Safety Architecture](./17-irreversible-operations-safety-architecture.md)
|
||||
- [18. The Handoff Problem: Agent → Human Maintainability](./18-the-handoff-problem-agent-human-maintainability.md)
|
||||
- [19. When to Scrap and Start Over](./19-when-to-scrap-and-start-over.md)
|
||||
- [20. Error Taxonomy & Routing](./20-error-taxonomy-routing.md)
|
||||
- [21. Cost-Quality Tradeoff & Model Routing](./21-cost-quality-tradeoff-model-routing.md)
|
||||
- [22. Cross-Project Learning & Reusable Intelligence](./22-cross-project-learning-reusable-intelligence.md)
|
||||
- [23. Evolution Across Project Scale](./23-evolution-across-project-scale.md)
|
||||
- [24. Security & Trust Boundaries](./24-security-trust-boundaries.md)
|
||||
- [25. Designing for Non-Technical Users ("Vibe Coders")](./25-designing-for-non-technical-users-vibe-coders.md)
|
||||
- [26. Cross-Cutting Themes (Where All 4 Models Converge)](./26-cross-cutting-themes-where-all-4-models-converge.md)
|
||||
|
||||
---
|
||||
|
||||
*Split into per-section files for surgical context loading.*
|
||||
|
||||
249
docs/context-and-hooks/01-the-context-pipeline.md
Normal file
249
docs/context-and-hooks/01-the-context-pipeline.md
Normal file
|
|
@ -0,0 +1,249 @@
|
|||
# The Context Pipeline
|
||||
|
||||
The full journey of a user prompt from keypress to LLM input, through every transformation stage. Understanding this pipeline is the foundation of all context engineering in pi.
|
||||
|
||||
---
|
||||
|
||||
## The Pipeline at a Glance
|
||||
|
||||
```
|
||||
User types prompt and hits Enter
|
||||
│
|
||||
├─► Extension command check (/command)
|
||||
│ If match → run handler, skip everything below
|
||||
│
|
||||
├─► input event
|
||||
│ Extensions can: transform text/images, intercept entirely, or pass through
|
||||
│
|
||||
├─► Skill expansion (/skill:name)
|
||||
│ Skill file content injected into prompt text
|
||||
│
|
||||
├─► Prompt template expansion (/template)
|
||||
│ Template file content merged into prompt text
|
||||
│
|
||||
├─► before_agent_start event [ONCE per user prompt]
|
||||
│ Extensions can:
|
||||
│ • Inject custom messages (appended after user message)
|
||||
│ • Modify the system prompt (chained across extensions)
|
||||
│
|
||||
├─► Agent.prompt(messages)
|
||||
│ Messages array: [user message, ...nextTurn messages, ...extension messages]
|
||||
│
|
||||
│ ┌── Turn loop (repeats while LLM calls tools) ──────────────┐
|
||||
│ │ │
|
||||
│ │ transformContext (= context event) [EVERY turn] │
|
||||
│ │ Extensions receive AgentMessage[] deep copy │
|
||||
│ │ Can filter, reorder, inject, or replace messages │
|
||||
│ │ Multiple handlers chain: each sees previous output │
|
||||
│ │ │
|
||||
│ │ convertToLlm [EVERY turn, AFTER context event] │
|
||||
│ │ AgentMessage[] → Message[] │
|
||||
│ │ Custom types mapped to user role │
|
||||
│ │ bashExecution (!! prefix) filtered out │
|
||||
│ │ Not extensible — hardcoded in messages.ts │
|
||||
│ │ │
|
||||
│ │ LLM call │
|
||||
│ │ System prompt + converted messages + tool definitions │
|
||||
│ │ │
|
||||
│ │ Tool execution (if LLM calls tools) │
|
||||
│ │ tool_call event → can block │
|
||||
│ │ execute runs │
|
||||
│ │ tool_result event → can modify result │
|
||||
│ │ Steering check → may skip remaining tools │
|
||||
│ │ │
|
||||
│ │ Follow-up check (if no more tool calls) │
|
||||
│ │ Queued follow-up messages become next turn input │
|
||||
│ │ │
|
||||
│ └────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
└─► agent_end event
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Stage-by-Stage Detail
|
||||
|
||||
### Stage 1: Extension Command Check
|
||||
|
||||
The first thing that happens. If the text starts with `/` and matches a registered extension command, the command handler runs and **the prompt never reaches the agent**. No events fire. No LLM call happens.
|
||||
|
||||
This means extension commands are fully synchronous escape hatches — they execute even during streaming (they're checked before any queuing logic).
|
||||
|
||||
### Stage 2: Input Event
|
||||
|
||||
```typescript
|
||||
pi.on("input", async (event, ctx) => {
|
||||
// event.text — the raw user input
|
||||
// event.images — attached images, if any
|
||||
// event.source — "interactive" | "rpc" | "extension"
|
||||
|
||||
// Three possible return values:
|
||||
return { action: "continue" }; // pass through unchanged
|
||||
return { action: "transform", text: "new text" }; // rewrite the input
|
||||
return { action: "handled" }; // swallow entirely
|
||||
});
|
||||
```
|
||||
|
||||
**Chaining:** Multiple `input` handlers chain. If handler A returns `transform`, handler B sees the transformed text. If any handler returns `handled`, the pipeline stops — no LLM call.
|
||||
|
||||
**Timing:** Fires before skill/template expansion. Your handler sees the raw `/skill:name args` text, not the expanded content.
|
||||
|
||||
### Stage 3: Skill and Template Expansion
|
||||
|
||||
Deterministic text substitution. `/skill:name args` becomes the skill file content wrapped in `<skill>` tags. `/template args` becomes the template file content. These are string replacements — no events fire.
|
||||
|
||||
### Stage 4: before_agent_start
|
||||
|
||||
```typescript
|
||||
pi.on("before_agent_start", async (event, ctx) => {
|
||||
// event.prompt — the expanded user prompt text
|
||||
// event.images — attached images
|
||||
// event.systemPrompt — current system prompt (may be modified by earlier extensions)
|
||||
|
||||
return {
|
||||
message: {
|
||||
customType: "my-context",
|
||||
content: "Context the LLM should see",
|
||||
display: false, // UI rendering only — LLM ALWAYS sees it
|
||||
},
|
||||
systemPrompt: event.systemPrompt + "\nExtra instructions",
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
**Critical facts:**
|
||||
- Fires **once** per user prompt, not per turn
|
||||
- System prompts **chain**: Extension A modifies it, Extension B sees the modified version in `event.systemPrompt`
|
||||
- Messages **accumulate**: All extensions' messages are collected and injected as separate entries
|
||||
- If no extension returns a `systemPrompt`, the base system prompt is restored (previous turn's modifications don't persist)
|
||||
|
||||
**Message injection order in the final array:**
|
||||
```
|
||||
[user message] → [nextTurn messages] → [extension messages from before_agent_start]
|
||||
```
|
||||
|
||||
### Stage 5: The Turn Loop
|
||||
|
||||
This is where the LLM is actually called. The turn loop repeats for each LLM response that includes tool calls.
|
||||
|
||||
#### 5a: transformContext / context event
|
||||
|
||||
The `context` event is wired as the `transformContext` callback on the Agent. It fires on **every turn** within the agent loop.
|
||||
|
||||
```typescript
|
||||
// Inside the agent loop (agent-loop.ts):
|
||||
let messages = context.messages;
|
||||
if (config.transformContext) {
|
||||
messages = await config.transformContext(messages, signal);
|
||||
}
|
||||
const llmMessages = await config.convertToLlm(messages);
|
||||
```
|
||||
|
||||
The `context` event handler in the runner creates a `structuredClone` deep copy:
|
||||
|
||||
```typescript
|
||||
// runner.ts emitContext():
|
||||
let currentMessages = structuredClone(messages);
|
||||
// ...each handler receives and can modify currentMessages
|
||||
```
|
||||
|
||||
**This means:**
|
||||
- You get a deep copy — safe to mutate, splice, filter, or replace
|
||||
- You work at the `AgentMessage[]` level (includes custom types)
|
||||
- Multiple handlers chain: each sees the output of the previous
|
||||
- **You cannot modify the system prompt here** — only `before_agent_start` can do that
|
||||
- The messages include everything: user messages, assistant responses, tool results, custom messages, bash executions, compaction summaries, branch summaries
|
||||
|
||||
#### 5b: convertToLlm
|
||||
|
||||
After `context` event processing, `convertToLlm` maps `AgentMessage[]` to `Message[]`:
|
||||
|
||||
| AgentMessage role | Converted to | Notes |
|
||||
|---|---|---|
|
||||
| `user` | `user` | Pass through |
|
||||
| `assistant` | `assistant` | Pass through |
|
||||
| `toolResult` | `toolResult` | Pass through |
|
||||
| `custom` | `user` | Content preserved, `display` field ignored |
|
||||
| `bashExecution` | `user` | Unless `excludeFromContext` (`!!` prefix) → filtered out |
|
||||
| `compactionSummary` | `user` | Wrapped in `<summary>` tags |
|
||||
| `branchSummary` | `user` | Wrapped in `<summary>` tags |
|
||||
|
||||
**`convertToLlm` is not extensible.** It's a hardcoded function in `messages.ts`. If you need to change how messages appear to the LLM, do it in the `context` event handler before this stage.
|
||||
|
||||
#### 5c: LLM Call
|
||||
|
||||
The converted messages plus system prompt plus tool definitions go to the LLM provider. The system prompt used is whatever was set by `before_agent_start` (or the base prompt if no extension modified it).
|
||||
|
||||
#### 5d: Tool Execution and Interception
|
||||
|
||||
When the LLM responds with tool calls, they execute sequentially:
|
||||
|
||||
```
|
||||
For each tool call:
|
||||
tool_call event → can { block: true, reason: "..." }
|
||||
If blocked → Error("reason") becomes the tool result
|
||||
tool_execution_start event (informational)
|
||||
tool.execute() runs
|
||||
tool_execution_end event (informational)
|
||||
tool_result event → can modify { content, details, isError }
|
||||
|
||||
Steering check → if steering messages queued:
|
||||
Remaining tools get "Skipped due to queued user message"
|
||||
Steering messages become input for next turn
|
||||
```
|
||||
|
||||
### Stage 6: Follow-up and Continuation
|
||||
|
||||
When the LLM finishes and has no more tool calls:
|
||||
1. Check for steering messages → if any, start new turn with them
|
||||
2. Check for follow-up messages → if any, start new turn with them
|
||||
3. If neither → `agent_end` fires, agent goes idle
|
||||
|
||||
---
|
||||
|
||||
## What the LLM Actually Sees
|
||||
|
||||
For any given turn, the LLM receives:
|
||||
|
||||
```
|
||||
System prompt (base + before_agent_start modifications)
|
||||
+
|
||||
Messages (after context event filtering, after convertToLlm mapping)
|
||||
+
|
||||
Tool definitions (active tools with names, descriptions, parameter schemas)
|
||||
```
|
||||
|
||||
The system prompt includes:
|
||||
- Base prompt (tool descriptions, guidelines, pi docs reference, date/time, cwd)
|
||||
- `promptSnippet` overrides from active tools (replaces tool description in "Available tools")
|
||||
- `promptGuidelines` from active tools (appended to "Guidelines" section)
|
||||
- `appendSystemPrompt` from settings/config
|
||||
- Project context files (AGENTS.md, CLAUDE.md from cwd ancestors)
|
||||
- Skills listing (names + descriptions, agent uses `read` to load them)
|
||||
- Any `before_agent_start` modifications
|
||||
|
||||
---
|
||||
|
||||
## Key Timing Distinctions
|
||||
|
||||
| Hook | When | How often | Can modify |
|
||||
|------|------|-----------|-----------|
|
||||
| `input` | Before expansion | Once per user input | Input text |
|
||||
| `before_agent_start` | After expansion, before agent loop | Once per user prompt | System prompt + inject messages |
|
||||
| `context` | Before each LLM call | Every turn in agent loop | Message array |
|
||||
| `tool_call` | Before each tool execution | Per tool call | Block execution |
|
||||
| `tool_result` | After each tool execution | Per tool call | Result content/details |
|
||||
|
||||
---
|
||||
|
||||
## The Deep Copy Question
|
||||
|
||||
When do you get a safe-to-mutate copy vs a reference?
|
||||
|
||||
| Hook | What you receive | Safe to mutate? |
|
||||
|------|-----------------|-----------------|
|
||||
| `context` | `structuredClone` deep copy | Yes |
|
||||
| `before_agent_start` | `event.systemPrompt` is a string (immutable) | Return new string |
|
||||
| `tool_call` | `event.input` is the raw args object | Do not mutate — return `block` |
|
||||
| `tool_result` | `{ ...event }` shallow spread | Return new values, don't mutate |
|
||||
| `input` | `event.text` is a string (immutable) | Return new text via `transform` |
|
||||
465
docs/context-and-hooks/02-hook-reference.md
Normal file
465
docs/context-and-hooks/02-hook-reference.md
Normal file
|
|
@ -0,0 +1,465 @@
|
|||
# Hook Reference
|
||||
|
||||
Complete behavioral specification of every hook in pi's extension system. Covers timing, chaining semantics, return shapes, and edge cases not in the extending-pi docs.
|
||||
|
||||
---
|
||||
|
||||
## Hook Categories
|
||||
|
||||
1. **Input hooks** — intercept user input before the agent
|
||||
2. **Agent lifecycle hooks** — control the agent loop boundary
|
||||
3. **Per-turn hooks** — fire on every LLM call within an agent run
|
||||
4. **Tool hooks** — intercept individual tool executions
|
||||
5. **Session hooks** — respond to session lifecycle changes
|
||||
6. **Model hooks** — respond to model changes
|
||||
7. **Resource hooks** — provide dynamic resources at startup
|
||||
|
||||
---
|
||||
|
||||
## 1. Input Hooks
|
||||
|
||||
### `input`
|
||||
|
||||
**When:** User submits text (Enter in editor, RPC message, or `pi.sendUserMessage` from an extension with `source: "extension"`).
|
||||
|
||||
**Before:** Skill expansion, template expansion, command check (extension commands are checked before `input` fires, but built-in commands are checked after).
|
||||
|
||||
**Chaining:** Sequential through all extensions. Each handler sees the text output of the previous handler's `transform`. First `handled` stops the chain and the pipeline.
|
||||
|
||||
```typescript
|
||||
pi.on("input", async (event, ctx) => {
|
||||
// event.text: string — current text (possibly transformed by earlier handler)
|
||||
// event.images: ImageContent[] | undefined
|
||||
// event.source: "interactive" | "rpc" | "extension"
|
||||
|
||||
// Option 1: Pass through
|
||||
return { action: "continue" };
|
||||
// or return nothing (undefined) — same as continue
|
||||
|
||||
// Option 2: Transform
|
||||
return { action: "transform", text: "rewritten", images: newImages };
|
||||
|
||||
// Option 3: Swallow (no LLM call, no further handlers)
|
||||
return { action: "handled" };
|
||||
});
|
||||
```
|
||||
|
||||
**Edge cases:**
|
||||
- Extension commands (`/mycommand`) are checked **before** `input` fires. If it matches, `input` never fires.
|
||||
- Built-in commands (`/new`, `/model`, etc.) are checked **after** `input` transforms. So `input` can transform text into a built-in command, or transform a built-in command into something else.
|
||||
- Images can be replaced via `transform`. Omitting `images` in the transform result preserves the original images.
|
||||
|
||||
---
|
||||
|
||||
## 2. Agent Lifecycle Hooks
|
||||
|
||||
### `before_agent_start`
|
||||
|
||||
**When:** After input processing, skill/template expansion, and the user message is constructed — but before `agent.prompt()` is called.
|
||||
|
||||
**Fires:** Once per user prompt. Does NOT fire on subsequent turns within the same agent run.
|
||||
|
||||
**Chaining:**
|
||||
- **System prompt:** Chains. Extension A modifies `event.systemPrompt`, Extension B sees that modified version. If no extension returns a `systemPrompt`, the base prompt is used (resetting any previous turn's modifications).
|
||||
- **Messages:** Accumulate. All `message` results are collected into an array. Each becomes a separate `CustomMessage` with `role: "custom"` injected after the user message.
|
||||
|
||||
```typescript
|
||||
pi.on("before_agent_start", async (event, ctx) => {
|
||||
// event.prompt: string — expanded user prompt text
|
||||
// event.images: ImageContent[] | undefined
|
||||
// event.systemPrompt: string — current system prompt (may be chained from earlier extension)
|
||||
|
||||
return {
|
||||
// Optional: inject a custom message into the session
|
||||
message: {
|
||||
customType: "my-extension", // identifies the message type
|
||||
content: "Text the LLM sees", // string or (TextContent | ImageContent)[]
|
||||
display: true, // controls UI rendering, NOT LLM visibility
|
||||
details: { any: "data" }, // for custom rendering and state reconstruction
|
||||
},
|
||||
|
||||
// Optional: modify the system prompt for this agent run
|
||||
systemPrompt: event.systemPrompt + "\nNew instructions",
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
**Critical detail:** The `display` field controls whether the message shows in the TUI chat log. The LLM **always** sees the message content regardless of `display`. All custom messages become `user` role messages in `convertToLlm`.
|
||||
|
||||
**Error handling:** If a handler throws, the error is captured and reported via `emitError`. Other handlers still run. The pipeline is not stopped.
|
||||
|
||||
### `agent_start`
|
||||
|
||||
**When:** The agent loop begins (after `before_agent_start`, after `agent.prompt()` is called).
|
||||
|
||||
**Fires:** Once per agent run. Informational only — no return value.
|
||||
|
||||
```typescript
|
||||
pi.on("agent_start", async (event, ctx) => {
|
||||
// event: { type: "agent_start" }
|
||||
// Useful for: starting timers, resetting per-run state
|
||||
});
|
||||
```
|
||||
|
||||
### `agent_end`
|
||||
|
||||
**When:** The agent loop finishes (all turns complete, no more tool calls, no queued messages).
|
||||
|
||||
**Fires:** Once per agent run.
|
||||
|
||||
```typescript
|
||||
pi.on("agent_end", async (event, ctx) => {
|
||||
// event.messages: AgentMessage[] — all messages produced during this run
|
||||
// Useful for: final summaries, state persistence, triggering follow-up actions
|
||||
});
|
||||
```
|
||||
|
||||
**Subtlety:** `event.messages` contains only the NEW messages from this agent run, not the full conversation history. Use `ctx.sessionManager.getBranch()` for the full history.
|
||||
|
||||
---
|
||||
|
||||
## 3. Per-Turn Hooks
|
||||
|
||||
### `turn_start`
|
||||
|
||||
**When:** Each turn within the agent loop begins (before the LLM call).
|
||||
|
||||
```typescript
|
||||
pi.on("turn_start", async (event, ctx) => {
|
||||
// event.turnIndex: number — 0-based index of this turn within the agent run
|
||||
// event.timestamp: number — when the turn started
|
||||
});
|
||||
```
|
||||
|
||||
### `context`
|
||||
|
||||
**When:** Before each LLM call, after the turn starts. This is the last chance to modify what the LLM sees.
|
||||
|
||||
**Fires:** Every turn. If the LLM calls 3 tools and loops back, `context` fires 4 times (once for initial call + once per loop-back).
|
||||
|
||||
**Chaining:** Sequential. Each handler receives the output of the previous. First handler gets a `structuredClone` deep copy of the agent's message array.
|
||||
|
||||
```typescript
|
||||
pi.on("context", async (event, ctx) => {
|
||||
// event.messages: AgentMessage[] — deep copy, safe to mutate
|
||||
|
||||
// Filter out messages
|
||||
const filtered = event.messages.filter(m => !isIrrelevant(m));
|
||||
return { messages: filtered };
|
||||
|
||||
// Or inject messages
|
||||
return { messages: [...event.messages, syntheticMessage] };
|
||||
|
||||
// Or return nothing to pass through unchanged
|
||||
});
|
||||
```
|
||||
|
||||
**What `event.messages` contains:**
|
||||
- All roles: `user`, `assistant`, `toolResult`, `custom`, `bashExecution`, `compactionSummary`, `branchSummary`
|
||||
- The user message from the current prompt
|
||||
- Custom messages injected by `before_agent_start`
|
||||
- Tool results from earlier turns in this agent run
|
||||
- Steering/follow-up messages that became turn inputs
|
||||
- Historical messages from the session (including compaction summaries)
|
||||
|
||||
**What it does NOT contain:**
|
||||
- The system prompt (use `before_agent_start` for that)
|
||||
- Tool definitions (use `pi.setActiveTools()` for that)
|
||||
|
||||
### `turn_end`
|
||||
|
||||
**When:** After the LLM responds and all tool calls for this turn complete.
|
||||
|
||||
```typescript
|
||||
pi.on("turn_end", async (event, ctx) => {
|
||||
// event.turnIndex: number
|
||||
// event.message: AgentMessage — the assistant's response message
|
||||
// event.toolResults: ToolResultMessage[] — results from tools called this turn
|
||||
});
|
||||
```
|
||||
|
||||
### `message_start` / `message_update` / `message_end`
|
||||
|
||||
**When:** Message lifecycle events. `update` only fires for assistant messages during streaming (token-by-token).
|
||||
|
||||
```typescript
|
||||
pi.on("message_start", async (event, ctx) => {
|
||||
// event.message: AgentMessage — user, assistant, toolResult, or custom
|
||||
});
|
||||
|
||||
pi.on("message_update", async (event, ctx) => {
|
||||
// event.message: AgentMessage — partial assistant message (streaming)
|
||||
// event.assistantMessageEvent: AssistantMessageEvent — the specific token event
|
||||
});
|
||||
|
||||
pi.on("message_end", async (event, ctx) => {
|
||||
// event.message: AgentMessage — final message
|
||||
// Messages are persisted to the session file at this point
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Tool Hooks
|
||||
|
||||
### `tool_call`
|
||||
|
||||
**When:** After the LLM requests a tool call, before it executes.
|
||||
|
||||
**Chaining:** Sequential. If any handler returns `{ block: true }`, execution stops immediately. The block reason becomes an Error that is caught and returned as the tool result with `isError: true`.
|
||||
|
||||
```typescript
|
||||
pi.on("tool_call", async (event, ctx) => {
|
||||
// event.toolCallId: string
|
||||
// event.toolName: string
|
||||
// event.input: typed based on tool (use isToolCallEventType for narrowing)
|
||||
|
||||
// Block execution
|
||||
return { block: true, reason: "Not allowed in read-only mode" };
|
||||
|
||||
// Allow execution (return nothing or undefined)
|
||||
});
|
||||
```
|
||||
|
||||
**Type narrowing:**
|
||||
```typescript
|
||||
import { isToolCallEventType } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
pi.on("tool_call", async (event, ctx) => {
|
||||
if (isToolCallEventType("bash", event)) {
|
||||
event.input.command; // string — typed!
|
||||
}
|
||||
if (isToolCallEventType("write", event)) {
|
||||
event.input.path; // string
|
||||
event.input.content; // string
|
||||
}
|
||||
// Custom tools need explicit type params:
|
||||
if (isToolCallEventType<"my_tool", { action: string }>("my_tool", event)) {
|
||||
event.input.action; // string
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### `tool_execution_start` / `tool_execution_update` / `tool_execution_end`
|
||||
|
||||
Informational events during tool execution. No return values.
|
||||
|
||||
```typescript
|
||||
pi.on("tool_execution_start", async (event) => {
|
||||
// event.toolCallId, event.toolName, event.args
|
||||
});
|
||||
|
||||
pi.on("tool_execution_update", async (event) => {
|
||||
// event.partialResult — streaming progress from onUpdate callback
|
||||
});
|
||||
|
||||
pi.on("tool_execution_end", async (event) => {
|
||||
// event.result, event.isError
|
||||
});
|
||||
```
|
||||
|
||||
### `tool_result`
|
||||
|
||||
**When:** After a tool finishes executing, before the result is returned to the agent loop.
|
||||
|
||||
**Chaining:** Sequential. Each handler can modify the result. Modifications accumulate across handlers. All handlers see the evolving `currentEvent` with content/details/isError updated by previous handlers.
|
||||
|
||||
```typescript
|
||||
pi.on("tool_result", async (event, ctx) => {
|
||||
// event.toolCallId: string
|
||||
// event.toolName: string
|
||||
// event.input: Record<string, unknown>
|
||||
// event.content: (TextContent | ImageContent)[]
|
||||
// event.details: unknown
|
||||
// event.isError: boolean
|
||||
|
||||
// Modify the result
|
||||
return {
|
||||
content: [...event.content, { type: "text", text: "\n\nAudit: logged" }],
|
||||
isError: false, // can flip error state
|
||||
};
|
||||
|
||||
// Return nothing to pass through unchanged
|
||||
});
|
||||
```
|
||||
|
||||
**Also fires for errors:** If tool execution throws, `tool_result` still fires with `isError: true` and the error message as content. Extensions can modify even error results.
|
||||
|
||||
---
|
||||
|
||||
## 5. Session Hooks
|
||||
|
||||
### `session_start`
|
||||
|
||||
**When:** Initial session load (startup) and after session switch/fork. Also fires after `/reload`.
|
||||
|
||||
**Use for:** State restoration from session entries, initial setup.
|
||||
|
||||
```typescript
|
||||
pi.on("session_start", async (_event, ctx) => {
|
||||
// Restore state from session
|
||||
for (const entry of ctx.sessionManager.getBranch()) {
|
||||
if (entry.type === "custom" && entry.customType === "my-state") {
|
||||
myState = entry.data;
|
||||
}
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### `session_before_switch` / `session_switch`
|
||||
|
||||
**When:** Before/after `/new` or `/resume`.
|
||||
|
||||
```typescript
|
||||
pi.on("session_before_switch", async (event) => {
|
||||
// event.reason: "new" | "resume"
|
||||
// event.targetSessionFile?: string (only for resume)
|
||||
return { cancel: true }; // prevent the switch
|
||||
});
|
||||
```
|
||||
|
||||
### `session_before_fork` / `session_fork`
|
||||
|
||||
**When:** Before/after `/fork`.
|
||||
|
||||
```typescript
|
||||
pi.on("session_before_fork", async (event) => {
|
||||
// event.entryId: string — the entry being forked from
|
||||
return { cancel: true };
|
||||
// or
|
||||
return { skipConversationRestore: true }; // fork without restoring messages
|
||||
});
|
||||
```
|
||||
|
||||
### `session_before_compact` / `session_compact`
|
||||
|
||||
**When:** Before/after compaction (manual or auto).
|
||||
|
||||
```typescript
|
||||
pi.on("session_before_compact", async (event) => {
|
||||
// event.preparation: CompactionPreparation
|
||||
// event.branchEntries: SessionEntry[]
|
||||
// event.customInstructions?: string
|
||||
// event.signal: AbortSignal
|
||||
|
||||
return { cancel: true };
|
||||
// or provide custom compaction:
|
||||
return {
|
||||
compaction: {
|
||||
summary: "My custom summary",
|
||||
firstKeptEntryId: event.preparation.firstKeptEntryId,
|
||||
tokensBefore: event.preparation.tokensBefore,
|
||||
}
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
### `session_before_tree` / `session_tree`
|
||||
|
||||
**When:** Before/after `/tree` navigation.
|
||||
|
||||
```typescript
|
||||
pi.on("session_before_tree", async (event) => {
|
||||
// event.preparation: TreePreparation
|
||||
// event.signal: AbortSignal
|
||||
|
||||
return { cancel: true };
|
||||
// or provide custom summary:
|
||||
return {
|
||||
summary: { summary: "Custom branch summary" },
|
||||
label: "my-label",
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
### `session_shutdown`
|
||||
|
||||
**When:** Process exit (Ctrl+C, Ctrl+D, SIGTERM, `ctx.shutdown()`).
|
||||
|
||||
```typescript
|
||||
pi.on("session_shutdown", async (_event, ctx) => {
|
||||
// Last chance to persist state
|
||||
// Keep it fast — process is exiting
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Model Hooks
|
||||
|
||||
### `model_select`
|
||||
|
||||
**When:** Model changes via `/model`, Ctrl+P cycling, or session restore.
|
||||
|
||||
```typescript
|
||||
pi.on("model_select", async (event, ctx) => {
|
||||
// event.model: Model — the new model
|
||||
// event.previousModel: Model | undefined
|
||||
// event.source: "set" | "cycle" | "restore"
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Resource Hooks
|
||||
|
||||
### `resources_discover`
|
||||
|
||||
**When:** At startup and after `/reload`. Lets extensions provide additional skill, prompt template, and theme paths.
|
||||
|
||||
**Not documented in extending-pi docs.** This is how extensions ship their own resources.
|
||||
|
||||
```typescript
|
||||
pi.on("resources_discover", async (event, ctx) => {
|
||||
// event.cwd: string
|
||||
// event.reason: "startup" | "reload"
|
||||
|
||||
return {
|
||||
skillPaths: [join(__dirname, "skills", "SKILL.md")],
|
||||
promptPaths: [join(__dirname, "prompts", "my-template.md")],
|
||||
themePaths: [join(__dirname, "themes", "dark.json")],
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
**Behavior:** Returned paths are loaded by the resource loader and integrated into the system prompt (skills) and available commands (prompts/themes). The system prompt is rebuilt after resources are extended.
|
||||
|
||||
---
|
||||
|
||||
## 8. User Bash Hooks
|
||||
|
||||
### `user_bash`
|
||||
|
||||
**When:** User executes a command via `!` or `!!` prefix in the editor.
|
||||
|
||||
```typescript
|
||||
pi.on("user_bash", async (event, ctx) => {
|
||||
// event.command: string
|
||||
// event.excludeFromContext: boolean (true if !! prefix)
|
||||
// event.cwd: string
|
||||
|
||||
// Provide custom execution (e.g., SSH)
|
||||
return {
|
||||
operations: { execute: (cmd) => sshExec(remote, cmd) },
|
||||
};
|
||||
|
||||
// Or provide a full replacement result
|
||||
return {
|
||||
result: { output: "custom output", exitCode: 0, cancelled: false, truncated: false },
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Execution Order Across Extensions
|
||||
|
||||
All hooks iterate through extensions in **load order** (project-local first, then global, then explicitly configured via `-e`). Within each extension, handlers for the same event run in registration order.
|
||||
|
||||
For hooks that chain (e.g., `context`, `before_agent_start.systemPrompt`, `input`, `tool_result`):
|
||||
- Extension A's handler runs first, Extension B sees A's output
|
||||
- Load order determines priority
|
||||
|
||||
For hooks that short-circuit (e.g., `tool_call` with `block`, `input` with `handled`, session `cancel`):
|
||||
- First extension to return the short-circuit value wins
|
||||
- Remaining handlers are skipped
|
||||
413
docs/context-and-hooks/03-context-injection-patterns.md
Normal file
413
docs/context-and-hooks/03-context-injection-patterns.md
Normal file
|
|
@ -0,0 +1,413 @@
|
|||
# Context Injection Patterns
|
||||
|
||||
Practical recipes for injecting, filtering, transforming, and managing context through pi's hook system. Each pattern includes when to use it, which hook to use, and the exact implementation.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 1: Per-Prompt System Prompt Modification
|
||||
|
||||
**Use when:** You want to change the LLM's behavior for the entire agent run based on some condition.
|
||||
|
||||
**Hook:** `before_agent_start`
|
||||
|
||||
```typescript
|
||||
let debugMode = false;
|
||||
|
||||
pi.registerCommand("debug", {
|
||||
handler: async (_args, ctx) => {
|
||||
debugMode = !debugMode;
|
||||
ctx.ui.notify(debugMode ? "Debug mode ON" : "Debug mode OFF");
|
||||
},
|
||||
});
|
||||
|
||||
pi.on("before_agent_start", async (event) => {
|
||||
if (debugMode) {
|
||||
return {
|
||||
systemPrompt: event.systemPrompt + `
|
||||
|
||||
## Debug Mode
|
||||
- Show your reasoning for each decision
|
||||
- Before executing any tool, explain what you expect to happen
|
||||
- After each tool result, explain what you learned
|
||||
- If something unexpected happens, stop and explain before continuing`,
|
||||
};
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Why `before_agent_start` and not `context`:** The system prompt is separate from the message array. `context` can only modify messages, not the system prompt.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 2: Invisible Context Injection
|
||||
|
||||
**Use when:** You need the LLM to know something without the user seeing it in the chat.
|
||||
|
||||
**Hook:** `before_agent_start` with `display: false`
|
||||
|
||||
```typescript
|
||||
pi.on("before_agent_start", async (event, ctx) => {
|
||||
const gitBranch = await getBranch();
|
||||
const recentCommits = await getRecentCommits(5);
|
||||
|
||||
return {
|
||||
message: {
|
||||
customType: "git-context",
|
||||
content: `[Git Context] Branch: ${gitBranch}\nRecent commits:\n${recentCommits}`,
|
||||
display: false, // User doesn't see this in chat
|
||||
// But the LLM DOES see it — display only controls UI rendering
|
||||
},
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
**Important:** `display: false` hides from UI only. The LLM always receives custom messages as `user` role content. There is no way to inject LLM-invisible metadata through `sendMessage` or `before_agent_start`.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 3: Conditional Context Filtering
|
||||
|
||||
**Use when:** Some messages in the history are no longer relevant and waste context tokens.
|
||||
|
||||
**Hook:** `context`
|
||||
|
||||
```typescript
|
||||
pi.on("context", async (event) => {
|
||||
return {
|
||||
messages: event.messages.filter(m => {
|
||||
// Remove custom messages from a previous mode
|
||||
if (m.role === "custom" && m.customType === "plan-mode-context") {
|
||||
return currentMode === "plan"; // only keep if still in plan mode
|
||||
}
|
||||
|
||||
// Remove old bash executions beyond the last 10
|
||||
if (m.role === "bashExecution") {
|
||||
return bashCount++ >= totalBash - 10;
|
||||
}
|
||||
|
||||
return true;
|
||||
}),
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
**Why `context` and not `before_agent_start`:** `context` fires every turn and can see the full message array including tool results from earlier turns. `before_agent_start` fires once and can only inject — it can't filter existing messages.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 4: Dynamic Context Injection Per Turn
|
||||
|
||||
**Use when:** You want to add context that changes between turns (e.g., current file state, running process output).
|
||||
|
||||
**Hook:** `context`
|
||||
|
||||
```typescript
|
||||
pi.on("context", async (event, ctx) => {
|
||||
// Inject a synthetic message at the end of the conversation
|
||||
const liveStatus = await getProcessStatus();
|
||||
|
||||
const contextMessage = {
|
||||
role: "user" as const,
|
||||
content: [{ type: "text" as const, text: `[Live Status] ${liveStatus}` }],
|
||||
timestamp: Date.now(),
|
||||
};
|
||||
|
||||
return {
|
||||
messages: [...event.messages, contextMessage],
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
**Caution:** Messages injected in `context` are NOT persisted to the session. They exist only for the LLM call. Next turn, you'll need to inject again. This is actually useful — it means the context is always fresh.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 5: Deferred Context (Next Turn)
|
||||
|
||||
**Use when:** You want to attach context to the user's next prompt without interrupting the current conversation.
|
||||
|
||||
**Mechanism:** `pi.sendMessage` with `deliverAs: "nextTurn"`
|
||||
|
||||
```typescript
|
||||
// Queue context for the next user prompt
|
||||
pi.sendMessage(
|
||||
{
|
||||
customType: "deferred-context",
|
||||
content: "The test suite passed with 47/47 tests",
|
||||
display: false,
|
||||
},
|
||||
{ deliverAs: "nextTurn" }
|
||||
);
|
||||
```
|
||||
|
||||
**How it works internally:** The message is stored in `_pendingNextTurnMessages` and injected into the `messages` array when the next `agent.prompt()` is called, after the user message. Unlike `context` hook injection, these messages ARE persisted to the session.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 6: Context Window Management
|
||||
|
||||
**Use when:** You're approaching the context limit and need to intelligently prune.
|
||||
|
||||
**Hook:** `context`
|
||||
|
||||
```typescript
|
||||
pi.on("context", async (event, ctx) => {
|
||||
const usage = ctx.getContextUsage();
|
||||
if (!usage || usage.percent === null || usage.percent < 70) {
|
||||
return; // plenty of room
|
||||
}
|
||||
|
||||
// Aggressive pruning: remove tool results beyond the last 20
|
||||
let toolResultCount = 0;
|
||||
const total = event.messages.filter(m => m.role === "toolResult").length;
|
||||
|
||||
return {
|
||||
messages: event.messages.filter(m => {
|
||||
if (m.role === "toolResult") {
|
||||
toolResultCount++;
|
||||
// Keep last 20 tool results
|
||||
return toolResultCount > total - 20;
|
||||
}
|
||||
return true;
|
||||
}),
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 7: Steering with Context
|
||||
|
||||
**Use when:** You want to redirect the agent mid-run with additional context.
|
||||
|
||||
**Mechanism:** `pi.sendMessage` with `deliverAs: "steer"`
|
||||
|
||||
```typescript
|
||||
// During an agent run, inject a steering message
|
||||
pi.sendMessage(
|
||||
{
|
||||
customType: "user-feedback",
|
||||
content: "IMPORTANT: The user just updated the config file. Re-read config.json before continuing.",
|
||||
display: true,
|
||||
},
|
||||
{ deliverAs: "steer" }
|
||||
);
|
||||
```
|
||||
|
||||
**What happens:** The current tool call finishes, remaining queued tool calls are skipped (they get error results saying "Skipped due to queued user message"), and the steering message becomes input for the next turn.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 8: Follow-Up Context After Completion
|
||||
|
||||
**Use when:** You want to trigger another LLM turn after the agent finishes, with additional context.
|
||||
|
||||
**Mechanism:** `pi.sendMessage` with `deliverAs: "followUp"`
|
||||
|
||||
```typescript
|
||||
pi.on("agent_end", async (event, ctx) => {
|
||||
// Check if the agent made changes that need verification
|
||||
const hasEdits = event.messages.some(m =>
|
||||
m.role === "toolResult" && m.toolName === "edit"
|
||||
);
|
||||
|
||||
if (hasEdits) {
|
||||
pi.sendMessage(
|
||||
{
|
||||
customType: "auto-verify",
|
||||
content: "You just made edits. Please verify them by running the test suite.",
|
||||
display: true,
|
||||
},
|
||||
{ deliverAs: "followUp", triggerTurn: true }
|
||||
);
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 9: Tool-Scoped Context via promptGuidelines
|
||||
|
||||
**Use when:** You want context that only appears when specific tools are active.
|
||||
|
||||
**Mechanism:** `promptGuidelines` on tool registration
|
||||
|
||||
```typescript
|
||||
pi.registerTool({
|
||||
name: "deploy",
|
||||
label: "Deploy",
|
||||
description: "Deploy the application",
|
||||
promptSnippet: "Deploy the application to staging or production",
|
||||
promptGuidelines: [
|
||||
"Always run tests before deploying",
|
||||
"Never deploy to production without explicit user confirmation",
|
||||
"After deploying, verify the health check endpoint",
|
||||
],
|
||||
parameters: Type.Object({ /* ... */ }),
|
||||
async execute(toolCallId, params, signal, onUpdate, ctx) { /* ... */ },
|
||||
});
|
||||
```
|
||||
|
||||
**Behavior:** The `promptGuidelines` are added to the "Guidelines" section of the system prompt ONLY when the `deploy` tool is in the active tool set. If the tool is disabled via `pi.setActiveTools(...)`, the guidelines disappear.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 10: Persistent State as Context
|
||||
|
||||
**Use when:** You need state that survives session resume AND is visible to the LLM.
|
||||
|
||||
**Mechanism:** Tool result `details` + `session_start` reconstruction + `before_agent_start` injection
|
||||
|
||||
```typescript
|
||||
let projectFacts: string[] = [];
|
||||
|
||||
pi.on("session_start", async (_event, ctx) => {
|
||||
// Reconstruct from session
|
||||
projectFacts = [];
|
||||
for (const entry of ctx.sessionManager.getBranch()) {
|
||||
if (entry.type === "message" && entry.message.role === "toolResult") {
|
||||
if (entry.message.toolName === "learn_fact") {
|
||||
projectFacts = entry.message.details?.facts ?? [];
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
pi.registerTool({
|
||||
name: "learn_fact",
|
||||
label: "Learn Fact",
|
||||
description: "Record a fact about the project",
|
||||
parameters: Type.Object({ fact: Type.String() }),
|
||||
async execute(toolCallId, params) {
|
||||
projectFacts.push(params.fact);
|
||||
return {
|
||||
content: [{ type: "text", text: `Learned: ${params.fact}` }],
|
||||
details: { facts: [...projectFacts] }, // snapshot in details for branching
|
||||
};
|
||||
},
|
||||
});
|
||||
|
||||
pi.on("before_agent_start", async (event) => {
|
||||
if (projectFacts.length > 0) {
|
||||
return {
|
||||
message: {
|
||||
customType: "project-facts",
|
||||
content: `Known project facts:\n${projectFacts.map(f => `- ${f}`).join("\n")}`,
|
||||
display: false,
|
||||
},
|
||||
};
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Why this works for branching:** State lives in tool result `details`, so when the user forks from an earlier point, `session_start` reconstructs from `getBranch()` (the current path), not the full history. Old branches' facts don't leak into new branches.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 11: Input Preprocessing / Macros
|
||||
|
||||
**Use when:** You want custom syntax that expands before the LLM sees it.
|
||||
|
||||
**Hook:** `input`
|
||||
|
||||
```typescript
|
||||
pi.on("input", async (event) => {
|
||||
// Expand @file references to file contents
|
||||
const expanded = event.text.replace(/@(\S+)/g, (match, filePath) => {
|
||||
try {
|
||||
const content = readFileSync(filePath, "utf-8");
|
||||
return `\`\`\`${filePath}\n${content}\n\`\`\``;
|
||||
} catch {
|
||||
return match; // leave unchanged if can't read
|
||||
}
|
||||
});
|
||||
|
||||
if (expanded !== event.text) {
|
||||
return { action: "transform", text: expanded };
|
||||
}
|
||||
return { action: "continue" };
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 12: Context-Aware Tool Blocking
|
||||
|
||||
**Use when:** You want to prevent certain tool usage based on conversation context.
|
||||
|
||||
**Hook:** `tool_call` with `context` awareness
|
||||
|
||||
```typescript
|
||||
let inPlanMode = false;
|
||||
|
||||
pi.on("tool_call", async (event, ctx) => {
|
||||
if (!inPlanMode) return;
|
||||
|
||||
const destructiveTools = ["edit", "write", "bash"];
|
||||
|
||||
if (event.toolName === "bash" && isToolCallEventType("bash", event)) {
|
||||
// Allow read-only bash commands
|
||||
if (isSafeCommand(event.input.command)) return;
|
||||
}
|
||||
|
||||
if (destructiveTools.includes(event.toolName)) {
|
||||
return {
|
||||
block: true,
|
||||
reason: `Plan mode active: ${event.toolName} is not allowed. Use /plan to exit plan mode.`,
|
||||
};
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### ❌ Don't: Modify system prompt in `context`
|
||||
|
||||
```typescript
|
||||
// WRONG — context event can only modify messages, not the system prompt
|
||||
pi.on("context", async (event, ctx) => {
|
||||
// This does nothing to the system prompt
|
||||
return { systemPrompt: "new prompt" }; // ← not a valid return field
|
||||
});
|
||||
```
|
||||
|
||||
### ❌ Don't: Rely on `display: false` for security
|
||||
|
||||
```typescript
|
||||
// WRONG — display: false only hides from UI, LLM still sees it
|
||||
pi.on("before_agent_start", async () => ({
|
||||
message: {
|
||||
customType: "secret",
|
||||
content: "API_KEY=sk-1234", // LLM receives this as a user message!
|
||||
display: false,
|
||||
},
|
||||
}));
|
||||
```
|
||||
|
||||
### ❌ Don't: Use `context` for one-time injection
|
||||
|
||||
```typescript
|
||||
// WRONG — context fires every turn, so this injects repeatedly
|
||||
let injected = false;
|
||||
pi.on("context", async (event) => {
|
||||
if (!injected) {
|
||||
injected = true;
|
||||
return { messages: [...event.messages, myMessage] };
|
||||
}
|
||||
});
|
||||
// Problem: after compaction or session restore, injected resets to false
|
||||
```
|
||||
|
||||
Use `before_agent_start` with `message` for one-time per-prompt injection instead.
|
||||
|
||||
### ❌ Don't: Use `getEntries()` for branch-aware state
|
||||
|
||||
```typescript
|
||||
// WRONG — getEntries() returns ALL entries including dead branches
|
||||
for (const entry of ctx.sessionManager.getEntries()) { /* ... */ }
|
||||
|
||||
// CORRECT — getBranch() returns only entries on the current branch path
|
||||
for (const entry of ctx.sessionManager.getBranch()) { /* ... */ }
|
||||
```
|
||||
209
docs/context-and-hooks/04-message-types-and-llm-visibility.md
Normal file
209
docs/context-and-hooks/04-message-types-and-llm-visibility.md
Normal file
|
|
@ -0,0 +1,209 @@
|
|||
# Message Types and LLM Visibility
|
||||
|
||||
Every message in pi has an `AgentMessage` type. These messages go through `convertToLlm` before the LLM sees them. This document specifies exactly what the LLM receives for each message type and what it never sees.
|
||||
|
||||
---
|
||||
|
||||
## The AgentMessage Type Hierarchy
|
||||
|
||||
Pi uses `AgentMessage` as its internal message type, which is a union of standard LLM messages and custom application messages:
|
||||
|
||||
```typescript
|
||||
// Standard LLM messages
|
||||
type Message = UserMessage | AssistantMessage | ToolResultMessage;
|
||||
|
||||
// Custom messages added by pi's coding agent
|
||||
interface CustomAgentMessages {
|
||||
bashExecution: BashExecutionMessage;
|
||||
custom: CustomMessage;
|
||||
branchSummary: BranchSummaryMessage;
|
||||
compactionSummary: CompactionSummaryMessage;
|
||||
}
|
||||
|
||||
// The union
|
||||
type AgentMessage = Message | CustomAgentMessages[keyof CustomAgentMessages];
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Message Type → LLM Conversion Table
|
||||
|
||||
| AgentMessage type | `role` seen by LLM | Content transformation | When excluded |
|
||||
|---|---|---|---|
|
||||
| `user` | `user` | Pass through unchanged | Never |
|
||||
| `assistant` | `assistant` | Pass through unchanged | Never |
|
||||
| `toolResult` | `toolResult` | Pass through unchanged | Never |
|
||||
| `custom` | `user` | `content` preserved as-is (string → `[{type:"text",text}]`) | Never — ALL custom messages reach the LLM |
|
||||
| `bashExecution` | `user` | Formatted: `` Ran `cmd`\n```\noutput\n``` `` | When `excludeFromContext: true` (`!!` prefix) |
|
||||
| `compactionSummary` | `user` | Wrapped: `The conversation history before this point was compacted into the following summary:\n<summary>\n...\n</summary>` | Never |
|
||||
| `branchSummary` | `user` | Wrapped: `The following is a summary of a branch that this conversation came back from:\n<summary>\n...\n</summary>` | Never |
|
||||
|
||||
---
|
||||
|
||||
## Custom Messages In Detail
|
||||
|
||||
Custom messages are created by:
|
||||
1. `pi.sendMessage()` — extension-injected messages
|
||||
2. `before_agent_start` returning a `message` — per-prompt context injection
|
||||
|
||||
### The `display` Field Misconception
|
||||
|
||||
```typescript
|
||||
pi.sendMessage({
|
||||
customType: "my-context",
|
||||
content: "This text goes to the LLM",
|
||||
display: false, // ← ONLY controls UI rendering
|
||||
});
|
||||
```
|
||||
|
||||
**What `display` controls:**
|
||||
- `true`: Message appears in the TUI chat log (rendered via `registerMessageRenderer` if one exists, or default rendering)
|
||||
- `false`: Message is hidden from the TUI chat log
|
||||
|
||||
**What `display` does NOT control:**
|
||||
- LLM visibility — the LLM ALWAYS receives the content as a `user` role message
|
||||
- Session persistence — the message is ALWAYS persisted to the session file
|
||||
|
||||
### How Custom Messages Become User Messages
|
||||
|
||||
In `convertToLlm` (messages.ts):
|
||||
|
||||
```typescript
|
||||
case "custom": {
|
||||
const content = typeof m.content === "string"
|
||||
? [{ type: "text", text: m.content }]
|
||||
: m.content;
|
||||
return {
|
||||
role: "user",
|
||||
content,
|
||||
timestamp: m.timestamp,
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
The `customType`, `display`, and `details` fields are all stripped. The LLM sees a plain user message with the content.
|
||||
|
||||
---
|
||||
|
||||
## Bash Execution Messages
|
||||
|
||||
Created when the user runs commands via `!` or `!!` prefix.
|
||||
|
||||
### `!` (included in context)
|
||||
|
||||
```typescript
|
||||
// User types: !ls -la
|
||||
// LLM sees:
|
||||
{
|
||||
role: "user",
|
||||
content: [{ type: "text", text: "Ran `ls -la`\n```\n<output>\n```" }]
|
||||
}
|
||||
```
|
||||
|
||||
With exit code, cancellation, and truncation info appended as needed:
|
||||
- Non-zero exit: `\n\nCommand exited with code N`
|
||||
- Cancelled: `\n\n(command cancelled)`
|
||||
- Truncated: `\n\n[Output truncated. Full output: /path/to/file]`
|
||||
|
||||
### `!!` (excluded from context)
|
||||
|
||||
```typescript
|
||||
// User types: !!echo secret
|
||||
// LLM sees: NOTHING — filtered out by convertToLlm
|
||||
```
|
||||
|
||||
The `excludeFromContext` flag on `BashExecutionMessage` causes `convertToLlm` to return `undefined` for this message, effectively removing it.
|
||||
|
||||
---
|
||||
|
||||
## Compaction and Branch Summary Messages
|
||||
|
||||
These are synthetic messages created by pi's session management.
|
||||
|
||||
### Compaction Summary
|
||||
|
||||
When the context is compacted, older messages are replaced with a summary:
|
||||
|
||||
```typescript
|
||||
// LLM sees:
|
||||
{
|
||||
role: "user",
|
||||
content: [{
|
||||
type: "text",
|
||||
text: "The conversation history before this point was compacted into the following summary:\n\n<summary>\n[LLM-generated summary of the compacted conversation]\n</summary>"
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
### Branch Summary
|
||||
|
||||
When navigating away from a branch and back, the abandoned branch gets summarized:
|
||||
|
||||
```typescript
|
||||
// LLM sees:
|
||||
{
|
||||
role: "user",
|
||||
content: [{
|
||||
type: "text",
|
||||
text: "The following is a summary of a branch that this conversation came back from:\n\n<summary>\n[summary of the branch]\n</summary>"
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What the LLM Never Sees
|
||||
|
||||
1. **`appendEntry` data** — Extension-private entries (`pi.appendEntry("my-state", data)`) are stored in the session file but NEVER included in the message array. They're not `AgentMessage` types at all — they're `CustomEntry` session entries.
|
||||
|
||||
2. **`details` on custom messages** — The `details` field is for rendering and state reconstruction. `convertToLlm` strips it.
|
||||
|
||||
3. **`details` on tool results** — Tool result `details` are stripped by the LLM message conversion. Only `content` reaches the LLM.
|
||||
|
||||
4. **`!!` bash execution output** — Explicitly excluded from context.
|
||||
|
||||
5. **Tool definitions not in the active set** — If a tool is registered but not in `getActiveTools()`, the LLM doesn't know it exists.
|
||||
|
||||
6. **`promptSnippet` and `promptGuidelines` from inactive tools** — Only active tools contribute to the system prompt.
|
||||
|
||||
---
|
||||
|
||||
## The Message Array Order
|
||||
|
||||
For a typical conversation, the message array the LLM sees (after `context` event and `convertToLlm`) looks like:
|
||||
|
||||
```
|
||||
1. [compactionSummary → user] (if compaction happened)
|
||||
2. [branchSummary → user] (if navigated back from a branch)
|
||||
3. [user] (first user message after compaction)
|
||||
4. [assistant] (LLM response)
|
||||
5. [toolResult] (tool results)
|
||||
6. [user] (next user message)
|
||||
7. [custom → user] (extension-injected message)
|
||||
8. ...continues...
|
||||
9. [user] (current prompt)
|
||||
10. [custom → user] (before_agent_start injected messages)
|
||||
11. [custom → user] (nextTurn queued messages)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implications for Extension Authors
|
||||
|
||||
### If you want the LLM to see something:
|
||||
- Use `before_agent_start` → `message` for per-prompt context
|
||||
- Use `context` event to inject into the message array per-turn
|
||||
- Use `pi.sendMessage` for standalone messages
|
||||
- Use `before_agent_start` → `systemPrompt` for system-level instructions
|
||||
|
||||
### If you want to hide something from the LLM:
|
||||
- Use `pi.appendEntry` — never reaches the message array
|
||||
- Use tool result `details` — stored in session but stripped before LLM
|
||||
- Use the `context` event to filter messages OUT of the array
|
||||
- There is NO way to inject UI-only messages that participate in the conversation flow — `display: false` only hides from the TUI, not from the LLM
|
||||
|
||||
### If you want something to survive compaction:
|
||||
- Store it in tool result `details` (survives in the kept entries)
|
||||
- Store it in `appendEntry` (survives as session data, not messages)
|
||||
- Re-inject it via `before_agent_start` every time (survives because you regenerate it)
|
||||
- Messages in the compacted range are replaced by the compaction summary — they're gone from the LLM's perspective
|
||||
233
docs/context-and-hooks/05-inter-extension-communication.md
Normal file
233
docs/context-and-hooks/05-inter-extension-communication.md
Normal file
|
|
@ -0,0 +1,233 @@
|
|||
# Inter-Extension Communication
|
||||
|
||||
How extensions communicate with each other, share state, and coordinate behavior.
|
||||
|
||||
---
|
||||
|
||||
## pi.events — The Shared Event Bus
|
||||
|
||||
Every extension receives the same `pi.events` instance. It's a simple typed pub/sub bus.
|
||||
|
||||
### API
|
||||
|
||||
```typescript
|
||||
// Emit an event on a channel
|
||||
pi.events.emit("my-channel", { action: "started", id: 123 });
|
||||
|
||||
// Subscribe to a channel — returns an unsubscribe function
|
||||
const unsub = pi.events.on("my-channel", (data) => {
|
||||
// data is typed as `unknown` — you must cast
|
||||
const payload = data as { action: string; id: number };
|
||||
console.log(payload.action); // "started"
|
||||
});
|
||||
|
||||
// Later: stop listening
|
||||
unsub();
|
||||
```
|
||||
|
||||
### Characteristics
|
||||
|
||||
| Property | Behavior |
|
||||
|---|---|
|
||||
| **Typing** | `data` is `unknown`. No generics. Cast at the consumer. |
|
||||
| **Error handling** | Handlers are wrapped in async try/catch. Errors log to `console.error` but don't propagate to emitter or crash the session. |
|
||||
| **Ordering** | Handlers fire in subscription order (order of `pi.events.on` calls). |
|
||||
| **Persistence** | No replay, no persistence. If you emit before anyone subscribes, the event is lost. |
|
||||
| **Scope** | Shared across ALL extensions in the session. The bus is created once and passed to every extension's `createExtensionAPI`. |
|
||||
| **Lifecycle** | The bus is cleared on extension reload (`/reload`). Subscriptions from the old extension instances are gone. |
|
||||
|
||||
### Example: Extension A Signals Extension B
|
||||
|
||||
```typescript
|
||||
// Extension A: plan-mode.ts
|
||||
export default function (pi: ExtensionAPI) {
|
||||
pi.registerCommand("plan", {
|
||||
handler: async (_args, ctx) => {
|
||||
planEnabled = !planEnabled;
|
||||
pi.events.emit("mode-change", { mode: planEnabled ? "plan" : "normal" });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
// Extension B: status-display.ts
|
||||
export default function (pi: ExtensionAPI) {
|
||||
pi.events.on("mode-change", (data) => {
|
||||
const { mode } = data as { mode: string };
|
||||
// React to mode change
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### Limitations
|
||||
|
||||
- **No request/response** — emit is fire-and-forget. If you need a response, use shared state or a callback pattern.
|
||||
- **No guaranteed delivery** — if the subscriber hasn't loaded yet (load order matters), the event is missed.
|
||||
- **No channel namespacing** — use descriptive channel names to avoid collisions (e.g., `"myext:event"` rather than `"update"`).
|
||||
|
||||
---
|
||||
|
||||
## Shared State Patterns
|
||||
|
||||
### Pattern 1: Shared Module State
|
||||
|
||||
If two extensions are loaded from the same package (via `package.json` `pi.extensions` array), they can share state through module-level variables in a shared file.
|
||||
|
||||
```
|
||||
my-extension/
|
||||
├── package.json # pi.extensions: ["./a.ts", "./b.ts"]
|
||||
├── a.ts # import { state } from "./shared.ts"
|
||||
├── b.ts # import { state } from "./shared.ts"
|
||||
└── shared.ts # export const state = { count: 0 }
|
||||
```
|
||||
|
||||
**Caveat:** jiti module caching means the shared module is loaded once. But on `/reload`, everything is re-imported from scratch — shared state resets.
|
||||
|
||||
### Pattern 2: Event Bus as State Channel
|
||||
|
||||
Use `pi.events` to broadcast state changes. Each extension maintains its own copy.
|
||||
|
||||
```typescript
|
||||
// Extension A: authoritative state owner
|
||||
let items: string[] = [];
|
||||
|
||||
function addItem(item: string) {
|
||||
items.push(item);
|
||||
pi.events.emit("items:updated", { items: [...items] });
|
||||
}
|
||||
|
||||
// Extension B: state consumer
|
||||
let mirroredItems: string[] = [];
|
||||
|
||||
pi.events.on("items:updated", (data) => {
|
||||
mirroredItems = (data as { items: string[] }).items;
|
||||
});
|
||||
```
|
||||
|
||||
### Pattern 3: Session Entries as Coordination Points
|
||||
|
||||
Extensions can read each other's `appendEntry` data from the session:
|
||||
|
||||
```typescript
|
||||
// Extension A writes:
|
||||
pi.appendEntry("ext-a-config", { theme: "dark", verbose: true });
|
||||
|
||||
// Extension B reads during session_start:
|
||||
pi.on("session_start", async (_event, ctx) => {
|
||||
for (const entry of ctx.sessionManager.getEntries()) {
|
||||
if (entry.type === "custom" && entry.customType === "ext-a-config") {
|
||||
const config = entry.data as { theme: string; verbose: boolean };
|
||||
// Use config from Extension A
|
||||
}
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Downside:** This only works after `session_start`. Not suitable for real-time coordination during a turn.
|
||||
|
||||
---
|
||||
|
||||
## Multi-Extension Coordination Patterns
|
||||
|
||||
### Pattern: Mode Manager
|
||||
|
||||
One extension acts as the mode authority, others react:
|
||||
|
||||
```typescript
|
||||
// mode-manager.ts — the authority
|
||||
export default function (pi: ExtensionAPI) {
|
||||
let currentMode: "plan" | "execute" | "review" = "execute";
|
||||
|
||||
pi.registerCommand("mode", {
|
||||
handler: async (args, ctx) => {
|
||||
const newMode = args.trim() as typeof currentMode;
|
||||
if (!["plan", "execute", "review"].includes(newMode)) {
|
||||
ctx.ui.notify(`Invalid mode: ${newMode}`, "error");
|
||||
return;
|
||||
}
|
||||
currentMode = newMode;
|
||||
pi.events.emit("mode:changed", { mode: currentMode });
|
||||
ctx.ui.notify(`Mode: ${currentMode}`);
|
||||
},
|
||||
});
|
||||
|
||||
// Other extensions can query current mode via event
|
||||
pi.events.on("mode:query", () => {
|
||||
pi.events.emit("mode:current", { mode: currentMode });
|
||||
});
|
||||
}
|
||||
|
||||
// tool-guard.ts — reacts to mode changes
|
||||
export default function (pi: ExtensionAPI) {
|
||||
let currentMode = "execute";
|
||||
|
||||
pi.events.on("mode:changed", (data) => {
|
||||
currentMode = (data as { mode: string }).mode;
|
||||
});
|
||||
|
||||
pi.on("tool_call", async (event) => {
|
||||
if (currentMode === "plan" && ["edit", "write"].includes(event.toolName)) {
|
||||
return { block: true, reason: "Plan mode: write operations disabled" };
|
||||
}
|
||||
if (currentMode === "review" && event.toolName === "bash") {
|
||||
return { block: true, reason: "Review mode: bash disabled" };
|
||||
}
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern: Extension Priority Chain
|
||||
|
||||
When multiple extensions handle the same hook, load order determines priority. Project-local extensions load before global ones. Within a directory, files are discovered in filesystem order.
|
||||
|
||||
If you need explicit priority control:
|
||||
|
||||
```typescript
|
||||
// priority-extension.ts
|
||||
export default function (pi: ExtensionAPI) {
|
||||
// Register with a known channel so other extensions can defer
|
||||
pi.events.emit("priority:registered", { name: "security-guard" });
|
||||
|
||||
pi.on("tool_call", async (event) => {
|
||||
// This runs first if loaded first
|
||||
if (isUnsafe(event)) {
|
||||
return { block: true, reason: "Security policy violation" };
|
||||
}
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## The ExtensionContext in Tools
|
||||
|
||||
Tools registered by extensions receive `ExtensionContext` as their 5th `execute` parameter. This is the same context event handlers get:
|
||||
|
||||
```typescript
|
||||
pi.registerTool({
|
||||
name: "my_tool",
|
||||
// ...
|
||||
async execute(toolCallId, params, signal, onUpdate, ctx) {
|
||||
// ctx.ui — dialog methods, notifications, widgets
|
||||
// ctx.sessionManager — read session state
|
||||
// ctx.model — current model
|
||||
// ctx.cwd — working directory
|
||||
// ctx.hasUI — false in print/json mode
|
||||
// ctx.isIdle() — agent state
|
||||
// ctx.abort() — abort current operation
|
||||
// ctx.getContextUsage() — token usage
|
||||
// ctx.compact() — trigger compaction
|
||||
// ctx.getSystemPrompt() — current system prompt
|
||||
|
||||
if (ctx.hasUI) {
|
||||
const confirmed = await ctx.ui.confirm("Proceed?", "This will modify files");
|
||||
if (!confirmed) {
|
||||
return { content: [{ type: "text", text: "Cancelled by user" }] };
|
||||
}
|
||||
}
|
||||
|
||||
// ... do work
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
**Important:** The `ctx` is freshly created via `runner.createContext()` for each tool execution. It reflects the current state at call time (current model, current session, etc.), not the state when the tool was registered.
|
||||
382
docs/context-and-hooks/06-advanced-patterns-from-source.md
Normal file
382
docs/context-and-hooks/06-advanced-patterns-from-source.md
Normal file
|
|
@ -0,0 +1,382 @@
|
|||
# Advanced Patterns from Source
|
||||
|
||||
Production patterns extracted from the pi codebase, built-in extensions, and real extension examples. Each pattern shows the mechanism, the source of truth, and when to use it.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 1: Mode-Aware Tool Sets with Context Injection
|
||||
|
||||
**Source:** `plan-mode/index.ts` — the built-in plan mode extension.
|
||||
|
||||
This pattern combines tool set management, tool call blocking, context event filtering, and before_agent_start injection into a cohesive mode system.
|
||||
|
||||
### The Architecture
|
||||
|
||||
```
|
||||
/plan toggle → sets planModeEnabled
|
||||
├─► setActiveTools(PLAN_MODE_TOOLS) # restrict available tools
|
||||
├─► tool_call guard # block unsafe bash even if tool is active
|
||||
├─► before_agent_start # inject mode-specific instructions
|
||||
├─► context # filter stale mode messages on mode exit
|
||||
└─► agent_end # check plan output, offer execution
|
||||
```
|
||||
|
||||
### Key Insight: Defense in Depth
|
||||
|
||||
The plan mode uses THREE layers of tool control:
|
||||
|
||||
1. **`setActiveTools`** — removes write tools from the active set entirely. The LLM doesn't even know they exist.
|
||||
2. **`tool_call` guard** — even for allowed tools like `bash`, blocks destructive commands via an allowlist.
|
||||
3. **`context` filter** — when exiting plan mode, removes stale plan mode context messages so they don't confuse the LLM in normal mode.
|
||||
|
||||
```typescript
|
||||
// Layer 1: Tool set
|
||||
if (planModeEnabled) {
|
||||
pi.setActiveTools(["read", "bash", "grep", "find", "ls"]);
|
||||
}
|
||||
|
||||
// Layer 2: Bash guard
|
||||
pi.on("tool_call", async (event) => {
|
||||
if (!planModeEnabled || event.toolName !== "bash") return;
|
||||
if (!isSafeCommand(event.input.command)) {
|
||||
return { block: true, reason: "Plan mode: command blocked" };
|
||||
}
|
||||
});
|
||||
|
||||
// Layer 3: Context cleanup on mode exit
|
||||
pi.on("context", async (event) => {
|
||||
if (planModeEnabled) return; // keep plan context when in plan mode
|
||||
return {
|
||||
messages: event.messages.filter(m => {
|
||||
// Remove plan mode markers from context
|
||||
if (m.customType === "plan-mode-context") return false;
|
||||
return true;
|
||||
}),
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
### Why This Matters
|
||||
|
||||
A naive implementation would just change the tool set. But:
|
||||
- `bash` with `rm -rf` is technically a "read-only" tool by name
|
||||
- Stale context messages from a previous mode can confuse the LLM
|
||||
- The LLM might try to work around restrictions if it sees the mode instructions but has the tools available
|
||||
|
||||
---
|
||||
|
||||
## Pattern 2: Preset System with Dynamic Model + Tool + Prompt Configuration
|
||||
|
||||
**Source:** `preset.ts` — the built-in preset extension.
|
||||
|
||||
This pattern shows how to build a full configuration management system that coordinates model, thinking level, tools, and system prompt from a single config file.
|
||||
|
||||
### The Architecture
|
||||
|
||||
```
|
||||
presets.json → load on session_start
|
||||
│
|
||||
├─► /preset command → applyPreset(name)
|
||||
├─► Ctrl+Shift+U → cyclePreset()
|
||||
├─► --preset flag → applyPreset on startup
|
||||
│
|
||||
applyPreset:
|
||||
├─► pi.setModel() → switch model
|
||||
├─► pi.setThinkingLevel() → adjust thinking
|
||||
├─► pi.setActiveTools() → reconfigure tools
|
||||
└─► store activePreset → before_agent_start reads it
|
||||
|
||||
before_agent_start:
|
||||
└─► append preset.instructions to system prompt
|
||||
```
|
||||
|
||||
### Key Insight: Deferred System Prompt Application
|
||||
|
||||
The preset doesn't modify the system prompt during `applyPreset`. It stores `activePreset` and lets `before_agent_start` read it:
|
||||
|
||||
```typescript
|
||||
// On apply — just store
|
||||
activePresetName = name;
|
||||
activePreset = preset;
|
||||
|
||||
// On each prompt — inject
|
||||
pi.on("before_agent_start", async (event) => {
|
||||
if (activePreset?.instructions) {
|
||||
return {
|
||||
systemPrompt: `${event.systemPrompt}\n\n${activePreset.instructions}`,
|
||||
};
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
This is better than calling `agent.setSystemPrompt()` directly because:
|
||||
- `before_agent_start` fires on every prompt, keeping the system prompt current
|
||||
- The base system prompt is rebuilt by pi when tools change — a direct set would be overwritten
|
||||
- Other extensions can see and further modify the prompt in the chain
|
||||
|
||||
---
|
||||
|
||||
## Pattern 3: Progress Tracking with Widget + State Persistence
|
||||
|
||||
**Source:** `plan-mode/index.ts` — todo item tracking during plan execution.
|
||||
|
||||
### The Architecture
|
||||
|
||||
```
|
||||
Plan created (assistant message with "Plan:" section)
|
||||
→ extractTodoItems() parses numbered steps
|
||||
→ todoItems stored in memory
|
||||
→ ui.setWidget() shows progress
|
||||
→ appendEntry() persists state
|
||||
|
||||
Each turn:
|
||||
→ turn_end checks for [DONE:n] markers
|
||||
→ markCompletedSteps() updates todoItems
|
||||
→ updateStatus() refreshes widget
|
||||
|
||||
Session resume:
|
||||
→ session_start restores from appendEntry
|
||||
→ Re-scans messages after last execute marker for [DONE:n]
|
||||
→ Rebuilds completion state
|
||||
```
|
||||
|
||||
### Key Insight: Dual State Reconstruction
|
||||
|
||||
On session resume, the extension does TWO things:
|
||||
|
||||
1. **Reads the persisted state** from `appendEntry`:
|
||||
```typescript
|
||||
const planModeEntry = entries
|
||||
.filter(e => e.type === "custom" && e.customType === "plan-mode")
|
||||
.pop();
|
||||
```
|
||||
|
||||
2. **Re-scans assistant messages** for completion markers:
|
||||
```typescript
|
||||
// Only scan messages AFTER the last plan-mode-execute marker
|
||||
const allText = messages.map(getTextContent).join("\n");
|
||||
markCompletedSteps(allText, todoItems);
|
||||
```
|
||||
|
||||
This handles the case where the extension crashed or was reloaded mid-execution — the persisted state might be stale, but the messages are the source of truth.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 4: Dynamic Resource Injection
|
||||
|
||||
**Source:** `dynamic-resources/index.ts` — extension that ships its own skills and themes.
|
||||
|
||||
```typescript
|
||||
import { dirname, join } from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
|
||||
const baseDir = dirname(fileURLToPath(import.meta.url));
|
||||
|
||||
export default function (pi: ExtensionAPI) {
|
||||
pi.on("resources_discover", () => {
|
||||
return {
|
||||
skillPaths: [join(baseDir, "SKILL.md")],
|
||||
promptPaths: [join(baseDir, "dynamic.md")],
|
||||
themePaths: [join(baseDir, "dynamic.json")],
|
||||
};
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### How It Works Internally
|
||||
|
||||
After `session_start`, the runner calls `emitResourcesDiscover()`. The returned paths are processed through the `ResourceLoader`:
|
||||
|
||||
1. Skills → loaded, added to system prompt's skill listing
|
||||
2. Prompts → loaded as prompt templates, available via `/templatename`
|
||||
3. Themes → loaded, available via `/theme` or `ctx.ui.setTheme()`
|
||||
|
||||
The system prompt is rebuilt after resources are extended, so new skills appear in the same prompt turn.
|
||||
|
||||
### When to Use
|
||||
|
||||
- Extension packages that need custom skills (e.g., a deployment extension with a "deploy checklist" skill)
|
||||
- Theme packs distributed as extensions
|
||||
- Dynamic prompt templates that depend on the project context
|
||||
|
||||
---
|
||||
|
||||
## Pattern 5: Claude Rules Integration
|
||||
|
||||
**Source:** `claude-rules.ts` — scanning `.claude/rules/` for per-project rules.
|
||||
|
||||
### The Architecture
|
||||
|
||||
```
|
||||
session_start:
|
||||
→ Scan .claude/rules/ for .md files (recursive)
|
||||
→ Store file list
|
||||
|
||||
before_agent_start:
|
||||
→ Append file list to system prompt
|
||||
→ Agent uses read tool to load specific rules on demand
|
||||
```
|
||||
|
||||
### Key Insight: Listing, Not Loading
|
||||
|
||||
The extension does NOT load rule file contents into the system prompt. It lists the files:
|
||||
|
||||
```typescript
|
||||
pi.on("before_agent_start", async (event) => {
|
||||
if (ruleFiles.length === 0) return;
|
||||
|
||||
const rulesList = ruleFiles.map(f => `- .claude/rules/${f}`).join("\n");
|
||||
|
||||
return {
|
||||
systemPrompt: event.systemPrompt + `
|
||||
|
||||
## Project Rules
|
||||
The following project rules are available in .claude/rules/:
|
||||
${rulesList}
|
||||
When working on tasks related to these rules, use the read tool to load the relevant rule files.`,
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
This is context-efficient: the system prompt grows by one line per rule file, not by the full contents of every rule. The LLM loads specific rules via `read` only when relevant.
|
||||
|
||||
---
|
||||
|
||||
## Pattern 6: Remote Execution via Tool Wrapping
|
||||
|
||||
**Source:** The SSH extension pattern and `createBashTool` with pluggable operations.
|
||||
|
||||
### The Architecture
|
||||
|
||||
Tools support pluggable `operations` that replace the underlying I/O:
|
||||
|
||||
```typescript
|
||||
import { createBashTool } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
// Create a bash tool that executes via SSH
|
||||
const remoteBash = createBashTool(cwd, {
|
||||
operations: {
|
||||
execute: async (command, options) => {
|
||||
return sshExec(remoteHost, command, options);
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
// Register it as the bash tool (overrides built-in)
|
||||
pi.registerTool({
|
||||
...remoteBash,
|
||||
name: "bash", // same name = overrides built-in
|
||||
});
|
||||
```
|
||||
|
||||
### The spawnHook Alternative
|
||||
|
||||
For lighter customization (e.g., environment setup):
|
||||
|
||||
```typescript
|
||||
const bashTool = createBashTool(cwd, {
|
||||
spawnHook: ({ command, cwd, env }) => ({
|
||||
command: `source ~/.profile\n${command}`,
|
||||
cwd: `/mnt/sandbox${cwd}`,
|
||||
env: { ...env, CI: "1" },
|
||||
}),
|
||||
});
|
||||
```
|
||||
|
||||
### User Bash Hook for `!` Commands
|
||||
|
||||
The `user_bash` event lets you intercept user-typed bash commands (not LLM-initiated ones):
|
||||
|
||||
```typescript
|
||||
pi.on("user_bash", async (event) => {
|
||||
// Route user bash commands through SSH too
|
||||
return {
|
||||
operations: {
|
||||
execute: (cmd, opts) => sshExec(remoteHost, cmd, opts),
|
||||
},
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 7: Extension-Aware Compaction
|
||||
|
||||
**Source:** `session_before_compact` in agent-session.ts.
|
||||
|
||||
### Custom Compaction Summary
|
||||
|
||||
Override the default LLM-generated summary:
|
||||
|
||||
```typescript
|
||||
pi.on("session_before_compact", async (event) => {
|
||||
// Build a domain-specific summary
|
||||
const summary = buildCustomSummary(event.branchEntries);
|
||||
|
||||
return {
|
||||
compaction: {
|
||||
summary,
|
||||
firstKeptEntryId: event.preparation.firstKeptEntryId,
|
||||
tokensBefore: event.preparation.tokensBefore,
|
||||
},
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
### Compaction-Aware State
|
||||
|
||||
If your extension stores state in messages that might get compacted away, you need a reconstruction strategy:
|
||||
|
||||
```typescript
|
||||
pi.on("session_start", async (_event, ctx) => {
|
||||
// Check if there's been a compaction
|
||||
const entries = ctx.sessionManager.getBranch();
|
||||
const hasCompaction = entries.some(e => e.type === "compaction");
|
||||
|
||||
if (hasCompaction) {
|
||||
// State before compaction is gone from messages
|
||||
// Fall back to appendEntry data or re-derive from remaining messages
|
||||
restoreFromAppendEntries(entries);
|
||||
} else {
|
||||
// Full message history available
|
||||
restoreFromToolResults(entries);
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pattern 8: The Complete Extension Initialization Sequence
|
||||
|
||||
From the source code, the full initialization order is:
|
||||
|
||||
```
|
||||
1. Extension factory function runs
|
||||
├─► pi.on() — register event handlers
|
||||
├─► pi.registerTool() — register tools
|
||||
├─► pi.registerCommand() — register commands
|
||||
├─► pi.registerShortcut() — register shortcuts
|
||||
├─► pi.registerFlag() — register CLI flags
|
||||
└─► pi.registerProvider() — queued (not yet applied)
|
||||
|
||||
2. ExtensionRunner created with all extensions
|
||||
|
||||
3. bindCore() — action methods become live
|
||||
├─► pi.sendMessage, pi.setActiveTools, etc. now work
|
||||
└─► Queued provider registrations flushed to ModelRegistry
|
||||
|
||||
4. bindExtensions() — UI context and command context connected
|
||||
└─► setUIContext(), bindCommandContext()
|
||||
|
||||
5. session_start event fires
|
||||
└─► Extensions restore state from session
|
||||
|
||||
6. resources_discover event fires
|
||||
└─► Extensions provide additional skill/prompt/theme paths
|
||||
|
||||
7. System prompt rebuilt with new resources
|
||||
|
||||
8. Ready for first user prompt
|
||||
```
|
||||
|
||||
**Important timing:** During step 1, action methods (`sendMessage`, `setActiveTools`, etc.) will throw. You can only register handlers and tools during the factory function. Use `session_start` for anything that needs runtime access.
|
||||
316
docs/context-and-hooks/07-the-system-prompt-anatomy.md
Normal file
316
docs/context-and-hooks/07-the-system-prompt-anatomy.md
Normal file
|
|
@ -0,0 +1,316 @@
|
|||
# The System Prompt Anatomy
|
||||
|
||||
How pi's system prompt is built, what goes into it, when it's rebuilt, and every lever you have to shape it.
|
||||
|
||||
---
|
||||
|
||||
## The Final Prompt Structure
|
||||
|
||||
When `buildSystemPrompt()` runs, it assembles sections in this exact order:
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────┐
|
||||
│ 1. Base prompt (default or SYSTEM.md override) │
|
||||
│ ├── Identity statement │
|
||||
│ ├── Available tools list │
|
||||
│ ├── Custom tools note │
|
||||
│ ├── Guidelines │
|
||||
│ └── Pi documentation pointers │
|
||||
│ │
|
||||
│ 2. Append system prompt (APPEND_SYSTEM.md) │
|
||||
│ │
|
||||
│ 3. Project context files │
|
||||
│ ├── ~/.gsd/agent/AGENTS.md (global) │
|
||||
│ ├── Ancestor AGENTS.md / CLAUDE.md files │
|
||||
│ └── cwd AGENTS.md / CLAUDE.md │
|
||||
│ │
|
||||
│ 4. Skills listing │
|
||||
│ └── <available_skills> XML block │
|
||||
│ │
|
||||
│ 5. Date/time and working directory │
|
||||
└──────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
After `buildSystemPrompt()`, extensions can further modify via `before_agent_start`.
|
||||
|
||||
---
|
||||
|
||||
## Section 1: The Base Prompt
|
||||
|
||||
### Default Base Prompt (no SYSTEM.md)
|
||||
|
||||
When no SYSTEM.md exists, pi uses its built-in base:
|
||||
|
||||
```
|
||||
You are an expert coding assistant operating inside pi, a coding agent harness.
|
||||
You help users by reading files, executing commands, editing code, and writing new files.
|
||||
|
||||
Available tools:
|
||||
- read: Read file contents
|
||||
- bash: Execute bash commands (ls, grep, find, etc.)
|
||||
- edit: Make surgical edits to files (find exact text and replace)
|
||||
- write: Create or overwrite files
|
||||
- my_custom_tool: [promptSnippet or description]
|
||||
|
||||
In addition to the tools above, you may have access to other custom tools
|
||||
depending on the project.
|
||||
|
||||
Guidelines:
|
||||
- Use bash for file operations like ls, rg, find
|
||||
- Use read to examine files before editing. You must use this tool instead of cat or sed.
|
||||
- Use edit for precise changes (old text must match exactly)
|
||||
- Use write only for new files or complete rewrites
|
||||
- [extension tool promptGuidelines inserted here]
|
||||
- Be concise in your responses
|
||||
- Show file paths clearly when working with files
|
||||
|
||||
Pi documentation (read only when the user asks about pi itself...):
|
||||
- Main documentation: [path]
|
||||
- Additional docs: [path]
|
||||
- Examples: [path]
|
||||
```
|
||||
|
||||
### SYSTEM.md Override (full replacement)
|
||||
|
||||
If `.gsd/SYSTEM.md` (project) or `~/.gsd/agent/SYSTEM.md` (global) exists, its contents **completely replace** the default base prompt above. The tools list, guidelines, pi docs pointers — all gone. You own the entire base.
|
||||
|
||||
Project takes precedence over global. Only one SYSTEM.md is used (first found wins).
|
||||
|
||||
**What still gets appended even with a custom SYSTEM.md:**
|
||||
- APPEND_SYSTEM.md content
|
||||
- Project context files (AGENTS.md / CLAUDE.md)
|
||||
- Skills listing (if the `read` tool is active)
|
||||
- Date/time and cwd
|
||||
|
||||
**What you lose:**
|
||||
- The entire default prompt structure
|
||||
- Built-in tool descriptions and guidelines
|
||||
- Pi documentation pointers
|
||||
- Dynamic guidelines from `promptGuidelines` on tools
|
||||
|
||||
### How Tool Descriptions Appear
|
||||
|
||||
Each active tool gets a line in "Available tools":
|
||||
|
||||
```
|
||||
- toolname: [one-line description]
|
||||
```
|
||||
|
||||
The description is determined by priority:
|
||||
1. `promptSnippet` from the tool registration (if provided)
|
||||
2. Built-in description from `toolDescriptions` map (for read, bash, edit, write, grep, find, ls)
|
||||
3. The tool's `name` as fallback
|
||||
|
||||
`promptSnippet` is normalized: newlines collapsed to spaces, trimmed to a single line.
|
||||
|
||||
### How Guidelines Are Built
|
||||
|
||||
Guidelines are assembled dynamically based on which tools are active:
|
||||
|
||||
| Condition | Guideline |
|
||||
|---|---|
|
||||
| bash active, no grep/find/ls | "Use bash for file operations like ls, rg, find" |
|
||||
| bash active + grep/find/ls | "Prefer grep/find/ls tools over bash for file exploration" |
|
||||
| read + edit active | "Use read to examine files before editing" |
|
||||
| edit active | "Use edit for precise changes (old text must match exactly)" |
|
||||
| write active | "Use write only for new files or complete rewrites" |
|
||||
| edit or write active | "When summarizing your actions, output plain text directly" |
|
||||
| Always | "Be concise in your responses" |
|
||||
| Always | "Show file paths clearly when working with files" |
|
||||
|
||||
**Extension tool guidelines** from `promptGuidelines` are appended after the built-in guidelines. They're deduplicated (same string appears only once even if multiple tools register it).
|
||||
|
||||
---
|
||||
|
||||
## Section 2: Append System Prompt
|
||||
|
||||
If `.gsd/APPEND_SYSTEM.md` (project) or `~/.gsd/agent/APPEND_SYSTEM.md` (global) exists, its contents are appended after the base prompt.
|
||||
|
||||
This is the safe way to add project-wide instructions without replacing the default prompt. It works with both the default base and a custom SYSTEM.md.
|
||||
|
||||
---
|
||||
|
||||
## Section 3: Project Context Files
|
||||
|
||||
Pi walks the filesystem collecting context files:
|
||||
|
||||
```
|
||||
1. ~/.gsd/agent/AGENTS.md (global)
|
||||
2. Walk from cwd upward to root:
|
||||
- Each directory: check for AGENTS.md, then CLAUDE.md (first found wins per directory)
|
||||
- Files are collected root-down (ancestors first, cwd last)
|
||||
```
|
||||
|
||||
All found files are concatenated under a "# Project Context" header:
|
||||
|
||||
```markdown
|
||||
# Project Context
|
||||
|
||||
Project-specific instructions and guidelines:
|
||||
|
||||
## /Users/you/.gsd/agent/AGENTS.md
|
||||
|
||||
[global AGENTS.md content]
|
||||
|
||||
## /Users/you/projects/myapp/AGENTS.md
|
||||
|
||||
[project AGENTS.md content]
|
||||
```
|
||||
|
||||
**AGENTS.md vs CLAUDE.md:** Both are treated identically. Per directory, AGENTS.md is checked first. If it exists, CLAUDE.md in the same directory is skipped.
|
||||
|
||||
---
|
||||
|
||||
## Section 4: Skills Listing
|
||||
|
||||
If the `read` tool is active and skills are loaded, an XML block is appended:
|
||||
|
||||
```xml
|
||||
The following skills provide specialized instructions for specific tasks.
|
||||
Use the read tool to load a skill's file when the task matches its description.
|
||||
When a skill file references a relative path, resolve it against the skill directory.
|
||||
|
||||
<available_skills>
|
||||
<skill>
|
||||
<name>commit-outstanding</name>
|
||||
<description>Commit all uncommitted files in logical groups</description>
|
||||
<location>/Users/you/.gsd/agent/skills/commit-outstanding/SKILL.md</location>
|
||||
</skill>
|
||||
</available_skills>
|
||||
```
|
||||
|
||||
Skills with `disable-model-invocation: true` in their frontmatter are excluded from this listing.
|
||||
|
||||
**Key design:** Only names, descriptions, and file paths go into the system prompt. The full skill content is NOT loaded. The agent uses the `read` tool to load specific skills on demand. This keeps the system prompt small even with many skills.
|
||||
|
||||
---
|
||||
|
||||
## Section 5: Date/Time and CWD
|
||||
|
||||
Always appended last:
|
||||
|
||||
```
|
||||
Current date and time: Saturday, March 7, 2026 at 08:55:05 AM CST
|
||||
Current working directory: /Users/you/projects/myapp
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## When the System Prompt Is Rebuilt
|
||||
|
||||
The base system prompt (`_baseSystemPrompt`) is rebuilt in these situations:
|
||||
|
||||
| Trigger | What happens |
|
||||
|---|---|
|
||||
| **Startup** (`_buildRuntime`) | Full rebuild with initial tool set |
|
||||
| **`setActiveToolsByName()`** | Rebuild with new tool set (guidelines and snippets change) |
|
||||
| **`reload()`** (`/reload`) | Full rebuild — reloads SYSTEM.md, APPEND_SYSTEM.md, context files, skills, extensions |
|
||||
| **`extendResourcesFromExtensions()`** | Rebuild after `resources_discover` adds new skills/prompts/themes |
|
||||
| **`_refreshToolRegistry()`** | Rebuild when extension tools change dynamically |
|
||||
|
||||
### Per-Prompt Modifications
|
||||
|
||||
On each user prompt, the `before_agent_start` hook can modify the system prompt. This modification is **not persisted** — the base prompt is restored if no extension modifies it on the next prompt:
|
||||
|
||||
```
|
||||
User prompt 1:
|
||||
before_agent_start → extensions modify system prompt → LLM sees modified version
|
||||
|
||||
User prompt 2:
|
||||
before_agent_start → no extensions modify → LLM sees base system prompt (reset)
|
||||
```
|
||||
|
||||
This means `before_agent_start` modifications are truly per-prompt. You cannot make a permanent system prompt change through this hook alone (the change must be re-applied every time).
|
||||
|
||||
---
|
||||
|
||||
## Every Lever for Shaping the System Prompt
|
||||
|
||||
From static configuration to dynamic extension hooks, ordered from broadest to most targeted:
|
||||
|
||||
### Static (file-based, loaded at startup)
|
||||
|
||||
| Mechanism | Scope | Effect |
|
||||
|---|---|---|
|
||||
| `SYSTEM.md` | Replace base prompt entirely | Nuclear option — you own everything |
|
||||
| `APPEND_SYSTEM.md` | Append to base prompt | Safe additive instructions |
|
||||
| `AGENTS.md` / `CLAUDE.md` | Project context section | Per-project conventions and rules |
|
||||
| Skill `SKILL.md` files | Skills listing | On-demand capability descriptions |
|
||||
|
||||
### Dynamic (extension-based, runtime)
|
||||
|
||||
| Mechanism | Scope | Timing | Effect |
|
||||
|---|---|---|---|
|
||||
| `before_agent_start` → `systemPrompt` | Full prompt | Per user prompt | Modify/append/replace system prompt |
|
||||
| `promptSnippet` on tools | Tool description line | When tool set changes | Custom one-liner in "Available tools" |
|
||||
| `promptGuidelines` on tools | Guidelines section | When tool set changes | Add behavioral bullets |
|
||||
| `pi.setActiveTools()` | Tool list + guidelines | Immediate, next prompt | Add/remove tools (rebuilds prompt) |
|
||||
| `resources_discover` event | Skills listing | Startup + reload | Inject additional skills from extensions |
|
||||
|
||||
### Per-Turn (message-based, not system prompt)
|
||||
|
||||
These don't modify the system prompt but add to what the LLM sees:
|
||||
|
||||
| Mechanism | Timing | Effect |
|
||||
|---|---|---|
|
||||
| `before_agent_start` → `message` | Per user prompt | Inject custom message (becomes user role) |
|
||||
| `context` event | Per LLM turn | Filter/inject/transform message array |
|
||||
| `pi.sendMessage()` | Anytime | Inject custom message into conversation |
|
||||
|
||||
---
|
||||
|
||||
## Practical Tradeoffs
|
||||
|
||||
### SYSTEM.md vs before_agent_start
|
||||
|
||||
| | SYSTEM.md | before_agent_start |
|
||||
|---|---|---|
|
||||
| **Persistence** | Permanent until file changes | Per-prompt, must re-apply |
|
||||
| **Dynamism** | Static file content | Can compute based on state |
|
||||
| **Tool awareness** | Loses built-in tool guidelines | Preserves base prompt, appends |
|
||||
| **Composability** | Only one SYSTEM.md (project or global) | Multiple extensions can chain |
|
||||
|
||||
**Recommendation:** Use SYSTEM.md only when you genuinely need to replace the entire prompt (e.g., custom agent personality, non-coding use case). Use `before_agent_start` for everything else.
|
||||
|
||||
### APPEND_SYSTEM.md vs AGENTS.md
|
||||
|
||||
Both append content, but they appear in different sections:
|
||||
|
||||
- **APPEND_SYSTEM.md** appears immediately after the base prompt, before "# Project Context"
|
||||
- **AGENTS.md** appears inside "# Project Context" with a `## filepath` header
|
||||
|
||||
Functionally equivalent for the LLM. Use APPEND_SYSTEM.md for instructions that feel like system-level directives. Use AGENTS.md for project-specific conventions and context.
|
||||
|
||||
### promptGuidelines vs before_agent_start
|
||||
|
||||
| | promptGuidelines | before_agent_start |
|
||||
|---|---|---|
|
||||
| **Scope** | Only when the tool is active | Always (or conditionally in your code) |
|
||||
| **Positioning** | Inside "Guidelines" section | Appended to end (or wherever you put it) |
|
||||
| **Tool coupling** | Automatically appears/disappears with tool | Independent of tool state |
|
||||
|
||||
**Recommendation:** Use `promptGuidelines` for instructions directly related to tool usage. Use `before_agent_start` for behavioral modifications independent of tool state.
|
||||
|
||||
---
|
||||
|
||||
## The Full Context Surface Area
|
||||
|
||||
Everything the LLM sees on a given turn:
|
||||
|
||||
```
|
||||
System prompt (built from all sources above + before_agent_start mods)
|
||||
+
|
||||
Message array (after context event filtering + convertToLlm):
|
||||
- Compaction summaries (user role)
|
||||
- Branch summaries (user role)
|
||||
- Historical user/assistant/toolResult messages
|
||||
- Bash execution results (user role, unless !! excluded)
|
||||
- Custom messages from extensions (user role)
|
||||
- Current prompt + before_agent_start injected messages
|
||||
+
|
||||
Tool definitions:
|
||||
- name, description, parameter JSON schema
|
||||
- Only for active tools (pi.getActiveTools())
|
||||
```
|
||||
|
||||
Understanding this complete surface area — and which levers control which parts — is the key to effective context engineering in pi.
|
||||
17
docs/context-and-hooks/README.md
Normal file
17
docs/context-and-hooks/README.md
Normal file
|
|
@ -0,0 +1,17 @@
|
|||
# Context & Hooks — Deep Reference
|
||||
|
||||
How context flows through pi, how to intercept and shape it, and advanced patterns for extension authors.
|
||||
|
||||
These documents fill gaps between the high-level extending-pi docs and the actual source implementation. Read the extending-pi docs first for fundamentals, then use these for precision work.
|
||||
|
||||
## Documents
|
||||
|
||||
| # | Document | When to read |
|
||||
|---|----------|-------------|
|
||||
| 01 | [The Context Pipeline](01-the-context-pipeline.md) | Understanding the full journey of a user prompt through every transformation stage to the LLM |
|
||||
| 02 | [Hook Reference](02-hook-reference.md) | Complete behavioral specification of every hook — timing, chaining, return shapes, edge cases |
|
||||
| 03 | [Context Injection Patterns](03-context-injection-patterns.md) | Practical recipes for injecting, filtering, transforming, and managing context |
|
||||
| 04 | [Message Types and LLM Visibility](04-message-types-and-llm-visibility.md) | How every message type is converted for the LLM, what it sees, what it doesn't |
|
||||
| 05 | [Inter-Extension Communication](05-inter-extension-communication.md) | `pi.events`, shared state patterns, and multi-extension coordination |
|
||||
| 06 | [Advanced Patterns from Source](06-advanced-patterns-from-source.md) | Production patterns extracted from the pi codebase and built-in extensions |
|
||||
| 07 | [The System Prompt Anatomy](07-the-system-prompt-anatomy.md) | How the system prompt is built, every input source, when it's rebuilt, and every lever to shape it |
|
||||
18
docs/extending-pi/01-what-are-extensions.md
Normal file
18
docs/extending-pi/01-what-are-extensions.md
Normal file
|
|
@ -0,0 +1,18 @@
|
|||
# What Are Extensions?
|
||||
|
||||
|
||||
Extensions are TypeScript modules that hook into pi's runtime to extend its behavior. They can:
|
||||
|
||||
- **Register custom tools** the LLM can call (via `pi.registerTool()`)
|
||||
- **Intercept and modify events** — block dangerous tool calls, transform user input, inject context
|
||||
- **Register slash commands** (`/mycommand`) for the user
|
||||
- **Render custom UI** — dialogs, selectors, games, overlays, custom editors
|
||||
- **Persist state** across session restarts
|
||||
- **Control how tool calls and messages appear** in the TUI
|
||||
- **Modify the system prompt** dynamically per-turn
|
||||
- **Manage models and providers** — register custom providers, switch models
|
||||
- **Override built-in tools** — wrap `read`, `bash`, `edit`, `write` with custom logic
|
||||
|
||||
**Why this matters:** Extensions are the primary mechanism for customizing pi. They turn pi from a generic coding agent into *your* coding agent — with your guardrails, your tools, your workflow.
|
||||
|
||||
---
|
||||
34
docs/extending-pi/02-architecture-mental-model.md
Normal file
34
docs/extending-pi/02-architecture-mental-model.md
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
# Architecture & Mental Model
|
||||
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ Pi Runtime │
|
||||
│ │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
|
||||
│ │ Session │ │ Agent │ │ Tool Executor │ │
|
||||
│ │ Manager │ │ Loop │ │ │ │
|
||||
│ └────┬─────┘ └────┬─────┘ └────────┬─────────┘ │
|
||||
│ │ │ │ │
|
||||
│ └──────────────┼─────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌───────▼────────┐ │
|
||||
│ │ Event System │ ◄── All events flow │
|
||||
│ └───────┬────────┘ through here │
|
||||
│ │ │
|
||||
│ ┌────────────┼────────────┐ │
|
||||
│ ▼ ▼ ▼ │
|
||||
│ Extension A Extension B Extension C │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Key concepts:**
|
||||
|
||||
- **Extensions are loaded once** when pi starts (or on `/reload`). Your default export function runs, and you subscribe to events and register tools/commands during that function call.
|
||||
- **Events are the communication mechanism.** Pi emits events at every stage of its lifecycle. Your extension listens and reacts.
|
||||
- **Tools are the LLM's interface to your extension.** The LLM sees tool descriptions in its system prompt and calls them when appropriate.
|
||||
- **Commands are the user's interface.** Users type `/mycommand` to invoke your extension directly.
|
||||
- **State lives in tool result `details`** for proper branching/forking support, or in `pi.appendEntry()` for extension-private state.
|
||||
|
||||
---
|
||||
36
docs/extending-pi/03-getting-started.md
Normal file
36
docs/extending-pi/03-getting-started.md
Normal file
|
|
@ -0,0 +1,36 @@
|
|||
# Getting Started
|
||||
|
||||
|
||||
### Minimal Extension
|
||||
|
||||
Create `~/.gsd/agent/extensions/my-extension.ts`:
|
||||
|
||||
```typescript
|
||||
import type { ExtensionAPI } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
export default function (pi: ExtensionAPI) {
|
||||
pi.on("session_start", async (_event, ctx) => {
|
||||
ctx.ui.notify("Extension loaded!", "info");
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
```bash
|
||||
# Quick test (doesn't need to be in extensions dir)
|
||||
pi -e ./my-extension.ts
|
||||
|
||||
# Or just place it in the extensions dir and start pi
|
||||
pi
|
||||
```
|
||||
|
||||
### Hot Reload
|
||||
|
||||
Extensions in auto-discovered locations (`~/.gsd/agent/extensions/` or `.gsd/extensions/`) can be hot-reloaded:
|
||||
|
||||
```
|
||||
/reload
|
||||
```
|
||||
|
||||
---
|
||||
32
docs/extending-pi/04-extension-locations-discovery.md
Normal file
32
docs/extending-pi/04-extension-locations-discovery.md
Normal file
|
|
@ -0,0 +1,32 @@
|
|||
# Extension Locations & Discovery
|
||||
|
||||
|
||||
### Auto-Discovery Paths
|
||||
|
||||
| Location | Scope |
|
||||
|----------|-------|
|
||||
| `~/.gsd/agent/extensions/*.ts` | Global (all projects) |
|
||||
| `~/.gsd/agent/extensions/*/index.ts` | Global (subdirectory) |
|
||||
| `.gsd/extensions/*.ts` | Project-local |
|
||||
| `.gsd/extensions/*/index.ts` | Project-local (subdirectory) |
|
||||
|
||||
### Additional Paths (via settings.json)
|
||||
|
||||
```json
|
||||
{
|
||||
"extensions": [
|
||||
"/path/to/local/extension.ts",
|
||||
"/path/to/local/extension/dir"
|
||||
],
|
||||
"packages": [
|
||||
"npm:@foo/bar@1.0.0",
|
||||
"git:github.com/user/repo@v1"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Security Warning
|
||||
|
||||
> Extensions run with your **full system permissions**. They can execute arbitrary code, read/write any file, make network requests. Only install from sources you trust.
|
||||
|
||||
---
|
||||
54
docs/extending-pi/05-extension-structure-styles.md
Normal file
54
docs/extending-pi/05-extension-structure-styles.md
Normal file
|
|
@ -0,0 +1,54 @@
|
|||
# Extension Structure & Styles
|
||||
|
||||
|
||||
### Single File (simplest)
|
||||
|
||||
```
|
||||
~/.gsd/agent/extensions/
|
||||
└── my-extension.ts
|
||||
```
|
||||
|
||||
### Directory with index.ts (multi-file)
|
||||
|
||||
```
|
||||
~/.gsd/agent/extensions/
|
||||
└── my-extension/
|
||||
├── index.ts # Entry point (must export default function)
|
||||
├── tools.ts
|
||||
└── utils.ts
|
||||
```
|
||||
|
||||
### Package with Dependencies (npm packages needed)
|
||||
|
||||
```
|
||||
~/.gsd/agent/extensions/
|
||||
└── my-extension/
|
||||
├── package.json
|
||||
├── package-lock.json
|
||||
├── node_modules/
|
||||
└── src/
|
||||
└── index.ts
|
||||
```
|
||||
|
||||
```json
|
||||
// package.json
|
||||
{
|
||||
"name": "my-extension",
|
||||
"dependencies": { "zod": "^3.0.0" },
|
||||
"pi": { "extensions": ["./src/index.ts"] }
|
||||
}
|
||||
```
|
||||
|
||||
Run `npm install` in the extension directory. Imports from `node_modules/` resolve automatically.
|
||||
|
||||
### Available Imports
|
||||
|
||||
| Package | Purpose |
|
||||
|---------|---------|
|
||||
| `@mariozechner/pi-coding-agent` | Extension types (`ExtensionAPI`, `ExtensionContext`, event types, utilities) |
|
||||
| `@sinclair/typebox` | Schema definitions for tool parameters (`Type.Object`, `Type.String`, etc.) |
|
||||
| `@mariozechner/pi-ai` | AI utilities (`StringEnum` for Google-compatible enums) |
|
||||
| `@mariozechner/pi-tui` | TUI components (`Text`, `Box`, `Container`, `SelectList`, etc.) |
|
||||
| Node.js built-ins | `node:fs`, `node:path`, `node:child_process`, etc. |
|
||||
|
||||
---
|
||||
42
docs/extending-pi/06-the-extension-lifecycle.md
Normal file
42
docs/extending-pi/06-the-extension-lifecycle.md
Normal file
|
|
@ -0,0 +1,42 @@
|
|||
# The Extension Lifecycle
|
||||
|
||||
|
||||
```
|
||||
pi starts
|
||||
│
|
||||
└─► Extension default function runs
|
||||
├── pi.on("event", handler) ← Subscribe to events
|
||||
├── pi.registerTool({...}) ← Register tools
|
||||
├── pi.registerCommand(...) ← Register commands
|
||||
└── pi.registerShortcut(...) ← Register keyboard shortcuts
|
||||
|
||||
└─► session_start event fires
|
||||
│
|
||||
▼
|
||||
User types a prompt ─────────────────────────────────────┐
|
||||
│ │
|
||||
├─► Extension commands checked (bypass if match) │
|
||||
├─► input event (can intercept/transform) │
|
||||
├─► Skill/template expansion │
|
||||
├─► before_agent_start (inject message, modify │
|
||||
│ system prompt) │
|
||||
├─► agent_start │
|
||||
│ │
|
||||
│ ┌── Turn loop (repeats while LLM calls tools)──┐│
|
||||
│ │ turn_start ││
|
||||
│ │ context (can modify messages sent to LLM) ││
|
||||
│ │ LLM responds → may call tools: ││
|
||||
│ │ tool_call (can BLOCK) ││
|
||||
│ │ tool_execution_start/update/end ││
|
||||
│ │ tool_result (can MODIFY) ││
|
||||
│ │ turn_end ││
|
||||
│ └───────────────────────────────────────────────┘│
|
||||
│ │
|
||||
└─► agent_end │
|
||||
│
|
||||
User types another prompt ◄──────────────────────────────┘
|
||||
```
|
||||
|
||||
**Critical insight:** The event system is your primary mechanism for interacting with pi. Every meaningful thing that happens emits an event, and most events let you modify or block the behavior.
|
||||
|
||||
---
|
||||
93
docs/extending-pi/07-events-the-nervous-system.md
Normal file
93
docs/extending-pi/07-events-the-nervous-system.md
Normal file
|
|
@ -0,0 +1,93 @@
|
|||
# Events — The Nervous System
|
||||
|
||||
|
||||
Events are the core of the extension system. They fall into five categories:
|
||||
|
||||
### 7.1 Session Events
|
||||
|
||||
| Event | When | Can Return |
|
||||
|-------|------|------------|
|
||||
| `session_start` | Session loads | — |
|
||||
| `session_before_switch` | Before `/new` or `/resume` | `{ cancel: true }` |
|
||||
| `session_switch` | After session switch | — |
|
||||
| `session_before_fork` | Before `/fork` | `{ cancel: true }` or `{ skipConversationRestore: true }` |
|
||||
| `session_fork` | After fork | — |
|
||||
| `session_before_compact` | Before compaction | `{ cancel: true }` or `{ compaction: {...} }` (custom summary) |
|
||||
| `session_compact` | After compaction | — |
|
||||
| `session_before_tree` | Before `/tree` navigation | `{ cancel: true }` or `{ summary: {...} }` |
|
||||
| `session_tree` | After tree navigation | — |
|
||||
| `session_shutdown` | On exit (Ctrl+C, Ctrl+D, SIGTERM) | — |
|
||||
|
||||
### 7.2 Agent Events
|
||||
|
||||
| Event | When | Can Return |
|
||||
|-------|------|------------|
|
||||
| `before_agent_start` | After user prompt, before agent loop | `{ message: {...}, systemPrompt: "..." }` |
|
||||
| `agent_start` | Agent loop begins | — |
|
||||
| `agent_end` | Agent loop ends | — |
|
||||
| `turn_start` | Each LLM turn begins | — |
|
||||
| `turn_end` | Each LLM turn ends | — |
|
||||
| `context` | Before each LLM call | `{ messages: [...] }` (modified copy) |
|
||||
| `message_start/update/end` | Message lifecycle | — |
|
||||
|
||||
### 7.3 Tool Events
|
||||
|
||||
| Event | When | Can Return |
|
||||
|-------|------|------------|
|
||||
| `tool_call` | Before tool executes | `{ block: true, reason: "..." }` |
|
||||
| `tool_execution_start` | Tool begins executing | — |
|
||||
| `tool_execution_update` | Tool sends progress | — |
|
||||
| `tool_execution_end` | Tool finishes | — |
|
||||
| `tool_result` | After tool executes | `{ content: [...], details: {...}, isError: bool }` (modify result) |
|
||||
|
||||
### 7.4 Input Events
|
||||
|
||||
| Event | When | Can Return |
|
||||
|-------|------|------------|
|
||||
| `input` | User input received (before skill/template expansion) | `{ action: "transform", text: "..." }` or `{ action: "handled" }` or `{ action: "continue" }` |
|
||||
|
||||
### 7.5 Model Events
|
||||
|
||||
| Event | When | Can Return |
|
||||
|-------|------|------------|
|
||||
| `model_select` | Model changes (`/model`, Ctrl+P, restore) | — |
|
||||
|
||||
### 7.6 User Bash Events
|
||||
|
||||
| Event | When | Can Return |
|
||||
|-------|------|------------|
|
||||
| `user_bash` | User runs `!` or `!!` commands | `{ operations: ... }` or `{ result: {...} }` |
|
||||
|
||||
### Event Handler Signature
|
||||
|
||||
```typescript
|
||||
pi.on("event_name", async (event, ctx: ExtensionContext) => {
|
||||
// event — typed payload for this event
|
||||
// ctx — access to UI, session, model, and control flow
|
||||
|
||||
// Return undefined for no action, or a typed response object
|
||||
});
|
||||
```
|
||||
|
||||
### Type Narrowing for Tool Events
|
||||
|
||||
```typescript
|
||||
import { isToolCallEventType, isBashToolResult } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
pi.on("tool_call", async (event, ctx) => {
|
||||
if (isToolCallEventType("bash", event)) {
|
||||
// event.input is typed as { command: string; timeout?: number }
|
||||
}
|
||||
if (isToolCallEventType("write", event)) {
|
||||
// event.input is typed as { path: string; content: string }
|
||||
}
|
||||
});
|
||||
|
||||
pi.on("tool_result", async (event, ctx) => {
|
||||
if (isBashToolResult(event)) {
|
||||
// event.details is typed as BashToolDetails
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
85
docs/extending-pi/08-extensioncontext-what-you-can-access.md
Normal file
85
docs/extending-pi/08-extensioncontext-what-you-can-access.md
Normal file
|
|
@ -0,0 +1,85 @@
|
|||
# ExtensionContext — What You Can Access
|
||||
|
||||
|
||||
Every event handler receives `ctx: ExtensionContext`. This is your window into pi's runtime state.
|
||||
|
||||
### ctx.ui — User Interaction
|
||||
|
||||
The primary way to interact with the user. See [Section 12: Custom UI](#12-custom-ui--visual-components) for full details.
|
||||
|
||||
```typescript
|
||||
// Dialogs (blocking, wait for user response)
|
||||
const choice = await ctx.ui.select("Pick one:", ["A", "B", "C"]);
|
||||
const ok = await ctx.ui.confirm("Delete?", "This cannot be undone");
|
||||
const name = await ctx.ui.input("Name:", "placeholder");
|
||||
const text = await ctx.ui.editor("Edit:", "prefilled text");
|
||||
|
||||
// Non-blocking UI
|
||||
ctx.ui.notify("Done!", "info"); // Toast notification
|
||||
ctx.ui.setStatus("my-ext", "Active"); // Footer status
|
||||
ctx.ui.setWidget("my-id", ["Line 1"]); // Widget above/below editor
|
||||
ctx.ui.setTitle("pi - my project"); // Terminal title
|
||||
ctx.ui.setEditorText("Prefill text"); // Set editor content
|
||||
ctx.ui.setWorkingMessage("Thinking..."); // Working message during streaming
|
||||
```
|
||||
|
||||
### ctx.hasUI
|
||||
|
||||
`false` in print mode (`-p`) and JSON mode. `true` in interactive and RPC mode. Always check before calling dialog methods in non-interactive contexts.
|
||||
|
||||
### ctx.cwd
|
||||
|
||||
Current working directory (string).
|
||||
|
||||
### ctx.sessionManager — Session State
|
||||
|
||||
Read-only access to the session:
|
||||
|
||||
```typescript
|
||||
ctx.sessionManager.getEntries() // All entries in session
|
||||
ctx.sessionManager.getBranch() // Current branch entries
|
||||
ctx.sessionManager.getLeafId() // Current leaf entry ID
|
||||
ctx.sessionManager.getSessionFile() // Path to session JSONL file
|
||||
ctx.sessionManager.getLabel(entryId) // Get label on entry
|
||||
```
|
||||
|
||||
### ctx.modelRegistry / ctx.model
|
||||
|
||||
Access to available models and the current model.
|
||||
|
||||
### ctx.isIdle() / ctx.abort() / ctx.hasPendingMessages()
|
||||
|
||||
Control flow helpers for checking agent state.
|
||||
|
||||
### ctx.shutdown()
|
||||
|
||||
Request graceful shutdown. Deferred until agent is idle. Emits `session_shutdown` before exiting.
|
||||
|
||||
### ctx.getContextUsage()
|
||||
|
||||
Returns current context token usage. Useful for triggering compaction or showing stats.
|
||||
|
||||
```typescript
|
||||
const usage = ctx.getContextUsage();
|
||||
if (usage && usage.tokens > 100_000) {
|
||||
// Context is getting large
|
||||
}
|
||||
```
|
||||
|
||||
### ctx.compact(options?)
|
||||
|
||||
Trigger compaction programmatically:
|
||||
|
||||
```typescript
|
||||
ctx.compact({
|
||||
customInstructions: "Focus on recent changes",
|
||||
onComplete: (result) => ctx.ui.notify("Compacted!", "info"),
|
||||
onError: (error) => ctx.ui.notify(`Failed: ${error.message}`, "error"),
|
||||
});
|
||||
```
|
||||
|
||||
### ctx.getSystemPrompt()
|
||||
|
||||
Returns the current effective system prompt (including any `before_agent_start` modifications).
|
||||
|
||||
---
|
||||
77
docs/extending-pi/09-extensionapi-what-you-can-do.md
Normal file
77
docs/extending-pi/09-extensionapi-what-you-can-do.md
Normal file
|
|
@ -0,0 +1,77 @@
|
|||
# ExtensionAPI — What You Can Do
|
||||
|
||||
|
||||
The `pi` object (received in your default export function) is your registration interface. It persists for the lifetime of the extension.
|
||||
|
||||
### Core Registration
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `pi.on(event, handler)` | Subscribe to events |
|
||||
| `pi.registerTool(definition)` | Register a tool the LLM can call |
|
||||
| `pi.registerCommand(name, options)` | Register a `/command` |
|
||||
| `pi.registerShortcut(key, options)` | Register a keyboard shortcut |
|
||||
| `pi.registerFlag(name, options)` | Register a CLI flag |
|
||||
| `pi.registerMessageRenderer(customType, renderer)` | Custom message rendering |
|
||||
| `pi.registerProvider(name, config)` | Register/override a model provider |
|
||||
| `pi.unregisterProvider(name)` | Remove a provider |
|
||||
|
||||
### Messaging
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `pi.sendMessage(message, options?)` | Inject a custom message into the session |
|
||||
| `pi.sendUserMessage(content, options?)` | Send a user message (triggers a turn) |
|
||||
|
||||
**`sendMessage` delivery modes:**
|
||||
- `"steer"` (default) — Interrupts streaming. Delivered after current tool finishes, remaining tools skipped.
|
||||
- `"followUp"` — Waits for agent to finish. Delivered when agent has no more tool calls.
|
||||
- `"nextTurn"` — Queued for next user prompt. Does not interrupt.
|
||||
|
||||
### State & Session
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `pi.appendEntry(customType, data?)` | Persist extension state (NOT sent to LLM) |
|
||||
| `pi.setSessionName(name)` | Set display name for session selector |
|
||||
| `pi.getSessionName()` | Get current session name |
|
||||
| `pi.setLabel(entryId, label)` | Bookmark an entry for `/tree` navigation |
|
||||
|
||||
### Tool Management
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `pi.getActiveTools()` | Get currently active tool names |
|
||||
| `pi.getAllTools()` | Get all registered tools (name + description) |
|
||||
| `pi.setActiveTools(names)` | Enable/disable tools at runtime |
|
||||
|
||||
### Model Management
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `pi.setModel(model)` | Switch model. Returns `false` if no API key. |
|
||||
| `pi.getThinkingLevel()` | Get current thinking level |
|
||||
| `pi.setThinkingLevel(level)` | Set thinking level (`"off"` through `"xhigh"`) |
|
||||
|
||||
### Utilities
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `pi.exec(command, args, options?)` | Execute a shell command |
|
||||
| `pi.events` | Shared event bus for inter-extension communication |
|
||||
| `pi.getFlag(name)` | Get value of a registered CLI flag |
|
||||
| `pi.getCommands()` | Get all available slash commands |
|
||||
|
||||
### ExtensionCommandContext (commands only)
|
||||
|
||||
Command handlers receive `ExtensionCommandContext`, which adds session control methods not available in regular event handlers (they would deadlock there):
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `ctx.waitForIdle()` | Wait for agent to finish streaming |
|
||||
| `ctx.newSession(options?)` | Create a new session |
|
||||
| `ctx.fork(entryId)` | Fork from an entry |
|
||||
| `ctx.navigateTree(targetId, options?)` | Navigate the session tree |
|
||||
| `ctx.reload()` | Hot-reload extensions, skills, prompts, themes |
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,148 @@
|
|||
# Custom Tools — Giving the LLM New Abilities
|
||||
|
||||
|
||||
Tools are the most powerful extension capability. They appear in the LLM's system prompt and the LLM calls them autonomously when appropriate.
|
||||
|
||||
### Tool Definition
|
||||
|
||||
```typescript
|
||||
import { Type } from "@sinclair/typebox";
|
||||
import { StringEnum } from "@mariozechner/pi-ai";
|
||||
|
||||
pi.registerTool({
|
||||
name: "my_tool", // Unique identifier
|
||||
label: "My Tool", // Display name in TUI
|
||||
description: "What this does", // Shown to LLM in system prompt
|
||||
|
||||
// Optional: customize the one-liner in the system prompt's "Available tools" section
|
||||
promptSnippet: "List or add items to the project todo list",
|
||||
|
||||
// Optional: add bullets to the system prompt's "Guidelines" section when tool is active
|
||||
promptGuidelines: [
|
||||
"Use this tool for todo planning instead of direct file edits."
|
||||
],
|
||||
|
||||
// Parameter schema (MUST use TypeBox)
|
||||
parameters: Type.Object({
|
||||
action: StringEnum(["list", "add"] as const), // ⚠️ Use StringEnum, NOT Type.Union/Type.Literal
|
||||
text: Type.Optional(Type.String()),
|
||||
}),
|
||||
|
||||
// The execution function
|
||||
async execute(toolCallId, params, signal, onUpdate, ctx) {
|
||||
// Check for cancellation
|
||||
if (signal?.aborted) {
|
||||
return { content: [{ type: "text", text: "Cancelled" }] };
|
||||
}
|
||||
|
||||
// Stream progress updates to the UI
|
||||
onUpdate?.({
|
||||
content: [{ type: "text", text: "Working..." }],
|
||||
details: { progress: 50 },
|
||||
});
|
||||
|
||||
// Do the work
|
||||
const result = await doSomething(params);
|
||||
|
||||
// Return result
|
||||
return {
|
||||
content: [{ type: "text", text: "Done" }], // Sent to LLM as context
|
||||
details: { data: result }, // For rendering & state reconstruction
|
||||
};
|
||||
},
|
||||
|
||||
// Optional: Custom TUI rendering (see Section 14)
|
||||
renderCall(args, theme) { ... },
|
||||
renderResult(result, options, theme) { ... },
|
||||
});
|
||||
```
|
||||
|
||||
### ⚠️ Critical: Use StringEnum
|
||||
|
||||
For string enum parameters, you **must** use `StringEnum` from `@mariozechner/pi-ai`. `Type.Union([Type.Literal("a"), Type.Literal("b")])` does NOT work with Google's API.
|
||||
|
||||
```typescript
|
||||
import { StringEnum } from "@mariozechner/pi-ai";
|
||||
|
||||
// ✅ Correct
|
||||
action: StringEnum(["list", "add", "remove"] as const)
|
||||
|
||||
// ❌ Broken with Google
|
||||
action: Type.Union([Type.Literal("list"), Type.Literal("add")])
|
||||
```
|
||||
|
||||
### Dynamic Tool Registration
|
||||
|
||||
Tools can be registered at any time — during load, in `session_start`, in command handlers, etc. New tools are available immediately without `/reload`.
|
||||
|
||||
```typescript
|
||||
pi.on("session_start", async (_event, ctx) => {
|
||||
pi.registerTool({ name: "dynamic_tool", ... });
|
||||
});
|
||||
|
||||
pi.registerCommand("add-tool", {
|
||||
handler: async (args, ctx) => {
|
||||
pi.registerTool({ name: "runtime_tool", ... });
|
||||
ctx.ui.notify("Tool registered!", "info");
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
### Output Truncation
|
||||
|
||||
**Tools MUST truncate output** to avoid overwhelming the LLM context. The built-in limit is 50KB / 2000 lines (whichever first).
|
||||
|
||||
```typescript
|
||||
import {
|
||||
truncateHead, truncateTail, formatSize,
|
||||
DEFAULT_MAX_BYTES, DEFAULT_MAX_LINES,
|
||||
} from "@mariozechner/pi-coding-agent";
|
||||
|
||||
async execute(toolCallId, params, signal, onUpdate, ctx) {
|
||||
const output = await runCommand();
|
||||
const truncation = truncateHead(output, {
|
||||
maxLines: DEFAULT_MAX_LINES,
|
||||
maxBytes: DEFAULT_MAX_BYTES,
|
||||
});
|
||||
|
||||
let result = truncation.content;
|
||||
if (truncation.truncated) {
|
||||
result += `\n\n[Output truncated: ${truncation.outputLines}/${truncation.totalLines} lines]`;
|
||||
}
|
||||
return { content: [{ type: "text", text: result }] };
|
||||
}
|
||||
```
|
||||
|
||||
### Overriding Built-in Tools
|
||||
|
||||
Register a tool with the same name as a built-in (`read`, `bash`, `edit`, `write`, `grep`, `find`, `ls`) to override it. Your implementation **must match the exact result shape** including the `details` type.
|
||||
|
||||
```bash
|
||||
# Start with no built-in tools, only your extensions
|
||||
pi --no-tools -e ./my-extension.ts
|
||||
```
|
||||
|
||||
### Remote Execution via Pluggable Operations
|
||||
|
||||
Built-in tools support pluggable operations for SSH, containers, etc.:
|
||||
|
||||
```typescript
|
||||
import { createReadTool, createBashTool } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
const remoteBash = createBashTool(cwd, {
|
||||
operations: { execute: (cmd) => sshExec(remote, cmd) }
|
||||
});
|
||||
|
||||
// The bash tool also supports a spawnHook:
|
||||
const bashTool = createBashTool(cwd, {
|
||||
spawnHook: ({ command, cwd, env }) => ({
|
||||
command: `source ~/.profile\n${command}`,
|
||||
cwd: `/mnt/sandbox${cwd}`,
|
||||
env: { ...env, CI: "1" },
|
||||
}),
|
||||
});
|
||||
```
|
||||
|
||||
**Operations interfaces:** `ReadOperations`, `WriteOperations`, `EditOperations`, `BashOperations`, `LsOperations`, `GrepOperations`, `FindOperations`
|
||||
|
||||
---
|
||||
40
docs/extending-pi/11-custom-commands-user-facing-actions.md
Normal file
40
docs/extending-pi/11-custom-commands-user-facing-actions.md
Normal file
|
|
@ -0,0 +1,40 @@
|
|||
# Custom Commands — User-Facing Actions
|
||||
|
||||
|
||||
Commands let users invoke your extension directly via `/mycommand`.
|
||||
|
||||
```typescript
|
||||
pi.registerCommand("deploy", {
|
||||
description: "Deploy to an environment",
|
||||
|
||||
// Optional: argument auto-completion
|
||||
getArgumentCompletions: (prefix: string) => {
|
||||
const envs = ["dev", "staging", "prod"];
|
||||
return envs
|
||||
.filter(e => e.startsWith(prefix))
|
||||
.map(e => ({ value: e, label: e }));
|
||||
},
|
||||
|
||||
handler: async (args, ctx) => {
|
||||
// args = everything after "/deploy "
|
||||
// ctx = ExtensionCommandContext (has extra session control methods)
|
||||
|
||||
await ctx.waitForIdle(); // Wait for agent to finish
|
||||
ctx.ui.notify(`Deploying to ${args}`, "info");
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
### Command Context Extras
|
||||
|
||||
Command handlers get `ExtensionCommandContext` which extends `ExtensionContext` with:
|
||||
|
||||
- `ctx.waitForIdle()` — Wait for agent to finish
|
||||
- `ctx.newSession(options?)` — Create a new session
|
||||
- `ctx.fork(entryId)` — Fork from an entry
|
||||
- `ctx.navigateTree(targetId, options?)` — Navigate the session tree
|
||||
- `ctx.reload()` — Hot-reload everything
|
||||
|
||||
> **Important:** These methods are only available in commands, not in event handlers, because they would deadlock there.
|
||||
|
||||
---
|
||||
195
docs/extending-pi/12-custom-ui-visual-components.md
Normal file
195
docs/extending-pi/12-custom-ui-visual-components.md
Normal file
|
|
@ -0,0 +1,195 @@
|
|||
# Custom UI — Visual Components
|
||||
|
||||
|
||||
Pi's extension UI has multiple layers, from simple notifications to full custom components.
|
||||
|
||||
### 12.1 Dialogs (Blocking)
|
||||
|
||||
```typescript
|
||||
// Selection
|
||||
const choice = await ctx.ui.select("Pick one:", ["A", "B", "C"]);
|
||||
|
||||
// Confirmation
|
||||
const ok = await ctx.ui.confirm("Delete?", "This cannot be undone");
|
||||
|
||||
// Text input
|
||||
const name = await ctx.ui.input("Name:", "placeholder");
|
||||
|
||||
// Multi-line editor
|
||||
const text = await ctx.ui.editor("Edit:", "prefilled text");
|
||||
```
|
||||
|
||||
#### Timed Dialogs
|
||||
|
||||
```typescript
|
||||
// Auto-dismiss after 5s with countdown: "Title (5s)" → "Title (4s)" → ...
|
||||
const ok = await ctx.ui.confirm("Auto-confirm?", "Proceeds in 5s", { timeout: 5000 });
|
||||
// Returns false on timeout
|
||||
```
|
||||
|
||||
### 12.2 Persistent UI Elements
|
||||
|
||||
```typescript
|
||||
// Footer status (persistent until cleared)
|
||||
ctx.ui.setStatus("my-ext", "● Active");
|
||||
ctx.ui.setStatus("my-ext", undefined); // Clear
|
||||
|
||||
// Widget above editor (default placement)
|
||||
ctx.ui.setWidget("my-widget", ["Line 1", "Line 2"]);
|
||||
|
||||
// Widget below editor
|
||||
ctx.ui.setWidget("my-widget", ["Below!"], { placement: "belowEditor" });
|
||||
|
||||
// Widget with theme callback
|
||||
ctx.ui.setWidget("my-widget", (_tui, theme) => ({
|
||||
render: () => [theme.fg("accent", "Styled widget")],
|
||||
invalidate: () => {},
|
||||
}));
|
||||
|
||||
// Working message during streaming
|
||||
ctx.ui.setWorkingMessage("Analyzing code...");
|
||||
ctx.ui.setWorkingMessage(); // Restore default
|
||||
|
||||
// Custom footer (replaces built-in entirely)
|
||||
ctx.ui.setFooter((tui, theme, footerData) => ({
|
||||
render(width) { return [theme.fg("dim", `branch: ${footerData.getGitBranch()}`)]; },
|
||||
invalidate() {},
|
||||
dispose: footerData.onBranchChange(() => tui.requestRender()),
|
||||
}));
|
||||
|
||||
// Editor control
|
||||
ctx.ui.setEditorText("Prefill");
|
||||
const current = ctx.ui.getEditorText();
|
||||
ctx.ui.pasteToEditor("pasted content");
|
||||
|
||||
// Tool expansion
|
||||
ctx.ui.setToolsExpanded(true);
|
||||
ctx.ui.setToolsExpanded(false);
|
||||
|
||||
// Theme management
|
||||
const themes = ctx.ui.getAllThemes();
|
||||
ctx.ui.setTheme("light");
|
||||
```
|
||||
|
||||
### 12.3 Custom Components (ctx.ui.custom)
|
||||
|
||||
For complex UI, `ctx.ui.custom()` temporarily replaces the editor with your component:
|
||||
|
||||
```typescript
|
||||
const result = await ctx.ui.custom<string | null>((tui, theme, keybindings, done) => {
|
||||
// Return a component object
|
||||
return {
|
||||
render(width: number): string[] {
|
||||
return ["Press Enter to confirm, Escape to cancel"];
|
||||
},
|
||||
handleInput(data: string) {
|
||||
if (matchesKey(data, Key.enter)) done("confirmed");
|
||||
if (matchesKey(data, Key.escape)) done(null);
|
||||
},
|
||||
invalidate() {},
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
### 12.4 Overlays (Floating Modals)
|
||||
|
||||
```typescript
|
||||
const result = await ctx.ui.custom<string | null>(
|
||||
(tui, theme, keybindings, done) => new MyDialog({ onClose: done }),
|
||||
{
|
||||
overlay: true,
|
||||
overlayOptions: {
|
||||
anchor: "center", // 9 positions: center, top-left, top-right, etc.
|
||||
width: "50%",
|
||||
maxHeight: "80%",
|
||||
margin: 2,
|
||||
visible: (w, h) => w >= 80, // Hide on narrow terminals
|
||||
},
|
||||
onHandle: (handle) => {
|
||||
// handle.setHidden(true/false)
|
||||
},
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
### 12.5 Custom Editor (Replace Main Input)
|
||||
|
||||
```typescript
|
||||
import { CustomEditor } from "@mariozechner/pi-coding-agent";
|
||||
import { matchesKey, truncateToWidth } from "@mariozechner/pi-tui";
|
||||
|
||||
class VimEditor extends CustomEditor {
|
||||
private mode: "normal" | "insert" = "insert";
|
||||
|
||||
handleInput(data: string): void {
|
||||
if (matchesKey(data, "escape") && this.mode === "insert") {
|
||||
this.mode = "normal";
|
||||
return;
|
||||
}
|
||||
if (this.mode === "insert") {
|
||||
super.handleInput(data); // Normal text editing + app keybindings
|
||||
return;
|
||||
}
|
||||
// Vim normal mode keys...
|
||||
if (data === "i") { this.mode = "insert"; return; }
|
||||
super.handleInput(data); // Pass unhandled to parent
|
||||
}
|
||||
}
|
||||
|
||||
// Register:
|
||||
ctx.ui.setEditorComponent((_tui, theme, keybindings) => new VimEditor(theme, keybindings));
|
||||
ctx.ui.setEditorComponent(undefined); // Restore default
|
||||
```
|
||||
|
||||
> **Key point:** Extend `CustomEditor` (not `Editor`) to get app keybindings (escape to abort, ctrl+d, model switching).
|
||||
|
||||
### 12.6 Built-in TUI Components
|
||||
|
||||
Import from `@mariozechner/pi-tui`:
|
||||
|
||||
| Component | Purpose |
|
||||
|-----------|---------|
|
||||
| `Text` | Multi-line text with word wrapping |
|
||||
| `Box` | Container with padding and background |
|
||||
| `Container` | Groups children vertically |
|
||||
| `Spacer` | Empty vertical space |
|
||||
| `Markdown` | Rendered markdown with syntax highlighting |
|
||||
| `Image` | Image rendering (Kitty, iTerm2, etc.) |
|
||||
| `SelectList` | Interactive selection from list |
|
||||
| `SettingsList` | Toggle settings UI |
|
||||
| `Input` | Text input field |
|
||||
|
||||
Import from `@mariozechner/pi-coding-agent`:
|
||||
|
||||
| Component | Purpose |
|
||||
|-----------|---------|
|
||||
| `DynamicBorder` | Border line with theming |
|
||||
| `BorderedLoader` | Spinner with cancel support |
|
||||
|
||||
### 12.7 Keyboard Input
|
||||
|
||||
```typescript
|
||||
import { matchesKey, Key } from "@mariozechner/pi-tui";
|
||||
|
||||
handleInput(data: string) {
|
||||
if (matchesKey(data, Key.up)) { /* arrow up */ }
|
||||
if (matchesKey(data, Key.enter)) { /* enter */ }
|
||||
if (matchesKey(data, Key.escape)) { /* escape */ }
|
||||
if (matchesKey(data, Key.ctrl("c"))) { /* ctrl+c */ }
|
||||
if (matchesKey(data, Key.shift("tab"))) { /* shift+tab */ }
|
||||
}
|
||||
```
|
||||
|
||||
### 12.8 Line Width Rules
|
||||
|
||||
**Critical:** Each line from `render()` must not exceed the `width` parameter.
|
||||
|
||||
```typescript
|
||||
import { visibleWidth, truncateToWidth, wrapTextWithAnsi } from "@mariozechner/pi-tui";
|
||||
|
||||
render(width: number): string[] {
|
||||
return [truncateToWidth(this.text, width)];
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
55
docs/extending-pi/13-state-management-persistence.md
Normal file
55
docs/extending-pi/13-state-management-persistence.md
Normal file
|
|
@ -0,0 +1,55 @@
|
|||
# State Management & Persistence
|
||||
|
||||
|
||||
### Pattern: State in Tool Result Details
|
||||
|
||||
The recommended approach for stateful tools. State lives in `details` so it works correctly with branching/forking.
|
||||
|
||||
```typescript
|
||||
export default function (pi: ExtensionAPI) {
|
||||
let items: string[] = [];
|
||||
|
||||
// Reconstruct from session on load
|
||||
pi.on("session_start", async (_event, ctx) => {
|
||||
items = [];
|
||||
for (const entry of ctx.sessionManager.getBranch()) {
|
||||
if (entry.type === "message" && entry.message.role === "toolResult") {
|
||||
if (entry.message.toolName === "my_tool") {
|
||||
items = entry.message.details?.items ?? [];
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
pi.registerTool({
|
||||
name: "my_tool",
|
||||
// ...
|
||||
async execute(toolCallId, params, signal, onUpdate, ctx) {
|
||||
items.push(params.text);
|
||||
return {
|
||||
content: [{ type: "text", text: "Added" }],
|
||||
details: { items: [...items] }, // ← Snapshot state here
|
||||
};
|
||||
},
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern: Extension-Private State (appendEntry)
|
||||
|
||||
For state that doesn't participate in LLM context but needs to survive restarts:
|
||||
|
||||
```typescript
|
||||
pi.appendEntry("my-state", { count: 42, lastRun: Date.now() });
|
||||
|
||||
// Restore on reload
|
||||
pi.on("session_start", async (_event, ctx) => {
|
||||
for (const entry of ctx.sessionManager.getEntries()) {
|
||||
if (entry.type === "custom" && entry.customType === "my-state") {
|
||||
const data = entry.data; // { count: 42, lastRun: ... }
|
||||
}
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,97 @@
|
|||
# Custom Rendering — Controlling What the User Sees
|
||||
|
||||
|
||||
### Tool Rendering
|
||||
|
||||
Tools can provide `renderCall` (how the tool call looks) and `renderResult` (how the result looks):
|
||||
|
||||
```typescript
|
||||
import { Text } from "@mariozechner/pi-tui";
|
||||
import { keyHint } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
pi.registerTool({
|
||||
name: "my_tool",
|
||||
// ...
|
||||
|
||||
renderCall(args, theme) {
|
||||
let text = theme.fg("toolTitle", theme.bold("my_tool "));
|
||||
text += theme.fg("muted", args.action);
|
||||
return new Text(text, 0, 0); // 0,0 padding — Box handles it
|
||||
},
|
||||
|
||||
renderResult(result, { expanded, isPartial }, theme) {
|
||||
if (isPartial) {
|
||||
return new Text(theme.fg("warning", "Processing..."), 0, 0);
|
||||
}
|
||||
|
||||
let text = theme.fg("success", "✓ Done");
|
||||
if (!expanded) {
|
||||
text += ` (${keyHint("expandTools", "to expand")})`;
|
||||
}
|
||||
if (expanded && result.details?.items) {
|
||||
for (const item of result.details.items) {
|
||||
text += "\n " + theme.fg("dim", item);
|
||||
}
|
||||
}
|
||||
return new Text(text, 0, 0);
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
### Message Rendering
|
||||
|
||||
Register a renderer for custom message types:
|
||||
|
||||
```typescript
|
||||
import { Text } from "@mariozechner/pi-tui";
|
||||
|
||||
pi.registerMessageRenderer("my-extension", (message, options, theme) => {
|
||||
const { expanded } = options;
|
||||
let text = theme.fg("accent", `[${message.customType}] `) + message.content;
|
||||
if (expanded && message.details) {
|
||||
text += "\n" + theme.fg("dim", JSON.stringify(message.details, null, 2));
|
||||
}
|
||||
return new Text(text, 0, 0);
|
||||
});
|
||||
|
||||
// Send messages that use this renderer:
|
||||
pi.sendMessage({
|
||||
customType: "my-extension", // Matches the renderer
|
||||
content: "Status update",
|
||||
display: true,
|
||||
details: { foo: "bar" },
|
||||
});
|
||||
```
|
||||
|
||||
### Theme Colors Reference
|
||||
|
||||
```typescript
|
||||
// Foreground: theme.fg(color, text)
|
||||
"text" | "accent" | "muted" | "dim" // General
|
||||
"success" | "error" | "warning" // Status
|
||||
"border" | "borderAccent" | "borderMuted" // Borders
|
||||
"toolTitle" | "toolOutput" // Tools
|
||||
"toolDiffAdded" | "toolDiffRemoved" // Diffs
|
||||
"mdHeading" | "mdLink" | "mdCode" // Markdown
|
||||
"syntaxKeyword" | "syntaxFunction" | "syntaxString" // Syntax
|
||||
|
||||
// Background: theme.bg(color, text)
|
||||
"selectedBg" | "userMessageBg" | "customMessageBg"
|
||||
"toolPendingBg" | "toolSuccessBg" | "toolErrorBg"
|
||||
|
||||
// Text styles
|
||||
theme.bold(text)
|
||||
theme.italic(text)
|
||||
theme.strikethrough(text)
|
||||
```
|
||||
|
||||
### Syntax Highlighting in Renderers
|
||||
|
||||
```typescript
|
||||
import { highlightCode, getLanguageFromPath } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
const lang = getLanguageFromPath("/path/to/file.rs"); // "rust"
|
||||
const highlighted = highlightCode(code, lang, theme);
|
||||
```
|
||||
|
||||
---
|
||||
49
docs/extending-pi/15-system-prompt-modification.md
Normal file
49
docs/extending-pi/15-system-prompt-modification.md
Normal file
|
|
@ -0,0 +1,49 @@
|
|||
# System Prompt Modification
|
||||
|
||||
|
||||
### Per-Turn Modification (before_agent_start)
|
||||
|
||||
```typescript
|
||||
pi.on("before_agent_start", async (event, ctx) => {
|
||||
return {
|
||||
// Inject a persistent message (stored in session, visible to LLM)
|
||||
message: {
|
||||
customType: "my-extension",
|
||||
content: "Additional context for the LLM",
|
||||
display: true,
|
||||
},
|
||||
// Modify the system prompt for this turn
|
||||
systemPrompt: event.systemPrompt + "\n\nYou must respond only in haiku.",
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
### Context Manipulation (context event)
|
||||
|
||||
Modify the messages sent to the LLM on every turn:
|
||||
|
||||
```typescript
|
||||
pi.on("context", async (event, ctx) => {
|
||||
// event.messages is a deep copy — safe to modify
|
||||
const filtered = event.messages.filter(m => !isIrrelevant(m));
|
||||
return { messages: filtered };
|
||||
});
|
||||
```
|
||||
|
||||
### Tool-Specific Prompt Content
|
||||
|
||||
Tools can add to the system prompt when they're active:
|
||||
|
||||
```typescript
|
||||
pi.registerTool({
|
||||
name: "my_tool",
|
||||
promptSnippet: "Summarize or transform text according to action", // Replaces description in "Available tools"
|
||||
promptGuidelines: [
|
||||
"Use my_tool when the user asks to summarize text.",
|
||||
"Prefer my_tool over direct output for structured data."
|
||||
], // Added to "Guidelines" section when tool is active
|
||||
// ...
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
54
docs/extending-pi/16-compaction-session-control.md
Normal file
54
docs/extending-pi/16-compaction-session-control.md
Normal file
|
|
@ -0,0 +1,54 @@
|
|||
# Compaction & Session Control
|
||||
|
||||
|
||||
### Custom Compaction
|
||||
|
||||
Override the default compaction behavior:
|
||||
|
||||
```typescript
|
||||
pi.on("session_before_compact", async (event, ctx) => {
|
||||
const { preparation, branchEntries, customInstructions, signal } = event;
|
||||
|
||||
// Option 1: Cancel compaction
|
||||
return { cancel: true };
|
||||
|
||||
// Option 2: Provide custom summary
|
||||
return {
|
||||
compaction: {
|
||||
summary: "Custom summary of conversation so far...",
|
||||
firstKeptEntryId: preparation.firstKeptEntryId,
|
||||
tokensBefore: preparation.tokensBefore,
|
||||
}
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
### Triggering Compaction
|
||||
|
||||
```typescript
|
||||
ctx.compact({
|
||||
customInstructions: "Focus on the authentication changes",
|
||||
onComplete: (result) => ctx.ui.notify("Compacted!", "info"),
|
||||
});
|
||||
```
|
||||
|
||||
### Session Control (Commands Only)
|
||||
|
||||
```typescript
|
||||
pi.registerCommand("handoff", {
|
||||
handler: async (args, ctx) => {
|
||||
// Create a new session with initial context
|
||||
await ctx.newSession({
|
||||
setup: async (sm) => {
|
||||
sm.appendMessage({
|
||||
role: "user",
|
||||
content: [{ type: "text", text: "Context: " + args }],
|
||||
timestamp: Date.now(),
|
||||
});
|
||||
},
|
||||
});
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
61
docs/extending-pi/17-model-provider-management.md
Normal file
61
docs/extending-pi/17-model-provider-management.md
Normal file
|
|
@ -0,0 +1,61 @@
|
|||
# Model & Provider Management
|
||||
|
||||
|
||||
### Switching Models
|
||||
|
||||
```typescript
|
||||
const model = ctx.modelRegistry.find("anthropic", "claude-sonnet-4-5");
|
||||
if (model) {
|
||||
const success = await pi.setModel(model);
|
||||
if (!success) ctx.ui.notify("No API key for this model", "error");
|
||||
}
|
||||
```
|
||||
|
||||
### Registering Custom Providers
|
||||
|
||||
```typescript
|
||||
pi.registerProvider("my-proxy", {
|
||||
baseUrl: "https://proxy.example.com",
|
||||
apiKey: "PROXY_API_KEY", // Env var name or literal
|
||||
api: "anthropic-messages",
|
||||
models: [
|
||||
{
|
||||
id: "claude-sonnet-4-20250514",
|
||||
name: "Claude 4 Sonnet (proxy)",
|
||||
reasoning: false,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 200000,
|
||||
maxTokens: 16384,
|
||||
}
|
||||
],
|
||||
// Optional: OAuth support for /login
|
||||
oauth: {
|
||||
name: "Corporate AI (SSO)",
|
||||
async login(callbacks) { /* ... */ },
|
||||
async refreshToken(credentials) { /* ... */ },
|
||||
getApiKey(credentials) { return credentials.access; },
|
||||
},
|
||||
});
|
||||
|
||||
// Override just the baseUrl for an existing provider
|
||||
pi.registerProvider("anthropic", {
|
||||
baseUrl: "https://proxy.example.com",
|
||||
});
|
||||
|
||||
// Remove a provider
|
||||
pi.unregisterProvider("my-proxy");
|
||||
```
|
||||
|
||||
### Reacting to Model Changes
|
||||
|
||||
```typescript
|
||||
pi.on("model_select", async (event, ctx) => {
|
||||
// event.model — new model
|
||||
// event.previousModel — previous model (undefined if first)
|
||||
// event.source — "set" | "cycle" | "restore"
|
||||
ctx.ui.setStatus("model", `${event.model.provider}/${event.model.id}`);
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
52
docs/extending-pi/18-remote-execution-tool-overrides.md
Normal file
52
docs/extending-pi/18-remote-execution-tool-overrides.md
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
# Remote Execution & Tool Overrides
|
||||
|
||||
|
||||
### SSH Example Pattern
|
||||
|
||||
```typescript
|
||||
import { createReadTool, createBashTool, createWriteTool } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
export default function (pi: ExtensionAPI) {
|
||||
pi.registerFlag("ssh", { description: "SSH target", type: "string" });
|
||||
|
||||
const localBash = createBashTool(process.cwd());
|
||||
|
||||
pi.registerTool({
|
||||
...localBash,
|
||||
async execute(id, params, signal, onUpdate, ctx) {
|
||||
const sshTarget = pi.getFlag("--ssh");
|
||||
if (sshTarget) {
|
||||
const remoteBash = createBashTool(process.cwd(), {
|
||||
operations: createSSHOperations(sshTarget),
|
||||
});
|
||||
return remoteBash.execute(id, params, signal, onUpdate);
|
||||
}
|
||||
return localBash.execute(id, params, signal, onUpdate);
|
||||
},
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### Tool Override Pattern (Logging/Access Control)
|
||||
|
||||
```typescript
|
||||
pi.registerTool({
|
||||
name: "read", // Same name = overrides built-in
|
||||
label: "Read (Logged)",
|
||||
description: "Read file contents with logging",
|
||||
parameters: Type.Object({
|
||||
path: Type.String(),
|
||||
offset: Type.Optional(Type.Number()),
|
||||
limit: Type.Optional(Type.Number()),
|
||||
}),
|
||||
async execute(toolCallId, params, signal, onUpdate, ctx) {
|
||||
console.log(`[AUDIT] Reading: ${params.path}`);
|
||||
// Delegate to built-in implementation
|
||||
const builtIn = createReadTool(ctx.cwd);
|
||||
return builtIn.execute(toolCallId, params, signal, onUpdate);
|
||||
},
|
||||
// Omit renderCall/renderResult to use built-in renderer automatically
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
56
docs/extending-pi/19-packaging-distribution.md
Normal file
56
docs/extending-pi/19-packaging-distribution.md
Normal file
|
|
@ -0,0 +1,56 @@
|
|||
# Packaging & Distribution
|
||||
|
||||
|
||||
### Creating a Pi Package
|
||||
|
||||
Add a `pi` manifest to `package.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "my-pi-package",
|
||||
"keywords": ["pi-package"],
|
||||
"pi": {
|
||||
"extensions": ["./extensions"],
|
||||
"skills": ["./skills"],
|
||||
"prompts": ["./prompts"],
|
||||
"themes": ["./themes"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Installing Packages
|
||||
|
||||
```bash
|
||||
pi install npm:@foo/bar@1.0.0
|
||||
pi install git:github.com/user/repo@v1
|
||||
pi install ./local/path
|
||||
|
||||
# Try without installing:
|
||||
pi -e npm:@foo/bar
|
||||
```
|
||||
|
||||
### Convention Directories (no manifest needed)
|
||||
|
||||
If no `pi` manifest exists, pi auto-discovers:
|
||||
- `extensions/` → `.ts` and `.js` files
|
||||
- `skills/` → `SKILL.md` folders
|
||||
- `prompts/` → `.md` files
|
||||
- `themes/` → `.json` files
|
||||
|
||||
### Gallery Metadata
|
||||
|
||||
```json
|
||||
{
|
||||
"pi": {
|
||||
"video": "https://example.com/demo.mp4",
|
||||
"image": "https://example.com/screenshot.png"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Dependencies
|
||||
|
||||
- List `@mariozechner/pi-ai`, `@mariozechner/pi-coding-agent`, `@mariozechner/pi-tui`, `@sinclair/typebox` in `peerDependencies` with `"*"` — they're bundled by pi.
|
||||
- Other npm deps go in `dependencies`. Pi runs `npm install` on package installation.
|
||||
|
||||
---
|
||||
21
docs/extending-pi/20-mode-behavior.md
Normal file
21
docs/extending-pi/20-mode-behavior.md
Normal file
|
|
@ -0,0 +1,21 @@
|
|||
# Mode Behavior
|
||||
|
||||
|
||||
| Mode | UI Methods | Notes |
|
||||
|------|-----------|-------|
|
||||
| **Interactive** (default) | Full TUI | Normal operation |
|
||||
| **RPC** (`--mode rpc`) | JSON protocol | Host handles UI, dialogs work via sub-protocol |
|
||||
| **JSON** (`--mode json`) | No-op | Event stream to stdout |
|
||||
| **Print** (`-p`) | No-op | Extensions run but can't prompt users |
|
||||
|
||||
**Always check `ctx.hasUI`** before calling dialog methods in extensions that might run in non-interactive modes:
|
||||
|
||||
```typescript
|
||||
if (ctx.hasUI) {
|
||||
const ok = await ctx.ui.confirm("Delete?", "Sure?");
|
||||
} else {
|
||||
// Default behavior for non-interactive mode
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
8
docs/extending-pi/21-error-handling.md
Normal file
8
docs/extending-pi/21-error-handling.md
Normal file
|
|
@ -0,0 +1,8 @@
|
|||
# Error Handling
|
||||
|
||||
|
||||
- **Extension errors** are logged but don't crash pi. The agent continues.
|
||||
- **`tool_call` handler errors** block the tool (fail-safe behavior).
|
||||
- **Tool `execute` errors** are reported to the LLM with `isError: true`, allowing it to recover.
|
||||
|
||||
---
|
||||
25
docs/extending-pi/22-key-rules-gotchas.md
Normal file
25
docs/extending-pi/22-key-rules-gotchas.md
Normal file
|
|
@ -0,0 +1,25 @@
|
|||
# Key Rules & Gotchas
|
||||
|
||||
|
||||
### Must-Follow Rules
|
||||
|
||||
1. **Use `StringEnum` for string enums** — `Type.Union`/`Type.Literal` breaks Google's API.
|
||||
2. **Truncate tool output** — Large output causes context overflow, compaction failures, degraded performance.
|
||||
3. **Use theme from callback** — Don't import theme directly. Use the `theme` parameter from `ctx.ui.custom()` or render functions.
|
||||
4. **Type the DynamicBorder color param** — Write `(s: string) => theme.fg("accent", s)`.
|
||||
5. **Call `tui.requestRender()` after state changes** in `handleInput`.
|
||||
6. **Return `{ render, invalidate, handleInput }`** from custom components.
|
||||
7. **Lines must not exceed `width`** in `render()` — use `truncateToWidth()`.
|
||||
8. **Session control methods only in commands** — `waitForIdle()`, `newSession()`, `fork()`, `navigateTree()`, `reload()` will deadlock in event handlers.
|
||||
9. **Strip leading `@` from path arguments** in custom tools — some models add it.
|
||||
10. **Store state in tool result `details`** for proper branching support.
|
||||
|
||||
### Common Patterns
|
||||
|
||||
- **Rebuild on `invalidate()`** when your component pre-bakes theme colors
|
||||
- **Check `signal?.aborted`** in long-running tool executions
|
||||
- **Use `pi.exec()` instead of `child_process`** for shell commands
|
||||
- **Overlay components are disposed when closed** — create fresh instances each time
|
||||
- **Treat `ctx.reload()` as terminal** — code after it runs from the pre-reload version
|
||||
|
||||
---
|
||||
24
docs/extending-pi/23-file-reference-documentation.md
Normal file
24
docs/extending-pi/23-file-reference-documentation.md
Normal file
|
|
@ -0,0 +1,24 @@
|
|||
# File Reference — Documentation
|
||||
|
||||
|
||||
All paths relative to:
|
||||
```
|
||||
/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/
|
||||
```
|
||||
|
||||
| File | What It Covers |
|
||||
|------|---------------|
|
||||
| `docs/extensions.md` | **Primary reference** — Full extensions API (1,972 lines) |
|
||||
| `docs/tui.md` | TUI component system — `Component` interface, built-in components, keyboard input, theming, overlay system, performance, patterns |
|
||||
| `docs/packages.md` | Creating and distributing pi packages (npm, git, local) |
|
||||
| `docs/session.md` | Session file format, entry types, message types, SessionManager API |
|
||||
| `docs/compaction.md` | Auto-compaction, branch summarization, custom compaction hooks |
|
||||
| `docs/rpc.md` | RPC mode protocol for headless/embedded operation |
|
||||
| `docs/sdk.md` | SDK integrations |
|
||||
| `docs/custom-provider.md` | Custom model providers, OAuth, streaming APIs |
|
||||
| `docs/keybindings.md` | Keyboard shortcut format and built-in keybindings |
|
||||
| `docs/themes.md` | Creating custom themes |
|
||||
| `docs/settings.md` | Settings configuration |
|
||||
| `README.md` | Main pi documentation |
|
||||
|
||||
---
|
||||
132
docs/extending-pi/24-file-reference-example-extensions.md
Normal file
132
docs/extending-pi/24-file-reference-example-extensions.md
Normal file
|
|
@ -0,0 +1,132 @@
|
|||
# File Reference — Example Extensions
|
||||
|
||||
|
||||
All paths relative to:
|
||||
```
|
||||
/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/examples/extensions/
|
||||
```
|
||||
|
||||
### Lifecycle & Safety
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `protected-paths.ts` | Blocking writes to `.env`, `.git/`, `node_modules/` via `tool_call` |
|
||||
| `dirty-repo-guard.ts` | Preventing session changes with uncommitted git changes |
|
||||
|
||||
### Custom Tools
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `todo.ts` | **Best example** — Stateful tool with persistence, custom rendering, command |
|
||||
| `hello.ts` | Minimal tool registration |
|
||||
| `question.ts` | Tool with `ctx.ui.select()` for user interaction |
|
||||
| `questionnaire.ts` | Multi-question wizard with tab navigation |
|
||||
| `tool-override.ts` | Overriding built-in `read` with logging/access control |
|
||||
| `dynamic-tools.ts` | Registering tools after startup and at runtime |
|
||||
| `truncated-tool.ts` | Output truncation with `truncateHead` |
|
||||
| `built-in-tool-renderer.ts` | Custom compact rendering for built-in tools |
|
||||
| `antigravity-image-gen.ts` | Image generation tool |
|
||||
| `ssh.ts` | Full SSH remote execution with pluggable operations |
|
||||
|
||||
### Commands & UI
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `commands.ts` | Basic command registration |
|
||||
| `preset.ts` | Named presets (model, thinking, tools) with flag and command |
|
||||
| `plan-mode/` | Full plan mode — commands, shortcuts, flags, widgets, status, tool management |
|
||||
| `qna.ts` | Extract questions + `BorderedLoader` + `setEditorText` |
|
||||
| `send-user-message.ts` | `pi.sendUserMessage()` for injecting user messages |
|
||||
| `modal-editor.ts` | Vim-like modal editor via `CustomEditor` |
|
||||
| `snake.ts` | Full game with custom UI, keyboard handling, persistence |
|
||||
| `space-invaders.ts` | Full game with custom UI |
|
||||
| `doom-overlay/` | DOOM running as an overlay at 35 FPS |
|
||||
| `timed-confirm.ts` | Dialogs with `timeout` and `AbortSignal` |
|
||||
| `overlay-test.ts` | Overlay compositing with inline inputs |
|
||||
| `overlay-qa-tests.ts` | Comprehensive overlay tests: anchors, margins, stacking |
|
||||
|
||||
### System Prompt & Context
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `pirate.ts` | `before_agent_start` system prompt modification |
|
||||
| `claude-rules.ts` | Loading rules from `.claude/rules/` into system prompt |
|
||||
| `system-prompt-header.ts` | Displaying system prompt info |
|
||||
| `input-transform.ts` | Transforming user input via `input` event |
|
||||
| `inline-bash.ts` | Expanding `!{command}` patterns in prompts |
|
||||
|
||||
### Compaction & Sessions
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `custom-compaction.ts` | Custom compaction summary via `session_before_compact` |
|
||||
| `trigger-compact.ts` | Triggering compaction at 100k tokens |
|
||||
| `git-checkpoint.ts` | Git stash on turns, restore on fork |
|
||||
| `bookmark.ts` | Labeling entries for `/tree` navigation |
|
||||
| `session-name.ts` | Naming sessions for selector display |
|
||||
|
||||
### UI Components
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `custom-footer.ts` | `setFooter` with git branch and token stats |
|
||||
| `custom-header.ts` | `setHeader` for custom startup header |
|
||||
| `status-line.ts` | `setStatus` for footer indicators |
|
||||
| `widget-placement.ts` | `setWidget` above and below editor |
|
||||
| `notify.ts` | Desktop notifications via OSC 777 |
|
||||
| `titlebar-spinner.ts` | Braille spinner in terminal title |
|
||||
| `message-renderer.ts` | Custom message rendering with `registerMessageRenderer` |
|
||||
| `model-status.ts` | `model_select` event for status bar |
|
||||
| `mac-system-theme.ts` | Auto-sync theme with macOS dark/light mode |
|
||||
|
||||
### Providers
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `custom-provider-anthropic/` | Custom Anthropic provider with OAuth |
|
||||
| `custom-provider-gitlab-duo/` | GitLab Duo via proxy |
|
||||
| `custom-provider-qwen-cli/` | Qwen CLI with OAuth device flow |
|
||||
|
||||
### Communication
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `event-bus.ts` | Inter-extension communication via `pi.events` |
|
||||
| `rpc-demo.ts` | All RPC-supported extension UI methods |
|
||||
| `reload-runtime.ts` | Safe reload flow: command + LLM tool handoff |
|
||||
| `shutdown-command.ts` | `ctx.shutdown()` for graceful exit |
|
||||
| `file-trigger.ts` | File watcher injecting messages via `sendMessage` |
|
||||
|
||||
### Misc
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `with-deps/` | Extension with its own `package.json` and npm dependencies |
|
||||
| `minimal-mode.ts` | Override built-in tool rendering for minimal display |
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference: "I want to..."
|
||||
|
||||
| Goal | Approach | Key API | Example File |
|
||||
|------|----------|---------|-------------|
|
||||
| Block dangerous commands | Listen to `tool_call`, return `{ block: true }` | `pi.on("tool_call", ...)` | `protected-paths.ts` |
|
||||
| Add a tool the LLM can use | Register a tool with schema and execute | `pi.registerTool({...})` | `todo.ts` |
|
||||
| Add a slash command | Register a command with handler | `pi.registerCommand(...)` | `commands.ts` |
|
||||
| Ask the user a question | Use dialog methods | `ctx.ui.select()`, `ctx.ui.confirm()` | `question.ts` |
|
||||
| Show persistent status | Set footer status | `ctx.ui.setStatus(id, text)` | `status-line.ts` |
|
||||
| Modify the system prompt | Hook `before_agent_start` | Return `{ systemPrompt: "..." }` | `pirate.ts` |
|
||||
| Filter messages sent to LLM | Hook `context` event | Return `{ messages: [...] }` | — |
|
||||
| Save state across restarts | Store in tool details or appendEntry | `details: {...}` / `pi.appendEntry(...)` | `todo.ts` |
|
||||
| Custom compaction | Hook `session_before_compact` | Return `{ compaction: {...} }` | `custom-compaction.ts` |
|
||||
| Build a full-screen UI | Use `ctx.ui.custom()` | Component with render/handleInput | `snake.ts` |
|
||||
| Show a floating dialog | Use overlay mode | `ctx.ui.custom(..., { overlay: true })` | `overlay-test.ts` |
|
||||
| Replace the input editor | Extend `CustomEditor` | `ctx.ui.setEditorComponent(...)` | `modal-editor.ts` |
|
||||
| Override a built-in tool | Register tool with same name | `pi.registerTool({ name: "read" })` | `tool-override.ts` |
|
||||
| Run tools via SSH | Use pluggable operations | `createBashTool(cwd, { operations })` | `ssh.ts` |
|
||||
| Switch models programmatically | Find and set model | `pi.setModel(model)` | `preset.ts` |
|
||||
| Register a custom provider | Provide config with models | `pi.registerProvider(...)` | `custom-provider-anthropic/` |
|
||||
| Transform user input | Hook `input` event | Return `{ action: "transform", text }` | `input-transform.ts` |
|
||||
| Inject messages | Send custom or user messages | `pi.sendMessage()` / `pi.sendUserMessage()` | `send-user-message.ts` |
|
||||
| React to model changes | Hook `model_select` | `pi.on("model_select", ...)` | `model-status.ts` |
|
||||
| Add a keyboard shortcut | Register a shortcut | `pi.registerShortcut("ctrl+x", ...)` | `plan-mode/` |
|
||||
| Package for distribution | Add `pi` key to package.json | `"pi": { "extensions": [...] }` | See packages.md |
|
||||
|
||||
---
|
||||
|
||||
*This document was generated from the Pi extension documentation and examples. Source docs are at:*
|
||||
```
|
||||
/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/
|
||||
/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/examples/extensions/
|
||||
```
|
||||
324
docs/extending-pi/25-slash-command-subcommand-patterns.md
Normal file
324
docs/extending-pi/25-slash-command-subcommand-patterns.md
Normal file
|
|
@ -0,0 +1,324 @@
|
|||
# Slash Command Subcommand Patterns
|
||||
|
||||
Pi does not have a separate built-in concept of "nested slash commands" like `/wt new` or `/foo delete`.
|
||||
|
||||
Instead, this UX is built by registering a single slash command and using **argument completions** to make the first argument behave like a subcommand.
|
||||
|
||||
This is the pattern used by the built-in worktree extension:
|
||||
- `/wt`
|
||||
- `/wt new`
|
||||
- `/wt ls`
|
||||
- `/wt switch my-branch`
|
||||
|
||||
The key API is:
|
||||
- `pi.registerCommand(name, options)`
|
||||
- `getArgumentCompletions(prefix)`
|
||||
- `handler(args, ctx)`
|
||||
|
||||
## Mental Model
|
||||
|
||||
Treat the command as:
|
||||
|
||||
- one top-level slash command
|
||||
- one or more positional arguments
|
||||
- the first positional argument acting as a subcommand
|
||||
- optional later arguments completed dynamically based on the first
|
||||
|
||||
So this:
|
||||
|
||||
```text
|
||||
/wt
|
||||
new
|
||||
ls
|
||||
switch
|
||||
merge
|
||||
rm
|
||||
status
|
||||
```
|
||||
|
||||
is really just:
|
||||
|
||||
- command: `wt`
|
||||
- first arg: one of `new | ls | switch | merge | rm | status`
|
||||
|
||||
## The Core Pattern
|
||||
|
||||
```typescript
|
||||
pi.registerCommand("foo", {
|
||||
description: "Manage foo items: /foo new|list|delete [name]",
|
||||
|
||||
getArgumentCompletions: (prefix: string) => {
|
||||
const subcommands = ["new", "list", "delete"];
|
||||
const parts = prefix.trim().split(/\s+/);
|
||||
|
||||
// Complete the first argument: /foo <subcommand>
|
||||
if (parts.length <= 1) {
|
||||
return subcommands
|
||||
.filter((cmd) => cmd.startsWith(parts[0] ?? ""))
|
||||
.map((cmd) => ({ value: cmd, label: cmd }));
|
||||
}
|
||||
|
||||
// Complete the second argument: /foo delete <name>
|
||||
if (parts[0] === "delete") {
|
||||
const items = ["alpha", "beta", "gamma"];
|
||||
const namePrefix = parts[1] ?? "";
|
||||
return items
|
||||
.filter((name) => name.startsWith(namePrefix))
|
||||
.map((name) => ({ value: `delete ${name}`, label: name }));
|
||||
}
|
||||
|
||||
return [];
|
||||
},
|
||||
|
||||
handler: async (args, ctx) => {
|
||||
const parts = args.trim().split(/\s+/);
|
||||
const sub = parts[0];
|
||||
const name = parts[1];
|
||||
|
||||
await ctx.waitForIdle();
|
||||
|
||||
if (sub === "new") {
|
||||
ctx.ui.notify("Create a new foo item", "info");
|
||||
return;
|
||||
}
|
||||
|
||||
if (sub === "list") {
|
||||
ctx.ui.notify("List foo items", "info");
|
||||
return;
|
||||
}
|
||||
|
||||
if (sub === "delete") {
|
||||
if (!name) {
|
||||
ctx.ui.notify("Usage: /foo delete <name>", "error");
|
||||
return;
|
||||
}
|
||||
ctx.ui.notify(`Deleting ${name}`, "info");
|
||||
return;
|
||||
}
|
||||
|
||||
ctx.ui.notify("Usage: /foo <new|list|delete> [name]", "info");
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
## How `getArgumentCompletions()` Behaves
|
||||
|
||||
`getArgumentCompletions(prefix)` receives everything after the slash command name.
|
||||
|
||||
Examples for `/foo`:
|
||||
|
||||
- typing `/foo ` gives `prefix === ""`
|
||||
- typing `/foo de` gives `prefix === "de"`
|
||||
- typing `/foo delete a` gives `prefix === "delete a"`
|
||||
|
||||
That means you can parse the prefix into words and decide what suggestions to show next.
|
||||
|
||||
A common structure is:
|
||||
|
||||
1. If the user is on the first argument, show available subcommands.
|
||||
2. If the first argument selects a branch like `delete`, show completions for the next argument.
|
||||
3. Otherwise return `[]`.
|
||||
|
||||
## Important Detail: Empty Prefix Handling
|
||||
|
||||
A practical gotcha is that:
|
||||
|
||||
```typescript
|
||||
"".trim().split(/\s+/)
|
||||
```
|
||||
|
||||
produces `['']`, not `[]`.
|
||||
|
||||
That is why the common pattern is:
|
||||
|
||||
```typescript
|
||||
const parts = prefix.trim().split(/\s+/);
|
||||
if (parts.length <= 1) {
|
||||
// complete first argument
|
||||
}
|
||||
```
|
||||
|
||||
This handles both:
|
||||
- completely empty input after the command
|
||||
- partially typed first arguments
|
||||
|
||||
## Dynamic Second-Argument Completion
|
||||
|
||||
This pattern becomes powerful when later arguments depend on the subcommand.
|
||||
|
||||
Example:
|
||||
|
||||
```typescript
|
||||
getArgumentCompletions: (prefix) => {
|
||||
const parts = prefix.trim().split(/\s+/);
|
||||
const sub = parts[0];
|
||||
|
||||
if (parts.length <= 1) {
|
||||
return ["new", "list", "delete"].map((s) => ({ value: s, label: s }));
|
||||
}
|
||||
|
||||
if (sub === "delete") {
|
||||
const items = getCurrentItemsSomehow();
|
||||
const namePrefix = parts[1] ?? "";
|
||||
return items
|
||||
.filter((item) => item.startsWith(namePrefix))
|
||||
.map((item) => ({ value: `delete ${item}`, label: item }));
|
||||
}
|
||||
|
||||
return [];
|
||||
}
|
||||
```
|
||||
|
||||
This is how `/wt switch`, `/wt merge`, and `/wt rm` can suggest current worktree names.
|
||||
|
||||
## Real Example: `/wt`
|
||||
|
||||
The worktree extension uses this exact structure in:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/agent/extensions/worktree/index.ts`
|
||||
|
||||
It defines:
|
||||
|
||||
```typescript
|
||||
const subcommands = ["new", "ls", "switch", "merge", "rm", "status"];
|
||||
```
|
||||
|
||||
Then:
|
||||
|
||||
- when the first argument is still being typed, it suggests those subcommands
|
||||
- when the first argument is `switch`, `merge`, or `rm`, it suggests matching worktree names for the second argument
|
||||
|
||||
That is why typing:
|
||||
|
||||
```text
|
||||
/wt
|
||||
```
|
||||
|
||||
shows:
|
||||
|
||||
```text
|
||||
new
|
||||
ls
|
||||
switch
|
||||
merge
|
||||
rm
|
||||
status
|
||||
```
|
||||
|
||||
and typing:
|
||||
|
||||
```text
|
||||
/wt switch
|
||||
```
|
||||
|
||||
shows available worktree names.
|
||||
|
||||
## Parsing in the Handler
|
||||
|
||||
Your completion logic and your handler logic should agree on the command shape.
|
||||
|
||||
A common structure is:
|
||||
|
||||
```typescript
|
||||
handler: async (args, ctx) => {
|
||||
const parts = args.trim().split(/\s+/);
|
||||
const sub = parts[0];
|
||||
const rest = parts.slice(1);
|
||||
|
||||
switch (sub) {
|
||||
case "new":
|
||||
// handle /foo new
|
||||
return;
|
||||
case "list":
|
||||
// handle /foo list
|
||||
return;
|
||||
case "delete":
|
||||
// handle /foo delete <name>
|
||||
return;
|
||||
default:
|
||||
ctx.ui.notify("Usage: /foo <new|list|delete>", "info");
|
||||
return;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Keep the parsing simple and mirror the same branches your completions advertise.
|
||||
|
||||
## When to Use This Pattern
|
||||
|
||||
Use a single command with subcommand-style completions when:
|
||||
|
||||
- the actions belong to one clear domain
|
||||
- you want discoverability from one entry point
|
||||
- the subcommands feel like one family of operations
|
||||
- later arguments depend on the earlier choice
|
||||
|
||||
Examples:
|
||||
|
||||
- `/wt new|switch|merge|rm|status`
|
||||
- `/preset save|load|delete`
|
||||
- `/workflow start|list|abort`
|
||||
- `/foo new|list|delete`
|
||||
|
||||
## When to Prefer Separate Commands
|
||||
|
||||
Prefer separate commands when:
|
||||
|
||||
- the actions are conceptually unrelated
|
||||
- each command needs its own distinct description and identity
|
||||
- autocomplete would become too deep or overloaded
|
||||
- the combined command would become hard to remember or document
|
||||
|
||||
Good candidates for separate commands:
|
||||
|
||||
- `/deploy`
|
||||
- `/rollback`
|
||||
- `/handoff`
|
||||
|
||||
rather than forcing all of those into one umbrella command.
|
||||
|
||||
## UX Guidelines
|
||||
|
||||
A few practical rules make this pattern feel good:
|
||||
|
||||
- Keep top-level subcommands short and obvious.
|
||||
- Use names that read naturally after the slash command.
|
||||
- Keep branching shallow; one or two levels is usually enough.
|
||||
- Return an empty array when no completion makes sense.
|
||||
- Make your fallback usage text match your completion structure.
|
||||
- If a subcommand needs required data, validate it again in the handler.
|
||||
|
||||
## Recommended Structure
|
||||
|
||||
A solid command with subcommands usually has:
|
||||
|
||||
- `description` showing the top-level grammar
|
||||
- `getArgumentCompletions()` for first and second argument suggestions
|
||||
- `handler()` that branches on the first argument
|
||||
- a fallback usage message for invalid input
|
||||
|
||||
Example description:
|
||||
|
||||
```typescript
|
||||
description: "Manage foo items: /foo new|list|delete [name]"
|
||||
```
|
||||
|
||||
## Related Docs
|
||||
|
||||
Read these alongside this pattern:
|
||||
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/11-custom-commands-user-facing-actions.md`
|
||||
- `/Users/lexchristopherson/.gsd/docs/extending-pi/09-extensionapi-what-you-can-do.md`
|
||||
- `/Users/lexchristopherson/.gsd/agent/extensions/worktree/index.ts`
|
||||
|
||||
## Summary
|
||||
|
||||
If you want `/foo` to behave like it has nested subcommands, do this:
|
||||
|
||||
1. register one slash command
|
||||
2. treat the first argument as a subcommand
|
||||
3. implement `getArgumentCompletions(prefix)`
|
||||
4. optionally complete later arguments dynamically
|
||||
5. branch in the handler based on the parsed first argument
|
||||
|
||||
That is the mechanism behind the `/wt` experience.
|
||||
36
docs/extending-pi/README.md
Normal file
36
docs/extending-pi/README.md
Normal file
|
|
@ -0,0 +1,36 @@
|
|||
# The Complete Guide to Building Pi Extensions
|
||||
|
||||
> Split into individual files for easier consumption.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [01. What Are Extensions?](./01-what-are-extensions.md)
|
||||
- [02. Architecture & Mental Model](./02-architecture-mental-model.md)
|
||||
- [03. Getting Started](./03-getting-started.md)
|
||||
- [04. Extension Locations & Discovery](./04-extension-locations-discovery.md)
|
||||
- [05. Extension Structure & Styles](./05-extension-structure-styles.md)
|
||||
- [06. The Extension Lifecycle](./06-the-extension-lifecycle.md)
|
||||
- [07. Events — The Nervous System](./07-events-the-nervous-system.md)
|
||||
- [08. ExtensionContext — What You Can Access](./08-extensioncontext-what-you-can-access.md)
|
||||
- [09. ExtensionAPI — What You Can Do](./09-extensionapi-what-you-can-do.md)
|
||||
- [10. Custom Tools — Giving the LLM New Abilities](./10-custom-tools-giving-the-llm-new-abilities.md)
|
||||
- [11. Custom Commands — User-Facing Actions](./11-custom-commands-user-facing-actions.md)
|
||||
- [12. Custom UI — Visual Components](./12-custom-ui-visual-components.md)
|
||||
- [13. State Management & Persistence](./13-state-management-persistence.md)
|
||||
- [14. Custom Rendering — Controlling What the User Sees](./14-custom-rendering-controlling-what-the-user-sees.md)
|
||||
- [15. System Prompt Modification](./15-system-prompt-modification.md)
|
||||
- [16. Compaction & Session Control](./16-compaction-session-control.md)
|
||||
- [17. Model & Provider Management](./17-model-provider-management.md)
|
||||
- [18. Remote Execution & Tool Overrides](./18-remote-execution-tool-overrides.md)
|
||||
- [19. Packaging & Distribution](./19-packaging-distribution.md)
|
||||
- [20. Mode Behavior](./20-mode-behavior.md)
|
||||
- [21. Error Handling](./21-error-handling.md)
|
||||
- [22. Key Rules & Gotchas](./22-key-rules-gotchas.md)
|
||||
- [23. File Reference — Documentation](./23-file-reference-documentation.md)
|
||||
- [24. File Reference — Example Extensions](./24-file-reference-example-extensions.md)
|
||||
- [25. Slash Command Subcommand Patterns](./25-slash-command-subcommand-patterns.md)
|
||||
|
||||
---
|
||||
|
||||
*Split into per-section files for surgical context loading.*
|
||||
|
||||
57
docs/pi-ui-tui/01-the-ui-architecture.md
Normal file
57
docs/pi-ui-tui/01-the-ui-architecture.md
Normal file
|
|
@ -0,0 +1,57 @@
|
|||
# The UI Architecture
|
||||
|
||||
Pi's TUI is a custom terminal rendering system. Understanding its architecture prevents most mistakes:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Terminal Window │
|
||||
│ │
|
||||
│ ┌────────────────────────────────────────────────────────┐ │
|
||||
│ │ Custom Header (ctx.ui.setHeader) │ │
|
||||
│ ├────────────────────────────────────────────────────────┤ │
|
||||
│ │ │ │
|
||||
│ │ Message Area │ │
|
||||
│ │ - User messages │ │
|
||||
│ │ - Assistant responses │ │
|
||||
│ │ - Tool calls and results ◄── renderCall/renderResult │ │
|
||||
│ │ - Custom messages ◄── registerMessageRenderer │ │
|
||||
│ │ - Notifications │ │
|
||||
│ │ │ │
|
||||
│ ├────────────────────────────────────────────────────────┤ │
|
||||
│ │ Widgets (above editor) ◄── ctx.ui.setWidget │ │
|
||||
│ ├────────────────────────────────────────────────────────┤ │
|
||||
│ │ │ │
|
||||
│ │ Editor ◄── Can be replaced by: │ │
|
||||
│ │ - ctx.ui.custom() (temporary full replacement) │ │
|
||||
│ │ - ctx.ui.setEditorComponent() (permanent replace) │ │
|
||||
│ │ │ │
|
||||
│ ├────────────────────────────────────────────────────────┤ │
|
||||
│ │ Widgets (below editor) ◄── ctx.ui.setWidget │ │
|
||||
│ ├────────────────────────────────────────────────────────┤ │
|
||||
│ │ Footer ◄── ctx.ui.setFooter / ctx.ui.setStatus │ │
|
||||
│ └────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌────────────────────────┐ │
|
||||
│ │ Overlay (floating) │ ◄── ctx.ui.custom({ overlay }) │
|
||||
│ │ Rendered on top of │ │
|
||||
│ │ everything │ │
|
||||
│ └────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Key principles:**
|
||||
- Everything renders as **arrays of strings** (one per line)
|
||||
- Each line **must not exceed the `width` parameter** — this is enforced
|
||||
- **ANSI escape codes** are used for styling — they don't count toward visible width
|
||||
- **Styles do NOT carry across lines** — the TUI resets SGR at the end of each line
|
||||
- All **state changes require explicit invalidation** followed by a render request
|
||||
- **Theme is always passed via callbacks** — never import it directly
|
||||
|
||||
### Packages
|
||||
|
||||
| Package | What it provides |
|
||||
|---------|-----------------|
|
||||
| `@mariozechner/pi-tui` | Core components (`Text`, `Box`, `Container`, `SelectList`, etc.), keyboard handling, text utilities |
|
||||
| `@mariozechner/pi-coding-agent` | Higher-level components (`DynamicBorder`, `BorderedLoader`, `CustomEditor`), theming helpers, code highlighting |
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,44 @@
|
|||
# The Component Interface — Foundation of Everything
|
||||
|
||||
Every visual element in Pi implements this interface:
|
||||
|
||||
```typescript
|
||||
interface Component {
|
||||
render(width: number): string[];
|
||||
handleInput?(data: string): void;
|
||||
wantsKeyRelease?: boolean;
|
||||
invalidate(): void;
|
||||
}
|
||||
```
|
||||
|
||||
| Method | Purpose | Required? |
|
||||
|--------|---------|-----------|
|
||||
| `render(width)` | Return array of strings (one per line). Each line ≤ `width` visible chars. | **Yes** |
|
||||
| `handleInput(data)` | Receive keyboard input when component has focus. | Optional |
|
||||
| `wantsKeyRelease` | If `true`, receive key release events (Kitty protocol). | Optional, default `false` |
|
||||
| `invalidate()` | Clear cached render state. Called on theme changes. | **Yes** |
|
||||
|
||||
### The Render Contract
|
||||
|
||||
```typescript
|
||||
render(width: number): string[] {
|
||||
// MUST return an array of strings
|
||||
// Each string MUST NOT exceed `width` in visible characters
|
||||
// ANSI escape codes (colors, styles) don't count toward visible width
|
||||
// Styles are reset at end of each line — reapply per line
|
||||
// Return [] for zero-height component
|
||||
}
|
||||
```
|
||||
|
||||
### The Invalidation Contract
|
||||
|
||||
```typescript
|
||||
invalidate(): void {
|
||||
// Clear ALL cached render output
|
||||
// Clear any pre-baked themed strings
|
||||
// Call super.invalidate() if extending a built-in component
|
||||
// After invalidation, next render() must produce fresh output
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
19
docs/pi-ui-tui/03-entry-points-how-ui-gets-on-screen.md
Normal file
19
docs/pi-ui-tui/03-entry-points-how-ui-gets-on-screen.md
Normal file
|
|
@ -0,0 +1,19 @@
|
|||
# Entry Points — How UI Gets on Screen
|
||||
|
||||
There are **six different ways** to put custom UI on screen, each for a different purpose:
|
||||
|
||||
| Method | Purpose | Blocks? | Replaces editor? |
|
||||
|--------|---------|---------|-------------------|
|
||||
| `ctx.ui.select/confirm/input/editor` | Quick dialogs | Yes | Temporarily |
|
||||
| `ctx.ui.notify` | Toast notifications | No | No |
|
||||
| `ctx.ui.setStatus` | Footer status text | No | No |
|
||||
| `ctx.ui.setWidget` | Persistent widget above/below editor | No | No |
|
||||
| `ctx.ui.setFooter` | Replace entire footer | No | No (replaces footer) |
|
||||
| `ctx.ui.custom()` | Full custom component | Yes | Temporarily |
|
||||
| `ctx.ui.custom({overlay})` | Floating overlay | Yes | No (renders on top) |
|
||||
| `ctx.ui.setEditorComponent` | Replace editor permanently | No | Yes (permanently) |
|
||||
| `ctx.ui.setHeader` | Custom startup header | No | No (replaces header) |
|
||||
| `renderCall/renderResult` | Tool display | No | No (inline in messages) |
|
||||
| `registerMessageRenderer` | Custom message display | No | No (inline in messages) |
|
||||
|
||||
---
|
||||
77
docs/pi-ui-tui/04-built-in-dialog-methods.md
Normal file
77
docs/pi-ui-tui/04-built-in-dialog-methods.md
Normal file
|
|
@ -0,0 +1,77 @@
|
|||
# Built-in Dialog Methods
|
||||
|
||||
The simplest UI — blocking dialogs that wait for user response:
|
||||
|
||||
### Selection
|
||||
|
||||
```typescript
|
||||
const choice = await ctx.ui.select("Pick a color:", ["Red", "Green", "Blue"]);
|
||||
// Returns: "Red" | "Green" | "Blue" | undefined (if cancelled)
|
||||
```
|
||||
|
||||
### Confirmation
|
||||
|
||||
```typescript
|
||||
const ok = await ctx.ui.confirm("Delete file?", "This action cannot be undone.");
|
||||
// Returns: true | false
|
||||
```
|
||||
|
||||
### Text Input
|
||||
|
||||
```typescript
|
||||
const name = await ctx.ui.input("Project name:", "my-project");
|
||||
// Returns: string | undefined (if cancelled)
|
||||
```
|
||||
|
||||
### Multi-line Editor
|
||||
|
||||
```typescript
|
||||
const text = await ctx.ui.editor("Edit the description:", "Default text here");
|
||||
// Returns: string | undefined (if cancelled)
|
||||
```
|
||||
|
||||
### Timed Dialogs (Auto-Dismiss)
|
||||
|
||||
Dialogs can auto-dismiss with a live countdown:
|
||||
|
||||
```typescript
|
||||
// Shows "Confirm? (5s)" → "Confirm? (4s)" → ... → auto-dismisses
|
||||
const ok = await ctx.ui.confirm(
|
||||
"Auto-proceed?",
|
||||
"Continuing in 5 seconds...",
|
||||
{ timeout: 5000 }
|
||||
);
|
||||
// Returns false on timeout
|
||||
```
|
||||
|
||||
**Timeout return values:**
|
||||
- `select()` → `undefined`
|
||||
- `confirm()` → `false`
|
||||
- `input()` → `undefined`
|
||||
|
||||
### Manual Dismissal with AbortSignal
|
||||
|
||||
For more control (distinguish timeout from user cancel):
|
||||
|
||||
```typescript
|
||||
const controller = new AbortController();
|
||||
const timeoutId = setTimeout(() => controller.abort(), 5000);
|
||||
|
||||
const ok = await ctx.ui.confirm(
|
||||
"Timed Confirm",
|
||||
"Auto-cancels in 5s",
|
||||
{ signal: controller.signal }
|
||||
);
|
||||
|
||||
clearTimeout(timeoutId);
|
||||
|
||||
if (ok) {
|
||||
// User confirmed
|
||||
} else if (controller.signal.aborted) {
|
||||
// Timed out
|
||||
} else {
|
||||
// User cancelled (Escape)
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
120
docs/pi-ui-tui/05-persistent-ui-elements.md
Normal file
120
docs/pi-ui-tui/05-persistent-ui-elements.md
Normal file
|
|
@ -0,0 +1,120 @@
|
|||
# Persistent UI Elements
|
||||
|
||||
These stay on screen until explicitly cleared:
|
||||
|
||||
### Status (Footer)
|
||||
|
||||
```typescript
|
||||
// Set (persists until cleared or overwritten)
|
||||
ctx.ui.setStatus("my-ext", "● Active");
|
||||
ctx.ui.setStatus("my-ext", ctx.ui.theme.fg("accent", "● Mode: Plan"));
|
||||
|
||||
// Clear
|
||||
ctx.ui.setStatus("my-ext", undefined);
|
||||
```
|
||||
|
||||
Multiple extensions can set independent status entries. They appear in the footer.
|
||||
|
||||
### Widgets (Above/Below Editor)
|
||||
|
||||
```typescript
|
||||
// Simple string array (above editor, default)
|
||||
ctx.ui.setWidget("my-widget", ["Line 1", "Line 2", "Line 3"]);
|
||||
|
||||
// Below editor
|
||||
ctx.ui.setWidget("my-widget", ["Below the editor!"], { placement: "belowEditor" });
|
||||
|
||||
// With theme (component factory)
|
||||
ctx.ui.setWidget("my-widget", (_tui, theme) => {
|
||||
const lines = items.map(item =>
|
||||
item.done
|
||||
? theme.fg("success", "✓ ") + theme.fg("muted", theme.strikethrough(item.text))
|
||||
: theme.fg("dim", "○ ") + item.text
|
||||
);
|
||||
return {
|
||||
render: () => lines,
|
||||
invalidate: () => {},
|
||||
};
|
||||
});
|
||||
|
||||
// Clear
|
||||
ctx.ui.setWidget("my-widget", undefined);
|
||||
```
|
||||
|
||||
### Working Message (During Streaming)
|
||||
|
||||
```typescript
|
||||
ctx.ui.setWorkingMessage("Analyzing code structure...");
|
||||
ctx.ui.setWorkingMessage(); // Restore default
|
||||
```
|
||||
|
||||
### Custom Footer (Full Replacement)
|
||||
|
||||
```typescript
|
||||
ctx.ui.setFooter((tui, theme, footerData) => ({
|
||||
invalidate() {},
|
||||
render(width: number): string[] {
|
||||
const branch = footerData.getGitBranch(); // Not accessible elsewhere!
|
||||
const statuses = footerData.getExtensionStatuses(); // All setStatus values
|
||||
const left = theme.fg("dim", `${ctx.model?.id || "no-model"}`);
|
||||
const right = theme.fg("dim", branch || "no git");
|
||||
const pad = " ".repeat(Math.max(1, width - visibleWidth(left) - visibleWidth(right)));
|
||||
return [truncateToWidth(left + pad + right, width)];
|
||||
},
|
||||
// Reactive: re-render when branch changes
|
||||
dispose: footerData.onBranchChange(() => tui.requestRender()),
|
||||
}));
|
||||
|
||||
// Restore default
|
||||
ctx.ui.setFooter(undefined);
|
||||
```
|
||||
|
||||
**`footerData` provides:**
|
||||
- `getGitBranch(): string | null` — current git branch (not accessible through any other API)
|
||||
- `getExtensionStatuses(): ReadonlyMap<string, string>` — all `setStatus` values
|
||||
- `onBranchChange(callback): () => void` — subscribe to branch changes, returns dispose function
|
||||
|
||||
### Custom Header
|
||||
|
||||
```typescript
|
||||
ctx.ui.setHeader((tui, theme) => ({
|
||||
render(width: number): string[] {
|
||||
return [theme.fg("accent", theme.bold("My Custom Header"))];
|
||||
},
|
||||
invalidate() {},
|
||||
}));
|
||||
```
|
||||
|
||||
### Editor Control
|
||||
|
||||
```typescript
|
||||
// Set editor text
|
||||
ctx.ui.setEditorText("Prefilled text for the user");
|
||||
|
||||
// Get current editor text
|
||||
const current = ctx.ui.getEditorText();
|
||||
|
||||
// Paste into editor (triggers paste handling, including collapse for large content)
|
||||
ctx.ui.pasteToEditor("pasted content");
|
||||
|
||||
// Tool output expansion
|
||||
const wasExpanded = ctx.ui.getToolsExpanded();
|
||||
ctx.ui.setToolsExpanded(true); // Expand all
|
||||
ctx.ui.setToolsExpanded(false); // Collapse all
|
||||
|
||||
// Terminal title
|
||||
ctx.ui.setTitle("pi - my project");
|
||||
```
|
||||
|
||||
### Theme Management
|
||||
|
||||
```typescript
|
||||
const themes = ctx.ui.getAllThemes(); // [{ name: "dark", path: ... }, ...]
|
||||
const lightTheme = ctx.ui.getTheme("light"); // Load without switching
|
||||
const result = ctx.ui.setTheme("light"); // Switch by name
|
||||
if (!result.success) ctx.ui.notify(result.error!, "error");
|
||||
ctx.ui.setTheme(lightTheme!); // Switch by Theme object
|
||||
ctx.ui.theme.fg("accent", "styled text"); // Access current theme
|
||||
```
|
||||
|
||||
---
|
||||
130
docs/pi-ui-tui/06-ctx-ui-custom-full-custom-components.md
Normal file
130
docs/pi-ui-tui/06-ctx-ui-custom-full-custom-components.md
Normal file
|
|
@ -0,0 +1,130 @@
|
|||
# ctx.ui.custom() — Full Custom Components
|
||||
|
||||
This is the most powerful UI mechanism. It **temporarily replaces the editor** with your component. Returns a value when `done()` is called.
|
||||
|
||||
### Basic Pattern
|
||||
|
||||
```typescript
|
||||
const result = await ctx.ui.custom<string | null>((tui, theme, keybindings, done) => {
|
||||
// tui — TUI instance (requestRender, screen dimensions)
|
||||
// theme — Current theme for styling
|
||||
// keybindings — App keybinding manager
|
||||
// done(value) — Call to close component and return value
|
||||
|
||||
return {
|
||||
render(width: number): string[] {
|
||||
return [
|
||||
theme.fg("accent", "─".repeat(width)),
|
||||
" Press Enter to confirm, Escape to cancel",
|
||||
theme.fg("accent", "─".repeat(width)),
|
||||
];
|
||||
},
|
||||
handleInput(data: string) {
|
||||
if (matchesKey(data, Key.enter)) done("confirmed");
|
||||
if (matchesKey(data, Key.escape)) done(null);
|
||||
},
|
||||
invalidate() {},
|
||||
};
|
||||
});
|
||||
|
||||
if (result === "confirmed") {
|
||||
ctx.ui.notify("Confirmed!", "info");
|
||||
}
|
||||
```
|
||||
|
||||
### The Factory Callback
|
||||
|
||||
The factory function receives four arguments:
|
||||
|
||||
| Argument | Type | Purpose |
|
||||
|----------|------|---------|
|
||||
| `tui` | `TUI` | Screen info and render control. `tui.requestRender()` triggers re-render after state changes. |
|
||||
| `theme` | `Theme` | Current theme. Use `theme.fg()`, `theme.bg()`, `theme.bold()`, etc. |
|
||||
| `keybindings` | `KeybindingsManager` | App keybinding config. For checking what keys do what. |
|
||||
| `done` | `(value: T) => void` | Call this to close the component and return a value to the awaiting code. |
|
||||
|
||||
### Using Existing Components as Children
|
||||
|
||||
```typescript
|
||||
const result = await ctx.ui.custom<string | null>((tui, theme, _kb, done) => {
|
||||
const container = new Container();
|
||||
container.addChild(new DynamicBorder((s: string) => theme.fg("accent", s)));
|
||||
container.addChild(new Text(theme.fg("accent", theme.bold("Title")), 1, 0));
|
||||
|
||||
const selectList = new SelectList(items, 10, {
|
||||
selectedPrefix: (t) => theme.fg("accent", t),
|
||||
selectedText: (t) => theme.fg("accent", t),
|
||||
description: (t) => theme.fg("muted", t),
|
||||
scrollInfo: (t) => theme.fg("dim", t),
|
||||
noMatch: (t) => theme.fg("warning", t),
|
||||
});
|
||||
selectList.onSelect = (item) => done(item.value);
|
||||
selectList.onCancel = () => done(null);
|
||||
container.addChild(selectList);
|
||||
|
||||
container.addChild(new DynamicBorder((s: string) => theme.fg("accent", s)));
|
||||
|
||||
return {
|
||||
render: (w) => container.render(w),
|
||||
invalidate: () => container.invalidate(),
|
||||
handleInput: (data) => { selectList.handleInput(data); tui.requestRender(); },
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
### Using a Class
|
||||
|
||||
```typescript
|
||||
class MyComponent {
|
||||
private selected = 0;
|
||||
private items: string[];
|
||||
private done: (value: string | null) => void;
|
||||
private tui: { requestRender: () => void };
|
||||
private cachedWidth?: number;
|
||||
private cachedLines?: string[];
|
||||
|
||||
constructor(tui: TUI, items: string[], done: (value: string | null) => void) {
|
||||
this.tui = tui;
|
||||
this.items = items;
|
||||
this.done = done;
|
||||
}
|
||||
|
||||
handleInput(data: string) {
|
||||
if (matchesKey(data, Key.up) && this.selected > 0) {
|
||||
this.selected--;
|
||||
this.invalidate();
|
||||
this.tui.requestRender();
|
||||
} else if (matchesKey(data, Key.down) && this.selected < this.items.length - 1) {
|
||||
this.selected++;
|
||||
this.invalidate();
|
||||
this.tui.requestRender();
|
||||
} else if (matchesKey(data, Key.enter)) {
|
||||
this.done(this.items[this.selected]);
|
||||
} else if (matchesKey(data, Key.escape)) {
|
||||
this.done(null);
|
||||
}
|
||||
}
|
||||
|
||||
render(width: number): string[] {
|
||||
if (this.cachedLines && this.cachedWidth === width) return this.cachedLines;
|
||||
this.cachedLines = this.items.map((item, i) => {
|
||||
const prefix = i === this.selected ? "> " : " ";
|
||||
return truncateToWidth(prefix + item, width);
|
||||
});
|
||||
this.cachedWidth = width;
|
||||
return this.cachedLines;
|
||||
}
|
||||
|
||||
invalidate() {
|
||||
this.cachedWidth = undefined;
|
||||
this.cachedLines = undefined;
|
||||
}
|
||||
}
|
||||
|
||||
// Usage:
|
||||
const result = await ctx.ui.custom<string | null>((tui, theme, _kb, done) => {
|
||||
return new MyComponent(tui, ["Option A", "Option B", "Option C"], done);
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
198
docs/pi-ui-tui/07-built-in-components-the-building-blocks.md
Normal file
198
docs/pi-ui-tui/07-built-in-components-the-building-blocks.md
Normal file
|
|
@ -0,0 +1,198 @@
|
|||
# Built-in Components — The Building Blocks
|
||||
|
||||
Import from `@mariozechner/pi-tui`:
|
||||
|
||||
### Text
|
||||
|
||||
Multi-line text with automatic word wrapping and optional background.
|
||||
|
||||
```typescript
|
||||
import { Text } from "@mariozechner/pi-tui";
|
||||
|
||||
const text = new Text(
|
||||
"Hello World\nSecond line", // content (supports \n)
|
||||
1, // paddingX (default: 1)
|
||||
1, // paddingY (default: 1)
|
||||
(s) => bgGray(s) // optional background function
|
||||
);
|
||||
|
||||
text.setText("Updated content"); // Update text dynamically
|
||||
```
|
||||
|
||||
**When to use:** Single or multi-line text blocks, styled labels, error messages.
|
||||
|
||||
### Box
|
||||
|
||||
Container with padding and background color. Add children inside it.
|
||||
|
||||
```typescript
|
||||
import { Box } from "@mariozechner/pi-tui";
|
||||
|
||||
const box = new Box(
|
||||
1, // paddingX
|
||||
1, // paddingY
|
||||
(s) => bgGray(s) // background function
|
||||
);
|
||||
box.addChild(new Text("Content inside a box", 0, 0));
|
||||
box.setBgFn((s) => bgBlue(s)); // Change background dynamically
|
||||
```
|
||||
|
||||
**When to use:** Visually grouping content with a colored background.
|
||||
|
||||
### Container
|
||||
|
||||
Groups child components vertically (stacked). No visual styling of its own.
|
||||
|
||||
```typescript
|
||||
import { Container } from "@mariozechner/pi-tui";
|
||||
|
||||
const container = new Container();
|
||||
container.addChild(component1);
|
||||
container.addChild(component2);
|
||||
container.removeChild(component1);
|
||||
container.clear(); // Remove all children
|
||||
```
|
||||
|
||||
**When to use:** Composing complex layouts from simpler components.
|
||||
|
||||
### Spacer
|
||||
|
||||
Empty vertical space.
|
||||
|
||||
```typescript
|
||||
import { Spacer } from "@mariozechner/pi-tui";
|
||||
|
||||
const spacer = new Spacer(2); // 2 empty lines
|
||||
```
|
||||
|
||||
**When to use:** Visual separation between components.
|
||||
|
||||
### Markdown
|
||||
|
||||
Renders markdown with full formatting and syntax highlighting.
|
||||
|
||||
```typescript
|
||||
import { Markdown } from "@mariozechner/pi-tui";
|
||||
import { getMarkdownTheme } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
const md = new Markdown(
|
||||
"# Title\n\nSome **bold** text\n\n```js\nconst x = 1;\n```",
|
||||
1, // paddingX
|
||||
1, // paddingY
|
||||
getMarkdownTheme() // MarkdownTheme (from pi-coding-agent)
|
||||
);
|
||||
|
||||
md.setText("Updated markdown content");
|
||||
```
|
||||
|
||||
**When to use:** Rendering documentation, help text, formatted content.
|
||||
|
||||
### Image
|
||||
|
||||
Renders images in supported terminals (Kitty, iTerm2, Ghostty, WezTerm).
|
||||
|
||||
```typescript
|
||||
import { Image } from "@mariozechner/pi-tui";
|
||||
|
||||
const image = new Image(
|
||||
base64Data, // base64-encoded image data
|
||||
"image/png", // MIME type
|
||||
theme, // ImageTheme
|
||||
{ maxWidthCells: 80, maxHeightCells: 24 } // Optional size constraints
|
||||
);
|
||||
```
|
||||
|
||||
**When to use:** Displaying generated images, screenshots, diagrams.
|
||||
|
||||
### SelectList
|
||||
|
||||
Interactive selection from a list with search, scrolling, and descriptions.
|
||||
|
||||
```typescript
|
||||
import { SelectList, type SelectItem } from "@mariozechner/pi-tui";
|
||||
|
||||
const items: SelectItem[] = [
|
||||
{ value: "opt1", label: "Option 1", description: "First option" },
|
||||
{ value: "opt2", label: "Option 2", description: "Second option" },
|
||||
{ value: "opt3", label: "Option 3" }, // description is optional
|
||||
];
|
||||
|
||||
const selectList = new SelectList(
|
||||
items,
|
||||
10, // maxVisible (scrollable if more items)
|
||||
{
|
||||
selectedPrefix: (t) => theme.fg("accent", t),
|
||||
selectedText: (t) => theme.fg("accent", t),
|
||||
description: (t) => theme.fg("muted", t),
|
||||
scrollInfo: (t) => theme.fg("dim", t),
|
||||
noMatch: (t) => theme.fg("warning", t),
|
||||
}
|
||||
);
|
||||
|
||||
selectList.onSelect = (item) => { /* item.value */ };
|
||||
selectList.onCancel = () => { /* escape pressed */ };
|
||||
```
|
||||
|
||||
**When to use:** Letting users pick from a list. Handles arrow keys, search filtering, scrolling.
|
||||
|
||||
### SettingsList
|
||||
|
||||
Toggle settings with left/right arrow keys.
|
||||
|
||||
```typescript
|
||||
import { SettingsList, type SettingItem } from "@mariozechner/pi-tui";
|
||||
import { getSettingsListTheme } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
const items: SettingItem[] = [
|
||||
{ id: "verbose", label: "Verbose mode", currentValue: "off", values: ["on", "off"] },
|
||||
{ id: "theme", label: "Theme", currentValue: "dark", values: ["dark", "light", "auto"] },
|
||||
];
|
||||
|
||||
const settingsList = new SettingsList(
|
||||
items,
|
||||
Math.min(items.length + 2, 15), // maxVisible
|
||||
getSettingsListTheme(),
|
||||
(id, newValue) => { /* setting changed */ },
|
||||
() => { /* close requested (escape) */ },
|
||||
{ enableSearch: true }, // Optional: fuzzy search by label
|
||||
);
|
||||
```
|
||||
|
||||
**When to use:** Settings panels, toggle groups, configuration UIs.
|
||||
|
||||
### Input
|
||||
|
||||
Text input field with cursor.
|
||||
|
||||
```typescript
|
||||
import { Input } from "@mariozechner/pi-tui";
|
||||
|
||||
const input = new Input();
|
||||
input.setText("initial value");
|
||||
// Route keyboard input via handleInput
|
||||
```
|
||||
|
||||
### Editor
|
||||
|
||||
Multi-line text editor with undo, word deletion, cursor movement.
|
||||
|
||||
```typescript
|
||||
import { Editor, type EditorTheme } from "@mariozechner/pi-tui";
|
||||
|
||||
const editorTheme: EditorTheme = {
|
||||
borderColor: (s) => theme.fg("accent", s),
|
||||
selectList: {
|
||||
selectedPrefix: (t) => theme.fg("accent", t),
|
||||
selectedText: (t) => theme.fg("accent", t),
|
||||
description: (t) => theme.fg("muted", t),
|
||||
scrollInfo: (t) => theme.fg("dim", t),
|
||||
noMatch: (t) => theme.fg("warning", t),
|
||||
},
|
||||
};
|
||||
|
||||
const editor = new Editor(tui, editorTheme);
|
||||
editor.setText("prefilled");
|
||||
editor.onSubmit = (value) => { /* enter pressed */ };
|
||||
```
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,51 @@
|
|||
# High-Level Components from pi-coding-agent
|
||||
|
||||
### DynamicBorder
|
||||
|
||||
A horizontal border line with themed color. Use for framing dialogs.
|
||||
|
||||
```typescript
|
||||
import { DynamicBorder } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
// ⚠️ MUST explicitly type the parameter as string
|
||||
const border = new DynamicBorder((s: string) => theme.fg("accent", s));
|
||||
```
|
||||
|
||||
### BorderedLoader
|
||||
|
||||
Spinner with cancel support. Shows a message and an animated spinner while async work runs.
|
||||
|
||||
```typescript
|
||||
import { BorderedLoader } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
const result = await ctx.ui.custom<string | null>((tui, theme, _kb, done) => {
|
||||
const loader = new BorderedLoader(tui, theme, "Fetching data...");
|
||||
loader.onAbort = () => done(null); // Escape pressed
|
||||
|
||||
// Do async work with the loader's AbortSignal
|
||||
fetchData(loader.signal)
|
||||
.then(data => done(data))
|
||||
.catch(() => done(null));
|
||||
|
||||
return loader;
|
||||
});
|
||||
```
|
||||
|
||||
### CustomEditor
|
||||
|
||||
Base class for custom editors that replace the input. Provides app keybindings (escape to abort, ctrl+d, model switching) automatically.
|
||||
|
||||
```typescript
|
||||
import { CustomEditor } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
class MyEditor extends CustomEditor {
|
||||
handleInput(data: string): void {
|
||||
// Handle your keys first
|
||||
if (data === "x") { /* custom behavior */ return; }
|
||||
// Fall through to CustomEditor for app keybindings + text editing
|
||||
super.handleInput(data);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
62
docs/pi-ui-tui/09-keyboard-input-how-to-handle-keys.md
Normal file
62
docs/pi-ui-tui/09-keyboard-input-how-to-handle-keys.md
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
# Keyboard Input — How to Handle Keys
|
||||
|
||||
### matchesKey — The Key Detection Function
|
||||
|
||||
```typescript
|
||||
import { matchesKey, Key } from "@mariozechner/pi-tui";
|
||||
|
||||
handleInput(data: string) {
|
||||
// Using Key constants (recommended — gives autocomplete)
|
||||
if (matchesKey(data, Key.up)) { /* arrow up */ }
|
||||
if (matchesKey(data, Key.down)) { /* arrow down */ }
|
||||
if (matchesKey(data, Key.left)) { /* arrow left */ }
|
||||
if (matchesKey(data, Key.right)) { /* arrow right */ }
|
||||
if (matchesKey(data, Key.enter)) { /* enter */ }
|
||||
if (matchesKey(data, Key.escape)) { /* escape */ }
|
||||
if (matchesKey(data, Key.tab)) { /* tab */ }
|
||||
if (matchesKey(data, Key.space)) { /* space */ }
|
||||
if (matchesKey(data, Key.backspace)) { /* backspace */ }
|
||||
if (matchesKey(data, Key.delete)) { /* delete */ }
|
||||
if (matchesKey(data, Key.home)) { /* home */ }
|
||||
if (matchesKey(data, Key.end)) { /* end */ }
|
||||
|
||||
// With modifiers
|
||||
if (matchesKey(data, Key.ctrl("c"))) { /* ctrl+c */ }
|
||||
if (matchesKey(data, Key.ctrl("x"))) { /* ctrl+x */ }
|
||||
if (matchesKey(data, Key.shift("tab"))) { /* shift+tab */ }
|
||||
if (matchesKey(data, Key.alt("left"))) { /* alt+left */ }
|
||||
if (matchesKey(data, Key.ctrlShift("p"))) { /* ctrl+shift+p */ }
|
||||
|
||||
// String format also works
|
||||
if (matchesKey(data, "enter")) { }
|
||||
if (matchesKey(data, "ctrl+c")) { }
|
||||
if (matchesKey(data, "shift+tab")) { }
|
||||
|
||||
// Printable character detection
|
||||
if (data.length === 1 && data.charCodeAt(0) >= 32) {
|
||||
// It's a printable character (letter, number, symbol)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Key identifiers Reference
|
||||
|
||||
| Category | Keys |
|
||||
|----------|------|
|
||||
| Basic | `enter`, `escape`, `tab`, `space`, `backspace`, `delete`, `home`, `end` |
|
||||
| Arrow | `up`, `down`, `left`, `right` |
|
||||
| Modifiers | `ctrl("x")`, `shift("tab")`, `alt("left")`, `ctrlShift("p")` |
|
||||
|
||||
### The handleInput Contract
|
||||
|
||||
```typescript
|
||||
handleInput(data: string): void {
|
||||
// 1. Check for your keys
|
||||
// 2. Update state
|
||||
// 3. Call this.invalidate() if render output changes
|
||||
// 4. Call tui.requestRender() to trigger a re-render
|
||||
// (or if you're the top-level custom component, the TUI does this automatically)
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
46
docs/pi-ui-tui/10-line-width-the-cardinal-rule.md
Normal file
46
docs/pi-ui-tui/10-line-width-the-cardinal-rule.md
Normal file
|
|
@ -0,0 +1,46 @@
|
|||
# Line Width — The Cardinal Rule
|
||||
|
||||
**Every line from `render()` MUST NOT exceed the `width` parameter in visible characters.** This is the single most common source of rendering bugs.
|
||||
|
||||
### Utilities
|
||||
|
||||
```typescript
|
||||
import { visibleWidth, truncateToWidth, wrapTextWithAnsi } from "@mariozechner/pi-tui";
|
||||
|
||||
// Get display width (ignores ANSI escape codes)
|
||||
visibleWidth("\x1b[32mHello\x1b[0m"); // Returns 5, not 14
|
||||
|
||||
// Truncate to fit width (preserves ANSI codes)
|
||||
truncateToWidth("Very long text here", 10); // "Very lo..."
|
||||
truncateToWidth("Very long text here", 10, ""); // "Very long " (no ellipsis)
|
||||
truncateToWidth("Very long text here", 10, "→"); // "Very long→"
|
||||
|
||||
// Word wrap preserving ANSI codes
|
||||
wrapTextWithAnsi("\x1b[32mThis is a long green text\x1b[0m", 15);
|
||||
// Returns ["This is a long", "green text"] with ANSI codes preserved per line
|
||||
```
|
||||
|
||||
### The Pattern
|
||||
|
||||
```typescript
|
||||
render(width: number): string[] {
|
||||
const lines: string[] = [];
|
||||
|
||||
// Always truncate any line that could exceed width
|
||||
lines.push(truncateToWidth(` ${prefix}${content}`, width));
|
||||
|
||||
// For dynamic content, calculate available space
|
||||
const labelWidth = visibleWidth(label);
|
||||
const available = width - labelWidth - 4; // Leave room for padding
|
||||
const truncated = truncateToWidth(value, available);
|
||||
lines.push(` ${label}: ${truncated}`);
|
||||
|
||||
return lines;
|
||||
}
|
||||
```
|
||||
|
||||
### Why This Matters
|
||||
|
||||
If a line exceeds `width`, the terminal wraps it, causing visual corruption — lines overlap, the cursor mispositions, and the entire TUI can become garbled. The TUI framework **cannot fix this for you** because it doesn't know how you want lines truncated.
|
||||
|
||||
---
|
||||
72
docs/pi-ui-tui/11-theming-colors-and-styles.md
Normal file
72
docs/pi-ui-tui/11-theming-colors-and-styles.md
Normal file
|
|
@ -0,0 +1,72 @@
|
|||
# Theming — Colors and Styles
|
||||
|
||||
### Using Theme Colors
|
||||
|
||||
The `theme` object is always passed via callbacks — never import it directly.
|
||||
|
||||
```typescript
|
||||
// Foreground color
|
||||
theme.fg("accent", "Highlighted text") // Apply foreground color
|
||||
theme.fg("success", "✓ Passed")
|
||||
theme.fg("error", "✗ Failed")
|
||||
theme.fg("warning", "⚠ Warning")
|
||||
theme.fg("muted", "Secondary text")
|
||||
theme.fg("dim", "Tertiary text")
|
||||
|
||||
// Background color
|
||||
theme.bg("selectedBg", "Selected item")
|
||||
theme.bg("toolSuccessBg", "Success background")
|
||||
|
||||
// Text styles
|
||||
theme.bold("Bold text")
|
||||
theme.italic("Italic text")
|
||||
theme.strikethrough("Struck through")
|
||||
|
||||
// Combination
|
||||
theme.fg("accent", theme.bold("Bold and colored"))
|
||||
theme.bg("selectedBg", theme.fg("text", " Selected "))
|
||||
```
|
||||
|
||||
### All Foreground Colors
|
||||
|
||||
| Category | Colors |
|
||||
|----------|--------|
|
||||
| **General** | `text`, `accent`, `muted`, `dim` |
|
||||
| **Status** | `success`, `error`, `warning` |
|
||||
| **Borders** | `border`, `borderAccent`, `borderMuted` |
|
||||
| **Messages** | `userMessageText`, `customMessageText`, `customMessageLabel` |
|
||||
| **Tools** | `toolTitle`, `toolOutput` |
|
||||
| **Diffs** | `toolDiffAdded`, `toolDiffRemoved`, `toolDiffContext` |
|
||||
| **Markdown** | `mdHeading`, `mdLink`, `mdLinkUrl`, `mdCode`, `mdCodeBlock`, `mdCodeBlockBorder`, `mdQuote`, `mdQuoteBorder`, `mdHr`, `mdListBullet` |
|
||||
| **Syntax** | `syntaxComment`, `syntaxKeyword`, `syntaxFunction`, `syntaxVariable`, `syntaxString`, `syntaxNumber`, `syntaxType`, `syntaxOperator`, `syntaxPunctuation` |
|
||||
| **Thinking** | `thinkingOff`, `thinkingMinimal`, `thinkingLow`, `thinkingMedium`, `thinkingHigh`, `thinkingXhigh` |
|
||||
| **Modes** | `bashMode` |
|
||||
|
||||
### All Background Colors
|
||||
|
||||
`selectedBg`, `userMessageBg`, `customMessageBg`, `toolPendingBg`, `toolSuccessBg`, `toolErrorBg`
|
||||
|
||||
### Syntax Highlighting
|
||||
|
||||
```typescript
|
||||
import { highlightCode, getLanguageFromPath } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
// Highlight with explicit language
|
||||
const highlighted = highlightCode("const x = 1;", "typescript", theme);
|
||||
|
||||
// Auto-detect from file path
|
||||
const lang = getLanguageFromPath("/path/to/file.rs"); // "rust"
|
||||
const highlighted = highlightCode(code, lang, theme);
|
||||
```
|
||||
|
||||
### Markdown Theme
|
||||
|
||||
```typescript
|
||||
import { getMarkdownTheme } from "@mariozechner/pi-coding-agent";
|
||||
import { Markdown } from "@mariozechner/pi-tui";
|
||||
|
||||
const mdTheme = getMarkdownTheme();
|
||||
const md = new Markdown(content, 1, 1, mdTheme);
|
||||
```
|
||||
|
||||
---
|
||||
115
docs/pi-ui-tui/12-overlays-floating-modals-and-panels.md
Normal file
115
docs/pi-ui-tui/12-overlays-floating-modals-and-panels.md
Normal file
|
|
@ -0,0 +1,115 @@
|
|||
# Overlays — Floating Modals and Panels
|
||||
|
||||
Overlays render **on top of existing content** without clearing the screen. Essential for dialogs, side panels, and floating UI.
|
||||
|
||||
### Basic Overlay
|
||||
|
||||
```typescript
|
||||
const result = await ctx.ui.custom<string | null>(
|
||||
(tui, theme, keybindings, done) => new MyDialog({ onClose: done }),
|
||||
{ overlay: true }
|
||||
);
|
||||
```
|
||||
|
||||
### Positioned Overlay
|
||||
|
||||
```typescript
|
||||
const result = await ctx.ui.custom<string | null>(
|
||||
(tui, theme, _kb, done) => new SidePanel({ onClose: done }),
|
||||
{
|
||||
overlay: true,
|
||||
overlayOptions: {
|
||||
// Size (number = columns, string = percentage)
|
||||
width: "50%",
|
||||
minWidth: 40,
|
||||
maxHeight: "80%",
|
||||
|
||||
// Position: anchor-based (9 positions)
|
||||
anchor: "right-center",
|
||||
offsetX: -2,
|
||||
offsetY: 0,
|
||||
|
||||
// Or absolute/percentage positioning
|
||||
row: "25%", // 25% from top
|
||||
col: 10, // column 10
|
||||
|
||||
// Margins
|
||||
margin: 2, // all sides
|
||||
margin: { top: 2, right: 4, bottom: 2, left: 4 }, // per side
|
||||
|
||||
// Responsive: hide on narrow terminals
|
||||
visible: (termWidth, termHeight) => termWidth >= 80,
|
||||
},
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
### Anchor Positions
|
||||
|
||||
```
|
||||
top-left top-center top-right
|
||||
┌────────────────────────────┐
|
||||
│ │
|
||||
left-center center right-center
|
||||
│ │
|
||||
└────────────────────────────┘
|
||||
bottom-left bottom-center bottom-right
|
||||
```
|
||||
|
||||
### Programmatic Visibility Control
|
||||
|
||||
```typescript
|
||||
let overlayHandle: OverlayHandle | null = null;
|
||||
|
||||
const result = await ctx.ui.custom<string | null>(
|
||||
(tui, theme, _kb, done) => new MyPanel({ onClose: done }),
|
||||
{
|
||||
overlay: true,
|
||||
overlayOptions: { anchor: "right-center", width: "40%" },
|
||||
onHandle: (handle) => {
|
||||
overlayHandle = handle;
|
||||
// handle.setHidden(true) — temporarily hide
|
||||
// handle.setHidden(false) — show again
|
||||
// handle.hide() — permanently remove
|
||||
},
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
### Stacked Overlays
|
||||
|
||||
Multiple overlays can be shown simultaneously. They stack in order (newest on top). Each one's `done()` closes only that overlay:
|
||||
|
||||
```typescript
|
||||
// Show three stacked overlays
|
||||
const p1 = ctx.ui.custom(/* ... */, { overlay: true, overlayOptions: { offsetX: -5, offsetY: -3 } });
|
||||
const p2 = ctx.ui.custom(/* ... */, { overlay: true, overlayOptions: { offsetX: 0, offsetY: 0 } });
|
||||
const p3 = ctx.ui.custom(/* ... */, { overlay: true, overlayOptions: { offsetX: 5, offsetY: 3 } });
|
||||
|
||||
// Last one shown (p3) receives keyboard input
|
||||
// Closing p3 gives focus to p2, closing p2 gives focus to p1
|
||||
```
|
||||
|
||||
### ⚠️ Overlay Lifecycle Rule
|
||||
|
||||
**Overlay components are disposed when closed. Never reuse references.**
|
||||
|
||||
```typescript
|
||||
// ❌ WRONG — stale reference
|
||||
let menu: MenuComponent;
|
||||
await ctx.ui.custom((_, __, ___, done) => {
|
||||
menu = new MenuComponent(done);
|
||||
return menu;
|
||||
}, { overlay: true });
|
||||
menu.doSomething(); // DISPOSED — will fail
|
||||
|
||||
// ✅ CORRECT — re-call the factory
|
||||
const showMenu = () => ctx.ui.custom(
|
||||
(_, __, ___, done) => new MenuComponent(done),
|
||||
{ overlay: true }
|
||||
);
|
||||
await showMenu(); // First show
|
||||
await showMenu(); // Re-show with fresh instance
|
||||
```
|
||||
|
||||
---
|
||||
72
docs/pi-ui-tui/13-custom-editors-replacing-the-input.md
Normal file
72
docs/pi-ui-tui/13-custom-editors-replacing-the-input.md
Normal file
|
|
@ -0,0 +1,72 @@
|
|||
# Custom Editors — Replacing the Input
|
||||
|
||||
Replace the main input editor with a custom implementation. The editor persists until explicitly removed.
|
||||
|
||||
### The Pattern
|
||||
|
||||
```typescript
|
||||
import { CustomEditor, type ExtensionAPI } from "@mariozechner/pi-coding-agent";
|
||||
import { matchesKey, truncateToWidth } from "@mariozechner/pi-tui";
|
||||
|
||||
class VimEditor extends CustomEditor {
|
||||
private mode: "normal" | "insert" = "insert";
|
||||
|
||||
handleInput(data: string): void {
|
||||
// Escape in insert mode → switch to normal
|
||||
if (matchesKey(data, "escape") && this.mode === "insert") {
|
||||
this.mode = "normal";
|
||||
return;
|
||||
}
|
||||
|
||||
// Insert mode: pass everything to CustomEditor for text editing + app keybindings
|
||||
if (this.mode === "insert") {
|
||||
super.handleInput(data);
|
||||
return;
|
||||
}
|
||||
|
||||
// Normal mode: vim keys
|
||||
switch (data) {
|
||||
case "i": this.mode = "insert"; return;
|
||||
case "h": super.handleInput("\x1b[D"); return; // Left arrow
|
||||
case "j": super.handleInput("\x1b[B"); return; // Down arrow
|
||||
case "k": super.handleInput("\x1b[A"); return; // Up arrow
|
||||
case "l": super.handleInput("\x1b[C"); return; // Right arrow
|
||||
}
|
||||
|
||||
// Filter printable chars in normal mode (don't insert them)
|
||||
if (data.length === 1 && data.charCodeAt(0) >= 32) return;
|
||||
|
||||
// Pass unhandled to super (ctrl+c, ctrl+d, etc.)
|
||||
super.handleInput(data);
|
||||
}
|
||||
|
||||
render(width: number): string[] {
|
||||
const lines = super.render(width);
|
||||
// Add mode indicator to last line
|
||||
if (lines.length > 0) {
|
||||
const label = this.mode === "normal" ? " NORMAL " : " INSERT ";
|
||||
const lastLine = lines[lines.length - 1]!;
|
||||
lines[lines.length - 1] = truncateToWidth(lastLine, width - label.length, "") + label;
|
||||
}
|
||||
return lines;
|
||||
}
|
||||
}
|
||||
|
||||
// Register it:
|
||||
export default function (pi: ExtensionAPI) {
|
||||
pi.on("session_start", (_event, ctx) => {
|
||||
ctx.ui.setEditorComponent((_tui, theme, keybindings) =>
|
||||
new VimEditor(theme, keybindings)
|
||||
);
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### Critical Rules
|
||||
|
||||
1. **Extend `CustomEditor`**, not `Editor`. `CustomEditor` provides app keybindings (escape to abort, ctrl+d to exit, model switching) that must not be lost.
|
||||
2. **Call `super.handleInput(data)`** for any key you don't handle.
|
||||
3. **Use the factory pattern**: `setEditorComponent` receives a factory `(tui, theme, keybindings) => CustomEditor`.
|
||||
4. **Pass `undefined` to restore default**: `ctx.ui.setEditorComponent(undefined)`.
|
||||
|
||||
---
|
||||
95
docs/pi-ui-tui/14-tool-rendering-custom-tool-display.md
Normal file
95
docs/pi-ui-tui/14-tool-rendering-custom-tool-display.md
Normal file
|
|
@ -0,0 +1,95 @@
|
|||
# Tool Rendering — Custom Tool Display
|
||||
|
||||
Tools can control how their calls and results appear in the message area.
|
||||
|
||||
### renderCall — How the Tool Call Looks
|
||||
|
||||
```typescript
|
||||
import { Text } from "@mariozechner/pi-tui";
|
||||
|
||||
pi.registerTool({
|
||||
name: "my_tool",
|
||||
// ...
|
||||
|
||||
renderCall(args, theme) {
|
||||
// args = the tool call arguments
|
||||
let text = theme.fg("toolTitle", theme.bold("my_tool "));
|
||||
text += theme.fg("muted", args.action);
|
||||
if (args.text) text += " " + theme.fg("dim", `"${args.text}"`);
|
||||
return new Text(text, 0, 0); // 0,0 padding — the wrapping Box handles padding
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
### renderResult — How the Tool Result Looks
|
||||
|
||||
```typescript
|
||||
import { Text } from "@mariozechner/pi-tui";
|
||||
import { keyHint } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
pi.registerTool({
|
||||
name: "my_tool",
|
||||
// ...
|
||||
|
||||
renderResult(result, { expanded, isPartial }, theme) {
|
||||
// result.content — the content array sent to the LLM
|
||||
// result.details — your custom details object
|
||||
// expanded — whether user toggled expand (Ctrl+O)
|
||||
// isPartial — streaming in progress (onUpdate was called)
|
||||
|
||||
// Handle streaming state
|
||||
if (isPartial) {
|
||||
return new Text(theme.fg("warning", "Processing..."), 0, 0);
|
||||
}
|
||||
|
||||
// Handle errors
|
||||
if (result.details?.error) {
|
||||
return new Text(theme.fg("error", `Error: ${result.details.error}`), 0, 0);
|
||||
}
|
||||
|
||||
// Default view (collapsed)
|
||||
let text = theme.fg("success", "✓ Done");
|
||||
if (!expanded) {
|
||||
text += ` (${keyHint("expandTools", "to expand")})`;
|
||||
}
|
||||
|
||||
// Expanded view — show details
|
||||
if (expanded && result.details?.items) {
|
||||
for (const item of result.details.items) {
|
||||
text += "\n " + theme.fg("dim", item);
|
||||
}
|
||||
}
|
||||
|
||||
return new Text(text, 0, 0);
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
### Key Hints for Keybindings
|
||||
|
||||
```typescript
|
||||
import { keyHint, appKeyHint, editorKey, rawKeyHint } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
// Editor action hint (respects user's keybinding config)
|
||||
keyHint("expandTools", "to expand") // e.g., "Ctrl+O to expand"
|
||||
keyHint("selectConfirm", "to select") // e.g., "Enter to select"
|
||||
|
||||
// Raw key hint
|
||||
rawKeyHint("Ctrl+O", "to expand") // Always shows "Ctrl+O to expand"
|
||||
```
|
||||
|
||||
### Fallback Behavior
|
||||
|
||||
If `renderCall` or `renderResult` is not defined or throws:
|
||||
- `renderCall` → shows tool name
|
||||
- `renderResult` → shows raw text from `content`
|
||||
|
||||
### Best Practices
|
||||
|
||||
- Return `Text` with padding `(0, 0)` — the wrapping `Box` handles padding
|
||||
- Support `expanded` for detail on demand
|
||||
- Handle `isPartial` for streaming progress
|
||||
- Keep the default (collapsed) view compact
|
||||
- Use `\n` for multi-line content within a single `Text`
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,33 @@
|
|||
# Message Rendering — Custom Message Display
|
||||
|
||||
Register a renderer for messages with your `customType`:
|
||||
|
||||
```typescript
|
||||
import { Text } from "@mariozechner/pi-tui";
|
||||
|
||||
pi.registerMessageRenderer("my-extension", (message, options, theme) => {
|
||||
const { expanded } = options;
|
||||
|
||||
let text = theme.fg("accent", `[${message.customType}] `);
|
||||
text += message.content;
|
||||
|
||||
if (expanded && message.details) {
|
||||
text += "\n" + theme.fg("dim", JSON.stringify(message.details, null, 2));
|
||||
}
|
||||
|
||||
return new Text(text, 0, 0);
|
||||
});
|
||||
```
|
||||
|
||||
Send messages that use this renderer:
|
||||
|
||||
```typescript
|
||||
pi.sendMessage({
|
||||
customType: "my-extension", // Must match registerMessageRenderer
|
||||
content: "Status update",
|
||||
display: true, // Show in TUI
|
||||
details: { progress: 50 }, // Available in renderer, NOT sent to LLM
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
78
docs/pi-ui-tui/16-performance-caching-and-invalidation.md
Normal file
78
docs/pi-ui-tui/16-performance-caching-and-invalidation.md
Normal file
|
|
@ -0,0 +1,78 @@
|
|||
# Performance — Caching and Invalidation
|
||||
|
||||
### The Caching Pattern
|
||||
|
||||
Always cache `render()` output and recompute only when state changes:
|
||||
|
||||
```typescript
|
||||
class CachedComponent {
|
||||
private cachedWidth?: number;
|
||||
private cachedLines?: string[];
|
||||
|
||||
render(width: number): string[] {
|
||||
if (this.cachedLines && this.cachedWidth === width) {
|
||||
return this.cachedLines;
|
||||
}
|
||||
|
||||
// Expensive computation here
|
||||
const lines = this.computeLines(width);
|
||||
|
||||
this.cachedWidth = width;
|
||||
this.cachedLines = lines;
|
||||
return lines;
|
||||
}
|
||||
|
||||
invalidate(): void {
|
||||
this.cachedWidth = undefined;
|
||||
this.cachedLines = undefined;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### The Update Cycle
|
||||
|
||||
```
|
||||
State changes → invalidate() → tui.requestRender() → render(width) called
|
||||
```
|
||||
|
||||
1. Something changes your component's state (user input, timer, async result)
|
||||
2. Call `this.invalidate()` to clear caches
|
||||
3. Call `tui.requestRender()` to schedule a re-render
|
||||
4. The TUI calls `render(width)` on the next frame
|
||||
5. Your component recomputes its output (since cache was cleared)
|
||||
|
||||
### Game Loop Pattern (Real-Time Updates)
|
||||
|
||||
```typescript
|
||||
class GameComponent {
|
||||
private interval: ReturnType<typeof setInterval> | null = null;
|
||||
private version = 0;
|
||||
private cachedVersion = -1;
|
||||
|
||||
constructor(private tui: { requestRender: () => void }) {
|
||||
this.interval = setInterval(() => {
|
||||
this.tick();
|
||||
this.version++;
|
||||
this.tui.requestRender();
|
||||
}, 100); // 10 FPS
|
||||
}
|
||||
|
||||
render(width: number): string[] {
|
||||
if (this.cachedVersion === this.version && /* width unchanged */) {
|
||||
return this.cachedLines;
|
||||
}
|
||||
// ... render ...
|
||||
this.cachedVersion = this.version;
|
||||
return lines;
|
||||
}
|
||||
|
||||
dispose(): void {
|
||||
if (this.interval) {
|
||||
clearInterval(this.interval);
|
||||
this.interval = null;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
49
docs/pi-ui-tui/17-theme-changes-and-invalidation.md
Normal file
49
docs/pi-ui-tui/17-theme-changes-and-invalidation.md
Normal file
|
|
@ -0,0 +1,49 @@
|
|||
# Theme Changes and Invalidation
|
||||
|
||||
When the user switches themes, the TUI calls `invalidate()` on all components. If your component pre-bakes theme colors, you must rebuild them.
|
||||
|
||||
### ❌ Wrong — Theme Colors Won't Update
|
||||
|
||||
```typescript
|
||||
class BadComponent extends Container {
|
||||
constructor(message: string, theme: Theme) {
|
||||
super();
|
||||
// Pre-baked theme colors — stuck with old theme forever!
|
||||
this.addChild(new Text(theme.fg("accent", message), 1, 0));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### ✅ Correct — Rebuild on Invalidate
|
||||
|
||||
```typescript
|
||||
class GoodComponent extends Container {
|
||||
private message: string;
|
||||
private theme: Theme;
|
||||
|
||||
constructor(message: string, theme: Theme) {
|
||||
super();
|
||||
this.message = message;
|
||||
this.theme = theme;
|
||||
this.rebuild();
|
||||
}
|
||||
|
||||
private rebuild(): void {
|
||||
this.clear(); // Remove all children
|
||||
this.addChild(new Text(this.theme.fg("accent", this.message), 1, 0));
|
||||
}
|
||||
|
||||
override invalidate(): void {
|
||||
super.invalidate();
|
||||
this.rebuild(); // Rebuild with current theme
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### When You Need This Pattern
|
||||
|
||||
**NEED to rebuild:** Pre-baked `theme.fg()`/`theme.bg()` strings, `highlightCode()` results, complex child trees with embedded colors.
|
||||
|
||||
**DON'T need to rebuild:** Theme callbacks `(text) => theme.fg("accent", text)`, stateless renders that compute fresh each time, simple containers without themed content.
|
||||
|
||||
---
|
||||
43
docs/pi-ui-tui/18-ime-support-the-focusable-interface.md
Normal file
43
docs/pi-ui-tui/18-ime-support-the-focusable-interface.md
Normal file
|
|
@ -0,0 +1,43 @@
|
|||
# IME Support — The Focusable Interface
|
||||
|
||||
For components that display a text cursor and need IME (Input Method Editor) support for CJK languages:
|
||||
|
||||
```typescript
|
||||
import { CURSOR_MARKER, type Component, type Focusable } from "@mariozechner/pi-tui";
|
||||
|
||||
class MyInput implements Component, Focusable {
|
||||
focused: boolean = false; // Set by TUI when focus changes
|
||||
|
||||
render(width: number): string[] {
|
||||
const marker = this.focused ? CURSOR_MARKER : "";
|
||||
return [`> ${beforeCursor}${marker}\x1b[7m${atCursor}\x1b[27m${afterCursor}`];
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Container with Embedded Input
|
||||
|
||||
If your container contains an `Input` or `Editor` child, propagate focus:
|
||||
|
||||
```typescript
|
||||
class SearchDialog extends Container implements Focusable {
|
||||
private searchInput: Input;
|
||||
private _focused = false;
|
||||
|
||||
get focused(): boolean { return this._focused; }
|
||||
set focused(value: boolean) {
|
||||
this._focused = value;
|
||||
this.searchInput.focused = value; // Propagate!
|
||||
}
|
||||
|
||||
constructor() {
|
||||
super();
|
||||
this.searchInput = new Input();
|
||||
this.addChild(this.searchInput);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Without this, IME candidate windows (Chinese, Japanese, Korean input) appear in the wrong position.
|
||||
|
||||
---
|
||||
133
docs/pi-ui-tui/19-building-a-complete-component-step-by-step.md
Normal file
133
docs/pi-ui-tui/19-building-a-complete-component-step-by-step.md
Normal file
|
|
@ -0,0 +1,133 @@
|
|||
# Building a Complete Component — Step by Step
|
||||
|
||||
Let's build a real component: an interactive todo list displayed via a command.
|
||||
|
||||
```typescript
|
||||
import type { ExtensionAPI, Theme } from "@mariozechner/pi-coding-agent";
|
||||
import { DynamicBorder } from "@mariozechner/pi-coding-agent";
|
||||
import { Container, Key, matchesKey, Text, truncateToWidth } from "@mariozechner/pi-tui";
|
||||
|
||||
interface TodoItem {
|
||||
text: string;
|
||||
done: boolean;
|
||||
}
|
||||
|
||||
class TodoListUI {
|
||||
private items: TodoItem[];
|
||||
private selected = 0;
|
||||
private theme: Theme;
|
||||
private done: (items: TodoItem[]) => void;
|
||||
private tui: { requestRender: () => void };
|
||||
private cachedWidth?: number;
|
||||
private cachedLines?: string[];
|
||||
|
||||
constructor(
|
||||
tui: { requestRender: () => void },
|
||||
theme: Theme,
|
||||
items: TodoItem[],
|
||||
done: (items: TodoItem[]) => void,
|
||||
) {
|
||||
this.tui = tui;
|
||||
this.theme = theme;
|
||||
this.items = [...items]; // Clone to avoid mutation
|
||||
this.done = done;
|
||||
}
|
||||
|
||||
handleInput(data: string): void {
|
||||
if (matchesKey(data, Key.up) && this.selected > 0) {
|
||||
this.selected--;
|
||||
} else if (matchesKey(data, Key.down) && this.selected < this.items.length - 1) {
|
||||
this.selected++;
|
||||
} else if (matchesKey(data, Key.space)) {
|
||||
// Toggle current item
|
||||
this.items[this.selected].done = !this.items[this.selected].done;
|
||||
} else if (matchesKey(data, Key.enter)) {
|
||||
this.done(this.items);
|
||||
return;
|
||||
} else if (matchesKey(data, Key.escape)) {
|
||||
this.done(this.items);
|
||||
return;
|
||||
} else {
|
||||
return; // Don't invalidate for unhandled keys
|
||||
}
|
||||
|
||||
this.invalidate();
|
||||
this.tui.requestRender();
|
||||
}
|
||||
|
||||
render(width: number): string[] {
|
||||
if (this.cachedLines && this.cachedWidth === width) {
|
||||
return this.cachedLines;
|
||||
}
|
||||
|
||||
const th = this.theme;
|
||||
const lines: string[] = [];
|
||||
|
||||
// Border
|
||||
lines.push(truncateToWidth(th.fg("accent", "─".repeat(width)), width));
|
||||
|
||||
// Title
|
||||
const done = this.items.filter(i => i.done).length;
|
||||
lines.push(truncateToWidth(
|
||||
` ${th.fg("accent", th.bold("Todos"))} ${th.fg("muted", `${done}/${this.items.length}`)}`,
|
||||
width
|
||||
));
|
||||
lines.push("");
|
||||
|
||||
// Items
|
||||
for (let i = 0; i < this.items.length; i++) {
|
||||
const item = this.items[i];
|
||||
const isSelected = i === this.selected;
|
||||
const prefix = isSelected ? th.fg("accent", "> ") : " ";
|
||||
const check = item.done ? th.fg("success", "✓ ") : th.fg("dim", "○ ");
|
||||
const text = item.done
|
||||
? th.fg("muted", th.strikethrough(item.text))
|
||||
: th.fg("text", item.text);
|
||||
|
||||
lines.push(truncateToWidth(`${prefix}${check}${text}`, width));
|
||||
}
|
||||
|
||||
if (this.items.length === 0) {
|
||||
lines.push(truncateToWidth(` ${th.fg("dim", "No items")}`, width));
|
||||
}
|
||||
|
||||
// Help
|
||||
lines.push("");
|
||||
lines.push(truncateToWidth(
|
||||
` ${th.fg("dim", "↑↓ navigate • Space toggle • Enter/Esc close")}`,
|
||||
width
|
||||
));
|
||||
lines.push(truncateToWidth(th.fg("accent", "─".repeat(width)), width));
|
||||
|
||||
this.cachedWidth = width;
|
||||
this.cachedLines = lines;
|
||||
return lines;
|
||||
}
|
||||
|
||||
invalidate(): void {
|
||||
this.cachedWidth = undefined;
|
||||
this.cachedLines = undefined;
|
||||
}
|
||||
}
|
||||
|
||||
// Usage in an extension:
|
||||
export default function (pi: ExtensionAPI) {
|
||||
let items: TodoItem[] = [
|
||||
{ text: "First task", done: false },
|
||||
{ text: "Second task", done: true },
|
||||
{ text: "Third task", done: false },
|
||||
];
|
||||
|
||||
pi.registerCommand("todos", {
|
||||
description: "Interactive todo list",
|
||||
handler: async (_args, ctx) => {
|
||||
const result = await ctx.ui.custom<TodoItem[]>((tui, theme, _kb, done) => {
|
||||
return new TodoListUI(tui, theme, items, done);
|
||||
});
|
||||
items = result; // Save updated state
|
||||
},
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
147
docs/pi-ui-tui/20-real-world-patterns-from-examples.md
Normal file
147
docs/pi-ui-tui/20-real-world-patterns-from-examples.md
Normal file
|
|
@ -0,0 +1,147 @@
|
|||
# Real-World Patterns from Examples
|
||||
|
||||
### Pattern: Selection Dialog with Borders
|
||||
|
||||
From `preset.ts` and `tools.ts`:
|
||||
|
||||
```typescript
|
||||
const result = await ctx.ui.custom<string | null>((tui, theme, _kb, done) => {
|
||||
const container = new Container();
|
||||
container.addChild(new DynamicBorder((s: string) => theme.fg("accent", s)));
|
||||
container.addChild(new Text(theme.fg("accent", theme.bold("Title")), 1, 0));
|
||||
|
||||
const selectList = new SelectList(items, maxVisible, {
|
||||
selectedPrefix: (t) => theme.fg("accent", t),
|
||||
selectedText: (t) => theme.fg("accent", t),
|
||||
description: (t) => theme.fg("muted", t),
|
||||
scrollInfo: (t) => theme.fg("dim", t),
|
||||
noMatch: (t) => theme.fg("warning", t),
|
||||
});
|
||||
selectList.onSelect = (item) => done(item.value);
|
||||
selectList.onCancel = () => done(null);
|
||||
container.addChild(selectList);
|
||||
container.addChild(new Text(theme.fg("dim", "↑↓ navigate • enter select • esc cancel"), 1, 0));
|
||||
container.addChild(new DynamicBorder((s: string) => theme.fg("accent", s)));
|
||||
|
||||
return {
|
||||
render: (w) => container.render(w),
|
||||
invalidate: () => container.invalidate(),
|
||||
handleInput: (data) => { selectList.handleInput(data); tui.requestRender(); },
|
||||
};
|
||||
});
|
||||
```
|
||||
|
||||
### Pattern: Game with Timer Loop
|
||||
|
||||
From `snake.ts`:
|
||||
|
||||
```typescript
|
||||
class SnakeComponent {
|
||||
private interval: ReturnType<typeof setInterval> | null = null;
|
||||
|
||||
constructor(tui: { requestRender: () => void }, done: () => void) {
|
||||
this.interval = setInterval(() => {
|
||||
this.tick(); // Update game state
|
||||
this.version++; // Bump render version
|
||||
tui.requestRender(); // Request re-render
|
||||
}, 100);
|
||||
}
|
||||
|
||||
dispose(): void {
|
||||
if (this.interval) {
|
||||
clearInterval(this.interval);
|
||||
this.interval = null;
|
||||
}
|
||||
}
|
||||
|
||||
// Call dispose() before calling done() to stop the timer
|
||||
handleInput(data: string): void {
|
||||
if (matchesKey(data, Key.escape)) {
|
||||
this.dispose();
|
||||
this.onClose();
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern: Async Operation with Spinner
|
||||
|
||||
From `qna.ts`:
|
||||
|
||||
```typescript
|
||||
const result = await ctx.ui.custom<string | null>((tui, theme, _kb, done) => {
|
||||
const loader = new BorderedLoader(tui, theme, "Processing...");
|
||||
loader.onAbort = () => done(null);
|
||||
|
||||
doAsyncWork(loader.signal)
|
||||
.then(data => done(data))
|
||||
.catch(() => done(null));
|
||||
|
||||
return loader;
|
||||
});
|
||||
```
|
||||
|
||||
### Pattern: Persistent Widget with Live Updates
|
||||
|
||||
From `plan-mode/index.ts`:
|
||||
|
||||
```typescript
|
||||
function updateUI(ctx: ExtensionContext): void {
|
||||
if (todoItems.length > 0) {
|
||||
const lines = todoItems.map(item => {
|
||||
if (item.completed) {
|
||||
return ctx.ui.theme.fg("success", "☑ ") +
|
||||
ctx.ui.theme.fg("muted", ctx.ui.theme.strikethrough(item.text));
|
||||
}
|
||||
return ctx.ui.theme.fg("muted", "☐ ") + item.text;
|
||||
});
|
||||
ctx.ui.setWidget("plan-todos", lines);
|
||||
ctx.ui.setStatus("plan-mode", ctx.ui.theme.fg("accent", `📋 ${completed}/${total}`));
|
||||
} else {
|
||||
ctx.ui.setWidget("plan-todos", undefined);
|
||||
ctx.ui.setStatus("plan-mode", undefined);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Pattern: Multi-Tab Questionnaire
|
||||
|
||||
From `questionnaire.ts`:
|
||||
|
||||
```typescript
|
||||
// State: currentTab, optionIndex, inputMode, answers map
|
||||
// Tab navigation with shift+tab / tab
|
||||
// Option selection with up/down + enter
|
||||
// "Type something" option that switches to embedded Editor
|
||||
// Submit tab that shows summary of all answers
|
||||
// Full renderCall and renderResult for LLM context display
|
||||
```
|
||||
|
||||
### Pattern: Custom Footer with Reactive Data
|
||||
|
||||
From `custom-footer.ts`:
|
||||
|
||||
```typescript
|
||||
ctx.ui.setFooter((tui, theme, footerData) => ({
|
||||
invalidate() {},
|
||||
render(width: number): string[] {
|
||||
let input = 0, output = 0, cost = 0;
|
||||
for (const e of ctx.sessionManager.getBranch()) {
|
||||
if (e.type === "message" && e.message.role === "assistant") {
|
||||
const m = e.message as AssistantMessage;
|
||||
input += m.usage.input;
|
||||
output += m.usage.output;
|
||||
cost += m.usage.cost.total;
|
||||
}
|
||||
}
|
||||
const branch = footerData.getGitBranch();
|
||||
const left = theme.fg("dim", `↑${fmt(input)} ↓${fmt(output)} $${cost.toFixed(3)}`);
|
||||
const right = theme.fg("dim", `${ctx.model?.id}${branch ? ` (${branch})` : ""}`);
|
||||
const pad = " ".repeat(Math.max(1, width - visibleWidth(left) - visibleWidth(right)));
|
||||
return [truncateToWidth(left + pad + right, width)];
|
||||
},
|
||||
dispose: footerData.onBranchChange(() => tui.requestRender()),
|
||||
}));
|
||||
```
|
||||
|
||||
---
|
||||
53
docs/pi-ui-tui/21-common-mistakes-and-how-to-avoid-them.md
Normal file
53
docs/pi-ui-tui/21-common-mistakes-and-how-to-avoid-them.md
Normal file
|
|
@ -0,0 +1,53 @@
|
|||
# Common Mistakes and How to Avoid Them
|
||||
|
||||
### 1. Lines exceed width
|
||||
|
||||
**Symptom:** Visual corruption, overlapping lines, garbled display.
|
||||
**Fix:** Use `truncateToWidth()` on every line.
|
||||
|
||||
### 2. Forgetting `tui.requestRender()`
|
||||
|
||||
**Symptom:** UI doesn't update after state changes.
|
||||
**Fix:** Call `this.invalidate()` then `tui.requestRender()` after any state change in `handleInput`.
|
||||
|
||||
### 3. Importing theme directly
|
||||
|
||||
**Symptom:** Wrong colors, crashes, or stale theme after switching.
|
||||
**Fix:** Always use `theme` from the callback: `ctx.ui.custom((tui, theme, kb, done) => ...)`.
|
||||
|
||||
### 4. Not typing DynamicBorder color param
|
||||
|
||||
**Symptom:** TypeScript error or runtime crash.
|
||||
**Fix:** `new DynamicBorder((s: string) => theme.fg("accent", s))` — always add `s: string`.
|
||||
|
||||
### 5. Reusing disposed overlay components
|
||||
|
||||
**Symptom:** Component doesn't render, events don't fire.
|
||||
**Fix:** Create fresh instances each time. Never save references to overlay components.
|
||||
|
||||
### 6. Styles bleeding across lines
|
||||
|
||||
**Symptom:** Colors from one line appear on the next.
|
||||
**Fix:** The TUI resets styles at end of each line. Reapply styles per line, or use `wrapTextWithAnsi()`.
|
||||
|
||||
### 7. Not implementing invalidate()
|
||||
|
||||
**Symptom:** Theme changes don't take effect, stale rendering.
|
||||
**Fix:** Clear all caches in `invalidate()`. If you pre-bake theme colors, rebuild them.
|
||||
|
||||
### 8. Forgetting to call `super.invalidate()`
|
||||
|
||||
**Symptom:** Child components don't update when extending Container/Box.
|
||||
**Fix:** `override invalidate() { super.invalidate(); /* your cleanup */ }`
|
||||
|
||||
### 9. Timer not cleaned up
|
||||
|
||||
**Symptom:** Errors after component closes, memory leaks, phantom updates.
|
||||
**Fix:** Call `clearInterval` in a `dispose()` method before calling `done()`.
|
||||
|
||||
### 10. Using `ctx.ui` methods in non-interactive mode
|
||||
|
||||
**Symptom:** Hangs (dialogs waiting for input that will never come) or silent failures.
|
||||
**Fix:** Check `ctx.hasUI` before calling dialog methods.
|
||||
|
||||
---
|
||||
68
docs/pi-ui-tui/22-quick-reference-all-ui-apis.md
Normal file
68
docs/pi-ui-tui/22-quick-reference-all-ui-apis.md
Normal file
|
|
@ -0,0 +1,68 @@
|
|||
# Quick Reference — All UI APIs
|
||||
|
||||
### ctx.ui Dialog Methods (Blocking)
|
||||
|
||||
| Method | Returns | Description |
|
||||
|--------|---------|-------------|
|
||||
| `select(title, options)` | `string \| undefined` | Selection dialog |
|
||||
| `confirm(title, message, opts?)` | `boolean` | Yes/No confirmation |
|
||||
| `input(label, placeholder?, opts?)` | `string \| undefined` | Single-line text input |
|
||||
| `editor(label, prefill?, opts?)` | `string \| undefined` | Multi-line text editor |
|
||||
|
||||
### ctx.ui Persistent Methods (Non-Blocking)
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| `notify(message, level)` | Toast notification (`"info"`, `"warning"`, `"error"`) |
|
||||
| `setStatus(id, text?)` | Footer status (clear with `undefined`) |
|
||||
| `setWidget(id, content?, opts?)` | Widget above/below editor |
|
||||
| `setWorkingMessage(text?)` | Working message during streaming |
|
||||
| `setFooter(factory?)` | Replace footer (restore with `undefined`) |
|
||||
| `setHeader(factory?)` | Replace header (restore with `undefined`) |
|
||||
| `setTitle(title)` | Terminal title |
|
||||
| `setEditorText(text)` | Set editor content |
|
||||
| `getEditorText()` | Get editor content |
|
||||
| `pasteToEditor(text)` | Paste into editor |
|
||||
| `setToolsExpanded(bool)` | Expand/collapse tool output |
|
||||
| `getToolsExpanded()` | Get expansion state |
|
||||
| `setEditorComponent(factory?)` | Replace editor (restore with `undefined`) |
|
||||
| `custom(factory, opts?)` | Full custom component / overlay |
|
||||
| `setTheme(name \| Theme)` | Switch theme |
|
||||
| `getTheme(name)` | Load theme without switching |
|
||||
| `getAllThemes()` | List available themes |
|
||||
| `theme` | Current theme object |
|
||||
|
||||
### Component Interface
|
||||
|
||||
| Method | Required | Description |
|
||||
|--------|----------|-------------|
|
||||
| `render(width): string[]` | Yes | Render to lines (each ≤ width) |
|
||||
| `handleInput(data): void` | No | Receive keyboard input |
|
||||
| `invalidate(): void` | Yes | Clear caches |
|
||||
| `wantsKeyRelease?: boolean` | No | Receive key release events |
|
||||
|
||||
### Key Imports
|
||||
|
||||
```typescript
|
||||
// From @mariozechner/pi-tui
|
||||
import {
|
||||
Text, Box, Container, Spacer, Markdown, Image,
|
||||
SelectList, SettingsList, Input, Editor,
|
||||
matchesKey, Key,
|
||||
visibleWidth, truncateToWidth, wrapTextWithAnsi,
|
||||
CURSOR_MARKER,
|
||||
type Component, type Focusable, type SelectItem, type SettingItem,
|
||||
type EditorTheme, type OverlayAnchor, type OverlayOptions, type OverlayHandle,
|
||||
} from "@mariozechner/pi-tui";
|
||||
|
||||
// From @mariozechner/pi-coding-agent
|
||||
import {
|
||||
DynamicBorder, BorderedLoader, CustomEditor,
|
||||
getMarkdownTheme, getSettingsListTheme,
|
||||
highlightCode, getLanguageFromPath,
|
||||
keyHint, appKeyHint, editorKey, rawKeyHint,
|
||||
type ExtensionAPI, type ExtensionContext, type Theme,
|
||||
} from "@mariozechner/pi-coding-agent";
|
||||
```
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,93 @@
|
|||
# File Reference — Example Extensions with UI
|
||||
|
||||
All paths relative to:
|
||||
```
|
||||
/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/examples/extensions/
|
||||
```
|
||||
|
||||
### Full Custom Components
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `snake.ts` | **Game** — Timer loop, keyboard handling, WASD + arrows, render caching, session persistence, pause/resume |
|
||||
| `space-invaders.ts` | **Game** — Similar patterns to snake with more complex rendering |
|
||||
| `doom-overlay/` | **Game as overlay** — DOOM running at 35 FPS in a floating overlay, real-time rendering |
|
||||
| `questionnaire.ts` | **Multi-tab wizard** — Tab navigation, embedded `Editor` for free-text, option selection, submission flow |
|
||||
| `modal-editor.ts` | **Custom editor** — Vim-like modal editing with mode indicator |
|
||||
| `rainbow-editor.ts` | **Custom editor** — Animated text effects |
|
||||
|
||||
### Dialogs and Selection
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `preset.ts` | `SelectList` with `DynamicBorder`, complex multi-value presets |
|
||||
| `tools.ts` | `SettingsList` for toggling tools on/off |
|
||||
| `question.ts` | `ctx.ui.select()` inside a tool |
|
||||
| `timed-confirm.ts` | Dialogs with `timeout` and `AbortSignal` |
|
||||
|
||||
### Overlays
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `overlay-test.ts` | Basic overlay compositing with inline inputs |
|
||||
| `overlay-qa-tests.ts` | **Comprehensive** — All 9 anchors, margins, offsets, stacking, responsive visibility, animation at ~30 FPS, percentage sizing, max-height |
|
||||
|
||||
### Persistent UI
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `plan-mode/` | `setStatus` + `setWidget` for progress tracking, reactive updates |
|
||||
| `status-line.ts` | `setStatus` with themed colors |
|
||||
| `widget-placement.ts` | `setWidget` above and below editor |
|
||||
| `custom-footer.ts` | `setFooter` with git branch, token stats, reactive branch changes |
|
||||
| `custom-header.ts` | `setHeader` for custom startup header |
|
||||
|
||||
### Tool Rendering
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `todo.ts` | **Complete example** — `renderCall` and `renderResult` with expanded/collapsed views, state in details |
|
||||
| `built-in-tool-renderer.ts` | Custom compact rendering for built-in tools |
|
||||
| `minimal-mode.ts` | Override rendering for minimal display |
|
||||
|
||||
### Message Rendering
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `message-renderer.ts` | `registerMessageRenderer` with colors and expandable details |
|
||||
|
||||
### Async Operations
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `qna.ts` | `BorderedLoader` for async LLM calls with cancel |
|
||||
| `summarize.ts` | Summarize conversation with transient UI |
|
||||
|
||||
### Notifications and Status
|
||||
| File | What It Demonstrates |
|
||||
|------|---------------------|
|
||||
| `notify.ts` | Desktop notifications via OSC 777 (Ghostty, iTerm2, WezTerm) |
|
||||
| `titlebar-spinner.ts` | Braille spinner animation in terminal title |
|
||||
| `model-status.ts` | React to model changes with `setStatus` |
|
||||
|
||||
### Documentation References
|
||||
| File | What It Covers |
|
||||
|------|---------------|
|
||||
| `docs/tui.md` | Full TUI component API, all patterns, performance, theming |
|
||||
| `docs/extensions.md` | Custom UI section, custom components, overlays, rendering |
|
||||
| `docs/themes.md` | Creating custom themes, full color palette |
|
||||
| `docs/keybindings.md` | Keyboard shortcut format, customization |
|
||||
|
||||
### Debug Logging
|
||||
|
||||
```bash
|
||||
PI_TUI_WRITE_LOG=/tmp/tui-ansi.log pi
|
||||
```
|
||||
|
||||
Captures the raw ANSI stream for debugging rendering issues.
|
||||
|
||||
---
|
||||
|
||||
*This document was generated from Pi's TUI and extension documentation. Source files:*
|
||||
```
|
||||
/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/tui.md
|
||||
/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/docs/extensions.md
|
||||
/Users/lexchristopherson/.nvm/versions/node/v22.20.0/lib/node_modules/@mariozechner/pi-coding-agent/examples/extensions/
|
||||
```
|
||||
|
||||
*Companion documents on Desktop:*
|
||||
- **Pi-What-It-Is-And-How-It-Works.md** — What Pi is and how it works
|
||||
- **Pi-Extensions-Complete-Guide.md** — Full extensions API reference
|
||||
34
docs/pi-ui-tui/README.md
Normal file
34
docs/pi-ui-tui/README.md
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
# Pi Custom UI & TUI Component System
|
||||
|
||||
> Split into individual files for easier consumption.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [01. The UI Architecture](./01-the-ui-architecture.md)
|
||||
- [02. The Component Interface — Foundation of Everything](./02-the-component-interface-foundation-of-everything.md)
|
||||
- [03. Entry Points — How UI Gets on Screen](./03-entry-points-how-ui-gets-on-screen.md)
|
||||
- [04. Built-in Dialog Methods](./04-built-in-dialog-methods.md)
|
||||
- [05. Persistent UI Elements](./05-persistent-ui-elements.md)
|
||||
- [06. ctx.ui.custom() — Full Custom Components](./06-ctx-ui-custom-full-custom-components.md)
|
||||
- [07. Built-in Components — The Building Blocks](./07-built-in-components-the-building-blocks.md)
|
||||
- [08. High-Level Components from pi-coding-agent](./08-high-level-components-from-pi-coding-agent.md)
|
||||
- [09. Keyboard Input — How to Handle Keys](./09-keyboard-input-how-to-handle-keys.md)
|
||||
- [10. Line Width — The Cardinal Rule](./10-line-width-the-cardinal-rule.md)
|
||||
- [11. Theming — Colors and Styles](./11-theming-colors-and-styles.md)
|
||||
- [12. Overlays — Floating Modals and Panels](./12-overlays-floating-modals-and-panels.md)
|
||||
- [13. Custom Editors — Replacing the Input](./13-custom-editors-replacing-the-input.md)
|
||||
- [14. Tool Rendering — Custom Tool Display](./14-tool-rendering-custom-tool-display.md)
|
||||
- [15. Message Rendering — Custom Message Display](./15-message-rendering-custom-message-display.md)
|
||||
- [16. Performance — Caching and Invalidation](./16-performance-caching-and-invalidation.md)
|
||||
- [17. Theme Changes and Invalidation](./17-theme-changes-and-invalidation.md)
|
||||
- [18. IME Support — The Focusable Interface](./18-ime-support-the-focusable-interface.md)
|
||||
- [19. Building a Complete Component — Step by Step](./19-building-a-complete-component-step-by-step.md)
|
||||
- [20. Real-World Patterns from Examples](./20-real-world-patterns-from-examples.md)
|
||||
- [21. Common Mistakes and How to Avoid Them](./21-common-mistakes-and-how-to-avoid-them.md)
|
||||
- [22. Quick Reference — All UI APIs](./22-quick-reference-all-ui-apis.md)
|
||||
- [23. File Reference — Example Extensions with UI](./23-file-reference-example-extensions-with-ui.md)
|
||||
|
||||
---
|
||||
|
||||
*Split into per-section files for surgical context loading.*
|
||||
|
||||
16
docs/what-is-pi/01-what-pi-is.md
Normal file
16
docs/what-is-pi/01-what-pi-is.md
Normal file
|
|
@ -0,0 +1,16 @@
|
|||
# What Pi Is
|
||||
|
||||
Pi is a **terminal-native coding agent harness**. It sits between you and an LLM, giving the model tools to read, write, edit, and execute code on your machine, while giving you a rich terminal UI with session management, branching, and a deep customization system.
|
||||
|
||||
**What it is not:**
|
||||
- It's not a thin wrapper around an API — it has a full session system, branching, compaction, and event architecture
|
||||
- It's not a locked-down product — nearly everything can be replaced, extended, or overridden via TypeScript extensions
|
||||
- It's not tied to one model — it supports 20+ providers and you can switch models mid-conversation
|
||||
|
||||
**The one-liner:** Pi is a minimal, aggressively extensible coding agent that runs in your terminal, works with any major LLM provider, and lets you adapt it to your workflow instead of adapting to it.
|
||||
|
||||
**Repository:** [github.com/badlogic/pi-mono](https://github.com/badlogic/pi-mono)
|
||||
**Package:** `@mariozechner/pi-coding-agent`
|
||||
**Website:** [shittycodingagent.ai](https://shittycodingagent.ai)
|
||||
|
||||
---
|
||||
34
docs/what-is-pi/02-design-philosophy.md
Normal file
34
docs/what-is-pi/02-design-philosophy.md
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
# Design Philosophy
|
||||
|
||||
Pi has a very specific philosophy that explains almost every architectural decision:
|
||||
|
||||
### "Extend, don't fork"
|
||||
|
||||
Other coding agents bake features in. If you want sub-agents, plan mode, permission gates, or custom tools, you either use what they built or you fork the project. Pi takes the opposite approach: the core is deliberately minimal, and everything beyond the basics is built through the extension system.
|
||||
|
||||
### What Pi ships without (on purpose)
|
||||
|
||||
| Feature | Pi's approach |
|
||||
|---------|--------------|
|
||||
| Sub-agents | Build with extensions, or install a package |
|
||||
| Plan mode | Build with extensions, or install a package |
|
||||
| Permission popups | Build with extensions — design your own security model |
|
||||
| MCP support | Build with extensions — or use Skills instead |
|
||||
| Background bash | Use tmux — full observability, direct interaction |
|
||||
| Built-in todos | They confuse models. Use a TODO.md, or build with extensions |
|
||||
|
||||
This isn't missing features — it's a deliberate architectural choice. Every baked-in feature is an opinion that might not match your workflow. Pi gives you the primitives to build exactly what you need.
|
||||
|
||||
### The extension system as a first-class citizen
|
||||
|
||||
Extensions aren't an afterthought. The entire event system, tool registration, command system, custom UI, and session persistence were designed from the ground up to make extensions as powerful as built-in features. An extension can:
|
||||
- Override any built-in tool
|
||||
- Replace the system prompt
|
||||
- Modify every message sent to the LLM
|
||||
- Replace the input editor entirely
|
||||
- Register new model providers
|
||||
- Control the agent's tool set at runtime
|
||||
|
||||
This is the core value proposition: **pi is a platform, not just a tool.**
|
||||
|
||||
---
|
||||
42
docs/what-is-pi/03-the-four-modes-of-operation.md
Normal file
42
docs/what-is-pi/03-the-four-modes-of-operation.md
Normal file
|
|
@ -0,0 +1,42 @@
|
|||
# The Four Modes of Operation
|
||||
|
||||
Pi runs in four modes, each serving a different use case:
|
||||
|
||||
### Interactive Mode (default)
|
||||
|
||||
The full TUI experience. You type prompts, see responses stream, watch tool calls execute, and interact with the agent in real-time. This is how most people use pi day-to-day.
|
||||
|
||||
```bash
|
||||
pi # Start interactive
|
||||
pi "List all TypeScript files" # Start with initial prompt
|
||||
pi -c # Continue last session
|
||||
pi -r # Browse and resume a session
|
||||
```
|
||||
|
||||
### Print Mode (`-p`)
|
||||
|
||||
Non-interactive. Sends a prompt, prints the response, exits. Perfect for scripting and pipelines.
|
||||
|
||||
```bash
|
||||
pi -p "Summarize this codebase"
|
||||
pi -p @screenshot.png "What's in this image?"
|
||||
pi -p --tools read,grep "Review the code in src/"
|
||||
```
|
||||
|
||||
### JSON Mode (`--mode json`)
|
||||
|
||||
Streams all events as JSON lines to stdout. For building tools that consume pi's output programmatically.
|
||||
|
||||
```bash
|
||||
pi --mode json "Fix the bug in auth.ts"
|
||||
```
|
||||
|
||||
### RPC Mode (`--mode rpc`)
|
||||
|
||||
Full bidirectional JSON protocol over stdin/stdout. For embedding pi in IDEs, custom UIs, or other applications. The host sends commands, pi streams events back.
|
||||
|
||||
```bash
|
||||
pi --mode rpc --provider anthropic
|
||||
```
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,75 @@
|
|||
# The Architecture — How Everything Fits Together
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ Pi Runtime │
|
||||
│ │
|
||||
│ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────────┐ │
|
||||
│ │ Model Registry │ │ Auth Storage │ │ Settings │ │
|
||||
│ │ (all providers) │ │ (API keys, │ │ (global + │ │
|
||||
│ │ │ │ OAuth tokens) │ │ project) │ │
|
||||
│ └────────┬─────────┘ └────────┬────────┘ └────────┬─────────┘ │
|
||||
│ │ │ │ │
|
||||
│ ┌────────▼────────────────────────▼────────────────────────▼─────────┐ │
|
||||
│ │ Agent Session │ │
|
||||
│ │ │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │ │
|
||||
│ │ │ Session │ │ Agent Loop │ │ Tool Executor │ │ │
|
||||
│ │ │ Manager │ │ │ │ │ │ │
|
||||
│ │ │ ┌───────────┐ │ │ user prompt │ │ read │ bash │ edit │ │ │ │
|
||||
│ │ │ │ JSONL Tree│ │ │ ↓ │ │ write│ grep │ find │ │ │ │
|
||||
│ │ │ │ (entries, │ │ │ LLM call │ │ ls │ custom tools │ │ │ │
|
||||
│ │ │ │ branches) │ │ │ ↓ │ │ │ │ │
|
||||
│ │ │ └───────────┘ │ │ tool calls │ └──────────────────────────┘ │ │
|
||||
│ │ │ │ │ ↓ │ │ │
|
||||
│ │ │ Compaction │ │ tool results │ │ │
|
||||
│ │ │ Engine │ │ ↓ │ │ │
|
||||
│ │ │ │ │ (loop until │ │ │
|
||||
│ │ │ Branch │ │ LLM stops) │ │ │
|
||||
│ │ │ Summarizer │ │ │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ ┌──────────────────────────────────────────────────────────────┐ │ │
|
||||
│ │ │ Event System │ │ │
|
||||
│ │ │ session_start → input → before_agent_start → agent_start │ │ │
|
||||
│ │ │ → turn_start → context → tool_call → tool_result → │ │ │
|
||||
│ │ │ turn_end → agent_end → session_shutdown │ │ │
|
||||
│ │ └──────────────────────────┬───────────────────────────────────┘ │ │
|
||||
│ │ │ │ │
|
||||
│ └─────────────────────────────┼──────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────────────────────▼───────────────────────────────────────┐│
|
||||
│ │ Extension Runtime ││
|
||||
│ │ Extension A Extension B Extension C ... ││
|
||||
│ │ (tools, cmds, (event gates, (custom UI, ││
|
||||
│ │ events) tool mods) providers) ││
|
||||
│ └─────────────────────────────────────────────────────────────────────┘│
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐│
|
||||
│ │ Resource Loader ││
|
||||
│ │ Skills │ Prompt Templates │ Themes │ Context Files (AGENTS.md) ││
|
||||
│ └─────────────────────────────────────────────────────────────────────┘│
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐│
|
||||
│ │ Mode Layer (TUI / RPC / JSON / Print) ││
|
||||
│ └─────────────────────────────────────────────────────────────────────┘│
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### The key subsystems:
|
||||
|
||||
| Subsystem | What it does |
|
||||
|-----------|-------------|
|
||||
| **Model Registry** | Tracks all available models across all providers, handles API key lookup |
|
||||
| **Auth Storage** | Stores API keys and OAuth tokens securely |
|
||||
| **Agent Session** | The main orchestrator — manages the agent loop, session, tools, and events |
|
||||
| **Session Manager** | Reads/writes JSONL session files, manages the entry tree, handles branching |
|
||||
| **Agent Loop** | The core cycle: send messages to LLM → execute tool calls → repeat until LLM stops |
|
||||
| **Tool Executor** | Runs tools (built-in and custom) with cancellation support |
|
||||
| **Compaction Engine** | Summarizes old messages when context gets too large |
|
||||
| **Event System** | Every action emits events that extensions can observe and modify |
|
||||
| **Extension Runtime** | Loads and manages extensions, routes events, handles tool/command registration |
|
||||
| **Resource Loader** | Discovers and loads skills, prompts, themes, and context files |
|
||||
| **Mode Layer** | Handles I/O for the current mode (TUI rendering, RPC protocol, JSON streaming, print) |
|
||||
|
||||
---
|
||||
44
docs/what-is-pi/05-the-agent-loop-how-pi-thinks.md
Normal file
44
docs/what-is-pi/05-the-agent-loop-how-pi-thinks.md
Normal file
|
|
@ -0,0 +1,44 @@
|
|||
# The Agent Loop — How Pi Thinks
|
||||
|
||||
The agent loop is the heartbeat of pi. It's what happens between you sending a prompt and getting a response:
|
||||
|
||||
```
|
||||
User sends prompt
|
||||
│
|
||||
▼
|
||||
┌─ TURN START ──────────────────────────────────────┐
|
||||
│ │
|
||||
│ 1. Assemble context │
|
||||
│ - System prompt (+ modifications from hooks) │
|
||||
│ - Previous messages (or compaction summary) │
|
||||
│ - The new user message │
|
||||
│ │
|
||||
│ 2. Send to LLM │
|
||||
│ - Stream response tokens │
|
||||
│ - Parse any tool calls in the response │
|
||||
│ │
|
||||
│ 3. If tool calls present: │
|
||||
│ - For each tool call: │
|
||||
│ a. Fire tool_call event (can be blocked) │
|
||||
│ b. Execute the tool │
|
||||
│ c. Fire tool_result event (can be modified) │
|
||||
│ d. Append result to messages │
|
||||
│ - Go back to step 1 (new turn with results) │
|
||||
│ │
|
||||
│ 4. If no tool calls (LLM just responded): │
|
||||
│ - Save messages to session │
|
||||
│ - Done │
|
||||
│ │
|
||||
└───────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Key insight:** The loop keeps going until the LLM decides to stop calling tools. A single user prompt might trigger 1 turn or 50 turns depending on the task complexity. Each turn is a complete LLM call → response → tool execution cycle.
|
||||
|
||||
**Stop reasons the LLM can produce:**
|
||||
- `stop` — Normal completion, the LLM is done
|
||||
- `toolUse` — The LLM wants to call tools (triggers another turn)
|
||||
- `length` — Hit the output token limit
|
||||
- `error` — Something went wrong
|
||||
- `aborted` — User cancelled (Escape)
|
||||
|
||||
---
|
||||
41
docs/what-is-pi/06-tools-how-pi-acts-on-the-world.md
Normal file
41
docs/what-is-pi/06-tools-how-pi-acts-on-the-world.md
Normal file
|
|
@ -0,0 +1,41 @@
|
|||
# Tools — How Pi Acts on the World
|
||||
|
||||
Tools are functions the LLM can call to interact with your system. The LLM sees tool descriptions in its system prompt and decides when to use them.
|
||||
|
||||
### Built-in Tools
|
||||
|
||||
Pi ships with 7 built-in tools (4 active by default):
|
||||
|
||||
| Tool | Default | What it does |
|
||||
|------|---------|-------------|
|
||||
| `read` | ✅ | Read file contents (text and images). Supports offset/limit for large files. Truncates to 2000 lines / 50KB. |
|
||||
| `bash` | ✅ | Execute shell commands. Returns stdout, stderr, exit code. Truncates to 2000 lines / 50KB. |
|
||||
| `edit` | ✅ | Surgical text replacement — find exact text and replace it. |
|
||||
| `write` | ✅ | Create or overwrite files. Auto-creates parent directories. |
|
||||
| `grep` | ❌ | Search file contents with regex patterns. |
|
||||
| `find` | ❌ | Find files by name/pattern. |
|
||||
| `ls` | ❌ | List directory contents. |
|
||||
|
||||
### Tool Control
|
||||
|
||||
```bash
|
||||
pi --tools read,bash,edit,write # Specify active tools (default)
|
||||
pi --tools read,grep,find,ls # Read-only exploration
|
||||
pi --no-tools # No built-in tools (extensions only)
|
||||
```
|
||||
|
||||
Extensions can also manage tools at runtime:
|
||||
```typescript
|
||||
pi.setActiveTools(["read", "bash"]); // Switch to read-only + bash
|
||||
pi.setActiveTools(pi.getAllTools().map(t => t.name)); // Enable all
|
||||
```
|
||||
|
||||
### How Tools Appear to the LLM
|
||||
|
||||
The system prompt includes an "Available tools" section listing each active tool with its description and parameter schema. The LLM reads this and decides when to call which tool. This is standard LLM tool-calling — the model outputs a structured tool call, pi executes it, and feeds the result back.
|
||||
|
||||
### Output Truncation
|
||||
|
||||
**All tools truncate output** to 50KB / 2000 lines (whichever is hit first). This prevents a single tool call from consuming the entire context window. When truncated, the full output is saved to a temp file and the LLM is told where to find it.
|
||||
|
||||
---
|
||||
81
docs/what-is-pi/07-sessions-memory-that-branches.md
Normal file
81
docs/what-is-pi/07-sessions-memory-that-branches.md
Normal file
|
|
@ -0,0 +1,81 @@
|
|||
# Sessions — Memory That Branches
|
||||
|
||||
Sessions are pi's memory system. They're more sophisticated than simple conversation history.
|
||||
|
||||
### Storage Format
|
||||
|
||||
Sessions are **JSONL files** (one JSON object per line). Each line is an "entry" with a `type`, `id`, and `parentId`:
|
||||
|
||||
```
|
||||
~/.gsd/agent/sessions/--path--to--project--/<timestamp>_<uuid>.jsonl
|
||||
```
|
||||
|
||||
### The Entry Tree
|
||||
|
||||
Entries form a **tree structure**, not a linear list. This is the key architectural insight:
|
||||
|
||||
```
|
||||
┌─ [user] ─ [assistant] ─ [tool] ─ [assistant] ← Branch A
|
||||
[header] ─ [user] ─┤
|
||||
└─ [user] ─ [assistant] ← Branch B (via /tree)
|
||||
```
|
||||
|
||||
Every entry has an `id` and `parentId`. When you navigate to a previous point with `/tree` and continue from there, a new branch is created from that point. **All branches coexist in the same file.** Nothing is deleted.
|
||||
|
||||
### Entry Types
|
||||
|
||||
| Type | Purpose |
|
||||
|------|---------|
|
||||
| `session` | Header — file metadata, version, working directory |
|
||||
| `message` | A conversation message (user, assistant, tool result, custom) |
|
||||
| `compaction` | Summary of older messages (created by compaction) |
|
||||
| `branch_summary` | Summary of an abandoned branch (created by `/tree`) |
|
||||
| `model_change` | Records when the user switched models |
|
||||
| `thinking_level_change` | Records when the user changed thinking level |
|
||||
| `custom` | Extension state (NOT sent to LLM) |
|
||||
| `custom_message` | Extension-injected message (IS sent to LLM) |
|
||||
| `label` | User bookmark on an entry |
|
||||
| `session_info` | Session metadata (display name) |
|
||||
|
||||
### Message Types Within Entries
|
||||
|
||||
Message entries contain typed message objects:
|
||||
|
||||
| Role | What it is |
|
||||
|------|-----------|
|
||||
| `user` | User's prompt (text and/or images) |
|
||||
| `assistant` | LLM's response (text, thinking, tool calls) — includes model, provider, usage stats |
|
||||
| `toolResult` | Output from a tool execution — includes `details` for rendering and state |
|
||||
| `bashExecution` | Output from user's `!command` (not from LLM tool calls) |
|
||||
| `custom` | Extension-injected message |
|
||||
| `branchSummary` | Summary of an abandoned branch |
|
||||
| `compactionSummary` | Summary from compaction |
|
||||
|
||||
### Context Building
|
||||
|
||||
When pi needs to send messages to the LLM, it walks the tree from the current leaf to the root:
|
||||
|
||||
1. If there's a compaction entry on the path → emit the summary first, then messages from `firstKeptEntryId` onward
|
||||
2. If there's a branch summary → include it as context
|
||||
3. Custom message entries → included in LLM context
|
||||
4. Custom entries (extension state) → NOT included in LLM context
|
||||
|
||||
### Session Commands
|
||||
|
||||
| Command | What it does |
|
||||
|---------|-------------|
|
||||
| `/tree` | Navigate to any point in the session tree and continue from there |
|
||||
| `/fork` | Create a new session file from the current branch |
|
||||
| `/resume` | Browse and switch to a previous session |
|
||||
| `/new` | Start a fresh session |
|
||||
| `/name <name>` | Set a display name for the session |
|
||||
| `/session` | Show session info (path, tokens, cost) |
|
||||
| `/compact` | Manually trigger compaction |
|
||||
|
||||
### Branching in Practice
|
||||
|
||||
**`/tree`** — In-place branching. You select a previous point, the conversation continues from there. The old branch is preserved and can be revisited. Pi optionally generates a summary of the branch you're leaving so context isn't lost.
|
||||
|
||||
**`/fork`** — Creates a new session file from the current branch. Opens a selector, copies history up to the selected point, and puts that message in the editor for modification. Good for "start fresh but keep the context."
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,87 @@
|
|||
# Compaction — How Pi Manages Context Limits
|
||||
|
||||
LLMs have finite context windows. Pi's compaction system keeps conversations going beyond those limits.
|
||||
|
||||
### When Compaction Triggers
|
||||
|
||||
**Automatic:** When `contextTokens > contextWindow - reserveTokens` (default reserve: 16,384 tokens). Also triggers proactively as you approach the limit.
|
||||
|
||||
**Manual:** `/compact [custom instructions]`
|
||||
|
||||
### How It Works
|
||||
|
||||
```
|
||||
Before compaction:
|
||||
|
||||
Messages: [user][assistant][tool][user][assistant][tool][tool][assistant][tool]
|
||||
└──────── summarize these ────────┘ └──── keep these (recent) ────┘
|
||||
↑
|
||||
keepRecentTokens (default: 20k)
|
||||
|
||||
After compaction (new entry appended):
|
||||
|
||||
What the LLM sees: [system prompt] [summary] [kept messages...]
|
||||
```
|
||||
|
||||
1. Pi walks backward from the newest message, counting tokens until it reaches `keepRecentTokens` (default 20k)
|
||||
2. Everything before that point gets summarized by the LLM using a structured format
|
||||
3. A `CompactionEntry` is appended with the summary and a pointer to the first kept message
|
||||
4. On reload, the LLM sees: system prompt → summary → recent messages
|
||||
|
||||
### Split Turns
|
||||
|
||||
Sometimes a single turn (one user prompt + all its tool calls) exceeds the `keepRecentTokens` budget. Pi handles this by cutting mid-turn and generating two summaries: one for the history before the turn, and one for the early part of the split turn.
|
||||
|
||||
### The Summary Format
|
||||
|
||||
Both compaction and branch summarization produce structured summaries:
|
||||
|
||||
```markdown
|
||||
## Goal
|
||||
[What the user is trying to accomplish]
|
||||
|
||||
## Constraints & Preferences
|
||||
- [Requirements mentioned by user]
|
||||
|
||||
## Progress
|
||||
### Done
|
||||
- [x] Completed tasks
|
||||
### In Progress
|
||||
- [ ] Current work
|
||||
### Blocked
|
||||
- Issues, if any
|
||||
|
||||
## Key Decisions
|
||||
- **Decision**: Rationale
|
||||
|
||||
## Next Steps
|
||||
1. What should happen next
|
||||
|
||||
## Critical Context
|
||||
- Data needed to continue
|
||||
|
||||
<read-files>
|
||||
path/to/file1.ts
|
||||
</read-files>
|
||||
|
||||
<modified-files>
|
||||
path/to/changed.ts
|
||||
</modified-files>
|
||||
```
|
||||
|
||||
### Why This Matters
|
||||
|
||||
Compaction is lossy — information is lost in the summary. But the full history remains in the JSONL file. You can always use `/tree` to revisit the pre-compaction state. The tradeoff is: continue working with a summary of earlier context, or start fresh. Extensions can customize compaction to produce better summaries for your specific use case.
|
||||
|
||||
**Settings:**
|
||||
```json
|
||||
{
|
||||
"compaction": {
|
||||
"enabled": true,
|
||||
"reserveTokens": 16384,
|
||||
"keepRecentTokens": 20000
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
96
docs/what-is-pi/09-the-customization-stack.md
Normal file
96
docs/what-is-pi/09-the-customization-stack.md
Normal file
|
|
@ -0,0 +1,96 @@
|
|||
# The Customization Stack
|
||||
|
||||
Pi has four layers of customization, each serving a different purpose:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────┐
|
||||
│ Extensions │ ← TypeScript code. Full runtime access.
|
||||
│ Custom tools, events, UI, │ Can do anything.
|
||||
│ commands, providers │
|
||||
├─────────────────────────────────────┤
|
||||
│ Skills │ ← Markdown instructions + scripts.
|
||||
│ On-demand capability packages │ Loaded when the task matches.
|
||||
│ loaded by the agent │
|
||||
├─────────────────────────────────────┤
|
||||
│ Prompt Templates │ ← Markdown snippets.
|
||||
│ Reusable prompts expanded │ Quick text expansion via /name.
|
||||
│ via /templatename │
|
||||
├─────────────────────────────────────┤
|
||||
│ Themes │ ← JSON color definitions.
|
||||
│ Visual appearance │ Hot-reload on change.
|
||||
└─────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Extensions
|
||||
|
||||
TypeScript modules with full runtime access. They can hook into every event, register tools the LLM can call, add commands, render custom UI, override built-in behavior, and register model providers. Extensions are the most powerful customization mechanism.
|
||||
|
||||
**Placement:**
|
||||
- `~/.gsd/agent/extensions/` (global)
|
||||
- `.gsd/extensions/` (project-local)
|
||||
|
||||
See the companion doc **Pi-Extensions-Complete-Guide.md** for the full 50KB reference.
|
||||
|
||||
### Skills
|
||||
|
||||
On-demand capability packages following the [Agent Skills standard](https://agentskills.io). A skill is a directory with a `SKILL.md` file containing instructions the agent follows. Skills are progressive: only their names and descriptions are in the system prompt. The agent reads the full SKILL.md only when the task matches.
|
||||
|
||||
**How skills work:**
|
||||
1. At startup, pi scans for skills and extracts names + descriptions
|
||||
2. Descriptions are listed in the system prompt
|
||||
3. When a task matches, the agent uses `read` to load the full SKILL.md
|
||||
4. The agent follows the instructions, using relative paths for scripts/assets
|
||||
|
||||
**Invocation:**
|
||||
```
|
||||
/skill:brave-search # Explicit invocation
|
||||
/skill:pdf-tools extract file.pdf # With arguments
|
||||
```
|
||||
|
||||
**Placement:**
|
||||
- `~/.gsd/agent/skills/` or `~/.agents/skills/` (global)
|
||||
- `.gsd/skills/` or `.agents/skills/` (project, searched up to git root)
|
||||
|
||||
**Skill structure:**
|
||||
```
|
||||
my-skill/
|
||||
├── SKILL.md # Required: frontmatter + instructions
|
||||
├── scripts/ # Helper scripts (optional)
|
||||
│ └── process.sh
|
||||
└── references/ # Reference docs (optional)
|
||||
└── api-guide.md
|
||||
```
|
||||
|
||||
### Prompt Templates
|
||||
|
||||
Markdown files that expand into prompts via `/name`. Simple text expansion with positional argument support (`$1`, `$2`, `$@`).
|
||||
|
||||
```markdown
|
||||
<!-- ~/.gsd/agent/prompts/review.md -->
|
||||
---
|
||||
description: Review staged git changes
|
||||
---
|
||||
Review the staged changes (`git diff --cached`). Focus on:
|
||||
- Bugs and logic errors
|
||||
- Security issues
|
||||
- Performance problems
|
||||
Focus area: $1
|
||||
```
|
||||
|
||||
Usage: `/review "error handling"` → expands with `$1` = "error handling"
|
||||
|
||||
**Placement:**
|
||||
- `~/.gsd/agent/prompts/` (global)
|
||||
- `.gsd/prompts/` (project-local)
|
||||
|
||||
### Themes
|
||||
|
||||
JSON files defining the color palette for the TUI. Hot-reload: edit the file and pi applies changes immediately.
|
||||
|
||||
**Built-in:** `dark`, `light`
|
||||
|
||||
**Placement:**
|
||||
- `~/.gsd/agent/themes/` (global)
|
||||
- `.gsd/themes/` (project-local)
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,58 @@
|
|||
# Providers & Models — Multi-Model by Default
|
||||
|
||||
Pi isn't locked to one provider. It supports 20+ providers out of the box and lets you add more.
|
||||
|
||||
### Authentication Methods
|
||||
|
||||
**OAuth subscriptions (via `/login`):**
|
||||
- Anthropic Claude Pro/Max
|
||||
- OpenAI ChatGPT Plus/Pro (Codex)
|
||||
- GitHub Copilot
|
||||
- Google Gemini CLI
|
||||
- Google Antigravity
|
||||
|
||||
**API keys (via environment variables):**
|
||||
- Anthropic, OpenAI, Azure OpenAI, Google Gemini, Google Vertex, Amazon Bedrock
|
||||
- Mistral, Groq, Cerebras, xAI, OpenRouter, Vercel AI Gateway
|
||||
- ZAI, OpenCode Zen, OpenCode Go, Hugging Face, Kimi, MiniMax
|
||||
|
||||
### Model Switching
|
||||
|
||||
You can switch models at any time during a conversation:
|
||||
|
||||
- `/model` — Open the model selector
|
||||
- `Ctrl+L` — Same as `/model`
|
||||
- `Ctrl+P` / `Shift+Ctrl+P` — Cycle through scoped models
|
||||
- `Shift+Tab` — Cycle thinking level
|
||||
|
||||
Model changes are recorded in the session as `model_change` entries, so when you resume a session, pi knows which model you were using.
|
||||
|
||||
### CLI Model Selection
|
||||
|
||||
```bash
|
||||
pi --model sonnet # Fuzzy match
|
||||
pi --model openai/gpt-4o # Provider/model
|
||||
pi --model sonnet:high # With thinking level
|
||||
pi --models "claude-*,gpt-4o" # Scope models for Ctrl+P cycling
|
||||
pi --list-models # List all available
|
||||
pi --list-models gemini # Search by name
|
||||
```
|
||||
|
||||
### Custom Providers
|
||||
|
||||
Add providers via `~/.gsd/agent/models.json` (simple) or extensions (advanced with OAuth, custom streaming):
|
||||
|
||||
```json
|
||||
// ~/.gsd/agent/models.json
|
||||
{
|
||||
"providers": [{
|
||||
"name": "my-proxy",
|
||||
"baseUrl": "https://proxy.example.com",
|
||||
"apiKey": "PROXY_API_KEY",
|
||||
"api": "anthropic-messages",
|
||||
"models": [{ "id": "claude-sonnet-4", "name": "Sonnet via Proxy", ... }]
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
50
docs/what-is-pi/11-the-interactive-tui.md
Normal file
50
docs/what-is-pi/11-the-interactive-tui.md
Normal file
|
|
@ -0,0 +1,50 @@
|
|||
# The Interactive TUI
|
||||
|
||||
Pi's terminal interface is built with a custom TUI framework (`@mariozechner/pi-tui`).
|
||||
|
||||
### Layout (top to bottom)
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Startup Header │
|
||||
│ Shows: shortcuts, loaded AGENTS.md files, prompts, │
|
||||
│ skills, extensions │
|
||||
├─────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ Messages Area │
|
||||
│ User messages, assistant responses, tool calls/results, │
|
||||
│ notifications, errors, extension UI │
|
||||
│ │
|
||||
├─────────────────────────────────────────────────────────────┤
|
||||
│ [Widgets above editor - from extensions] │
|
||||
├─────────────────────────────────────────────────────────────┤
|
||||
│ Editor (input area) │
|
||||
│ Border color = thinking level │
|
||||
├─────────────────────────────────────────────────────────────┤
|
||||
│ [Widgets below editor - from extensions] │
|
||||
├─────────────────────────────────────────────────────────────┤
|
||||
│ Footer: cwd │ session name │ tokens │ cost │ context │ model│
|
||||
│ [Extension status indicators] │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Editor Features
|
||||
|
||||
| Feature | How |
|
||||
|---------|-----|
|
||||
| File reference | Type `@` to fuzzy-search project files |
|
||||
| Path completion | Tab to complete paths |
|
||||
| Multi-line | Shift+Enter |
|
||||
| Images | Ctrl+V to paste, or drag onto terminal |
|
||||
| Bash commands | `!command` (sends output to LLM), `!!command` (runs without sending) |
|
||||
| External editor | Ctrl+G opens `$VISUAL` or `$EDITOR` |
|
||||
|
||||
### Tool Output Display
|
||||
|
||||
Tool calls and results are rendered inline with collapsible output:
|
||||
- `Ctrl+O` — Toggle expand/collapse all tool output
|
||||
- `Ctrl+T` — Toggle expand/collapse thinking blocks
|
||||
|
||||
Extensions can provide custom renderers for their tools, controlling exactly how tool calls and results appear.
|
||||
|
||||
---
|
||||
|
|
@ -0,0 +1,20 @@
|
|||
# The Message Queue — Talking While Pi Thinks
|
||||
|
||||
Pi doesn't make you wait for the agent to finish before sending more instructions. You can queue messages while the agent is streaming:
|
||||
|
||||
| Key | Behavior |
|
||||
|-----|----------|
|
||||
| **Enter** | Queue a **steering** message — delivered after current tool, interrupts remaining tools |
|
||||
| **Alt+Enter** | Queue a **follow-up** message — delivered after agent finishes all work |
|
||||
| **Escape** | Abort the agent and restore queued messages to editor |
|
||||
| **Alt+Up** | Retrieve queued messages back to editor |
|
||||
|
||||
**Steering** is for course-correction: "Stop, do this instead." The message is delivered after the current tool finishes, but remaining tool calls in the LLM's response are skipped.
|
||||
|
||||
**Follow-up** is for chaining: "After you're done with that, also do this." The message waits until the agent has no more tool calls to make.
|
||||
|
||||
**Settings:**
|
||||
- `steeringMode`: `"one-at-a-time"` (default) or `"all"` (deliver all queued at once)
|
||||
- `followUpMode`: same options
|
||||
|
||||
---
|
||||
34
docs/what-is-pi/13-context-files-project-instructions.md
Normal file
34
docs/what-is-pi/13-context-files-project-instructions.md
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
# Context Files — Project Instructions
|
||||
|
||||
Pi loads instruction files automatically at startup:
|
||||
|
||||
### AGENTS.md (or CLAUDE.md)
|
||||
|
||||
Pi looks for `AGENTS.md` or `CLAUDE.md` in:
|
||||
1. `~/.gsd/agent/AGENTS.md` (global)
|
||||
2. Every parent directory from cwd up to filesystem root
|
||||
3. Current directory
|
||||
|
||||
All matching files are concatenated and included in the system prompt. Use these for project conventions, common commands, architectural notes.
|
||||
|
||||
### System Prompt Override
|
||||
|
||||
Replace the default system prompt entirely:
|
||||
- `.gsd/SYSTEM.md` (project)
|
||||
- `~/.gsd/agent/SYSTEM.md` (global)
|
||||
|
||||
Append to it instead:
|
||||
- `.gsd/APPEND_SYSTEM.md` (project)
|
||||
- `~/.gsd/agent/APPEND_SYSTEM.md` (global)
|
||||
|
||||
### File Arguments
|
||||
|
||||
Include files directly in prompts from the CLI:
|
||||
|
||||
```bash
|
||||
pi @prompt.md "Answer this"
|
||||
pi -p @screenshot.png "What's in this image?"
|
||||
pi @code.ts @test.ts "Review these files"
|
||||
```
|
||||
|
||||
---
|
||||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Reference in a new issue