# Repository Guidelines

## Setup Checklist for New Contributors

- [ ] Install dev dependencies: `npm install`
- [ ] Install pre-commit hooks: `npm run secret-scan:install-hook`
- [ ] Apply GitHub labels: `gh label create priority/P0 --color B60205 --description "Critical"` (see .github/labels.yml for full list)
- [ ] Verify devcontainer: `devcontainer build --workspace-folder .`
- [ ] Run first tech-debt scan: `node scripts/tech-debt-scan.mjs`

## Purpose-First Doctrine

sf follows **spec-first TDD**: see [`docs/SPEC_FIRST_TDD.md`](docs/SPEC_FIRST_TDD.md) for the full constitution.

Iron Law:

```
THE TEST IS THE SPEC.  THE JSDOC IS THE PURPOSE.  CODE EXISTS TO FULFILL PURPOSE.

NO BEHAVIOR CHANGE WITHOUT A FAILING TEST FIRST.
NO COMPLETION WITHOUT A REAL CONSUMER.
NO JUDGMENT CALL WITHOUT A CONFIDENCE AND FALSIFIER.
```

Every artifact (slice plan, task plan, function, test, ADR) must answer:

- **why** this behaviour exists
- **what value** it creates or protects
- **who** uses it in production (real consumer, not just tests)
- **what breaks** if it returns the wrong answer

If any answer is missing: `BLOCKED: purpose unclear — [field]`. Surfacing the gap beats rationalising past it.

## Project Structure

This is a TypeScript monorepo with npm workspaces. The main entry point is `dist/loader.js` (bin: `sf`).

- `src/` — Main CLI source (sf-run core, extensions, agents)
- `packages/` — Workspace packages (8 total): pi-tui, pi-ai, pi-agent-core, pi-coding-agent, daemon, mcp-server, native, rpc-client
- `web/` — Next.js web frontend (optional web host mode)
- `rust-engine/` — Rust N-API bindings for performance-critical operations
- `scripts/` — Build, dev, release, and CI helper scripts
- `tests/` — Fixtures, smoke tests, live tests, live-regression tests
- `docs/` — User guides and developer documentation
- `docker/` — Docker sandbox and builder configurations

## Build, Test, and Development Commands

```bash
# Full build (core + web)
npm run build

# Build core only (packages + tsc + resources)
npm run build:core

# Dev mode with hot reload
npm run dev

# Run all tests (unit + integration)
npm test

# Unit tests only
npm run test:unit

# Integration tests only
npm run test:integration

# Coverage check (Vitest V8 provider; thresholds: statements 40%, lines 40%, branches 20%, functions 20%)
npm run test:coverage

# Type check extensions (no emit)
npm run typecheck:extensions

# Native Rust build
npm run build:native

# Root lint checks (Biome over src/)
npm run lint
npm run lint:fix

# Web lint (Next.js ESLint; separate package)
npm --prefix web run lint

# Release workflow (changelog + version bump)
npm run release:changelog
npm run release:bump
```

## Coding Style & Naming Conventions

- **Language**: TypeScript with `"strict": true` enabled in all packages
- **Module resolution**: NodeNext
- **Target**: ES2022
- **Package manager**: npm (canonical; do not commit `bun.lock` or `pnpm-lock.yaml`)
- **Commit format**: Conventional Commits enforced via commit-msg hook
- **Branch naming**: `<type>/<short-description>` — e.g. `feat/new-command`, `fix/login-bug`
  - Types: `feat`, `fix`, `docs`, `chore`, `refactor`, `test`, `infra`, `ci`, `perf`, `build`, `revert`

### JSDoc Purpose Convention

Every exported function, type, class, and module-level constant opens with a JSDoc block whose first sentence is its **purpose** — the consumer-facing reason it exists. Not what it does (the signature shows that), but **why**.

```ts
/**
 * Acquire a unit claim atomically. Returns true on success, false if another worker
 * already holds an unexpired lease.
 *
 * Purpose: prevent two workers from dispatching the same unit when the run-lock is
 * unavailable (shared NFS, broken filesystem semantics) — the conditional UPDATE in
 * SQLite is the safety net.
 *
 * Consumer: auto-dispatch.ts when picking the next eligible unit per poll tick.
 */
export function claimUnit(unitId: string, leaseMs: number): boolean { ... }
```

Required for every exported symbol whose behaviour is non-trivial:

- **First line** — what it returns / does, in the present tense.
- **Purpose:** — why it exists; the value it protects.
- **Consumer:** — who calls it in production. If you can't name a consumer, the symbol shouldn't exist yet.

A bare `/** Helper. */` is a code smell. Either write the purpose or delete the symbol.

For module-level JSDoc (file headers): keep the existing `module-name.ts — short description` opening, then a `Purpose:` line stating why the module exists as a separable unit.

## Testing Guidelines

- **Primary test runner**: Vitest via `npm run test:unit`, `npm run test:integration`, and `npm test`
- **Node test runner**: used only by specific package/native/browser-tool scripts where `package.json` says `node --test`
- **Coverage tool**: Vitest coverage with `@vitest/coverage-v8`; thresholds are enforced in CI
- **Naming**: `*.test.ts` and `*.test.mjs` patterns
- **Smoke tests**: `npm run test:smoke`
- **Live tests**: `npm run test:live` (requires environment variables)

### Purposeful Tests

Test names are contract claims. Use the form `<what>_<when>_<expected>`:

| Good | Bad |
|---|---|
| `claim_when_lease_expired_returns_true` | `test claim` |
| `dispatch_when_blocker_unresolved_skips_unit` | `test dispatch logic` |

Three-tier organisation:

1. **Behaviour contracts** (primary) — what the consumer receives. The spec. A different implementation that passes these is equally correct.
2. **Degradation contracts** — what happens when dependencies fail. Consumer must always get a useful response; failure must degrade, not crash.
3. **Implementation guards** (secondary, labelled `// guard:`) — protect specific failure modes (resource leaks, infinite loops). Refactors update guards, not behaviour contracts.

Write behaviour contracts first. They are the work order.

A test that asserts call counts or mock interactions is **mechanical**, not purposeful — it should be a labelled implementation guard, not a primary contract test. A test that breaks on a refactor without behaviour change is mechanical too. Fix the test or relabel it.

**Bug = missing correct-behaviour test.** When fixing a bug, write a test for the *correct* behaviour first — it must fail (RED) because the bug exists. If it passes immediately, the test is testing the broken behaviour; fix the test, not the code.

## Extension Development

Extensions live in `src/resources/extensions/`. Each extension should:
- Export a manifest with `name`, `version`, `tools[]`, and `agents[]`
- Include tests in `src/resources/extensions/<name>/tests/`
- Register tools via the extension API

## Pull Request Guidelines

1. **Link an issue** — PRs without a linked issue will be closed without review
2. **One concern per PR** — don't bundle unrelated changes
3. **No drive-by formatting** — don't reformat code you didn't touch
4. **CI must pass** — fix failing tests before requesting review
5. **Rebase onto main** — do not merge main into your feature branch
6. Use the PR template at `.github/PULL_REQUEST_TEMPLATE.md`

## Environment Setup

Copy `docker/.env.example` to `.env` and fill in API keys. At minimum you need one LLM provider key (Anthropic, OpenAI, Google, or OpenRouter).

## Architecture Notes

- State lives on disk in `.sf/` — no in-memory state survives across sessions
- Bundled extensions/agents sync to `~/.sf/agent/` on every launch
- LLM providers are lazy-loaded on first use to reduce cold-start time
- Native Rust engine handles grep, glob, ps, highlight, ast, diff

## SF Planning State

`.sf/` is the canonical home for SF agent state. It contains milestone plans, slice plans, task plans, and ephemeral working files under `.sf/milestones/`, `.sf/STATE.md`, `.sf/QUEUE.md`, and related artifacts.

**Promote-only rule:** Agent state (the `.sf/` directory under `~/.sf/projects/<hash>/`) is transient and gitignored — never committed directly. Project state (`.sf/` tracked in the repo root) contains only human-authored artifacts such as `DECISIONS.md`, `KNOWLEDGE.md`, `REQUIREMENTS.md`, `ROADMAP.md`, and `STATE.md`.

Promoted artifacts — milestone summaries, architecture decision records (ADRs), and durable specifications — belong in tracked documentation directories:

- `docs/plans/` — reviewed implementation plans promoted from `.sf/` milestone planning
- `docs/adr/` — accepted architectural decisions promoted from `.sf/DECISIONS.md`
- `docs/specs/` — long-lived behavior contracts and API specifications

**Naming conventions:**
- Milestone IDs: `M001`, `M002`, …
- Slice IDs: `S01`, `S02`, …
- Task IDs: `T01`, `T02`, …

**Commands:**
- `sf plan promote <source>` — copy a file from `.sf/` to `docs/plans/`, `docs/adr/`, or `docs/specs/`
- `sf plan list` — list milestone and slice files in `.sf/`
- `sf plan diff` — compare `.sf/` state with promoted `docs/` artifacts

See [`docs/plans/README.md`](docs/plans/README.md), [`docs/adr/README.md`](docs/adr/README.md), and [`docs/specs/README.md`](docs/specs/README.md) for directory-specific conventions.

## Eval Dump Inbox

SF/Pi automatically loads `AGENTS.md` and `CLAUDE.md` from the repo tree at
startup. It does not automatically load `TODO.md`, but this repo uses root
`TODO.md` as a temporary human dump inbox for eval and self-evolution ideas.

When a repo contains a root `TODO.md`, treat it as a temporary dump inbox and
read it before planning substantive work in that repo. This applies even when
the user does not explicitly mention evals. Treat the `Raw Dump Inbox` section
as untriaged source material, not as durable instructions. Triage it into
reviewable artifacts: concrete eval cases, harness gaps, memory extraction
requirements, docs, tests, or follow-up implementation tasks. After triage,
remove the processed dump notes from `TODO.md` so the file returns to an empty
inbox/template state. Do not treat dumped notes as runtime memory or approved
behavior until they are converted into tested, versioned project artifacts.

## CI/CD

- `ci.yml` — builds, tests, gates merges to main
- `pipeline.yml` — three-stage release (dev → test → prod)
- `pr-risk.yml` — PR risk classification
- `ai-triage.yml` — AI-based issue/PR triage

## Code Quality Tooling

The repository uses the following quality tools:

- **Biome** — root source linting via `npm run lint` and autofix via `npm run lint:fix`
  - Scope: `src/` plus versioned JSON checks
  - Config: `biome.json`
  - Format touched files with `npx biome check --write <paths>`; full-repo formatting is not the current CI gate.
- **ESLint** — web app linting via `npm --prefix web run lint`
  - Scope: `web/`
  - Config: `web/eslint.config.mjs`
- **TypeScript** — Strict mode enabled; run `npm run typecheck:extensions`
- **Knip** — Detect unused code and dependencies: `npx knip` (config at `knip.json`)
- **jscpd** — Detect duplicate code: `npx jscpd` (config at `.jscpd.json`)
- **Tech Debt Scanner** — `node scripts/tech-debt-scan.mjs`
  - Tracks TODO/FIXME/HACK/XXX counts against thresholds
- **Secret Scan** — `npm run secret-scan` (pre-commit hook available via `npm run secret-scan:install-hook`)
- **Coverage** — `npm run test:coverage` (Vitest V8 coverage with 40/40/20/20 thresholds)

## Dev Container

A Dev Container configuration is available at `.devcontainer/devcontainer.json`.
Open the repository in VS Code with the Dev Containers extension, or run:

```bash
devcontainer up --workspace-folder .
```

The container includes Node 24, Rust, GitHub CLI, Docker-in-Docker, and recommended VS Code extensions.

## Dependency Updates

Dependabot is configured at `.github/dependabot.yml` for:
- Root npm dependencies (weekly, grouped by ecosystem)
- Web app dependencies (weekly)
- GitHub Actions (weekly)

## Issue Labels

Label definitions are at `.github/labels.yml`. Apply labels using:

```bash
# Create a single label
gh label create priority/P0 --color B60205 --description "Critical — blocks release"

# Or use a label management action in CI
```