singularity-forge/AGENTS.md
Mikael Hugo 6f6ace3da6 chore: Node 24.15 floor + modernization round-up
- engines.node: >=24.15.0 across all 23 package.json (root + 8
  workspace + studio + web + pkg + vscode-extension + 11 SF
  extension manifests)
- CI workflows pinned to node-version: '24.15' (16 sites)
- Dockerfile -> node:24.15-slim
- .nvmrc / .node-version -> 24.15.0
- Refactored worktree-cli.ts and headless-query.ts to use
  import.meta.filename instead of fileURLToPath(import.meta.url)
- exec.ts simplified with AbortSignal.any + spawn signal/killSignal
- Picks up Crush's biome.json + AGENTS.md doc cleanup in same pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 06:37:36 +02:00

253 lines
10 KiB
Markdown

# Repository Guidelines
## Setup Checklist for New Contributors
- [ ] Install dev dependencies: `npm install`
- [ ] Install pre-commit hooks: `npm run secret-scan:install-hook`
- [ ] Apply GitHub labels: `gh label create priority/P0 --color B60205 --description "Critical"` (see .github/labels.yml for full list)
- [ ] Verify devcontainer: `devcontainer build --workspace-folder .`
- [ ] Run first tech-debt scan: `node scripts/tech-debt-scan.mjs`
## Purpose-First Doctrine
sf follows **spec-first TDD**: see [`docs/SPEC_FIRST_TDD.md`](docs/SPEC_FIRST_TDD.md) for the full constitution.
Iron Law:
```
THE TEST IS THE SPEC. THE JSDOC IS THE PURPOSE. CODE EXISTS TO FULFILL PURPOSE.
NO BEHAVIOR CHANGE WITHOUT A FAILING TEST FIRST.
NO COMPLETION WITHOUT A REAL CONSUMER.
NO JUDGMENT CALL WITHOUT A CONFIDENCE AND FALSIFIER.
```
Every artifact (slice plan, task plan, function, test, ADR) must answer:
- **why** this behaviour exists
- **what value** it creates or protects
- **who** uses it in production (real consumer, not just tests)
- **what breaks** if it returns the wrong answer
If any answer is missing: `BLOCKED: purpose unclear — [field]`. Surfacing the gap beats rationalising past it.
## Project Structure
This is a TypeScript monorepo with npm workspaces. The main entry point is `dist/loader.js` (bin: `sf`).
- `src/` — Main CLI source (sf-run core, extensions, agents)
- `packages/` — Workspace packages (8 total): pi-tui, pi-ai, pi-agent-core, pi-coding-agent, daemon, mcp-server, native, rpc-client
- `web/` — Next.js web frontend (optional web host mode)
- `rust-engine/` — Rust N-API bindings for performance-critical operations
- `scripts/` — Build, dev, release, and CI helper scripts
- `tests/` — Fixtures, smoke tests, live tests, live-regression tests
- `docs/` — User guides and developer documentation
- `docker/` — Docker sandbox and builder configurations
## Build, Test, and Development Commands
```bash
# Full build (core + web)
npm run build
# Build core only (packages + tsc + resources)
npm run build:core
# Dev mode with hot reload
npm run dev
# Run all tests (unit + integration)
npm test
# Unit tests only
npm run test:unit
# Integration tests only
npm run test:integration
# Coverage check (Vitest V8 provider; thresholds: statements 40%, lines 40%, branches 20%, functions 20%)
npm run test:coverage
# Type check extensions (no emit)
npm run typecheck:extensions
# Native Rust build
npm run build:native
# Root lint checks (Biome over src/)
npm run lint
npm run lint:fix
# Web lint (Next.js ESLint; separate package)
npm --prefix web run lint
# Release workflow (changelog + version bump)
npm run release:changelog
npm run release:bump
```
## Coding Style & Naming Conventions
- **Language**: TypeScript with `"strict": true` enabled in all packages
- **Module resolution**: NodeNext
- **Target**: ES2022
- **Package manager**: npm (canonical; do not commit `bun.lock` or `pnpm-lock.yaml`)
- **Commit format**: Conventional Commits enforced via commit-msg hook
- **Branch naming**: `<type>/<short-description>` — e.g. `feat/new-command`, `fix/login-bug`
- Types: `feat`, `fix`, `docs`, `chore`, `refactor`, `test`, `infra`, `ci`, `perf`, `build`, `revert`
### JSDoc Purpose Convention
Every exported function, type, class, and module-level constant opens with a JSDoc block whose first sentence is its **purpose** — the consumer-facing reason it exists. Not what it does (the signature shows that), but **why**.
```ts
/**
* Acquire a unit claim atomically. Returns true on success, false if another worker
* already holds an unexpired lease.
*
* Purpose: prevent two workers from dispatching the same unit when the run-lock is
* unavailable (shared NFS, broken filesystem semantics) — the conditional UPDATE in
* SQLite is the safety net.
*
* Consumer: auto-dispatch.ts when picking the next eligible unit per poll tick.
*/
export function claimUnit(unitId: string, leaseMs: number): boolean { ... }
```
Required for every exported symbol whose behaviour is non-trivial:
- **First line** — what it returns / does, in the present tense.
- **Purpose:** — why it exists; the value it protects.
- **Consumer:** — who calls it in production. If you can't name a consumer, the symbol shouldn't exist yet.
A bare `/** Helper. */` is a code smell. Either write the purpose or delete the symbol.
For module-level JSDoc (file headers): keep the existing `module-name.ts — short description` opening, then a `Purpose:` line stating why the module exists as a separable unit.
## Testing Guidelines
- **Primary test runner**: Vitest via `npm run test:unit`, `npm run test:integration`, and `npm test`
- **Node test runner**: used only by specific package/native/browser-tool scripts where `package.json` says `node --test`
- **Coverage tool**: Vitest coverage with `@vitest/coverage-v8`; thresholds are enforced in CI
- **Naming**: `*.test.ts` and `*.test.mjs` patterns
- **Smoke tests**: `npm run test:smoke`
- **Live tests**: `npm run test:live` (requires environment variables)
### Purposeful Tests
Test names are contract claims. Use the form `<what>_<when>_<expected>`:
| Good | Bad |
|---|---|
| `claim_when_lease_expired_returns_true` | `test claim` |
| `dispatch_when_blocker_unresolved_skips_unit` | `test dispatch logic` |
Three-tier organisation:
1. **Behaviour contracts** (primary) — what the consumer receives. The spec. A different implementation that passes these is equally correct.
2. **Degradation contracts** — what happens when dependencies fail. Consumer must always get a useful response; failure must degrade, not crash.
3. **Implementation guards** (secondary, labelled `// guard:`) — protect specific failure modes (resource leaks, infinite loops). Refactors update guards, not behaviour contracts.
Write behaviour contracts first. They are the work order.
A test that asserts call counts or mock interactions is **mechanical**, not purposeful — it should be a labelled implementation guard, not a primary contract test. A test that breaks on a refactor without behaviour change is mechanical too. Fix the test or relabel it.
**Bug = missing correct-behaviour test.** When fixing a bug, write a test for the *correct* behaviour first — it must fail (RED) because the bug exists. If it passes immediately, the test is testing the broken behaviour; fix the test, not the code.
## Extension Development
Extensions live in `src/resources/extensions/`. Each extension should:
- Export a manifest with `name`, `version`, `tools[]`, and `agents[]`
- Include tests in `src/resources/extensions/<name>/tests/`
- Register tools via the extension API
## Pull Request Guidelines
1. **Link an issue** — PRs without a linked issue will be closed without review
2. **One concern per PR** — don't bundle unrelated changes
3. **No drive-by formatting** — don't reformat code you didn't touch
4. **CI must pass** — fix failing tests before requesting review
5. **Rebase onto main** — do not merge main into your feature branch
6. Use the PR template at `.github/PULL_REQUEST_TEMPLATE.md`
## Environment Setup
Copy `docker/.env.example` to `.env` and fill in API keys. At minimum you need one LLM provider key (Anthropic, OpenAI, Google, or OpenRouter).
## Architecture Notes
- State lives on disk in `.sf/` — no in-memory state survives across sessions
- Bundled extensions/agents sync to `~/.sf/agent/` on every launch
- LLM providers are lazy-loaded on first use to reduce cold-start time
- Native Rust engine handles grep, glob, ps, highlight, ast, diff
## Eval Dump Inbox
SF/Pi automatically loads `AGENTS.md` and `CLAUDE.md` from the repo tree at
startup. It does not automatically load `TODO.md`, but this repo uses root
`TODO.md` as a temporary human dump inbox for eval and self-evolution ideas.
When a repo contains a root `TODO.md`, treat it as a temporary dump inbox and
read it before planning substantive work in that repo. This applies even when
the user does not explicitly mention evals. Treat the `Raw Dump Inbox` section
as untriaged source material, not as durable instructions. Triage it into
reviewable artifacts: concrete eval cases, harness gaps, memory extraction
requirements, docs, tests, or follow-up implementation tasks. After triage,
remove the processed dump notes from `TODO.md` so the file returns to an empty
inbox/template state. Do not treat dumped notes as runtime memory or approved
behavior until they are converted into tested, versioned project artifacts.
## CI/CD
- `ci.yml` — builds, tests, gates merges to main
- `pipeline.yml` — three-stage release (dev → test → prod)
- `pr-risk.yml` — PR risk classification
- `ai-triage.yml` — AI-based issue/PR triage
## Code Quality Tooling
The repository uses the following quality tools:
- **Biome** — root source linting via `npm run lint` and autofix via `npm run lint:fix`
- Scope: `src/` plus versioned JSON checks
- Config: `biome.json`
- Format touched files with `npx biome check --write <paths>`; full-repo formatting is not the current CI gate.
- **ESLint** — web app linting via `npm --prefix web run lint`
- Scope: `web/`
- Config: `web/eslint.config.mjs`
- **TypeScript** — Strict mode enabled; run `npm run typecheck:extensions`
- **Knip** — Detect unused code and dependencies: `npx knip` (config at `knip.json`)
- **jscpd** — Detect duplicate code: `npx jscpd` (config at `.jscpd.json`)
- **Tech Debt Scanner** — `node scripts/tech-debt-scan.mjs`
- Tracks TODO/FIXME/HACK/XXX counts against thresholds
- **Secret Scan** — `npm run secret-scan` (pre-commit hook available via `npm run secret-scan:install-hook`)
- **Coverage** — `npm run test:coverage` (Vitest V8 coverage with 40/40/20/20 thresholds)
## Dev Container
A Dev Container configuration is available at `.devcontainer/devcontainer.json`.
Open the repository in VS Code with the Dev Containers extension, or run:
```bash
devcontainer up --workspace-folder .
```
The container includes Node 24, Rust, GitHub CLI, Docker-in-Docker, and recommended VS Code extensions.
## Dependency Updates
Dependabot is configured at `.github/dependabot.yml` for:
- Root npm dependencies (weekly, grouped by ecosystem)
- Web app dependencies (weekly)
- GitHub Actions (weekly)
## Issue Labels
Label definitions are at `.github/labels.yml`. Apply labels using:
```bash
# Create a single label
gh label create priority/P0 --color B60205 --description "Critical — blocks release"
# Or use a label management action in CI
```