# Quality Score ## Principles - Make code legible to agents with semantic names and explicit boundaries. - Prefer small, testable modules over files that require broad context to edit. - Enforce style, architecture, and reliability rules mechanically where possible. - Keep a cleanup loop for stale docs, generated artifacts, and accumulated implementation debt. ## Fast Checks (run on every change) ```bash just typecheck # tsc --project tsconfig.resources.json, no emit just lint # eslint across src/ ``` Both must pass before any commit. Typecheck catches type drift early. Lint enforces import rules that enforce the Pi clean seam (ADR-010). ## Slow Checks (run before shipping) ```bash just test # full unit suite — node --test runner, no coverage overhead just test-smoke # sf --version, sf --help, sf --print — all three must pass ``` Coverage thresholds (enforced by `npm run test:coverage`): - Statements: **40%** minimum - Lines: **40%** minimum - Branches: **20%** minimum - Functions: **20%** minimum These are floors, not targets. The real quality bar is purposeful tests that assert behavior contracts (see `docs/SPEC_FIRST_TDD.md`). ## Evals (ad-hoc, not yet automated) No automated eval suite exists yet. ADR-018 Phase 3 defines the eval runner contract. Until then, quality for autonomous behavior is measured by: - Smoke test pass rate across providers - Manual milestone runs with trace inspection (`.sf/traces/`) - Decision register review at milestone close ## Known Blind Spots | Area | Gap | Risk | |------|-----|------| | `headless.ts` | RPC lifecycle (spawn → event stream → restart) is not covered by unit tests; only integration-tested manually | High: crash recovery correctness | | Parallel milestone orchestration | No tests for concurrent STATE.md mutations | Medium: data loss under parallelism | | Notification routing | Text-matching classification has no per-pattern unit tests | Low: wrong exit code on wording change | | Stuck detection | Sliding-window logic tested, but real-loop replay is not | Medium: false positives under unusual patterns | | Provider fallback | Model routing under simulated provider failure not covered | Medium: silent routing to wrong tier | ## Doc Quality Signal ```bash grep -r "TODO\|placeholder\|Describe the\|Document.*here\|Record.*here\|Use this as\|Capture.*here\|Track cleanup" \ docs/ --include="*.md" ``` This should return empty. Any match is a placeholder doc that needs real content.