Gitignore (core change): - Remove stale blanket .sf/ entries from .gitignore (migrated to .git/info/exclude on 2026-04-29, never cleaned up) - gitignore.ts: split SF_RUNTIME_EXCLUSION_PATTERNS into two modes — SF_SYMLINK_EXCLUSION_PATTERNS (blanket .sf for symlink repos where git cannot traverse the symlink) and SF_RUNTIME_EXCLUSION_PATTERNS (granular runtime-only patterns for directory repos, enabling .sf/milestones/ and other durable planning artifacts to be tracked) - ensureGitInfoExclude() now detects symlink vs directory and writes the correct patterns, handling transitions between modes cleanly - ADR-001 status: Proposed → Accepted Docs: - Fill 11 placeholder scaffold docs with real SF-specific content: PLANS, DESIGN, PRODUCT_SENSE, QUALITY_SCORE, RELIABILITY, SECURITY, design-docs/index.md, exec-plans/active, exec-plans/completed, exec-plans/tech-debt-tracker, records/index - Add records note: docs/records/2026-05-01-repo-vcs-and-notifications.md - ADR-008 status: Accepted → Proposed (deferred — not applicable to current usage model where Claude Code assists externally, not as a Pi provider inside SF's dispatch loop) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
59 lines
2.5 KiB
Markdown
59 lines
2.5 KiB
Markdown
# Quality Score
|
|
|
|
## Principles
|
|
|
|
- Make code legible to agents with semantic names and explicit boundaries.
|
|
- Prefer small, testable modules over files that require broad context to edit.
|
|
- Enforce style, architecture, and reliability rules mechanically where possible.
|
|
- Keep a cleanup loop for stale docs, generated artifacts, and accumulated implementation debt.
|
|
|
|
## Fast Checks (run on every change)
|
|
|
|
```bash
|
|
just typecheck # tsc --project tsconfig.resources.json, no emit
|
|
just lint # eslint across src/
|
|
```
|
|
|
|
Both must pass before any commit. Typecheck catches type drift early. Lint enforces import rules that enforce the Pi clean seam (ADR-010).
|
|
|
|
## Slow Checks (run before shipping)
|
|
|
|
```bash
|
|
just test # full unit suite — node --test runner, no coverage overhead
|
|
just test-smoke # sf --version, sf --help, sf --print — all three must pass
|
|
```
|
|
|
|
Coverage thresholds (enforced by `npm run test:coverage`):
|
|
- Statements: **40%** minimum
|
|
- Lines: **40%** minimum
|
|
- Branches: **20%** minimum
|
|
- Functions: **20%** minimum
|
|
|
|
These are floors, not targets. The real quality bar is purposeful tests that assert behavior contracts (see `docs/SPEC_FIRST_TDD.md`).
|
|
|
|
## Evals (ad-hoc, not yet automated)
|
|
|
|
No automated eval suite exists yet. ADR-018 Phase 3 defines the eval runner contract. Until then, quality for autonomous behavior is measured by:
|
|
|
|
- Smoke test pass rate across providers
|
|
- Manual milestone runs with trace inspection (`.sf/traces/`)
|
|
- Decision register review at milestone close
|
|
|
|
## Known Blind Spots
|
|
|
|
| Area | Gap | Risk |
|
|
|------|-----|------|
|
|
| `headless.ts` | RPC lifecycle (spawn → event stream → restart) is not covered by unit tests; only integration-tested manually | High: crash recovery correctness |
|
|
| Parallel milestone orchestration | No tests for concurrent STATE.md mutations | Medium: data loss under parallelism |
|
|
| Notification routing | Text-matching classification has no per-pattern unit tests | Low: wrong exit code on wording change |
|
|
| Stuck detection | Sliding-window logic tested, but real-loop replay is not | Medium: false positives under unusual patterns |
|
|
| Provider fallback | Model routing under simulated provider failure not covered | Medium: silent routing to wrong tier |
|
|
|
|
## Doc Quality Signal
|
|
|
|
```bash
|
|
grep -r "TODO\|placeholder\|Describe the\|Document.*here\|Record.*here\|Use this as\|Capture.*here\|Track cleanup" \
|
|
docs/ --include="*.md"
|
|
```
|
|
|
|
This should return empty. Any match is a placeholder doc that needs real content.
|