portal/.sf/REQUIREMENTS.md

150 lines
6.2 KiB
Markdown
Raw Permalink Normal View History

2026-05-13 01:58:23 +02:00
# Requirements
## Active
### R001 — Ops-engine incident adapter
- Class: core-capability
- Status: active
- Description: A `CentralcloudCore.Incidents` module exists that calls the ops-engine `/api/incidents` endpoints and returns data in the OnCall alert_group shape so existing templates require minimal changes.
- Why it matters: Decouples the LiveViews from Grafana OnCall and provides a clean migration boundary.
- Source: spec
- Primary owning slice: M001/S01
- Supporting slices: none
- Validation: unmapped
- Notes: Must send `Host: ops.centralcloud.com` header on every request to avoid 301 redirects.
### R002 — DashboardLive uses ops-engine incidents
- Class: primary-user-loop
- Status: active
- Description: The dashboard shows firing, acknowledged, and resolved incident counts and tables sourced from the ops engine instead of Grafana OnCall.
- Why it matters: This is the primary entry point for staff; it must reflect the authoritative incident source.
- Source: spec
- Primary owning slice: M001/S01
- Supporting slices: none
- Validation: unmapped
- Notes: Status mapping: ops `open`/`firing` → UI `firing`; ops `acknowledged` → UI `acked`; ops `resolved` → UI `resolved`.
### R003 — IncidentsLive uses ops-engine incidents
- Class: primary-user-loop
- Status: active
- Description: The `/incidents` list view displays incidents from the ops engine with status filtering.
- Why it matters: Staff browse and triage incidents through this view.
- Source: spec
- Primary owning slice: M001/S01
- Supporting slices: none
- Validation: unmapped
- Notes: Remove `silenced` filter since ops engine does not support silence.
### R004 — IncidentLive detail uses ops-engine incidents
- Class: primary-user-loop
- Status: active
- Description: The `/incidents/:id` detail view displays a single incident from the ops engine.
- Why it matters: Staff investigate individual incidents here.
- Source: spec
- Primary owning slice: M001/S02
- Supporting slices: none
- Validation: unmapped
- Notes: Ops engine returns `services`, `timeline`, `notifications` instead of `alerts`. Template may need adjustment for the detail section.
### R005 — Incident ack/resolve actions hit ops engine
- Class: primary-user-loop
- Status: active
- Description: Acknowledge and resolve buttons in the detail view call the ops engine endpoints (`POST /api/incidents/:id/acknowledge` and `POST /api/incidents/:id/resolve`).
- Why it matters: Actions must mutate the authoritative incident source.
- Source: spec
- Primary owning slice: M001/S02
- Supporting slices: none
- Validation: unmapped
- Notes: Assign action is available in the API but not currently exposed in the UI.
### R006 — OnCall schedules remain untouched
- Class: constraint
- Status: active
- Description: `OnCallLive`, `CentralcloudCore.OnCall`, and all schedule/shift/user/escalation chain functionality continue to use Grafana OnCall.
- Why it matters: Scope boundary — replacing OnCall scheduling is a separate, larger project.
- Source: spec
- Primary owning slice: M001/S01
- Supporting slices: M001/S02, M001/S03
- Validation: unmapped
- Notes: Verify `/oncall` page still loads correctly after migration.
### R007 — Build and deploy staff app
- Class: operability
- Status: active
- Description: The staff app is built as a new Docker image, pushed to the registry, and the deployment manifest is updated so Flux reconciles the new version.
- Why it matters: Code changes must reach production to be valuable.
- Source: spec
- Primary owning slice: M001/S03
- Supporting slices: none
- Validation: unmapped
- Notes: Current deployment has `replicas: 0`. Scaling strategy must be determined at deploy time.
### R008 — Remove silence feature from UI
- Class: core-capability
- Status: active
- Description: Silence buttons and the `silenced` status filter are removed from the UI because the ops engine does not expose a silence endpoint.
- Why it matters: Prevents broken actions and confusing UI states.
- Source: inferred
- Primary owning slice: M001/S02
- Supporting slices: M001/S01
- Validation: unmapped
- Notes: Also remove `silenced` status badge and filter from IncidentsLive.
## Out of Scope
### R030 — Replace Grafana OnCall entirely
- Class: anti-feature
- Status: out-of-scope
- Description: Removing or replacing Grafana OnCall for schedules, users, and escalation chains.
- Why it matters: Prevents scope creep into a much larger project.
- Source: spec
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: Explicitly excluded in spec.
### R031 — Modify centralcloud-ops engine
- Class: constraint
- Status: out-of-scope
- Description: Any changes to the ops engine codebase or API.
- Why it matters: Work is constrained to the portal repo and deployment manifests only.
- Source: spec
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: Engine is treated as an external API contract.
### R032 — Add comprehensive test suite
- Class: quality-attribute
- Status: out-of-scope
- Description: Adding unit or integration tests for the migrated functionality.
- Why it matters: No tests exist currently; adding them would expand scope beyond the migration.
- Source: inferred
- Primary owning slice: none
- Supporting slices: none
- Validation: n/a
- Notes: Verification is manual/browser-based per spec.
## Traceability
| ID | Class | Status | Primary owner | Supporting | Proof |
|---|---|---|---|---|---|
| R001 | core-capability | active | M001/S01 | none | unmapped |
| R002 | primary-user-loop | active | M001/S01 | none | unmapped |
| R003 | primary-user-loop | active | M001/S01 | none | unmapped |
| R004 | primary-user-loop | active | M001/S02 | none | unmapped |
| R005 | primary-user-loop | active | M001/S02 | none | unmapped |
| R006 | constraint | active | M001/S01 | M001/S02,S03 | unmapped |
| R007 | operability | active | M001/S03 | none | unmapped |
| R008 | core-capability | active | M001/S02 | M001/S01 | unmapped |
| R030 | anti-feature | out-of-scope | none | none | n/a |
| R031 | constraint | out-of-scope | none | none | n/a |
| R032 | quality-attribute | out-of-scope | none | none | n/a |
## Coverage Summary
- Active requirements: 8
- Mapped to slices: 8
- Validated: 0
- Unmapped active requirements: 0