Grafana OnCall engine fork — self-hosted on-call scheduler and alert router
Find a file
Joey Orlando aaae31a232
PagerDuty Migrator: Add filtering capabilities and fix user notification rule preservation (#5454)
This PR adds filtering capabilities to the PagerDuty migrator tool and
fixes user notification rule preservation behavior.

Closes https://github.com/grafana/irm/issues/612

## Changes

### 1. Added Resource Filtering
Added the ability to filter PagerDuty resources during migration based
on:
- Team membership
- User association
- Name patterns (using regex)

New environment variables for filtering:
```
PAGERDUTY_FILTER_TEAM
PAGERDUTY_FILTER_USERS
PAGERDUTY_FILTER_SCHEDULE_REGEX
PAGERDUTY_FILTER_ESCALATION_POLICY_REGEX
PAGERDUTY_FILTER_INTEGRATION_REGEX
```

#### Example Usage

Filter by team:
```bash
docker run --rm \
-e MIGRATING_FROM="pagerduty" \
-e MODE="plan" \
-e ONCALL_API_URL="<your-oncall-api-url>" \
-e ONCALL_API_TOKEN="<your-oncall-api-token>" \
-e PAGERDUTY_API_TOKEN="<your-pd-api-token>" \
-e PAGERDUTY_FILTER_TEAM="SRE Team" \
oncall-migrator
```

Filter by specific users:
```bash
docker run --rm \
-e MIGRATING_FROM="pagerduty" \
-e MODE="plan" \
-e ONCALL_API_URL="<your-oncall-api-url>" \
-e ONCALL_API_TOKEN="<your-oncall-api-token>" \
-e PAGERDUTY_API_TOKEN="<your-pd-api-token>" \
-e PAGERDUTY_FILTER_USERS="P123ABC,P456DEF" \
oncall-migrator
```

Filter schedules by name pattern:
```bash
docker run --rm \
-e MIGRATING_FROM="pagerduty" \
-e MODE="plan" \
-e ONCALL_API_URL="<your-oncall-api-url>" \
-e ONCALL_API_TOKEN="<your-oncall-api-token>" \
-e PAGERDUTY_API_TOKEN="<your-pd-api-token>" \
-e PAGERDUTY_FILTER_SCHEDULE_REGEX="^(Primary|Secondary)" \
oncall-migrator
```

Filter escalation policies by name pattern:
```bash
docker run --rm \
-e MIGRATING_FROM="pagerduty" \
-e MODE="plan" \
-e ONCALL_API_URL="<your-oncall-api-url>" \
-e ONCALL_API_TOKEN="<your-oncall-api-token>" \
-e PAGERDUTY_API_TOKEN="<your-pd-api-token>" \
-e PAGERDUTY_FILTER_ESCALATION_POLICY_REGEX="^Prod" \
oncall-migrator
```

Filter integrations by name pattern:
```bash
docker run --rm \
-e MIGRATING_FROM="pagerduty" \
-e MODE="plan" \
-e ONCALL_API_URL="<your-oncall-api-url>" \
-e ONCALL_API_TOKEN="<your-oncall-api-token>" \
-e PAGERDUTY_API_TOKEN="<your-pd-api-token>" \
-e PAGERDUTY_FILTER_INTEGRATION_REGEX="Prometheus$" \
oncall-migrator
```

### 2. Fixed User Notification Rule Preservation

Introduces a `PRESERVE_EXISTING_USER_NOTIFICATION_RULES` config (default
of `true`). The migrator now:
- does not delete user notification rules in Grafana OnCall, if the
Grafana user already has some defined, AND
`PRESERVE_EXISTING_USER_NOTIFICATION_RULES` is True
- if the Grafana user has no personal notification rules defined in
OnCall, we will create them
- deletes existing user notification rules, and creates new ones, in
Grafana OnCall, if `PRESERVE_EXISTING_USER_NOTIFICATION_RULES` is False
- basically make sure that the state in Grafana OnCall matches the
_latest_ state in PagerDuty
- Improves logging to clearly indicate when rules are being preserved

#### Example Usage

Preserve existing notification policies (default):
```bash
docker run --rm \
-e MIGRATING_FROM="pagerduty" \
-e MODE="migrate" \
-e ONCALL_API_URL="<your-oncall-api-url>" \
-e ONCALL_API_TOKEN="<your-oncall-api-token>" \
-e PAGERDUTY_API_TOKEN="<your-pd-api-token>" \
oncall-migrator
```

Replace existing notification policies:
```bash
docker run --rm \
-e MIGRATING_FROM="pagerduty" \
-e MODE="migrate" \
-e ONCALL_API_URL="<your-oncall-api-url>" \
-e ONCALL_API_TOKEN="<your-oncall-api-token>" \
-e PAGERDUTY_API_TOKEN="<your-pd-api-token>" \
-e PRESERVE_EXISTING_USER_NOTIFICATION_RULES="false" \
oncall-migrator
```

### 3. Improved Testing
Added comprehensive test coverage for filtering functionality and
updated user notification rule preservation tests

## Testing Done
- Manual testing of filtering capabilities in both plan and migrate
modes
- Verified notification policy preservation behavior
2025-02-18 08:12:05 -05:00
.github Use a different GH secret to sign plugin (#5447) 2025-02-12 17:48:37 +00:00
.tilt chore: serve local dist (#5021) 2024-09-13 12:44:20 +00:00
dev chore(deps): bump aiohttp from 3.10.2 to 3.10.11 in /dev/scripts/generate-fake-data (#5275) 2024-11-20 19:47:55 +00:00
docs Add important version of round-robin escalation step (#5418) 2025-01-21 16:29:36 +00:00
engine chore: remove reference to recaptcha site (#5443) 2025-02-07 19:34:10 +00:00
grafana-plugin fix: disable recaptcha when site key is not set (#5451) 2025-02-14 14:30:08 +00:00
helm Release oncall Helm chart 1.14.4 2025-02-14 16:13:13 +00:00
terraform Remove unnecessary team checks (#2606) 2023-07-21 15:55:57 +01:00
tools PagerDuty Migrator: Add filtering capabilities and fix user notification rule preservation (#5454) 2025-02-18 08:12:05 -05:00
.dockerignore WIP: Direct paging improvements (#3064) 2023-09-28 03:57:49 +00:00
.gitattributes bump uwsgi to 2.0.26 + Python to 3.12.3 (#4495) 2024-06-10 15:33:37 -04:00
.gitignore chore: Switch to pnpm + adjust to IRM (#4969) 2024-09-02 12:48:23 +00:00
.markdownlint.json don't enforce line-length rule for markdownlint for code-blocks or tables (#2145) 2023-06-09 06:57:19 +00:00
.markdownlintignore Back merge irm (#5098) 2024-10-01 12:59:24 +00:00
.nvmrc fix apps.telegram.tasks.send_log_and_actions_message retrying tasks (#4851) 2024-08-19 14:05:40 -04:00
.pre-commit-config.yaml ci: update oss plugin release process (#5051) 2024-09-23 11:56:16 -04:00
.prettierignore Brojd/update insights docs (#3692) 2024-01-22 11:26:07 +00:00
.prettierrc.js Brojd/improve e2e tests dx (#3516) 2023-12-15 08:58:25 +00:00
.yamllint.yml chore: Switch to pnpm + adjust to IRM (#4969) 2024-09-02 12:48:23 +00:00
CHANGELOG.md Update CHANGELOG.md (#4041) 2024-03-12 07:47:08 +00:00
docker-compose-developer.yml security: Update docker redis image to v7.0.15 (#5063) 2024-11-20 17:04:14 +00:00
docker-compose-mysql-rabbitmq.yml security: Update docker redis image to v7.0.15 (#5063) 2024-11-20 17:04:14 +00:00
docker-compose.yml security: Update docker redis image to v7.0.15 (#5063) 2024-11-20 17:04:14 +00:00
LICENSE scratch that.. add back LICENSE 2024-06-06 09:58:26 -04:00
Makefile chore: Switch to pnpm + adjust to IRM (#4969) 2024-09-02 12:48:23 +00:00
README.md Update helm chart for newer grafana + enable externalServiceAccounts (#4876) 2024-09-05 12:18:07 -06:00
Tiltfile fix: disable accessControlOnCall for Grafana 11.3 (#5245) 2024-11-12 15:48:47 +00:00

Grafana OnCall

Latest Release License Docker Pulls Slack Build Status

Developer-friendly incident response with brilliant Slack integration.

  • Collect and analyze alerts from multiple monitoring systems
  • On-call rotations based on schedules
  • Automatic escalations
  • Phone calls, SMS, Slack, Telegram notifications

Getting Started

Important

These instructions are for using Grafana 11 or newer. You must enable the feature toggle for externalServiceAccounts. This is already done for the docker files and helm charts. If you are running Grafana separately see the Grafana documentation on how to enable this.

We prepared multiple environments:

  1. Download docker-compose.yml:

    curl -fsSL https://raw.githubusercontent.com/grafana/oncall/dev/docker-compose.yml -o docker-compose.yml
    
  2. Set variables:

    echo "DOMAIN=http://localhost:8080
    # Remove 'with_grafana' below if you want to use existing grafana
    # Add 'with_prometheus' below to optionally enable a local prometheus for oncall metrics
    # e.g. COMPOSE_PROFILES=with_grafana,with_prometheus
    COMPOSE_PROFILES=with_grafana
    # to setup an auth token for prometheus exporter metrics:
    # PROMETHEUS_EXPORTER_SECRET=my_random_prometheus_secret
    # also, make sure to enable the /metrics endpoint:
    # FEATURE_PROMETHEUS_EXPORTER_ENABLED=True
    SECRET_KEY=my_random_secret_must_be_more_than_32_characters_long" > .env
    
  3. (Optional) If you want to enable/setup the prometheus metrics exporter (besides the changes above), create a prometheus.yml file (replacing my_random_prometheus_secret accordingly), next to your docker-compose.yml:

    echo "global:
      scrape_interval:     15s
      evaluation_interval: 15s
    
    scrape_configs:
      - job_name: prometheus
        metrics_path: /metrics/
        authorization:
          credentials: my_random_prometheus_secret
        static_configs:
          - targets: [\"host.docker.internal:8080\"]" > prometheus.yml
    

    NOTE: you will need to setup a Prometheus datasource using http://prometheus:9090 as the URL in the Grafana UI.

  4. Launch services:

    docker-compose pull && docker-compose up -d
    
  5. Provision the plugin (If you run Grafana outside the included docker files install the plugin before these steps):

    If you are using the included docker compose file use admin/admin credentials and localhost:3000 to perform this task. If you have configured Grafana differently adjust your credentials and hostnames accordingly.

    # Note: onCallApiUrl 'engine' and grafanaUrl 'grafana' use the name from the docker compose file.  If you are 
    # running your grafana or oncall engine instance with another hostname adjust accordingly. 
    curl -X POST 'http://admin:admin@localhost:3000/api/plugins/grafana-oncall-app/settings' -H "Content-Type: application/json" -d '{"enabled":true, "jsonData":{"stackId":5, "orgId":100, "onCallApiUrl":"http://engine:8080", "grafanaUrl":"http://grafana:3000"}}'
    curl -X POST 'http://admin:admin@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/install'
    
  6. Start using OnCall, log in to Grafana with credentials as defined above: admin/admin

  7. Enjoy! Check our OSS docs if you want to set up Slack, Telegram, Twilio or SMS/calls through Grafana Cloud.

Troubleshooting

Here are some API calls that can be made to help if you are having difficulty connecting Grafana and OnCall. (Modify parameters to match your credentials and environment)

# Use this to get more information about the connection between Grafana and OnCall
curl -X GET 'http://admin:admin@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/status'
# If you added a user or changed permissions and don't see it show up in OnCall you can manually trigger sync.
# Note: This is called automatically when the app is loaded (page load/refresh) but there is a 5 min timeout so 
# that it does not generate unnecessary activity.
curl -X POST 'http://admin:admin@localhost:3000/api/plugins/grafana-oncall-app/resources/plugin/sync'

Update version

To update your Grafana OnCall hobby environment:

# Update Docker image
docker-compose pull engine

# Re-deploy
docker-compose up -d

After updating the engine, you'll also need to click the "Update" button on the plugin version page. See Grafana docs for more info on updating Grafana plugins.

Join community

Have a question, comment or feedback? Don't be afraid to open an issue!

Stargazers over time

Stargazers over time

Further Reading