Grafana OnCall engine fork — self-hosted on-call scheduler and alert router
Find a file
Vadim Stepanov eada4a4355
Fix duplicate orders for user notification policies (#2278)
# What this PR does

Fixes an issue when multiple user notification policies have duplicated
order values, leading to the following unexpected behaviours:
1. Not possible to rearrange notification policies that have duplicated
orders.
2. The notification system only executes the first policy from each
order group. For example, if there are policies with orders `[0, 0, 0,
0]`, only the first policy will be executed, and all others will be
skipped. So the user will see four policies in the UI, while only one of
them will be actually executed.

This PR fixes the issue by adding a unique index on `(user_id,
important, order)` for `UserNotificationPolicy` model. However, it's not
possible to add that unique index using the ordering library that we use
due to it's implementation details.
I added a new abstract Django model `OrderedModel` that's able to work
with such unique indices + under concurrent load.

Important info on this new `OrderedModel` abstract model:
- Orders are unique on the DB level
- Orders are allowed to be non-consecutive, for example order sequence
`[100, 150, 400]` is valid
- When deleting an instance, orders of other instances don't change.
This is a notable difference from the library we use. I think it's
better to only delete the instance without changing any other orders,
because it reduces the number of dependencies between instances (e.g.
Terraform drift will be much smaller this way if a policy is deleted via
the web UI).

## Which issue(s) this PR fixes

Related to https://github.com/grafana/oncall-private/issues/1680

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-06-21 11:13:56 +00:00
.github adjust stale PR bot to use pr:stale label instead (#2298) 2023-06-21 04:28:07 -04:00
dev add mypy static type checker to backend codebase (#2151) 2023-06-12 12:50:33 -04:00
docs Metrics doc (#2149) 2023-06-21 11:00:13 +00:00
engine Fix duplicate orders for user notification policies (#2278) 2023-06-21 11:13:56 +00:00
examples/terraform Terraform examples 2022-08-11 14:32:39 +05:00
grafana-plugin Fixes for templates&grouping (#2300) 2023-06-21 13:42:01 +03:00
helm fix existingSecret for RabbitMQ to default value (see #761) (#864) 2023-06-16 03:14:36 +00:00
provisioning/datasources Add initial setup for local dev prometheus exporter (#2039) 2023-06-01 12:31:33 +00:00
tools don't enforce line-length rule for markdownlint for code-blocks or tables (#2145) 2023-06-09 06:57:19 +00:00
.drone.yml Sign drone build 2023-06-12 15:32:12 -06:00
.gitignore ignore .http file extensions (#1762) 2023-04-17 10:52:03 +02:00
.markdownlint.json don't enforce line-length rule for markdownlint for code-blocks or tables (#2145) 2023-06-09 06:57:19 +00:00
.markdownlintignore Add tracing support 2022-12-19 17:15:06 +08:00
.pre-commit-config.yaml fix failing lint github actions job due to issue w/ isort version (#1249) 2023-01-30 11:43:15 +01:00
CHANGELOG.md Fix duplicate orders for user notification policies (#2278) 2023-06-21 11:13:56 +00:00
CODE_OF_CONDUCT.md add precommit rules for markdown/json files (#915) 2022-12-01 14:26:54 +01:00
docker-compose-developer.yml Fix duplicate orders for user notification policies (#2278) 2023-06-21 11:13:56 +00:00
docker-compose-mysql-rabbitmq.yml bump mysql from 5.7 to 8.0.32 (#1790) 2023-05-10 17:53:27 +00:00
docker-compose.yml Add instructions for prometheus exporter setup (#2103) 2023-06-12 13:04:07 +00:00
GOVERNANCE.md add precommit rules for markdown/json files (#915) 2022-12-01 14:26:54 +01:00
LICENSE World, meet OnCall! 2022-06-03 08:09:47 -06:00
LICENSING.md add precommit rules for markdown/json files (#915) 2022-12-01 14:26:54 +01:00
MAINTAINERS.md add precommit rules for markdown/json files (#915) 2022-12-01 14:26:54 +01:00
Makefile add mypy static type checker to backend codebase (#2151) 2023-06-12 12:50:33 -04:00
README.md Add instructions for prometheus exporter setup (#2103) 2023-06-12 13:04:07 +00:00
screenshot.png Merge dev to main (#54) 2022-06-13 16:39:58 -06:00
screenshot_mobile.png Readme updates 2023-04-11 15:43:52 +03:00

Grafana OnCall

Latest Release License Docker Pulls Slack Discussion Build Status

Developer-friendly incident response with brilliant Slack integration.

  • Collect and analyze alerts from multiple monitoring systems
  • On-call rotations based on schedules
  • Automatic escalations
  • Phone calls, SMS, Slack, Telegram notifications

Getting Started

We prepared multiple environments:

  1. Download docker-compose.yml:

    curl -fsSL https://raw.githubusercontent.com/grafana/oncall/dev/docker-compose.yml -o docker-compose.yml
    
  2. Set variables:

    echo "DOMAIN=http://localhost:8080
    # Remove 'with_grafana' below if you want to use existing grafana
    # Add 'with_prometheus' below to optionally enable a local prometheus for oncall metrics
    # e.g. COMPOSE_PROFILES=with_grafana,with_prometheus
    COMPOSE_PROFILES=with_grafana
    # to setup an auth token for prometheus exporter metrics:
    # PROMETHEUS_EXPORTER_SECRET=my_random_prometheus_secret
    # also, make sure to enable the /metrics endpoint:
    # FEATURE_PROMETHEUS_EXPORTER_ENABLED=True
    SECRET_KEY=my_random_secret_must_be_more_than_32_characters_long" > .env
    
  3. (Optional) If you want to enable/setup the prometheus metrics exporter (besides the changes above), create a prometheus.yml file (replacing my_random_prometheus_secret accordingly), next to your docker-compose.yml:

    echo "global:
      scrape_interval:     15s
      evaluation_interval: 15s
    
    scrape_configs:
      - job_name: prometheus
        metrics_path: /metrics/
        authorization:
          - credentials: my_random_prometheus_secret
        static_configs:
          - targets: [\"host.docker.internal:8080\"]" > prometheus.yml
    

    NOTE: you will need to setup a Prometheus datasource using http://prometheus:9090 as the URL in the Grafana UI.

  4. Launch services:

    docker-compose pull && docker-compose up -d
    
  5. Go to OnCall Plugin Configuration, using log in credentials as defined above: admin/admin (or find OnCall plugin in configuration->plugins) and connect OnCall plugin with OnCall backend:

    OnCall backend URL: http://engine:8080
    
  6. Enjoy! Check our OSS docs if you want to set up Slack, Telegram, Twilio or SMS/calls through Grafana Cloud.

Update version

To update your Grafana OnCall hobby environment:

# Update Docker image
docker-compose pull engine

# Re-deploy
docker-compose up -d

After updating the engine, you'll also need to click the "Update" button on the plugin version page. See Grafana docs for more info on updating Grafana plugins.

Join community

Stargazers over time

Stargazers over time

Further Reading