Grafana OnCall engine fork — self-hosted on-call scheduler and alert router
Find a file
Matias Bordese ab700fd4a1
Update OrderedModel serializer not to save previously updated order (#4706)
We are noticing some
[issues](https://ops.grafana-ops.net/goto/sQFLv4XSg?orgId=1) when
updating routes via Terraform because when receiving multiple concurrent
requests updating position, the order is updated in a transaction but it
is then tried to save again in the `.save` call, which could lead to
`IntegrityError`s, because the order may have been updated in a
concurrent request meanwhile.

Test run with the reproduced error (before the serializer update):

```
_____________________________ test_ordered_model_swap_all_to_zero_via_serializer _____________________________

    @pytest.mark.django_db(transaction=True)
    def test_ordered_model_swap_all_to_zero_via_serializer():
        THREADS = 300
        exceptions = []
    
        TestOrderedModel.objects.all().delete()  # clear table
        instances = [TestOrderedModel.objects.create(test_field="test") for _ in range(THREADS)]
    
        # generate random non-unique orders
        random.seed(42)
        positions = [random.randint(0, THREADS - 1) for _ in range(THREADS)]
    
        def update_order_to_zero(idx):
            try:
                instance = instances[idx]
                serializer = TestOrderedModelSerializer(
                    instance, data={"order": 0, "extra_field": idx}, partial=True
                )
                serializer.is_valid(raise_exception=True)
                serializer.save()
                instance.swap(positions[idx])
            except Exception as e:
                exceptions.append(e)
    
        threads = [threading.Thread(target=update_order_to_zero, args=(0,)) for _ in range(THREADS)]
        for thread in threads:
            thread.start()
        for thread in threads:
            thread.join()
    
        # can only check that orders are still sequential and that there are no exceptions
        # can't check the exact order because it changes depending on the order of execution
>       assert not exceptions
E       assert not [IntegrityError(1062, "Duplicate entry 'test-0' for key 'base_testorderedmodel.unique_test_field_order'"), IntegrityEr...rder'"), IntegrityError(1062, "Duplicate entry 'test-0' for key 'base_testorderedmodel.unique_test_field_order'"), ...]

THREADS    = 300
exceptions = [IntegrityError(1062, "Duplicate entry 'test-0' for key 'base_testorderedmodel.unique_test_field_order'"), IntegrityEr...rder'"), IntegrityError(1062, "Duplicate entry 'test-0' for key 'base_testorderedmodel.unique_test_field_order'"), ...]
instances  = [<TestOrderedModel: TestOrderedModel object (841)>, <TestOrderedModel: TestOrderedModel object (842)>, <TestOrderedMod...ject (844)>, <TestOrderedModel: TestOrderedModel object (845)>, <TestOrderedModel: TestOrderedModel object (846)>, ...]
positions  = [57, 12, 140, 125, 114, 71, ...]
thread     = <Thread(Thread-302 (update_order_to_zero), stopped 139636013323840)>
threads    = [<Thread(Thread-3 (update_order_to_zero), stopped 139646722418240)>, <Thread(Thread-4 (update_order_to_zero), stopped ...ate_order_to_zero), stopped 139646608590400)>, <Thread(Thread-8 (update_order_to_zero), stopped 139646600197696)>, ...]
update_order_to_zero = <function test_ordered_model_swap_all_to_zero_via_serializer.<locals>.update_order_to_zero at 0x7f02086274c0>

common/tests/test_ordered_model.py:481: AssertionError
```
2024-07-19 11:56:26 +00:00
.github Cleanup and split tiltfile by profiles (#4691) 2024-07-19 05:47:34 +00:00
.tilt Cleanup and split tiltfile by profiles (#4691) 2024-07-19 05:47:34 +00:00
dev Cleanup and split tiltfile by profiles (#4691) 2024-07-19 05:47:34 +00:00
docs Make docs clearer for "Notify users one by one" policy (#4662) 2024-07-11 16:47:52 +00:00
engine Update OrderedModel serializer not to save previously updated order (#4706) 2024-07-19 11:56:26 +00:00
grafana-plugin Unified Slack app reinstall (#4682) 2024-07-19 11:53:06 +00:00
helm Release oncall Helm chart 1.7.2 2024-06-20 17:54:24 +00:00
terraform Remove unnecessary team checks (#2606) 2023-07-21 15:55:57 +01:00
tools Add additional information to the reports script (#4665) 2024-07-15 19:36:41 +00:00
.dockerignore WIP: Direct paging improvements (#3064) 2023-09-28 03:57:49 +00:00
.gitattributes bump uwsgi to 2.0.26 + Python to 3.12.3 (#4495) 2024-06-10 15:33:37 -04:00
.gitignore Brojd/improve e2e tests dx (#3516) 2023-12-15 08:58:25 +00:00
.markdownlint.json don't enforce line-length rule for markdownlint for code-blocks or tables (#2145) 2023-06-09 06:57:19 +00:00
.markdownlintignore Add tracing support 2022-12-19 17:15:06 +08:00
.pre-commit-config.yaml Splunk OnCall migration tool (#4267) 2024-05-14 13:53:59 +00:00
.prettierignore Brojd/update insights docs (#3692) 2024-01-22 11:26:07 +00:00
.prettierrc.js Brojd/improve e2e tests dx (#3516) 2023-12-15 08:58:25 +00:00
.yamllint.yml configure yamllint pre-commit step (#2728) 2023-08-03 02:35:08 -04:00
CHANGELOG.md Update CHANGELOG.md (#4041) 2024-03-12 07:47:08 +00:00
docker-compose-developer.yml bump uwsgi to 2.0.26 + Python to 3.12.3 (#4495) 2024-06-10 15:33:37 -04:00
docker-compose-mysql-rabbitmq.yml bump uwsgi to 2.0.26 + Python to 3.12.3 (#4495) 2024-06-10 15:33:37 -04:00
docker-compose.yml bump uwsgi to 2.0.26 + Python to 3.12.3 (#4495) 2024-06-10 15:33:37 -04:00
LICENSE scratch that.. add back LICENSE 2024-06-06 09:58:26 -04:00
Makefile make make cleanup prune volumes (#4600) 2024-06-27 13:59:12 +00:00
README.md Update README.md 2024-06-20 11:12:26 -06:00
Tiltfile Cleanup and split tiltfile by profiles (#4691) 2024-07-19 05:47:34 +00:00

Grafana OnCall

Latest Release License Docker Pulls Slack Build Status

Developer-friendly incident response with brilliant Slack integration.

  • Collect and analyze alerts from multiple monitoring systems
  • On-call rotations based on schedules
  • Automatic escalations
  • Phone calls, SMS, Slack, Telegram notifications

Getting Started

We prepared multiple environments:

  1. Download docker-compose.yml:

    curl -fsSL https://raw.githubusercontent.com/grafana/oncall/dev/docker-compose.yml -o docker-compose.yml
    
  2. Set variables:

    echo "DOMAIN=http://localhost:8080
    # Remove 'with_grafana' below if you want to use existing grafana
    # Add 'with_prometheus' below to optionally enable a local prometheus for oncall metrics
    # e.g. COMPOSE_PROFILES=with_grafana,with_prometheus
    COMPOSE_PROFILES=with_grafana
    # to setup an auth token for prometheus exporter metrics:
    # PROMETHEUS_EXPORTER_SECRET=my_random_prometheus_secret
    # also, make sure to enable the /metrics endpoint:
    # FEATURE_PROMETHEUS_EXPORTER_ENABLED=True
    SECRET_KEY=my_random_secret_must_be_more_than_32_characters_long" > .env
    
  3. (Optional) If you want to enable/setup the prometheus metrics exporter (besides the changes above), create a prometheus.yml file (replacing my_random_prometheus_secret accordingly), next to your docker-compose.yml:

    echo "global:
      scrape_interval:     15s
      evaluation_interval: 15s
    
    scrape_configs:
      - job_name: prometheus
        metrics_path: /metrics/
        authorization:
          credentials: my_random_prometheus_secret
        static_configs:
          - targets: [\"host.docker.internal:8080\"]" > prometheus.yml
    

    NOTE: you will need to setup a Prometheus datasource using http://prometheus:9090 as the URL in the Grafana UI.

  4. Launch services:

    docker-compose pull && docker-compose up -d
    
  5. Go to OnCall Plugin Configuration, using log in credentials as defined above: admin/admin (or find OnCall plugin in configuration->plugins) and connect OnCall plugin with OnCall backend:

    OnCall backend URL: http://engine:8080
    
  6. Enjoy! Check our OSS docs if you want to set up Slack, Telegram, Twilio or SMS/calls through Grafana Cloud.

Update version

To update your Grafana OnCall hobby environment:

# Update Docker image
docker-compose pull engine

# Re-deploy
docker-compose up -d

After updating the engine, you'll also need to click the "Update" button on the plugin version page. See Grafana docs for more info on updating Grafana plugins.

Join community

Have a question, comment or feedback? Don't be afraid to open an issue!

Stargazers over time

Stargazers over time

Further Reading