# What this PR does
Add a system similar to how we select integrations when creating
webhooks so that the user has a description of what webhookds do and
does not have to write complex templates for common webhook use cases.
Presets allow us to create the contents of the webhooks in code and
define which fields are controlled by the preset. Some specifics:
- Newly created webhooks must choose between Simple, Advanced or another
predefined system
- Simple is always an escalation step and will post the entire payload
to the given URL
- Advanced is the same as no preset which is our current view where all
fields are available
- There are no changes for all existing webhooks with empty preset
fields
- Once a webhook is created with a preset the preset cannot be changed
- Fields in the webhook that are populated by code will give a
validation error if they are modified
- In the public API webhooks with presets are returned for viewing but
cannot be created or modified. This restriction is in place because the
Web UI provides the context for which fields to use with a preset. The
public API is for interacting with webhooks where all fields are
defined.
To define a preset create a file with metadata and an override function.
The metadata drives validation and what to display in the UI. There are
two functions one is connected to the pre_save hook of the Webhook model
for persistent changes, the other replaces parameters at execution time
for ephemeral changes. See the simple and advanced presets as an
example. The file must be listed in settings in
`INSTALLED_WEBHOOK_PRESETS` to be enabled at runtime..
## Which issue(s) this PR fixes
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Add composite indexes based on existing queries/usage, ensuring partial
index prefixes are useful too.
- `is_active` filtering is set in the default `User` manager
- most of our user queries are per `organization`
- multiple cases filter by `username` or `email` (most notably schedule
related queries, given the low-level backend ical representation)
Also rework how users are fetched from DB when getting users from
schedules ical representation (which was particularly slow when regex
filtering by required permission).
Related to https://github.com/grafana/oncall-private/issues/2163
# What this PR does
Fixes a regression that was introduced when actions were remapped to new
webhooks. Validation in the serializer was making the user field no
longer accept blank and None values. This PR makes those values accepted
again.
## Which issue(s) this PR fixes
https://github.com/grafana/terraform-provider-grafana/issues/1025
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
- gracefully retry
`apps.alerts.tasks.delete_alert_group.delete_alert_group` when hitting
Slack ratelimits
- remove Slack messages from the DB as soon as they are deleted from
Slack, so the tasks are not retrying perpetually
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Fixes https://github.com/grafana/oncall-private/issues/2154
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
- Rename `SlackClientWithErrorHandling` to just `SlackClient`
- Add more error classes + improve the way errors are raised based on
the Slack error code
- Add API call retries on Slack server errors (e.g. when Slack returns
`5xx` errors)
- Refactor some methods working with Slack API + add tests
## Which issue(s) this PR fixes
- https://github.com/grafana/oncall-private/issues/1837
- https://github.com/grafana/oncall-private/issues/1840
- https://github.com/grafana/oncall-private/issues/1842
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Improve Slack disconnect cleanup:
- Reset Slack user group for schedules
- Reset Slack user identity for deleted users
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/1962
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Make sure that Slack user groups are not getting updated for deleted
orgs.
Without this change, there could be issues with backends in multiple
clusters trying to update a single Slack user group after migrating an
org to another cluster (organizations get soft deleted from the original
cluster after migration).
## Which issue(s) this PR fixes
https://github.com/grafana/support-escalations/issues/6936
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Fix escalation step "Notify if num alerts in time window" when
escalation policy was deleted during the escalation
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2017
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
- update `slackclient` dependency to latest version. The version we were
using was 5 years old 😲
- first followed the v2 migration guide
[here](https://github.com/slackapi/python-slack-sdk/wiki/Migrating-to-2.x)
followed by the v3 migration guide
[here](https://slack.dev/python-slack-sdk/v3-migration/). The main
changes were:
- The PyPI project was renamed from `slackclient` to `slack_sdk`
- it is discouraged/harder to call `api_call` and encouraged to call the
helper methods (ex. `chat_postMessage`;
[note](https://github.com/slackapi/python-slack-sdk/wiki/Migrating-to-2.x#web-client-api-changes)
in migration guide docs)
- In 1.x, a failed api call would return the error payload to you and
have you handle the error. In 2.x, a failed api call will throw an
exception. To handle this in your code, you will have to wrap api calls
with a try except block. Since we overload `WebClient.api_call` this was
an easy change and only required a one line change
- remove `apps.slack.slack_client.slack_server.SlackClientServer` class.
The new version of `slack_sdk` handles the case that we needed to
overload for in the first place.
- merged `apps/slack/slack_client/slack_client.py` and
`apps/slack/slack_client/exceptions.py` into `apps/slack/client.py`
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/2915
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Refactors the foreign key relationship between `AlertGroup` and
`SlackMessage` models (see
https://github.com/grafana/oncall-private/issues/1867 for more info).
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/1867
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Fix escalation snapshot building if last notified user in escalation
step "Notify users one by one (round-robin)" was deleted
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2148
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Fixes https://github.com/grafana/oncall-private/issues/2091
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
- add `common.oncall_gateway.tasks.create_slack_connector_async_v2`
celery task to `CELERY_TASK_ROUTES`
- remove two deprecated tasks in `common.oncall_gateway.tasks`
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
These celery tasks have not been used for more than one week (since
[v1.3.25](https://github.com/grafana/oncall/releases/tag/v1.3.25) which
released an improvement for Grafana Alerting integration)
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
This task was renamed in a v1.3.30. It is no longer referenced and there
are no pending tasks in production queues:

## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Clean up check escalation finished task, update description
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
## Which issue(s) this PR fixes
Closes https://github.com/grafana/oncall-private/issues/1822
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
- Add a test to ensure celery tasks are assigned to a queue
- Move CELERY_TASK_ROUTES out of settings into its own file for easier
reuse and reference.
## Which issue(s) this PR fixes
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
When calculating on-call load, if an involved shift extends outside the
requested time span you can get unexpected results. Truncate the
returned events times to keep the details for the specified start/end
dates.
# What this PR does
Fix silence for paused escalations or alert groups with empty escalation
chain
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/2912
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Make descriptions consistent for the `STEP_NOTIFY_IF_TIME` escalation
step (my [initial
assumption](https://github.com/grafana/oncall/pull/2729#discussion_r1281929984)
seems to be wrong and this can confuse users).
## Which issue(s) this PR fixes
https://github.com/grafana/support-escalations/issues/7164
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Exclude the latest alert groups from escalation finished check to give
them time for building escalation snapshot
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2028
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)