centralcloud/oncall-engine

Author	SHA1	Message	Date
Vadim Stepanov	8188dd5dd2	Create missing direct paging integrations (#3468 ) # What this PR does Makes organization sync create direct paging integrations for Grafana teams that don't have one. ## Which issue(s) this PR fixes Related to https://github.com/grafana/oncall-private/issues/2302 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-11-30 17:18:18 +00:00
Joey Orlando	7c4b40a046	upgrade to Python 3.12 (#3456 ) # What this PR does Upgrade to Python 3.12 + fix several invalid test assertions that lead to test failures in the latest version of `pytest`: ``` AttributeError: 'called_once_with' is not a valid assertion. Use a spec for the mock if 'called_once_with' is meant to be an attribute.. Did you mean: 'assert_called_once_with'? ``` ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-11-30 13:47:41 +00:00
Vadim Stepanov	381a9ecf54	Delete duplicate direct paging integrations (#3412 ) # What this PR does Deletes duplicate direct paging integrations (i.e. keeps only the first direct paging integration per team). Also adds a unique constraint that will make such duplicates impossible at the DB level. ## Which issue(s) this PR fixes Related to https://github.com/grafana/oncall-private/issues/2302 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-11-30 11:19:12 +00:00
Matias Bordese	7aa78f5f73	Enable flake8-bugbear, fix issues (#3454 ) Enables [flake8-bugbear](https://github.com/PyCQA/flake8-bugbear), checking for bugs/design problems, and [fixes the issues found](https://pastebin.com/fEDBz6Ta) (some interesting ones, particularly with mutable args). Related to https://github.com/grafana/oncall/pull/3448	2023-11-29 15:04:48 +00:00
Matias Bordese	d730f6b2bf	Trigger distribute task after alert is committed (#3420 ) Fix issue triggering task retries because alert is not yet committed to the DB. Similar to https://github.com/grafana/oncall/pull/3001.	2023-11-24 12:02:32 +00:00
Vadim Stepanov	cb2d4fa76b	Fix deleting integrations with duplicate names (#3397 ) # What this PR does Fixes a bug when it's not possible to delete two or more integrations having the same name at once. ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/2313 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-11-21 12:44:21 +00:00
Yulya Artyukhina	d7d5c3aa28	Fix acknowledge reminder (#3345 ) # What this PR does Fix acknowledge reminder: - check if organization was deleted - improve logging ## Which issue(s) this PR fixes ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-11-14 13:39:27 +00:00
Ildar Iskhakov	784c5ee7c1	Add notifications success ratio log to auditor (#3312 ) # What this PR does This PR adds alert groups success ratio over last 48 hours ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-11-10 16:39:13 +08:00
Vadim Stepanov	456829f768	Pass all integration labels down to alert groups (#3302 ) Reverts grafana/oncall#3301	2023-11-08 14:04:58 +00:00
Yulya Artyukhina	7552de13e5	Add a command to continue escalations for alert groups (#3283 ) # What this PR does Add an ability to continue escalations for alert groups from the point it was in case if it was stopped ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-11-07 13:44:23 +00:00
Matias Bordese	cc9dc66437	Move cache clear to fixtures, fix some deprecation notices (#3269 )	2023-11-06 16:52:50 +00:00
Joey Orlando	2cbb20601e	Improve performance of GET /users and GET /teams endpoints used by add responders popup (#3241 ) # What this PR does - Improve performance of the specific `GET /users` and `GET /teams` calls that're made by the Add Responders dropdown in the UI - Add `GET /team/{teamId}` internal API route (needed by Grafana Incident team for their Add Responders changes) - Some UI improvements to the Add Responders popup (loading state + pre-fetch users and teams when the drawer is opened) - Re-enable django-admin only if `settings.SILK_PROFILER_ENABLED == True` (need to be able to log into django admin to auth to silk routes) Closes #3231 Closes https://github.com/grafana/oncall-private/issues/2252 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-11-03 12:40:54 -04:00
Joey Orlando	d49ccef8cb	address some minor direct paging backend issues (#3208 ) # Which issue(s) this PR fixes - Fixes an issue where if the user does not appear in the `UserHasNotification` query, we don't actually unpage the user and therefore they still show up in the `paged_users` array. (unpaging == creating a `AlertGroupLogRecord.TYPE_UNPAGE_USER` log record) - Fixes an issue where if a user is paged multiple times, they would currently show up in `paged_users` > 1 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-10-27 20:47:00 +00:00
Joey Orlando	697248dc75	Add responders improvements (#3128 ) # What this PR does https://www.loom.com/share/c5e10b5ec51343d0954c6f41cfd6a5fb ## Summary of backend changes - Add `AlertReceiveChannel.get_orgs_direct_paging_integrations` method and `AlertReceiveChannel.is_contactable` property. These are needed to be able to (optionally) filter down teams, in the `GET /teams` internal API endpoint ([here](https://github.com/grafana/oncall/pull/3128/files#diff-a4bd76e557f7e11dafb28a52c1034c075028c693b3c12d702d53c07fc6f24c05R55-R63)), to just teams that have a "contactable" Direct Paging integration - `engine/apps/alerts/paging.py` - update these functions to support new UX. In short `direct_paging` no longer takes a list of `ScheduleNotifications` or an `EscalationChain` object - add `user_is_oncall` helper function - add `_construct_title` helper function. In short if no `title` is provided, which is the case for Direct Pages originating from OnCall (either UI or Slack), then the format is `f"{from_user.username} is paging <team.name (if team is specified> <comma separated list of user.usernames> to join escalation"` - `engine/apps/api/serializers/team.py` - add `number_of_users_currently_oncall` attribute to response schema ([code](https://github.com/grafana/oncall/pull/3128/files#diff-26af48f796c9e987a76447586dd0f92349783d6ea6a0b6039a2f0f28bd58c2ebR45-R52)) - `engine/apps/api/serializers/user.py` - add `is_currently_oncall` attribute to response schema ([code](https://github.com/grafana/oncall/pull/3128/files#diff-6744b5544ebb120437af98a996da5ad7d48ee1139a6112c7e3904010ab98f232R157-R162)) - `engine/apps/api/views/team.py` - add support for two new optional query params `only_include_notifiable_teams` and `include_no_team` ([code](https://github.com/grafana/oncall/pull/3128/files#diff-a4bd76e557f7e11dafb28a52c1034c075028c693b3c12d702d53c07fc6f24c05R55-R70)) - `engine/apps/api/views/user.py` - in the `GET /users` internal API endpoint, when specifying the `search` query param now also search on `teams__name` ([code](https://github.com/grafana/oncall/pull/3128/files#diff-30309629484ad28e6fe09816e1bd226226d652ea977b6f3b6775976c729bf4b5R223); this is a new UX requirement) - add support for a new optional query param, `is_currently_oncall`, to allow filtering users based on.. whether they are currently on call or not ([code](https://github.com/grafana/oncall/pull/3128/files#diff-30309629484ad28e6fe09816e1bd226226d652ea977b6f3b6775976c729bf4b5R272-R282)) - remove `check_availability` endpoint (no longer used with new UX; also removed references in frontend code) - `engine/apps/slack/scenarios/paging.py` and `engine/apps/slack/scenarios/manage_responders.py` - update Slack workflows to support new UX. Schedules are no longer a concept here. When creating a new alert group via `/escalate` the user either specifies a team and/or user(s) (they must specify at least one of the two and validation is done here to check this). When adding responders to an existing alert group it's simply a list of users that they can add, no more schedules. - add `Organization.slack_is_configured` and `Organization.telegram_is_configured` properties. These are needed to support [this new functionality ](https://github.com/grafana/oncall/pull/3128/files#diff-9d96504027309f2bd1e95352bac1433b09b60eb4fafb611b52a6c15ed16cbc48R271-R272) in the `AlertReceiveChannel` model. ## Summary of frontend changes - Refactor/rename `EscalationVariants` component to `AddResponders` + remove `grafana-plugin/src/containers/UserWarningModal` (no longer needed with new UX) - Remove `grafana-plugin/src/models/user.ts` as it seemed to be a duplicate of `grafana-plugin/src/models/user/user.types.ts` Related to https://github.com/grafana/incident/issues/4278 - Closes #3115 - Closes #3116 - Closes #3117 - Closes #3118 - Closes #3177 ## TODO - [x] make frontend changes - [x] update Slack backend functionality - [x] update public documentation - [x] add/update e2e tests ## Post-deploy To-dos - [ ] update dev/ops/production Slack bots to update `/escalate` command description (should now say "Direct page a team or user(s)") ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-10-27 12:12:07 -04:00
Vadim Stepanov	1acb7018d0	Improve alert group deletion API (#3124 ) # What this PR does - Invalidate alert group cache on wipe - Improve public API docs on alert group deletion - Add / improve tests ## Which issue(s) this PR fixes Related to https://github.com/grafana/oncall/issues/3051 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-10-05 14:32:40 +01:00
Vadim Stepanov	a727450d49	Public API: Acknowledge & Resolve actions (#3108 ) # What this PR does Makes it possible to acknowledge/unacknowledge and resolve/unresolve alert groups via public API, and makes sure these actions are reflected properly in the alert group timeline. ## Demo ```bash curl --request POST \ --header "Authorization: TOKEN" \ http://localhost:8080/api/v1/alert_groups/IQMHLV8INB24N/resolve ``` <img width="651" alt="Screenshot 2023-10-04 at 16 05 27" src="https://github.com/grafana/oncall/assets/20116910/d4e66868-0132-4b6b-95c7-8424fced7c0b"> ## Which issue(s) this PR fixes https://github.com/grafana/oncall/issues/3051 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-10-05 09:46:48 +01:00
Matias Bordese	29eae9b2c6	Fix slack notification for shift end affected by taken swap (#3092 ) Related to https://github.com/grafana/oncall/issues/3096	2023-10-02 12:56:07 +00:00
Vadim Stepanov	6caacf4048	Handle Slack ratelimit on alert group deletion (#3038 ) # What this PR does - gracefully retry `apps.alerts.tasks.delete_alert_group.delete_alert_group` when hitting Slack ratelimits - remove Slack messages from the DB as soon as they are deleted from Slack, so the tasks are not retrying perpetually ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-09-19 08:41:47 +00:00
Vadim Stepanov	8b2212c7dc	Improve Slack error handling (#3000 ) # What this PR does - Rename `SlackClientWithErrorHandling` to just `SlackClient` - Add more error classes + improve the way errors are raised based on the Slack error code - Add API call retries on Slack server errors (e.g. when Slack returns `5xx` errors) - Refactor some methods working with Slack API + add tests ## Which issue(s) this PR fixes - https://github.com/grafana/oncall-private/issues/1837 - https://github.com/grafana/oncall-private/issues/1840 - https://github.com/grafana/oncall-private/issues/1842 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-09-12 09:49:16 +00:00
Matias Bordese	bf4d948449	Update slack schedule shift change notification (#2949 ) Related to https://github.com/grafana/oncall/issues/2916 Updated notification: ![slack-shift-notification](https://github.com/grafana/oncall/assets/260710/825fda59-6636-44c1-9740-8976e7c109a7)	2023-09-07 13:00:12 +00:00
Yulya Artyukhina	6ff61ad172	Fix escalation step "Notify if num alerts in time window" (#2965 ) # What this PR does Fix escalation step "Notify if num alerts in time window" when escalation policy was deleted during the escalation ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/2017 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-09-05 10:32:59 +00:00
Joey Orlando	a9155130df	update slack_sdk dependency to latest version (#2947 ) # What this PR does - update `slackclient` dependency to latest version. The version we were using was 5 years old 😲 - first followed the v2 migration guide [here](https://github.com/slackapi/python-slack-sdk/wiki/Migrating-to-2.x) followed by the v3 migration guide [here](https://slack.dev/python-slack-sdk/v3-migration/). The main changes were: - The PyPI project was renamed from `slackclient` to `slack_sdk` - it is discouraged/harder to call `api_call` and encouraged to call the helper methods (ex. `chat_postMessage`; [note](https://github.com/slackapi/python-slack-sdk/wiki/Migrating-to-2.x#web-client-api-changes) in migration guide docs) - In 1.x, a failed api call would return the error payload to you and have you handle the error. In 2.x, a failed api call will throw an exception. To handle this in your code, you will have to wrap api calls with a try except block. Since we overload `WebClient.api_call` this was an easy change and only required a one line change - remove `apps.slack.slack_client.slack_server.SlackClientServer` class. The new version of `slack_sdk` handles the case that we needed to overload for in the first place. - merged `apps/slack/slack_client/slack_client.py` and `apps/slack/slack_client/exceptions.py` into `apps/slack/client.py` ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-09-05 11:31:59 +02:00
Yulya Artyukhina	cc92c53f84	Fix build escalation snapshot (#2954 ) # What this PR does Fix escalation snapshot building if last notified user in escalation step "Notify users one by one (round-robin)" was deleted ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/2148 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-09-04 11:10:28 +00:00
Yulya Artyukhina	a1a5adf891	Fix silence for paused escalations or alert groups with empty escalation chain (#2929 ) # What this PR does Fix silence for paused escalations or alert groups with empty escalation chain ## Which issue(s) this PR fixes https://github.com/grafana/oncall/issues/2912 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-08-31 11:47:13 +00:00
Yulya Artyukhina	1ff6ac3380	Set next step eta to None if escalation was paused (#2901 ) # What this PR does Set next step eta to None if escalation was paused (escalation step "Continue escalation if num alerts > X in time window" ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/2028 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-08-29 13:46:41 +00:00
Yulya Artyukhina	5bc7351671	Fix next step eta for silenced alert groups (#2887 ) # What this PR does Update `next_step_eta` in alert group escalation snapshot when alert group is silenced for period ## Which issue(s) this PR fixes Fixes the issue related to [this one](https://github.com/grafana/oncall-private/issues/2028) ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com> Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>	2023-08-28 12:13:01 +00:00
Yulya Artyukhina	58a9a39efe	Improve getting/updating contact points for Grafana Alerting integration (#2742 ) This PR improves Grafana Alerting integration: - get alerting contact points "on fly" instead of keeping them in db - add ability to connect more than one contact point - add ability to create new contact point on create Grafana Alerting integration - show warnings in integration settings for non-active contact points - remove creation alerting notification policies on create Grafana Alerting integration ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) --------- Co-authored-by: Rares Mardare <rares.mardare@grafana.com>	2023-08-18 12:12:29 +02:00
Matias Bordese	179a1db471	Add alertmanager integration for heartbeat support (#2807 ) Related to https://github.com/grafana/oncall/issues/2801 and https://github.com/grafana/support-escalations/issues/7081. --------- Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>	2023-08-17 13:22:37 +00:00
Vadim Stepanov	6f0921a3e4	Fix Slack acknowledgment reminders (#2769 ) # What this PR does Fixes a bug with Slack acknowledgment reminders not being sent (+ some refactoring and unit tests). ## Which issue(s) this PR fixes https://github.com/grafana/oncall/issues/2756 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-08-11 09:41:56 +00:00
Matias Bordese	bb9f647608	Filter out untaken swaps from final schedule and shift notifications (#2748 ) Avoid creating (or notifying) about potential event splits resulting from untaken swap requests.	2023-08-04 17:43:54 +00:00
Yulya Artyukhina	0494afac85	Update schedule slack notifications (#2710 ) # What this PR does Update schedule slack notifications to use schedule final events instead of getting events from iCal ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-08-03 12:38:01 +00:00
Joey Orlando	8db1ea5235	remove some references to amixr (#2698 ) # What this PR does Update references to amixr in various spots in the docs/code + some `.md` IDE autoformatter changes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated (N/A) - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-08-01 14:22:42 -04:00
Innokentii Konstantinov	1ccb9d6979	AlertManager v2 (#2643 ) Introduce AlertManager v2 integration with improved internal behaviour it's using grouping from AlertManager, not trying to re-group alerts on OnCall side. Existing AlertManager and Grafana Alerting integrations are marked as Legacy with options to migrate them manually now or be migrated automatically after DEPRECATION DATE(TBD). Integration urls and public api responses stay the same both for legacy and new integrations. --------- Co-authored-by: Rares Mardare <rares.mardare@grafana.com> Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-08-01 12:18:52 +08:00
Joey Orlando	f77a54b518	Shift Swap Requests in Slack + improve typing for Slack django app (#2653 ) # What this PR does Shift Swap Requests https://www.loom.com/share/860c3337b338412cbd2ac4024260f3e8?sid=3d91b558-b4de-4351-8b45-8a99b7302346 Other - Drastically improve the typing in the `slack` Django app, and several other models/functions that were consumed by logic within the `slack` Django app (ex. setting `RelatedManager` type hints on various models) https://www.loom.com/share/da6b9984519c48d59a45d3c93c08d7dc ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-28 15:11:38 +00:00
Matias Bordese	3ff6e0e492	Refactoring schedule final events (#2651 ) Update `list_users_to_notify_from_ical` to use schedule final events	2023-07-28 11:59:33 +00:00
Matvey Kukuy	abc1c94355	Incident -> Alert Group (#2090 ) Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>	2023-07-26 15:08:07 +00:00
Vadim Stepanov	55299995f7	Fix "Continue escalation if >X alerts per Y minutes" escalation step (#2636 ) # What this PR does Fixes a faulty escalation step "Continue escalation if >X alerts per Y minutes". ## Which issue(s) this PR fixes https://github.com/grafana/oncall/issues/895 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-26 13:33:24 +01:00
Vadim Stepanov	faa7099297	Direct paging: page if acked or silenced, show warning when resolved (#2639 ) # What this PR does The current implementation of the direct paging feature doesn't page additional responders if the alert group is acknowledged, silenced, or resolved, and doesn't show any warnings for such cases. This PR makes so that adding responders for silenced & acknowledged alert groups actually pages the selected user / schedule. For resolved alert groups, a warning message will be shown both in web UI and Slack. ## Which issue(s) this PR fixes Related to https://github.com/grafana/oncall/issues/2442 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-26 13:25:26 +01:00
Matias Bordese	5341a7ea5b	Revert "Refactoring schedule final events for reusability" (#2642 ) Reverts grafana/oncall#2625 Found a small issue, reverting for now until the problem is fixed not to block other changes	2023-07-25 17:37:33 -03:00
Matias Bordese	95723583aa	Refactoring schedule final events for reusability (#2625 ) First step towards reusing `schedule.final_events` where current / upcoming schedule shifts information is needed (eg. escalations, shift notifications, etc)	2023-07-25 18:45:50 +00:00
Vadim Stepanov	602ed535e3	Fix duplicate orders on routes and escalation policies (#2568 ) # What this PR does Fix duplicate `order` values for models `EscalationPolicy` and `ChannelFilter` using the same approach as https://github.com/grafana/oncall/pull/2278. - Make internal API action `move_to_position` a part of [OrderedModelViewSet](https://github.com/grafana/oncall/pull/2568/files#diff-eb62521ccbcb072d1bd2156adeadae3b5895bce6d0d54432a23db3948b0ada54R11-R34), so all ordered model views use the same logic. - Make public API serializers for ordered models inherit from [OrderedModelSerializer](https://github.com/grafana/oncall/pull/2568/files#diff-d749f94af5a49adaf5074325cdfad10ddd5a52dbfd78b49582867ebb9c92fae5R6-R38), so ordered model views are consistent with each other in public API. - Remove `order` from plugin fronted, since it's not being used anywhere. The frontend uses step indices & `move_to_position` action instead. - Make escalation snapshot use step indices instead of orders, since orders are not guaranteed to be sequential (+fix a minor off-by-one bug) ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/1680 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-18 17:17:53 +00:00
Joey Orlando	9cc74e5b67	remove references to AlertGroup.is_archived and AlertGroup.unarchived_objects (#2524 ) # What this PR does This is a follow up to #2502 which started to remove logic to "archiving" alert groups. This PR: - removes all references to `AlertGroup.is_archived` and marks the column as deprecated. We will remove it in the next release - removes the `AlertGroup.unarchived_objects` `Manager` - renames the `AlertGroup.all_objects` `Manager` to `AlertGroup.objects` ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-18 11:48:34 +00:00
Joey Orlando	d5b43b0439	minor improvements for check_escalation_finished celery task (#2554 ) # What this PR does This PR adds some enhancements to the `check_escalation_finished` celery task. It short-circuits auditing of an alert group if it does not have an escalation chain associated with it. In `EscalationSnapshotMixin.start_escalation_if_needed` we will not set `raw_escalation_snapshot` ([here](https://github.com/grafana/oncall/blob/dev/engine/apps/alerts/escalation_snapshot/escalation_snapshot_mixin.py#L262)) in this case: ```python3 def start_escalation_if_needed(self, countdown=START_ESCALATION_DELAY, eta=None): """ :type self:AlertGroup """ AlertGroup = apps.get_model("alerts", "AlertGroup") is_on_maintenace_or_debug_mode = self.channel.maintenance_mode is not None if ( self.is_restricted or is_on_maintenace_or_debug_mode or self.pause_escalation or not self.escalation_chain_exists <-- here ): logger.debug( f"Not escalating alert group w/ pk: {self.pk}\n" f"is_restricted: {self.is_restricted}\n" f"is_on_maintenace_or_debug_mode: {is_on_maintenace_or_debug_mode}\n" f"pause_escalation: {self.pause_escalation}\n" f"escalation_chain_exists: {self.escalation_chain_exists}" ) return logger.debug(f"Start escalation for alert group with pk: {self.pk}") # take raw escalation snapshot from db if escalation is paused raw_escalation_snapshot = ( self.build_raw_escalation_snapshot() if not self.pause_escalation else self.raw_escalation_snapshot ) task_id = celery_uuid() AlertGroup.all_objects.filter(pk=self.pk,).update( active_escalation_id=task_id, is_escalation_finished=False, raw_escalation_snapshot=raw_escalation_snapshot, ) ``` `EscalationSnapshotMixin.escalation_chain_exists` is as such: ```python3 @property def escalation_chain_exists(self) -> bool: if self.pause_escalation: return False elif not self.channel_filter: return False return self.channel_filter.escalation_chain is not None ``` ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) (N/A) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) (N/A)	2023-07-17 14:04:53 +00:00
Vadim Stepanov	69bafb61f1	Direct paging improvements (#2537 ) # What this PR does - Deprecates `/oncall` Slack command in favour of `/esalate` (direct paging) + fixes a regression bug in both commands - Unifies direct paging UX across Slack & Web UI (or at least makes an attempt to make things more similar). Kudos to @iskhakov for all the great work on this recently! - A bunch of minor changes that hopefully make direct paging more usable - TODO: documentation updates will be added in a separate PR ## Screenshots ### No issues scenario Slack: <img width="522" alt="Screenshot 2023-07-14 at 23 53 11" src="https://github.com/grafana/oncall/assets/20116910/ec15a18f-d817-4177-b1f2-6b89d79bb361"> Web UI: <img width="1172" alt="Screenshot 2023-07-14 at 23 52 25" src="https://github.com/grafana/oncall/assets/20116910/813f967c-2fdd-4868-9287-487dbfa7cea6"> ### Not configured scenario Slack: <img width="519" alt="Screenshot 2023-07-14 at 23 45 22" src="https://github.com/grafana/oncall/assets/20116910/932fa05c-81ea-42ca-be80-41b05f767d3e"> Web UI: <img width="1172" alt="Screenshot 2023-07-14 at 23 47 31" src="https://github.com/grafana/oncall/assets/20116910/6bcb07e4-2e50-4120-9fac-be8b0277e181"> ### `/oncall` deprecation warning <img width="521" alt="Screenshot 2023-07-17 at 10 31 56" src="https://github.com/grafana/oncall/assets/20116910/4ff28337-1693-4af0-81d9-9eda90099c1b"> ## Which issue(s) this PR fixes https://github.com/grafana/oncall/issues/2442 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-17 14:21:56 +01:00
Innokentii Konstantinov	aa4edad4a7	Remove INTEGRATIONS_TO_REVERSE_URL_MAP (#2533 ) This PR is a step for migrating to AlertManager v2. I want to simplify config thing a little to be able to move forward. In INTEGRATIONS_TO_REVERSE_URL_MAP key equal value and in all cases we just don't need it. Public API is different, since I need this mapping for migration to AlertManager v2 to provide backward-compatibility. It will be done in next PRs.	2023-07-17 04:43:24 +00:00
Joey Orlando	767c5352fa	augment API response pagination attributes (#2471 ) # What this PR does This PR: - adds a few attributes to paginated API responses - removes channel filter "send demo alert" internal API endpoint + tests (this endpoint was marked as deprecated + not consumed by the web UI) With the new paginated API response schema, the web UI will no longer need to: - hardcode `ITEMS_PER_PAGE` for each table - manually calculate total number of pages (these two things ☝️ will be done in https://github.com/grafana/oncall/issues/2476) For `GET /api/internal/v1/alertgroups` the response will now look like this: ```diff { "next": <url> \| None, "previous": <url> \| None, "results": [], ++ "page_size": <int> } ``` For all other paginated API responses, the response will now look like: ```diff { "count": <int>, "next": <url> \| None, "previous": <url> \| None, "results": [], ++ "page_size": <int>, ++ "current_page_number": <int>, ++ "total_pages": <int> } ``` ## TODO - [x] update public API docs to include these new attributes ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-14 11:19:40 -04:00
Joey Orlando	681087117c	remove deprecated gitops views (#2530 ) # What this PR does I don't see any traffic to these endpoint URLs over the past week. The terraform "renderers" in `engine/apps/alerts/terraform_renderer` make a lot of references to amixr, making think these are legacy, and can be removed. ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated (N/A) - [ ] Documentation added (or `pr:no public docs` PR label added if not required) (N/A) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) (N/A)	2023-07-14 05:13:52 -04:00
Joey Orlando	77f6dedce5	add index on started_at column in alert groups (#2516 ) # What this PR does Adds an index on the `started_at` column in the `alerts_alertgroup` table. For the alert groups query used by the `check_escalation_finished_task`, this resulted in a huge performance boost, taking the query time from 89mins to 4secs (on our largest production dataset). ## Which issue(s) this PR fixes closes #724 closes https://github.com/grafana/oncall-private/issues/1713 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-13 05:22:59 -04:00
Ildar Iskhakov	0b28815d46	Unhide direct paging integration (#2483 ) # What this PR does Fixes https://github.com/grafana/oncall/issues/2442 https://github.com/grafana/oncall/assets/2262529/08bb8e5f-acc6-4f2d-9e38-717c9f37e3da ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-13 13:41:31 +08:00
Joey Orlando	d24dc4b630	remove organization maintenance mode + fix integration maintenance mode (#2511 )	2023-07-12 16:41:44 -04:00

1 2

90 commits