centralcloud/oncall-engine

Author	SHA1	Message	Date
Ildar Iskhakov	335c8fe65b	Optimize alert and alert group public api endpoints, add filter by id (#1274 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-03 17:05:08 +08:00
Matvey Kukuy	038310829b	Mobile app documentation draft. (#1207 ) # What this PR does First draft of documentation. @alyssawada please use it as a starting point :) ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [x] Documentation added - [ ] `CHANGELOG.md` updated --------- Co-authored-by: alyssa wada <alyssa.wada@grafana.com> Co-authored-by: Joey Orlando <joey.orlando@grafana.com> Co-authored-by: Alyssa Wada <101596687+alyssawada@users.noreply.github.com> Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>	2023-02-02 15:06:28 +00:00
Vadim Stepanov	2218161069	Fix test	2023-02-02 14:28:37 +00:00
Vadim Stepanov	08dbab73d2	Remove mobile_app_settings DynamicSetting (#1268 ) # What this PR does Remove checks for `mobile_app_settings` DynamicSetting, so changing `FEATURE_MOBILE_APP_INTEGRATION_ENABLED` is enough for toggling the mobile app backend (aka remove per-org feature flag) Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-02 13:21:04 +00:00
Matias Bordese	bc0276fb22	Keep track of direct paging schedule/importance in logs (#1269 ) This will eventually allow to improve responders information in an alert group detail page	2023-02-02 09:21:31 -03:00
Vadim Stepanov	9b709e86c9	Fix local dev setup slowness (#1270 ) # What this PR does Fixes an issue when a local dev setup becomes extremely slow. - Set `DEBUG` and `SILK_PROFILER_ENABLED` to `False` by default + add utility make commands to toggle it - Use `uwsgi` instead of Django's built-in `runserver` for local dev setup - Limit Celery concurrency to 3 for local dev setup (previously was 20, used >1GB RAM on my machine) --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-02 09:08:48 +00:00
Ildar Iskhakov	df1517573e	Cache web template rendered fields for alert and alertgroup endpoints (#1261 ) # What this PR does This PR adds same approach as introduced [here](https://github.com/grafana/oncall/pull/1236) to all alert and alertgroup endpoints ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-02 11:37:52 +08:00
Vadim Stepanov	b7176888ed	Better FCM error handling / retries (#1267 ) # What this PR does Raise `FirebaseError` in celery tasks contacting FCM instead of just logging it + add tests ## Checklist - [x] Tests updated	2023-02-01 14:45:32 +00:00
Matias Bordese	3e15b8cd85	Add default slack channel info to direct paging dialog (#1263 )	2023-02-01 10:03:54 -03:00
Joey Orlando	16196822de	Add utility function to get readonly db key if defined (#1264 ) # What this PR does This is a minor refactor before implementing https://github.com/grafana/oncall-private/issues/1558. Additionally, it cleans up a few spots where we do this: ``` # Re-take in case we are in the readonly db context. ``` We currently don't read anything from a read-only database, so this should be not necessary. ## Checklist - [x] Tests updated - [ ] Documentation added (N/A) - [ ] `CHANGELOG.md` updated (N/A)	2023-02-01 12:07:32 +01:00
Joey Orlando	94fe7979cf	add django-dbconn-retry library (#1262 )	2023-01-31 20:17:54 +01:00
Matias Bordese	b1fc123d9f	Add a filter by involved users to alert groups page (#1240 ) Related to #1119 It also adds a shortcut to filter current user's related alert groups (alert groups user was notified by, or in which user participated). Make the filter visible by default, with a false value.	2023-01-30 14:08:18 +02:00
Vadim Stepanov	f80271a1f4	Return alert group ID in direct paging API (#1241 ) # What this PR does Make direct paging internal API endpoint return an alert group ID. ## Which issue(s) this PR fixes Related to https://github.com/grafana/oncall/issues/823 ## Checklist - [x] Tests updated	2023-01-30 11:48:25 +00:00
Ildar Iskhakov	ae44ee5652	Cache render_for_web field for alertgroups list serializer (#1236 ) # What this PR does This PR caches the field `render_for_web` with lifetime 1 day and cache becomes invalid if it was created before * last alert received * template changed ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-28 12:50:41 +08:00
Matias Bordese	e0ae9919c7	Add paging for direct paging users in slack dialog (#1232 ) Fixes issue when there are more than 100 users to be listed in the direct pagination responders select. Alternatively we should consider moving to an `external_select` block later.	2023-01-27 14:10:44 -03:00
Ildar Iskhakov	4a8011d236	Add silk setting to store .prof files in the specific folder and share it between uwsgi workers (#1228 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-26 20:33:04 +08:00
Ildar Iskhakov	a6a781320d	Set SILKY_PYTHON_PROFILER_BINARY setting to False by default (#1218 ) # What this PR does Here is the example of the visualisation with `snakeviz` <img width="1126" alt="Screenshot 2023-01-25 at 22 15 49" src="https://user-images.githubusercontent.com/2262529/214586753-ad49a002-27e1-4e44-82f2-4ad5f4e40101.png"> ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-25 22:17:17 +08:00
Matias Bordese	dd27b3f2c5	Add schedules support for slack direct paging (#1183 ) Related to #823	2023-01-25 09:10:50 -03:00
Joey Orlando	3cf2fcf660	optimize GET /schedules internal API endpoint (#1169 ) # What this PR does Fixes slow internal`GET /schedules` endpoints. Using the fake-data generation script in #1128, I generated 65 calendar schedules in my local setup. This resulted in the following endpoint performance: ![Screenshot 2023-01-24 at 12 03 16](https://user-images.githubusercontent.com/9406895/214276618-1a9848ba-eb84-49ec-a099-fdd96beac93f.png) The responses which show ~76 queries were run on the latest `dev` branch. Responses w/ ~26 queries were run on this branch. Additionally: - add typing to a few methods in `apps/schedules/ical_utils.py` - document `apps/api/permissions/__init__.py:user_is_authorized` function ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/1552 ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>	2023-01-25 11:08:09 +01:00
Yulya Artyukhina	de5d876d27	Refactor create/update contact points for Alerting integration (#872 ) What this PR does: - Keep grafana version on create/update contact points to avoid multiple requests to alerting - Add retry limit on create contact point async - Fix bugs related on create contact point - Update logs on create/update contact point, make them more clear - Avoid unnecessary requests to Grafana Alerting	2023-01-25 09:42:42 +01:00
Ildar Iskhakov	1fc3f6d301	Refactor plugin sync (#1200 ) # What this PR does This PR adds a shortcut in the plugin synchronisation process, so the existing users will be able login without waiting for the sync task. Every request still starts the background synchronisation task, to be able to propagate the organisation changes faster than periodic task. It means that we don't necessarily need "force reload" button in the interface. For all the other cases (user does not exist, organisation token "not ok", etc) process remains same - plugin will show "Initialising plugin..." until the background task in successfully completed Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-01-25 09:12:08 +08:00
Vadim Stepanov	cf1a1cd7f3	Remove DynamicSetting usage for mobile app backend on OSS (#1204 ) # What this PR does Make so there's no need to populate `mobile_app_settings` DynamicSetting when using the OSS license to turn on the mobile app backend.	2023-01-24 13:53:54 +00:00
Ildar Iskhakov	46b39b2c87	Remove resolved and acknowledged filters as we switched to status (#1201 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-24 18:13:21 +08:00
Innokentii Konstantinov	cfa7fb816c	Sync users and teams on tf requests (#1180 ) # What this PR does This PR add sync with grafana on requests from terraform ## Which issue(s) this PR fixes It's needed to fix case when customers want to create team via grafana terraform provider and use it in the oncall provider without having to log into Grafana Cloud. Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-01-24 13:44:07 +08:00
Vadim Stepanov	ae5949aa7e	Allow viewers fetch cloud connection status (#1181 ) # What this PR does Fixes the issue when users with the viewer role can't fetch the cloud connection status, which makes the plugin fail to load for viewers. This PR makes the cloud connection endpoint use `OTHER_SETTINGS_READ` for fetching the cloud connection status instead of `OTHER_SETTINGS_WRITE`. ## Checklist - [x] Tests updated - [x] `CHANGELOG.md` updated	2023-01-23 11:17:57 +00:00
Ildar Iskhakov	37d25b5b31	Optimize alert group filtering queries (#1191 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-23 16:07:55 +08:00
Dan Cech	639fd81644	Update message when user needs to connect their profile (#1190 ) # What this PR does This just tweaks the message users get when they try to interact via slack but haven't connected their profile, it fixes a typo and streamlines the text.	2023-01-23 08:44:33 +01:00
Ildar Iskhakov	b90fe433c9	Optimize alertgroups endpoint (#1189 ) # What this PR does Changing query to retrieve alert group in two completely different queries instead of one with `join` new queries ``` SELECT alerts_alertreceivechannel.id FROM alerts_alertreceivechannel WHERE (alerts_alertreceivechannel.deleted_at IS NULL AND alerts_alertreceivechannel.organization_id = 8 AND alerts_alertreceivechannel.team_id IS NULL) SELECT `alerts_alertgroup`.`id` FROM `alerts_alertgroup` WHERE (`alerts_alertgroup`.`channel_id` IN (2,33,34,35,36,40,52,59,61,62,63,70,76,89,93,94,03,08,09,10,12,13,16,18,20,22,23,24,26,27,28,30,31,33,34,35,36,40,41,42,43,45,48,53,56,57,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,86,87,88,89,91,93,23,27,29,31,32,33,55,56,57,58,65,69,72,75,81,13,17,20,22,33,34,38,39,41,44,45,46,51,52,55,56,58,59,60,63,68,70,71) AND NOT `alerts_alertgroup`.`is_archived` AND NOT `alerts_alertgroup`.`is_archived` AND `alerts_alertgroup`.`root_alert_group_id` IS NULL AND ((NOT `alerts_alertgroup`.`silenced` AND NOT `alerts_alertgroup`.`acknowledged` AND NOT `alerts_alertgroup`.`resolved`) OR (`alerts_alertgroup`.`acknowledged` AND NOT `alerts_alertgroup`.`resolved`)) AND NOT `alerts_alertgroup`.`is_archived`) ORDER BY `alerts_alertgroup`.`id` DESC LIMIT 26 ``` ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-22 00:53:11 +08:00
Ildar Iskhakov	c9b83906a0	Optimize alertgroups endpoint (#1188 ) # What this PR does Changing query to retrieve alert group in two requests instead of one with `join` old query: ``` SELECT `alerts_alertgroup`.`id` FROM `alerts_alertgroup` INNER JOIN `alerts_alertreceivechannel` ON (`alerts_alertgroup`.`channel_id` = `alerts_alertreceivechannel`.`id`) WHERE (`alerts_alertreceivechannel`.`organization_id` = 1 AND `alerts_alertreceivechannel`.`team_id` IS NULL AND NOT `alerts_alertgroup`.`is_archived` AND NOT `alerts_alertgroup`.`is_archived` AND `alerts_alertgroup`.`root_alert_group_id` IS NULL AND ((NOT `alerts_alertgroup`.`silenced` AND NOT `alerts_alertgroup`.`acknowledged` AND NOT `alerts_alertgroup`.`resolved`) OR (`alerts_alertgroup`.`acknowledged` AND NOT `alerts_alertgroup`.`resolved`)) AND NOT `alerts_alertgroup`.`is_archived`) ORDER BY `alerts_alertgroup`.`id` DESC LIMIT 26 ``` new query: ``` SELECT "alerts_alertgroup"."id" FROM "alerts_alertgroup" WHERE ("alerts_alertgroup"."channel_id" IN (SELECT U0."id" FROM "alerts_alertreceivechannel" U0 WHERE (NOT (U0."integration" = maintenance) AND U0."deleted_at" IS NULL AND U0."organization_id" = 1 AND U0."team_id" IS NULL)) AND NOT "alerts_alertgroup"."is_archived" AND NOT "alerts_alertgroup"."is_archived" AND "alerts_alertgroup"."root_alert_group_id" IS NULL AND ((NOT "alerts_alertgroup"."silenced" AND NOT "alerts_alertgroup"."acknowledged" AND NOT "alerts_alertgroup"."resolved") OR ("alerts_alertgroup"."acknowledged" AND NOT "alerts_alertgroup"."resolved")) AND NOT "alerts_alertgroup"."is_archived") ORDER BY "alerts_alertgroup"."id" DESC LIMIT 26 ``` ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-22 00:14:48 +08:00
Ildar Iskhakov	83b1f069d0	Optimize alertgroups endpoint (#1186 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-21 21:59:20 +08:00
Vadim Stepanov	2b0abf018c	Hide direct paging integrations (#1162 ) # What this PR does Hide direct paging integrations from the web UI. Related to https://github.com/grafana/oncall/issues/823 ## Checklist - [x] Tests updated - [ ] Documentation added (N/A) - [ ] `CHANGELOG.md` updated (N/A)	2023-01-20 13:29:57 +00:00
Ildar Iskhakov	0a00d3e2c1	Update base.py	2023-01-20 20:20:51 +08:00
Matias Bordese	693b5a41c4	Add slack command to trigger direct paging (#1154 ) Slash command needs to be added to slack app manifest: ``` slash_commands: - command: /escalate url: https://<oncall-public-url>/slack/interactive_api_endpoint/ description: Create a new alert group escalation should_escape: false ```	2023-01-20 09:06:27 -03:00
Ildar Iskhakov	aec54707ec	Add pyroscope integration (#1176 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-20 18:47:16 +08:00
Ildar Iskhakov	c06709fdb6	Move silk profiler under env variable setting (#1175 ) # What this PR does This PR moves silk profiler under the settings flag which can be configured with env vars. It will allow us to enable silk on the clusters, e.g. dev ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-20 18:19:31 +08:00
Joey Orlando	98241b9a10	fake-data generation script + fixes for django-silk and django-debug-toolbar (#1128 ) # What this PR does ## Main stuff - add Python script to populate local Grafana/OnCall setup w/ large amounts of fake data. Right now the data types that can be generated are: - teams and Admin users via the Grafana API (must be synced manually by going into the UI before going onto the next step) - Calendar Schedules which have three 8h oncall-shifts, via the OnCall public API - fixes `django-debug-toolbar` when being run in `docker-compose` locally ## Other stuff - documents how to easily modify the Grafana `docker-compose` container provisioning configuration - document solutions for two backend setup related issues encountered when running the engine/celery workers locally, outside of `docker-compose`, on an Apple silicon Mac - fixes small bug in `grafana_plugin.helpers.client.APIClient.call_api` where it would call `response.json()` for all requests, regardless of whether or not the response actually contained data or not - in `engine/settings/dev.py`, properly setup `django-silk` and document the steps to use it locally - make it possible to log out debug SQL queries by specifying `DEV_DEBUG_VIEW_SQL_QUERIES` env var, rather than having to uncomment out a section of `settings/dev.py` ## Which issue(s) this PR fixes - Some local setup issues when trying to use `django-silk` and `django-debug-toolbar` - Makes it much easier to populate your local setup with a lot of fake data - Makes it possible to easily modify your local grafana's provisioning configuration ## Checklist - [ ] Tests updated (N/A) - [ ] Documentation added (N/A) - [ ] `CHANGELOG.md` updated (N/A)	2023-01-20 09:19:41 +01:00
Michael Derynck	cc3fdab8fb	Fix UnboundLocalError in webhooks (#1165 ) Fix error where rendered_data was being used without being defined.	2023-01-19 15:50:22 -07:00
Vadim Stepanov	ccae9d86b3	Add an ability to use an escalation chain for direct paging (#1161 ) # What this PR does Adds an ability to page an escalation chain for a newly created direct paging alert group using the internal API. Also [adds a forgotten migration](`32fc44e744`) related to the direct paging backend. Related to https://github.com/grafana/oncall/issues/823 ## Checklist - [x] Tests updated - [ ] Documentation added (N/A) - [ ] `CHANGELOG.md` updated (N/A)	2023-01-19 18:51:57 +00:00
Yulya Artyukhina	d5461866d1	Add a dummy step for declare incident button in slack (#1157 ) Add a dummy step for declare incident button to prevent raising 'Step is undefined' exception because Slack sends a POST request to the backend upon clicking a button with a redirect link to Incident. This pr doesn't change any functionality	2023-01-19 14:50:02 +01:00
Vadim Stepanov	29f67dc2f3	Fix circular import	2023-01-19 11:53:05 +00:00
Vadim Stepanov	6b87ad74e9	Enforce cloud connection to send push notifications on OSS (#1132 ) This PR modifies how OSS instances send mobile app push notifications. It also adds frontend warnings when user is trying to use the mobile app without connecting to cloud. - [x] Add public API authentication to `FCMRelayView` and throttle the view to 300 push notifications per instance per minute. This is similar to how SMS and phone call notifications work on OSS instances. - [x] Add frontend warnings based on cloud connectivity - [x] Fix/add frontend tests - [x] Add tests for FCMRelayView and mobile app backend ## Screenshots When a user tries to connect the mobile app in his settings and cloud is not connected (clicking "Connect Cloud OnCall" redirects to the "Cloud" tab): <img width="1088" alt="Screenshot 2023-01-12 at 18 48 58" src="https://user-images.githubusercontent.com/20116910/212156591-86906020-eddf-43f1-9402-7ebb7547c7e6.png"> When a user tries to use mobile push notifications as a personal notification step and cloud is not connected: <img width="764" alt="Screenshot 2023-01-12 at 19 01 10" src="https://user-images.githubusercontent.com/20116910/212157580-9abb0758-79ad-4316-b8cd-15b4fff01502.png"> Now on the "Cloud" tab there's some info about the mobile app (the last section at the bottom of the page): <img width="1245" alt="Screenshot 2023-01-12 at 18 49 10" src="https://user-images.githubusercontent.com/20116910/212156997-c8b70dd5-bf15-4bc7-8eb8-9decdb8ecc80.png"> After connecting to the cloud instance, everything goes back to active and it's now possible to connect the mobile app: <img width="1091" alt="Screenshot 2023-01-12 at 19 08 27" src="https://user-images.githubusercontent.com/20116910/212158811-60d49888-4714-4c0e-850f-3ff6a11a117a.png"> After connecting the app the warning is gone: <img width="764" alt="Screenshot 2023-01-12 at 19 07 00" src="https://user-images.githubusercontent.com/20116910/212158614-677ab889-127f-4d64-bacc-0c26887f3097.png">	2023-01-19 11:15:56 +00:00
Vadim Stepanov	c93ee5c554	Send a Slack DM when user is not in channel (#1144 ) # What this PR does Currently, when a user gets mentioned in an alert group thread and the user is not in the Slack channel, the Slack bot sends the following to the channel: > ⚠️ Tried to ask USER to look at incident. Unfortunately USER is not in this channel. Please, invite. This PR changes this behaviour to instead send a direct message to the user. The message contains a link to the main alert group message in Slack. <img width="806" alt="Screenshot 2023-01-17 at 19 25 36" src="https://user-images.githubusercontent.com/20116910/212996457-02db183f-2041-4998-b743-bd5b6c84b7b5.png"> ## Checklist - [ ] Tests updated (N/A) - [ ] Documentation added (N/A) - [x] `CHANGELOG.md` updated	2023-01-18 16:08:15 +00:00
Matias Bordese	90def88752	Add escalation chain option when creating a direct page alert group (#1143 ) Also changes the default integration used when creating an alert group for a direct page to a custom manual integration to avoid conflicts/unexpected behaviors with existing manual alerts.	2023-01-18 12:58:26 -03:00
Vadim Stepanov	b8d78fd6bb	Allow messaging backends to be enabled/disabled per organization (#1151 ) # What this PR does Allows messaging backends to be enabled/disabled per organization when getting a list of available personal notification channels. ## Checklist - [x] Tests updated - [ ] Documentation added (N/A) - [x] `CHANGELOG.md` updated	2023-01-18 15:52:25 +00:00
Matias Bordese	d3062b56fd	Draft initial logic for user/schedule paging (#1098 ) Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>	2023-01-17 12:19:08 -03:00
Yulya Artyukhina	9129a720ef	Integration with grafana incident (#1081 ) Check if Grafana Incident is enabled. If it is, add a button with a link to declare Grafana Incident from Alert group in Slack and on Web. Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>	2023-01-17 13:04:50 +01:00
Tommy	5bd8fbdef8	Add alert groups state filter (#1133 ) # What this PR does This PR added a new parameter (state) into the alert_group public API to filter the state of the alert groups ## Which issue(s) this PR fixes https://github.com/grafana/oncall/issues/684 ## Checklist - [x] Tests updated - [x] Documentation added - [x] `CHANGELOG.md` updated Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>	2023-01-17 10:28:29 +00:00
Vadim Stepanov	59f2c293e7	Move FCM relay logic into a celery task (#1137 )	2023-01-13 19:28:34 +00:00
Matias Bordese	0d38fe2a7f	Web schedules overrides are the higher priority level (#1115 ) Related to https://github.com/grafana/oncall-private/issues/1550	2023-01-13 08:58:35 -03:00
Innokentii Konstantinov	9a3b53ff34	Delete slack_connector on org soft-delete (#1127 )	2023-01-12 17:37:05 +08:00

1 2 3 4 5 ...

607 commits