centralcloud/oncall-engine

Author	SHA1	Message	Date
Ildar Iskhakov	d3c6621dae	Teams redesign (#1528 ) # What this PR does * api returns all the resources available to the user by default * substitutes `team switcher` with `multi-select team filter` * allow referencing between integrations - escalations chains - [schedules, outgoing webhooks] across teams https://user-images.githubusercontent.com/2262529/225634581-2d2e8af2-15ce-4c01-a90e-8267d98f5a23.mov ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated --------- Co-authored-by: Maxim <maxim.mordasov@grafana.com> Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-03-22 00:57:20 +08:00
Joey Orlando	4d655dff60	modify check_escalation_finished_task task (#1266 ) # What this PR does This PR: - modifies the `check_escalation_finished_task` celery task to: - do stricter escalation validation based on the alert group's escalation snapshot (see the `audit_alert_group_escalation` method in `engine/apps/alerts/tasks/check_escalation_finished.py` for the validation logic) - use a read-only database for querying alert-groups if one is configured, otherwise use the "default" one - ping a configurable heartbeat (new env var `ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL` added) - increase the task frequency from every 10 to every 13 minutes (this can be configured via an env variable) - adds public documentation on how to configure this auditor task - modifies the local celery startup command to properly take into consideration all celery related env vars (similar to the ones we use in `engine/celery_with_exporter.sh`; this made it easier to enable `celery beat` locally for testing) - removes the following code: - removes references to `AlertGroup.estimate_escalation_finish_time` and marks the model field as deprecated using the [`django-deprecate-fields` library](https://pypi.org/project/django-deprecate-fields/). This field was only used for the previous version of this validation task - `EscalationSnapshotMixin.calculate_eta_for_finish_escalation` was only used to calculate the value for `AlertGroup.estimate_escalation_finish_time` - `calculate_escalation_finish_time` celery task ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/1558 ## Checklist - [x] Tests updated - [x] Documentation added - [x] `CHANGELOG.md` updated	2023-03-17 10:14:08 +00:00
Vadim Stepanov	ea60c0d247	Inbound email integration (#837 ) This PR add Inbound Email integration. It designed to support some variety of ESPs, but in prod we will use Mailgun, so locally I tested it only with mailgun ESP. Important: To make it work on different clusters I'm planning to provide different email domains for different regions, like ....@us.oncall.grafana.net, ...@eu.oncall.grafana.net --------- Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>	2023-03-16 13:59:21 +08:00
Innokentii Konstantinov	747a2b2bc0	FIx insight_logs for mobile app backend (#1498 )	2023-03-08 13:38:59 +00:00
Ildar Iskhakov	2e63a9ff08	Jinja2 based routes (#1319 ) # What this PR does This PR adds the new way to set up routes using jinja2 templating language <img width="1174" alt="Screenshot 2023-03-06 at 22 11 13" src="https://user-images.githubusercontent.com/2262529/223134053-69d43c47-bb2a-4790-a16d-767425017a76.png"> <img width="1175" alt="Screenshot 2023-03-06 at 22 11 34" src="https://user-images.githubusercontent.com/2262529/223134070-1e5ef82f-021c-4d5d-b255-b19bb3445641.png"> ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-03-08 16:42:18 +08:00
Innokentii Konstantinov	a50ec8fed2	Refactor get_user_verbal_for_team_for_slack. (#809 ) Remove unused params from signature, rename	2023-03-07 10:09:37 +00:00
Innokentii Konstantinov	249e4067c4	Remove unused def render_resolution_notes_for_csv_report	2023-03-07 13:47:49 +08:00
Innokentii Konstantinov	6a5e75e083	Fix of templates api behaviour for public and private api (#1408 ) # What this PR does This PR fixes templates behaviour for public and private api. It fix "reset to default" for templates from messaging backends and some minor bugs. Also added acknowledge signal and source link templates ## Checklist - [x] Tests updated - [x] Documentation added - [x] `CHANGELOG.md` updated	2023-03-01 16:32:15 +08:00
Matias Bordese	04c42e2796	Matiasb/fix task refresh ical when empty value (#1401 ) This should fix task error as seen in logs, trying to parse an empty string as ical value: ``` Task apps.schedules.tasks.refresh_ical_files.refresh_ical_file[] raised unexpected: ValueError("Found no components where exactly one is required: ''") ```	2023-02-24 21:16:09 +00:00
Yulya Artyukhina	53af4783de	Fix the cause of retry of notify_all and notify_group tasks (#1376 ) Fix the cause of retry of notify_all and notify_group tasks that was related to an incorrect step order.	2023-02-23 09:28:13 +00:00
Innokentii Konstantinov	26a2bd9c91	Refactor maintenance (#1340 ) # What this PR does This PR simplifies code of maintenance mode. 1. Perform distribution/escalation maintenance checks in send_signal... tasks. 2. Use usual alert distribution flow for the maintenance incident. 3. Decouple maintenance mode from slack (all, except notify_about_maintenance_action methods, I don't want to make this PR too big) As a bonus from these changes, maintenance mode now mute alert group delivery in all chatops integrations, not only in slack. (Before, incidents happened while maintenance were posted to telegram and msteams anyway) ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-02-23 07:13:03 +00:00
Innokentii Konstantinov	c733d8b9f2	Cleanup ScenarioStep (#1213 ) # What this PR does This PR cleanup ScenarioStep. It's needed to simplify moving Slack to the messaging backends in future. 1. Introduce AlertGroupSlackService to move logic from ScenarioStep. Also it allowed to get rid of importing ScenarioSteps in the code not related to processing of slack callbacks. 2. Remove tags from ScenarioSteps, they are unused. 3. Remove ScenarioStep.dispatch method. It just was calling ScenarioStep.process_scenario. 4. Remove "action" param from process_scenario, it was unused. 5. Remove creation of SlackActionRecord on handling SlackEvents. We are not using it, but it generates INSERT query on most of the user-slack interactions. 6. Remove "random_prefix_for_routing" from ScenarioStep, it was unused. ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-21 20:22:11 +01:00
Yulya Artyukhina	058665b8a8	Fix too long declare incident link (#1342 ) # What this PR does ## Which issue(s) this PR fixes Issue with too long declare incident link in Slack ## Checklist - [x] `CHANGELOG.md` updated	2023-02-20 18:42:44 +08:00
Ildar Iskhakov	1b7ada4315	Add database migrations linter (#1020 ) # What this PR does This PR adds [django-migration-linter](https://github.com/3YOURMIND/django-migration-linter) to keep database migrations backwards compatible - we can automatically run migrations and they are zero-downtime, e.g. old code can work with the migrated database - we can run and rollback migrations without worrying about data safety - OnCall is deployed to the multiple environments core team is not able to control See [django-migration-linter checklist](https://github.com/3YOURMIND/django-migration-linter/blob/main/docs/incompatibilities.md) for the common mistakes and best practices ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-06 16:01:37 +08:00
Matias Bordese	bc0276fb22	Keep track of direct paging schedule/importance in logs (#1269 ) This will eventually allow to improve responders information in an alert group detail page	2023-02-02 09:21:31 -03:00
Vadim Stepanov	f80271a1f4	Return alert group ID in direct paging API (#1241 ) # What this PR does Make direct paging internal API endpoint return an alert group ID. ## Which issue(s) this PR fixes Related to https://github.com/grafana/oncall/issues/823 ## Checklist - [x] Tests updated	2023-01-30 11:48:25 +00:00
Ildar Iskhakov	ae44ee5652	Cache render_for_web field for alertgroups list serializer (#1236 ) # What this PR does This PR caches the field `render_for_web` with lifetime 1 day and cache becomes invalid if it was created before * last alert received * template changed ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-28 12:50:41 +08:00
Matias Bordese	dd27b3f2c5	Add schedules support for slack direct paging (#1183 ) Related to #823	2023-01-25 09:10:50 -03:00
Yulya Artyukhina	de5d876d27	Refactor create/update contact points for Alerting integration (#872 ) What this PR does: - Keep grafana version on create/update contact points to avoid multiple requests to alerting - Add retry limit on create contact point async - Fix bugs related on create contact point - Update logs on create/update contact point, make them more clear - Avoid unnecessary requests to Grafana Alerting	2023-01-25 09:42:42 +01:00
Ildar Iskhakov	37d25b5b31	Optimize alert group filtering queries (#1191 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-23 16:07:55 +08:00
Michael Derynck	cc3fdab8fb	Fix UnboundLocalError in webhooks (#1165 ) Fix error where rendered_data was being used without being defined.	2023-01-19 15:50:22 -07:00
Vadim Stepanov	ccae9d86b3	Add an ability to use an escalation chain for direct paging (#1161 ) # What this PR does Adds an ability to page an escalation chain for a newly created direct paging alert group using the internal API. Also [adds a forgotten migration](`32fc44e744`) related to the direct paging backend. Related to https://github.com/grafana/oncall/issues/823 ## Checklist - [x] Tests updated - [ ] Documentation added (N/A) - [ ] `CHANGELOG.md` updated (N/A)	2023-01-19 18:51:57 +00:00
Yulya Artyukhina	d5461866d1	Add a dummy step for declare incident button in slack (#1157 ) Add a dummy step for declare incident button to prevent raising 'Step is undefined' exception because Slack sends a POST request to the backend upon clicking a button with a redirect link to Incident. This pr doesn't change any functionality	2023-01-19 14:50:02 +01:00
Matias Bordese	90def88752	Add escalation chain option when creating a direct page alert group (#1143 ) Also changes the default integration used when creating an alert group for a direct page to a custom manual integration to avoid conflicts/unexpected behaviors with existing manual alerts.	2023-01-18 12:58:26 -03:00
Matias Bordese	d3062b56fd	Draft initial logic for user/schedule paging (#1098 ) Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>	2023-01-17 12:19:08 -03:00
Yulya Artyukhina	9129a720ef	Integration with grafana incident (#1081 ) Check if Grafana Incident is enabled. If it is, add a button with a link to declare Grafana Incident from Alert group in Slack and on Web. Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>	2023-01-17 13:04:50 +01:00
Tommy	5bd8fbdef8	Add alert groups state filter (#1133 ) # What this PR does This PR added a new parameter (state) into the alert_group public API to filter the state of the alert groups ## Which issue(s) this PR fixes https://github.com/grafana/oncall/issues/684 ## Checklist - [x] Tests updated - [x] Documentation added - [x] `CHANGELOG.md` updated Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>	2023-01-17 10:28:29 +00:00
Innokentii Konstantinov	fa6906a606	Simplify and speed up slack rendering (#1105 ) Simplify and speed up slack rendering.	2023-01-10 15:41:38 +08:00
Joey Orlando	802e3964e9	update mobile app push notification text + make telegram alert verbage consistent ("Firing" instead of "Alerting") (#1089 )	2023-01-05 16:16:43 +01:00
Michael Derynck	7c26eb559b	Improve handling of template exceptions during group data creation (#1068 ) # What this PR does With the addition of tighter controls on jinja templates handle exceptions while rendering group data as follows: - Title will cache error message as title and display to user and the error will be logged - Group distinction will be left as None and the error will be logged - Is resolve signal will be treated as False and the error will be logged - Is acknowledge signal will be treated as False and the error will be logged ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/1542	2023-01-03 12:30:59 -07:00
Matias Bordese	05524ab698	Merge pull request #1059 from grafana/matiasb/truncate-slack-title-block Truncate slack alert group title block below max size	2023-01-03 08:50:57 -03:00
Innokentii Konstantinov	5e297847ae	Speedup alert group search	2023-01-03 11:04:16 +08:00
Matias Bordese	75aaeef3f2	Truncate slack alert group title block below max size	2023-01-02 10:07:53 -03:00
Innokentii Konstantinov	41f886b31e	Speedup seach alertgroup	2022-12-17 19:34:13 +08:00
Innokentii Konstantinov	7341641b3f	Introduce org uuid (#947 ) * Introduce org uuid * Rename uuid_with_org_id to uuid_with_org_uuid Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2022-12-06 22:42:58 +08:00
Joey Orlando	ffda80ae34	add permalinks.web attribute to alert group internal/public api response (#953 )	2022-12-06 11:06:05 +01:00
Joey Orlando	9e598385f4	Add RBAC Support (#777 ) * Modify plugin.json to support RBAC role registration * defines 26 new custom roles in plugin.json. The main roles are: - Admin: read/write access to everything in OnCall - Reader: read access to everything in OnCall - OnCaller : read access to everything in OnCall + edit access to Alert Groups and Schedules - <object-type> Editor: read/write access to everything related to <object-type> - <object-type> Reader: read access for <object-type> - User Settings Admin: read/write access to all user's settings, not just own settings. This is in comparison to User Settings Editor which can only read/write own settings * update changelog and documentation (#686) * implement RBAC for OnCall backend This commit refactors backend authorization. It trys to use RBAC authorization if the org's grafana instance supports it, otherwise it falls back to basic role authorization. * update RBAC backend tests * add tests for RBAC changes - run backend tests as matrix where RBAC is enabled/disabled. When RBAC is enabled, the permissions granted are read from the role grants in the frontend's plugin.json file (instead of relying what we specify in RBACPermission.Permissions) - remove --reuse-db --nomigrations flags from engine/tox.ini - minor autoformatting changes to docker-compose-developer.yml * remove --ds=settings.ci-test from pytest CI command DJANGO_SETTINGS_MODULE is already specified as an env var so this is just unecessary duplication * update gitignore * update github action job name for "test" * RBAC frontend changes * refactors the use of basic roles (ex. Viewer, Editor, Admin) use RBAC permissions (when supported), or falling back to basic roles when RBAC is not supported. - updates the UserAction enum in grafana-plugin/src/state/userAction.ts. Previously this was hardcoded to a list of strings that were being returned by the OnCall API. Now the values here correspond to the permissions in plugin.json (plus a fallback role) * changes per Gabriel's comments: - get rid of group attribute in rbac roles - remove displayName role attribute - remove hidden role attribute - add back role to includes section * don't try to update user timezone if they don't have permission	2022-11-29 09:41:56 +01:00
Michael Derynck	3582f9b08f	Improve Jinja Template feedback and error handling (#884 ) * Improve feedback so template errors are given to user * Add security error logging * Add limits for templates, payloads, results * Show popup error notification for webhook errors and template errors that don't have a result * Update tests * Split exceptions into warnings/errors to give more control when previewing, rendering, saving templates * Limit title lengths * Make TypeError a warning * Adjust title length limit * Remove length limiting on urlize since it is being done on template render * Fix tests * Add KeyError and ValueError to warnings * No longer enforcing json result when saving webhook in case it is dependent on payload * Add tests for expected exceptions coming from apply_jinja_template * Update changelog * Send raw post if template result is not JSON	2022-11-28 09:46:51 -07:00
Vadim Stepanov	dc6fcf5c05	Add internal API fields for the mobile app (#910 ) * add permalinks list to internal API alertgroup view * add user's name and full avatar URL to the user view * make avatar_full_url a property * fix tests * fix user connection criteria	2022-11-28 15:52:31 +00:00
Vadim Stepanov	255964ceaf	Mobile app messaging backend (#874 ) * move mobile notifications to a separate backend, remove critical notification * remove outdated mobile app code * MOBILE_APP_PUSH_NOTIFICATIONS_ENABLED -> FEATURE_MOBILE_APP_INTEGRATION_ENABLED * create error log if no devices are set up * move mobile auth related code to the mobile_app Django app * move mobile auth related code to the mobile_app Django app * move mobile auth related code to the mobile_app Django app * fix typing * add GCMDevice todos * add user connection capabilities * add user connect/disconnect to the messaging backend * move APNS endpoint to mobile_app Django app * restore critical notifications * support hackathon app * tweak migrations so mobile app auth tokens are preserved * reuse notify_by IDs * use mobile app template to render push notification * add GCM/FCM (Android) support * fix unlink user * logger.error -> logger.info	2022-11-23 15:56:43 +00:00
Innokentii Konstantinov	0816813237	Handle 404 for get_alerting_config	2022-11-18 17:07:39 +08:00
Innokentii Konstantinov	f9a9c1d978	Cleanup on deletion/archivation of slack channel (#822 ) * Cleanup on deletion/archivation of slack channel * Bulk update of organizations, filter channel filters by org * Optimize org bulk update	2022-11-16 17:56:05 +08:00
Michael Derynck	25826690a8	Use common environment for templates	2022-11-05 00:31:51 -06:00
Joey Orlando	627afe37e1	Remove references to Alert.migrator_lock attribute This commit patches issue related to #708. #708 forgot to remove attributes on models outside of the migration_tool django app that were referencing model attributes from migration_tool. The only attribute that referenced a field in migration_tool was migrator_lock on the Alert model. This commit removes any references to that attribute.	2022-10-27 13:52:03 +02:00
Matias Bordese	2c8c66a8c8	Not previously handled backends (eg. mobile) could end here without a messaging backend	2022-10-26 09:30:13 -03:00
Michael Derynck	a37df38930	Merge branch 'dev' into mderynck/add-check-notify-group-task	2022-10-25 12:50:12 -06:00
Matias Bordese	8e2bcf5274	Fix failing test related to users org caching	2022-10-25 14:27:27 -03:00
Michael Derynck	ef097fcdd9	Add check for usergroup to notify group task	2022-10-25 10:23:19 -06:00
Innokentii Konstantinov	2c6a27154f	Support mutliregion telegram (#676 ) * Support mutliregion telegram * Fix test_personal_message * Fix tg verification code tests * Simplify /start cmd handler * Comment about link with org_id in tg msg	2022-10-25 14:53:07 +08:00
Matias Bordese	eb32fa7ba0	Handle scenario when multiple general team manual integrations are available	2022-10-21 14:23:45 -03:00

1 2 3

123 commits