centralcloud/oncall-engine

Author	SHA1	Message	Date
Ildar Iskhakov	14b692674a	Fix bugs in web title and message templates rendering and visual representation (#1747 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-05-02 16:57:52 +08:00
Ildar Iskhakov	071e3c6b1b	Remove template editor from Slack (#1847 ) # What this PR does <img width="521" alt="Screenshot 2023-04-28 at 5 36 10 PM" src="https://user-images.githubusercontent.com/2262529/235112636-56fe0b48-1cda-4ba7-8a09-1cfb0ced2222.png"> ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-05-02 14:06:20 +08:00
Joey Orlando	4967ce8208	update web UI, Slack, and Telegram to allow silencing an acknowledged alert group (#1831 ) # What this PR does https://www.loom.com/share/1a6ef0d00c3b46ca80c120579d512dcc ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated (N/A) - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-04-27 14:52:35 +00:00
Tommy	23c7a6f682	Add 2, 3 and 6 hours silence options (#1822 ) # What this PR does This PR adds additional silence options in the UI. Currently we only have 1 hour, 4 hours and 12 hours silence options. I think it's worth it to have finer silence options. ## Which issue(s) this PR fixes No issue ticket but I can create one. ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-04-26 02:22:18 +00:00
Ildar Iskhakov	e6ebec1a17	Reuse web templates in other templates (#1786 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-04-24 07:44:39 +00:00
Ildar Iskhakov	6e61643750	Limit number of alertmanager alerts in alert group to autoresolve (#1779 ) # What this PR does This PR set the limit so that workers won't attempt to autoresolve too big alertmanager alert groups. ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-04-24 05:38:21 +00:00
Michael Derynck	cef748ed4c	Add users_to_be_notified to new webhooks payload (#1798 ) - Change FIRING trigger for webhooks to be sent after escalation snapshot has been computed - Extract users from `notify_to_users_queue` and `notify_schedule` from escalation snapshot to populate `users_to_be_notified` in webhook payload	2023-04-20 16:13:48 +00:00
Joey Orlando	5f9e79d50f	move alert_group.is_restricted to alert_receive_channel.restricted_at (#1770 )	2023-04-18 12:02:56 +02:00
Ildar Iskhakov	cee0fdccd7	Add new field description_short to private and public api (#1698 ) # What this PR does Required for new Integrations page <img width="674" alt="Screenshot 2023-04-04 at 20 32 03" src="https://user-images.githubusercontent.com/2262529/229792240-60783f30-00ba-4dfc-bebd-75d6c2c232e3.png"> ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-04-18 04:55:55 +00:00
Ildar Iskhakov	f825fdf1a3	Send demo alert with dynamic payload and get demo payload example on private api (#1700 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-04-18 02:48:11 +00:00
Joey Orlando	e93347e75b	temporarily disable is_restricted column + migration (#1765 )	2023-04-17 19:30:31 +02:00
Joey Orlando	3b274f45f4	add several new database columns + emit two new Django signals (#1522 ) # What this PR does - add new columns `gcom_org_contract_type`, `gcom_org_irm_sku_subscription_start_date`, and `gcom_org_oldest_admin_with_billing_privileges_user_id` to `user_management_organization` table + `is_restricted` column to `alerts_alertgroup` table - emit two new Django signals - `org_sync_signal` at the end of the `engine/apps/user_management/sync.py::sync_organization` method - `alert_group_created_signal` when a new Alert Group is created ## Checklist - [ ] Tests updated (N/A) - [ ] Documentation added (N/A) - [x] `CHANGELOG.md` updated --------- Co-authored-by: Rares Mardare <rares.mardare@grafana.com>	2023-04-14 09:15:57 +02:00
Ben Sully	303670947b	Fix `web_link` property of AlertGroup (#1738 ) The routing of the OnCall plugin has changed and no longer uses URL params but instead uses paths. This link is used when declaring an Incident from the OnCall Slack alert and needs to match the correct pattern in order for Incident to correctly detect it.	2023-04-12 16:44:27 +00:00
Matias Bordese	2a89374adf	Add escalation chain support for new webhooks (#1654 ) Allow setting a webhook as escalation chain policy step.	2023-04-05 12:03:55 +00:00
Maxim Mordasov	061123e124	Allow changing team for escalation chains (#1658 ) # What this PR does Allows changing team for escalation chains ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) --------- Co-authored-by: Ildar Iskhakov <ildar.iskhakov@grafana.com> Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>	2023-03-30 10:43:00 +01:00
Ildar Iskhakov	d3c6621dae	Teams redesign (#1528 ) # What this PR does * api returns all the resources available to the user by default * substitutes `team switcher` with `multi-select team filter` * allow referencing between integrations - escalations chains - [schedules, outgoing webhooks] across teams https://user-images.githubusercontent.com/2262529/225634581-2d2e8af2-15ce-4c01-a90e-8267d98f5a23.mov ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated --------- Co-authored-by: Maxim <maxim.mordasov@grafana.com> Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-03-22 00:57:20 +08:00
Joey Orlando	4d655dff60	modify check_escalation_finished_task task (#1266 ) # What this PR does This PR: - modifies the `check_escalation_finished_task` celery task to: - do stricter escalation validation based on the alert group's escalation snapshot (see the `audit_alert_group_escalation` method in `engine/apps/alerts/tasks/check_escalation_finished.py` for the validation logic) - use a read-only database for querying alert-groups if one is configured, otherwise use the "default" one - ping a configurable heartbeat (new env var `ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL` added) - increase the task frequency from every 10 to every 13 minutes (this can be configured via an env variable) - adds public documentation on how to configure this auditor task - modifies the local celery startup command to properly take into consideration all celery related env vars (similar to the ones we use in `engine/celery_with_exporter.sh`; this made it easier to enable `celery beat` locally for testing) - removes the following code: - removes references to `AlertGroup.estimate_escalation_finish_time` and marks the model field as deprecated using the [`django-deprecate-fields` library](https://pypi.org/project/django-deprecate-fields/). This field was only used for the previous version of this validation task - `EscalationSnapshotMixin.calculate_eta_for_finish_escalation` was only used to calculate the value for `AlertGroup.estimate_escalation_finish_time` - `calculate_escalation_finish_time` celery task ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/1558 ## Checklist - [x] Tests updated - [x] Documentation added - [x] `CHANGELOG.md` updated	2023-03-17 10:14:08 +00:00
Vadim Stepanov	ea60c0d247	Inbound email integration (#837 ) This PR add Inbound Email integration. It designed to support some variety of ESPs, but in prod we will use Mailgun, so locally I tested it only with mailgun ESP. Important: To make it work on different clusters I'm planning to provide different email domains for different regions, like ....@us.oncall.grafana.net, ...@eu.oncall.grafana.net --------- Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>	2023-03-16 13:59:21 +08:00
Innokentii Konstantinov	747a2b2bc0	FIx insight_logs for mobile app backend (#1498 )	2023-03-08 13:38:59 +00:00
Ildar Iskhakov	2e63a9ff08	Jinja2 based routes (#1319 ) # What this PR does This PR adds the new way to set up routes using jinja2 templating language <img width="1174" alt="Screenshot 2023-03-06 at 22 11 13" src="https://user-images.githubusercontent.com/2262529/223134053-69d43c47-bb2a-4790-a16d-767425017a76.png"> <img width="1175" alt="Screenshot 2023-03-06 at 22 11 34" src="https://user-images.githubusercontent.com/2262529/223134070-1e5ef82f-021c-4d5d-b255-b19bb3445641.png"> ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-03-08 16:42:18 +08:00
Innokentii Konstantinov	a50ec8fed2	Refactor get_user_verbal_for_team_for_slack. (#809 ) Remove unused params from signature, rename	2023-03-07 10:09:37 +00:00
Innokentii Konstantinov	249e4067c4	Remove unused def render_resolution_notes_for_csv_report	2023-03-07 13:47:49 +08:00
Innokentii Konstantinov	6a5e75e083	Fix of templates api behaviour for public and private api (#1408 ) # What this PR does This PR fixes templates behaviour for public and private api. It fix "reset to default" for templates from messaging backends and some minor bugs. Also added acknowledge signal and source link templates ## Checklist - [x] Tests updated - [x] Documentation added - [x] `CHANGELOG.md` updated	2023-03-01 16:32:15 +08:00
Matias Bordese	04c42e2796	Matiasb/fix task refresh ical when empty value (#1401 ) This should fix task error as seen in logs, trying to parse an empty string as ical value: ``` Task apps.schedules.tasks.refresh_ical_files.refresh_ical_file[] raised unexpected: ValueError("Found no components where exactly one is required: ''") ```	2023-02-24 21:16:09 +00:00
Yulya Artyukhina	53af4783de	Fix the cause of retry of notify_all and notify_group tasks (#1376 ) Fix the cause of retry of notify_all and notify_group tasks that was related to an incorrect step order.	2023-02-23 09:28:13 +00:00
Innokentii Konstantinov	26a2bd9c91	Refactor maintenance (#1340 ) # What this PR does This PR simplifies code of maintenance mode. 1. Perform distribution/escalation maintenance checks in send_signal... tasks. 2. Use usual alert distribution flow for the maintenance incident. 3. Decouple maintenance mode from slack (all, except notify_about_maintenance_action methods, I don't want to make this PR too big) As a bonus from these changes, maintenance mode now mute alert group delivery in all chatops integrations, not only in slack. (Before, incidents happened while maintenance were posted to telegram and msteams anyway) ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-02-23 07:13:03 +00:00
Innokentii Konstantinov	c733d8b9f2	Cleanup ScenarioStep (#1213 ) # What this PR does This PR cleanup ScenarioStep. It's needed to simplify moving Slack to the messaging backends in future. 1. Introduce AlertGroupSlackService to move logic from ScenarioStep. Also it allowed to get rid of importing ScenarioSteps in the code not related to processing of slack callbacks. 2. Remove tags from ScenarioSteps, they are unused. 3. Remove ScenarioStep.dispatch method. It just was calling ScenarioStep.process_scenario. 4. Remove "action" param from process_scenario, it was unused. 5. Remove creation of SlackActionRecord on handling SlackEvents. We are not using it, but it generates INSERT query on most of the user-slack interactions. 6. Remove "random_prefix_for_routing" from ScenarioStep, it was unused. ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-21 20:22:11 +01:00
Yulya Artyukhina	058665b8a8	Fix too long declare incident link (#1342 ) # What this PR does ## Which issue(s) this PR fixes Issue with too long declare incident link in Slack ## Checklist - [x] `CHANGELOG.md` updated	2023-02-20 18:42:44 +08:00
Ildar Iskhakov	1b7ada4315	Add database migrations linter (#1020 ) # What this PR does This PR adds [django-migration-linter](https://github.com/3YOURMIND/django-migration-linter) to keep database migrations backwards compatible - we can automatically run migrations and they are zero-downtime, e.g. old code can work with the migrated database - we can run and rollback migrations without worrying about data safety - OnCall is deployed to the multiple environments core team is not able to control See [django-migration-linter checklist](https://github.com/3YOURMIND/django-migration-linter/blob/main/docs/incompatibilities.md) for the common mistakes and best practices ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-06 16:01:37 +08:00
Matias Bordese	bc0276fb22	Keep track of direct paging schedule/importance in logs (#1269 ) This will eventually allow to improve responders information in an alert group detail page	2023-02-02 09:21:31 -03:00
Vadim Stepanov	f80271a1f4	Return alert group ID in direct paging API (#1241 ) # What this PR does Make direct paging internal API endpoint return an alert group ID. ## Which issue(s) this PR fixes Related to https://github.com/grafana/oncall/issues/823 ## Checklist - [x] Tests updated	2023-01-30 11:48:25 +00:00
Ildar Iskhakov	ae44ee5652	Cache render_for_web field for alertgroups list serializer (#1236 ) # What this PR does This PR caches the field `render_for_web` with lifetime 1 day and cache becomes invalid if it was created before * last alert received * template changed ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-28 12:50:41 +08:00
Matias Bordese	dd27b3f2c5	Add schedules support for slack direct paging (#1183 ) Related to #823	2023-01-25 09:10:50 -03:00
Yulya Artyukhina	de5d876d27	Refactor create/update contact points for Alerting integration (#872 ) What this PR does: - Keep grafana version on create/update contact points to avoid multiple requests to alerting - Add retry limit on create contact point async - Fix bugs related on create contact point - Update logs on create/update contact point, make them more clear - Avoid unnecessary requests to Grafana Alerting	2023-01-25 09:42:42 +01:00
Ildar Iskhakov	37d25b5b31	Optimize alert group filtering queries (#1191 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-01-23 16:07:55 +08:00
Michael Derynck	cc3fdab8fb	Fix UnboundLocalError in webhooks (#1165 ) Fix error where rendered_data was being used without being defined.	2023-01-19 15:50:22 -07:00
Vadim Stepanov	ccae9d86b3	Add an ability to use an escalation chain for direct paging (#1161 ) # What this PR does Adds an ability to page an escalation chain for a newly created direct paging alert group using the internal API. Also [adds a forgotten migration](`32fc44e744`) related to the direct paging backend. Related to https://github.com/grafana/oncall/issues/823 ## Checklist - [x] Tests updated - [ ] Documentation added (N/A) - [ ] `CHANGELOG.md` updated (N/A)	2023-01-19 18:51:57 +00:00
Yulya Artyukhina	d5461866d1	Add a dummy step for declare incident button in slack (#1157 ) Add a dummy step for declare incident button to prevent raising 'Step is undefined' exception because Slack sends a POST request to the backend upon clicking a button with a redirect link to Incident. This pr doesn't change any functionality	2023-01-19 14:50:02 +01:00
Matias Bordese	90def88752	Add escalation chain option when creating a direct page alert group (#1143 ) Also changes the default integration used when creating an alert group for a direct page to a custom manual integration to avoid conflicts/unexpected behaviors with existing manual alerts.	2023-01-18 12:58:26 -03:00
Matias Bordese	d3062b56fd	Draft initial logic for user/schedule paging (#1098 ) Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>	2023-01-17 12:19:08 -03:00
Yulya Artyukhina	9129a720ef	Integration with grafana incident (#1081 ) Check if Grafana Incident is enabled. If it is, add a button with a link to declare Grafana Incident from Alert group in Slack and on Web. Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>	2023-01-17 13:04:50 +01:00
Tommy	5bd8fbdef8	Add alert groups state filter (#1133 ) # What this PR does This PR added a new parameter (state) into the alert_group public API to filter the state of the alert groups ## Which issue(s) this PR fixes https://github.com/grafana/oncall/issues/684 ## Checklist - [x] Tests updated - [x] Documentation added - [x] `CHANGELOG.md` updated Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>	2023-01-17 10:28:29 +00:00
Innokentii Konstantinov	fa6906a606	Simplify and speed up slack rendering (#1105 ) Simplify and speed up slack rendering.	2023-01-10 15:41:38 +08:00
Joey Orlando	802e3964e9	update mobile app push notification text + make telegram alert verbage consistent ("Firing" instead of "Alerting") (#1089 )	2023-01-05 16:16:43 +01:00
Michael Derynck	7c26eb559b	Improve handling of template exceptions during group data creation (#1068 ) # What this PR does With the addition of tighter controls on jinja templates handle exceptions while rendering group data as follows: - Title will cache error message as title and display to user and the error will be logged - Group distinction will be left as None and the error will be logged - Is resolve signal will be treated as False and the error will be logged - Is acknowledge signal will be treated as False and the error will be logged ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/1542	2023-01-03 12:30:59 -07:00
Matias Bordese	05524ab698	Merge pull request #1059 from grafana/matiasb/truncate-slack-title-block Truncate slack alert group title block below max size	2023-01-03 08:50:57 -03:00
Innokentii Konstantinov	5e297847ae	Speedup alert group search	2023-01-03 11:04:16 +08:00
Matias Bordese	75aaeef3f2	Truncate slack alert group title block below max size	2023-01-02 10:07:53 -03:00
Innokentii Konstantinov	41f886b31e	Speedup seach alertgroup	2022-12-17 19:34:13 +08:00
Innokentii Konstantinov	7341641b3f	Introduce org uuid (#947 ) * Introduce org uuid * Rename uuid_with_org_id to uuid_with_org_uuid Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2022-12-06 22:42:58 +08:00

1 2 3

138 commits