Commit graph

141 commits

Author SHA1 Message Date
Matias Bordese
cc356c9d54
chore: capitalize Slack name references (#5421)
Related to https://github.com/grafana/irm/issues/425
2025-01-21 17:05:39 +00:00
Joey Orlando
5227ee3798
chore: random slack code cleanup (#5307)
# What this PR does

Related to https://github.com/grafana/oncall/pull/5287

Few random "clean-ups", type improvements, etc.

Additionally, fixes a change made in #5292; we should wait to read from
`slack_message.channel.slack_id`, until we've performed the
data-migration mentioned in that PR (in the mean-time we should continue
to use `slack_message._channel_id`).

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-11-29 13:21:29 +00:00
Joey Orlando
a29e35c25a
refactor SlackMessage.channel_id (CHAR field) to SlackMessage.channel (foreign key relationship) (#5292)
# What this PR does

Related to https://github.com/grafana/oncall-private/issues/2947

**NOTE**

This PR introduces steps 1 and 2 of the 3 part migration proposed
[here](https://raintank-corp.slack.com/archives/C06K1MQ07GS/p1732555465144099).
Step 3, swapping reads to be from the new-column and dropping
dual-writes, will be done in a future PR/release.

---

I’m tackling this work now because _ultimately_ I want to move
`AlertReceiveChannel.rate_limited_in_slack_at` to
`SlackChannel.rate_limited_at` , but first I sorta need to refactor
`SlackMessage.channel_id` from a `CHAR` field to a foreign key
relationship (because in the spots where we touch Slack rate limiting,
like
[here](https://github.com/grafana/oncall/blob/dev/engine/apps/slack/alert_group_slack_service.py#L42-L50)
for example, we only have `slack_message.channel_id`, which means I need
to do extra queries to fetch the appropriate `SlackChannel` to then be
able to get/set `SlackChannel.rate_limited_at`

Other minor stuffs:
- it also prepares us to drop `SlackMessage._slack_team_identity`. We
already have a `@property` of `SlackMessage.slack_team_identity` (which
[previously had some hacky
logic](https://github.com/grafana/oncall/blob/dev/engine/apps/slack/models/slack_message.py#L74-L84)).
I've refactored `SlackMessage.slack_team_identity` to simply point to
`self.organization.slack_team_identity` + updated our code to _stop_
setting `SlackMessage._slack_team_identity` (will drop this column in
future release)

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-11-26 11:03:38 +00:00
Michael Derynck
757f0d1ce0
fix: remove notification failure policy log record when prevent posting is set (#5260)
# What this PR does
Changes UserNotificationPolicyLogRecord to success when
slack_prevent_posting is set as the user has already been notified in
slack or another method in their personal notification preferences.
These entries have also been filtered out of the alert group history
timeline as they were causing confusion to users thinking notifications
failed when in fact they had already been sent.

## Which issue(s) this PR closes

https://github.com/grafana/support-escalations/issues/13236

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-11-20 17:14:14 +00:00
Michael Derynck
2024ee7f78
feat: Auto retry escalation on failed audit (#5265)
# What this PR does
Automatically retries escalation when alert groups fail auditing. This
is the same effect as the continue_escalation command without any of the
extra arguments.

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-11-19 22:23:15 +00:00
Joey Orlando
4a5c4263e0
feat: convert schedule.channel (char field) to schedule.slack_channel (foreign key) (#5199)
# What this PR does

`OnCallSchedule` equivalent of
https://github.com/grafana/oncall/pull/5191.

**NOTE**: merge after https://github.com/grafana/oncall/pull/5224 (so
that I can use some of the new serializer fields defined in there)

### Migration
```bash
Running migrations:                                                                                                                                                                                                │
│ source=engine:app google_trace_id=none logger=apps.schedules.migrations.0019_auto_20241021_1735 Starting migration to populate slack_channel field.                                                                │
│ source=engine:app google_trace_id=none logger=apps.schedules.migrations.0019_auto_20241021_1735 Total schedules to process: 1                                                                                      │
│ source=engine:app google_trace_id=none logger=apps.schedules.migrations.0019_auto_20241021_1735 Schedule 26 updated with SlackChannel 2 (slack_id: C043LL6RTS7).                                                   │
│ source=engine:app google_trace_id=none logger=apps.schedules.migrations.0019_auto_20241021_1735 Bulk updated 1 OnCallSchedules with their Slack channel.                                                           │
│ source=engine:app google_trace_id=none logger=apps.schedules.migrations.0019_auto_20241021_1735 Finished migration. Total schedules processed: 1. Schedules updated: 1. Missing SlackChannels: 0.                  │
│   Applying schedules.0019_auto_20241021_1735... OK
```

### Tested Public API
```txt
POST {{oncall_host}}/api/v1/schedules/
Authorization: {{oncall_api_key}}
Content-Type: application/json

{
    "name": "Demo testy testy2",
    "type": "web",
    "time_zone": "America/Los_Angeles",
    "slack": {
        "channel_id": "C05PPLYN1U1"
    }
}

HTTP/1.1 201 Created
Content-Type: application/json
Vary: Accept, Origin
Allow: GET, POST, HEAD, OPTIONS
X-Frame-Options: DENY
Content-Length: 198
X-Content-Type-Options: nosniff
Referrer-Policy: same-origin
Cross-Origin-Opener-Policy: same-origin

{
  "id": "SBBN73UTUTVCE",
  "team_id": null,
  "name": "Demo testy testy2",
  "time_zone": "America/Los_Angeles",
  "on_call_now": [],
  "shifts": [],
  "slack": {
    "channel_id": "C05PPLYN1U1",
    "user_group_id": null
  },
  "type": "web"
}
```

### Tested via UI (eg; internal API)

https://www.loom.com/share/e66bf3468b144dd782da5eb6e0bfd0af

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-11-04 14:27:21 -05:00
Joey Orlando
deb6a45588
chore: convert two slack channel ID char fields to foreign keys (#5224)
# What this PR does

Similar to https://github.com/grafana/oncall/pull/5199

Converts follow char fields to primary key relationships on
`SlackChannel` table:
- `ResolutionNoteSlackMessage.channel_id` ->
`ResolutionNoteSlackMessage.slack_channel`
- `ChannelFilter.slack_channel_id` -> `ChannelFilter.slack_channel`

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-11-04 13:34:06 -05:00
Matias Bordese
8dc90230d7
Update shift change notification to consider microsecond timestamps (#5196)
Related to https://github.com/grafana/support-escalations/issues/12893
2024-10-21 16:48:07 +00:00
Yulya Artyukhina
8420cfd822
Fix acknowledge reminder task (#5179)
# What this PR does
- Adds 10 minutes lock for acknowledge reminder task to prevent task
duplicates, that causes posting multiple reminder messages and flooding
in Slack threads.
- Adds a new signal for acknowledge reminder task instead of using
`alert_group_action_triggered_signal` since it is used only to post
reminder message in Slack thread and it's not needed to be processed by
other representatives

## Which issue(s) this PR closes

Related to https://github.com/grafana/oncall-private/issues/2953

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-10-16 12:13:28 +00:00
Matias Bordese
46679606ac
Reschedule rate limited telegram task instead of retry (#5178) 2024-10-15 13:35:54 +00:00
Yulya Artyukhina
f35f66a6ea
Fix retrying tasks (#5160)
# What this PR does
Small fixes for some retrying tasks

## Which issue(s) this PR closes
https://github.com/grafana/oncall-private/issues/2965
https://github.com/grafana/oncall-private/issues/2964

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-10-10 15:37:30 +00:00
Matias Bordese
b79c5d0c1c
Skip user notification if alert group is already resolved (#5145)
Sometimes a task is queued and scheduled for later (or for a retry too),
but the alert group is resolved by then. Skip notification in that case.
2024-10-09 18:10:39 +00:00
Matias Bordese
fa815b7ecd
Reworked declare incident escalation step (#5130)
Reworked https://github.com/grafana/oncall/pull/5047. Main update is the
switch from FK to a [M2M
relation](https://docs.google.com/document/d/1HeulqxoFShSHtInQrZNJLL5MDlHPNT50rVGaK3zZWvw/edit?disco=AAABVLjV4W8)
(which doesn't really change the original/intended behavior, besides not
needing to alter the alert group table, and it is a bit more flexible;
the extra table shouldn't introduce issues because this is used only for
tracking purposes and the information needed in the log record is
already there).

Avoid a db migration involving alert group table:

```
--
-- Create model RelatedIncident
--
CREATE TABLE `alerts_relatedincident` (`id` bigint AUTO_INCREMENT NOT NULL PRIMARY KEY, `incident_id` varchar(50) NOT NULL, `created_at` datetime(6) NOT NULL, `is_active` bool NOT NULL, `channel_filter_id` bigint NULL, `organization_id` bigint NOT NULL);
CREATE TABLE `alerts_relatedincident_attached_alert_groups` (`id` bigint AUTO_INCREMENT NOT NULL PRIMARY KEY, `relatedincident_id` bigint NOT NULL, `alertgroup_id` bigint NOT NULL);
ALTER TABLE `alerts_relatedincident` ADD CONSTRAINT `alerts_relatedincident_organization_id_incident_id_d7fc9a4f_uniq` UNIQUE (`organization_id`, `incident_id`);
ALTER TABLE `alerts_relatedincident` ADD CONSTRAINT `alerts_relatedincide_channel_filter_id_9556c836_fk_alerts_ch` FOREIGN KEY (`channel_filter_id`) REFERENCES `alerts_channelfilter` (`id`);
ALTER TABLE `alerts_relatedincident` ADD CONSTRAINT `alerts_relatedincide_organization_id_74ed6bed_fk_user_mana` FOREIGN KEY (`organization_id`) REFERENCES `user_management_organization` (`id`);
CREATE INDEX `alerts_relatedincident_incident_id_8356a799` ON `alerts_relatedincident` (`incident_id`);
ALTER TABLE `alerts_relatedincident_attached_alert_groups` ADD CONSTRAINT `alerts_relatedincident_a_relatedincident_id_alert_3d683baa_uniq` UNIQUE (`relatedincident_id`, `alertgroup_id`);
ALTER TABLE `alerts_relatedincident_attached_alert_groups` ADD CONSTRAINT `alerts_relatedincide_relatedincident_id_3e5e7a23_fk_alerts_re` FOREIGN KEY (`relatedincident_id`) REFERENCES `alerts_relatedincident` (`id`);
ALTER TABLE `alerts_relatedincident_attached_alert_groups` ADD CONSTRAINT `alerts_relatedincide_alertgroup_id_0125deca_fk_alerts_al` FOREIGN KEY (`alertgroup_id`) REFERENCES `alerts_alertgroup` (`id`);
```
2024-10-07 19:26:10 +00:00
Matias Bordese
62ab3f1f62
Revert declared incident model related changes (#5116) 2024-10-02 21:34:20 +00:00
Yulya Artyukhina
70b7273078
Add declare incident step and model (#5047)
Related to https://github.com/grafana/oncall-private/issues/2831

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.

---------

Co-authored-by: Matias Bordese <mbordese@gmail.com>
Co-authored-by: Dominik <dominik.broj@grafana.com>
2024-10-02 13:38:33 +00:00
Vadim Stepanov
7b74b65168
Fix direct paging ack (#4957)
# What this PR does

Fixes https://github.com/grafana/oncall/issues/4760 (also provides a
workaround for https://github.com/grafana/oncall/issues/4761)

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-08-30 09:17:09 +00:00
Yulya Artyukhina
64bf1e5096
Speed up internal api endpoints (#4830)
# What this PR does
Reduces number of calls to db for `/schedules`, `/alertgroups` and
`/users` endpoints.
Fixes the issue when there was an additional call to db to get
organization url to build user avatar full link.

## Which issue(s) this PR closes

Related to [issue link here]

<!--
*Note*: If you want the issue to be auto-closed once the PR is merged,
change "Related to" to "Closes" in the line above.
If you have more than one GitHub issue that this PR closes, be sure to
preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-08-15 14:20:55 +00:00
Yulya Artyukhina
1a6d77888e
Fix notification plan builder (#4726)
# What this PR does
Fixes building notification plan if one or more notifications were
bundled

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-24 15:49:03 +00:00
Yulya Artyukhina
49d1127698
Fix send_bundled_notification task (#4696)
# What this PR does
Fix scheduling `perform_notification` from `send_bundled_notification`
task - leftover after resolving merge conflict

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-17 17:00:26 +00:00
Yulya Artyukhina
191814b25e
User notifications bundle (#4457)
# What this PR does
This PR adds two new models: UserNotificationBundle and
BundledNotification (proposals for naming are welcome).

`UserNotificationBundle` manages the information about last notification
time and scheduled notification task for bundled notifications. It is
unique per user + notification_channel + notification importance.

`BundledNotification` contains notification policy and alert group, that
triggered the notification. The BundledNotification instance is created
in `notify_user_task` for every notification, that should be bundled,
and is attached to UserNotificationBundle by ForeignKey connection.

How it works:
If the user was notified recently (within the last two minutes) by the
current notification channel, and this channel is bundlable,
BundledNotification instance will be created and attached to the
UserNotificationBundle instance, and `send_bundled_notification` task
will be scheduled to execute in 2 min.
In `send_bundled_notification` task we get all BundledNotification
attached to the current UserNotificationBundle instance, check if alert
groups are still active and if there is only one notification - perform
regular notification by calling `perform_notification` task, otherwise
call "notify_by_<channel>_bundle" method for the current notification
channel.

PR with method to send notification bundle by SMS -
https://github.com/grafana/oncall/pull/4624

**This feature is disabled by default by feature flag. Public docs will
be added in a separate PR with enabling this feature.**
## Which issue(s) this PR closes
related to https://github.com/grafana/oncall-private/issues/2712

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-16 11:24:08 +00:00
Joey Orlando
34a90134fb
patch default user notification policy changes + fix failing e2e test (#4635)
# What this PR does

This is a follow-up PR to https://github.com/grafana/oncall/pull/4628.
As @Ferril pointed out, there was a slight issue in
`apps.alerts.tasks.notify_user.perform_notification` method when using a
"fallback"/default user notification policy. This is because the
`log_record_pk` arg passed into `perform_notification` will fetch the
`UserNotificationPolicyLogRecord` object, but that object will have a
`notification_policy` set to `None` (because there's no persistent
`UserNotificationPolicy` object to refer to).

Instead we now pass in a second argument to `perform_notification`,
`use_default_notification_policy_fallback`. If this is true, simply grab
the transient/in-memory `UserNotificationPolicy` and use that inside of
this task

Related to https://github.com/grafana/oncall/issues/4410

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-09 11:23:53 -04:00
Joey Orlando
af99d62a32
fix failing e2e tests 2024-07-08 13:04:16 -04:00
Joey Orlando
0163b58399
notify user task patch + small update to user notification rules public API docs (#4628)
# What this PR does

Patches a small bug noticed (locally) by @Ferril 🙏 + updates our user
notification rules public API docs to include `notify_by_msteams` as a
valid `type` value (cloud only)

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-08 11:52:20 -04:00
Joey Orlando
abedea72bf
don't force create default user notification policies (#4608)
# What this PR does

Related to https://github.com/grafana/oncall/issues/4410

The changes in this PR are a prerequisite to
https://github.com/grafana/terraform-provider-grafana/pull/1653. See the
conversation
[here](https://raintank-corp.slack.com/archives/C04JCU51NF8/p1719806995902499?thread_ts=1719520920.744319&cid=C04JCU51NF8)
for more context on why we decided to move away from always creating
default personal notification rules for users.

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-05 15:08:17 -04:00
Matias Bordese
2aa8639e2a
Update escalation auditor logs to expose succeeding count (#4431)
Related to https://github.com/grafana/oncall-private/issues/2619
(we need the succeeding number to make the SLO query happy with
cluster/namespace filtering)
2024-05-31 18:29:31 +00:00
Matias Bordese
08d1e00430
Update escalation auditor to log total and failed escalations info (#4425)
Related to https://github.com/grafana/oncall-private/issues/2619
2024-05-30 18:53:53 +00:00
Matias Bordese
d316c9121e
Fix order filtering when executing notify all/group steps from snapshot (#4381)
Fixes https://github.com/grafana/oncall-private/issues/2708
2024-05-23 12:36:28 +00:00
Matias Bordese
65ee57f563
Ignore uncompleted notifications if policy is deleted (#4260)
Related to https://github.com/grafana/oncall-private/issues/2637
2024-04-23 11:40:24 +00:00
Michael Derynck
d75590b943
Handle alert group deleted when task is already queued (#4230)
# What this PR does
- Since send_alert_create_signal is inside transaction on_commit we can
conclude that if it does not exist it was intentionally deleted before
the task could run and the task can exit instead of retrying
- Improve logging when send_alert_create_signal is called so both alert
and alert group are in the same line so you don't need to search the
logs as much
- Improve logging on public api delete alert group so we can know what
the alert group belonged to and the responsible user/org
- Remove distribute_alerts (Stopped using a while back, code should be
safe to remove now, no tasks running in system)

## Which issue(s) this PR closes

Closes https://github.com/grafana/oncall-private/issues/2640

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-04-16 14:39:00 +00:00
Joey Orlando
c5cd675738
cleanup CustomButton backend code + add ngrok/express outgoing webhook e2e test (#2544)
# What this PR does

- removes unused "custom button" backend code now that we've migrated to
outgoing webhooks
- adds new e2e test for webhooks asserting that an `ngrok`/`express`
webhook handler receives the call as expected + payload is as expected
(related to https://github.com/grafana/oncall/issues/2691) - skipped for
now, the test passes locally but fails on GitHub Actions CI, seems to be
networking related
 
## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)

---------

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>
2024-03-28 15:37:22 +00:00
Yulya Artyukhina
ba122ec6ef
Update notification checker (#3818)
# What this PR does
Count sms with status "accepted" as delivered in notification checker
## Which issue(s) this PR fixes

https://raintank-corp.slack.com/archives/C025VMT6SPK/p1706799009342889?thread_ts=1706786822.083149&cid=C025VMT6SPK
## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-02-01 15:42:43 +00:00
Michael Derynck
2a466a0c4f
Add transaction on_commit before signals for alert group actions (#3731)
# What this PR does
Add transactions around log record creation and check transaction
on_commit before sending signals passing DB id of alert group log
records. In cases for delete we can then assume any missing IDs on tasks
are from intentionally deleted alert groups and we can stop tasks from
retrying endlessly.

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-31 15:54:50 -07:00
Matias Bordese
3795c836d1
Add transaction block and callbacks when triggering tasks (#3779)
Related to https://github.com/grafana/oncall/issues/3729
2024-01-31 09:26:14 -05:00
Ildar Iskhakov
401d279d54
Refactor create_alert task (#3759)
# What this PR does

This PR simplifies alert group/alert creation, so the alert created and
escalation started in the same task.

## Which issue(s) this PR fixes

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-30 08:39:04 +00:00
Yulya Artyukhina
19cae8086e
Retry perform_notification with Telegram ratelimit countdown on RetryAfter error (#3744)
# What this PR does
Use Telegram ratelimit countdown when retry `perform_notification` task
on `RetryAfter` error
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2451

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-24 15:31:56 +00:00
Joey Orlando
f27aa48dcb
address typo in partial passed to transaction.on_commit 2024-01-18 09:09:48 -05:00
Joey Orlando
909aacd8b8
change perform_notification.apply_async
transaction.on_commit back to using partial
2024-01-18 07:58:21 -05:00
Joey Orlando
16b648bd15
fix infinitely retrying apps.alerts.tasks.notify_user.perform_notification task (#3708)
# Which issue(s) this PR fixes

Closes https://github.com/grafana/oncall-private/issues/2318

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-18 07:07:01 -05:00
Matias Bordese
2fd456fc77
Update alert group personal notifications checker to check sent SMS (#3698)
Sent SMS messages are considered completed for our purpose here (ie. do
not wait for Twilio delivered confirmation).
2024-01-17 17:46:18 +00:00
Joey Orlando
4036ced9b9
add LogExceptionOnFailureTask celery task class (#3677)
# What this PR does

Closes https://github.com/grafana/oncall-private/issues/2449

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-12 21:31:01 +00:00
Matias Bordese
4e2e7e0a15
Add task logging personal notifications triggered/completed counts (#3638)
Related to https://github.com/grafana/oncall-private/issues/2347
2024-01-10 18:54:27 +00:00
Matias Bordese
f68b9dd004
Update auditor to check personal notifications (#3563)
Requires https://github.com/grafana/oncall/pull/3557

Related to https://github.com/grafana/oncall-private/issues/2347
2023-12-18 16:13:18 +00:00
Yulya Artyukhina
36227418ed
Speed up escalation auditor (#3578)
# What this PR does
Speed up escalation auditor
- use raw escalation snapshot instead of serialized one

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-18 12:28:55 +00:00
Michael Derynck
e7f3eff72c
Limit how long acknowledge reminders can run for (#3571)
# What this PR does
Stops rescheduling of `acknowledge_reminder_task` after 2 weeks.
Assumption being if it has been sitting for that long in acknowledged
state it is likely to not need more reminders that it is still
acknowledged. Notifications for thread were probably muted a long time
ago.

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-15 16:50:01 +00:00
Yulya Artyukhina
2b62da77b7
Check if escalation was skipped in Slack before trying to notify user (#3562)
# What this PR does
Updates check if escalation was skipped in Slack before trying to notify
user by Slack.

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-15 09:33:01 +00:00
Matias Bordese
e260e23715
Add missing success log entries for personal notifications (#3557) 2023-12-14 18:32:26 +00:00
Matias Bordese
6dada51133
Remove unneeded filter making query slower (#3570)
There is no index for the `received_at` column, and the filter isn't
really needed (aggregation will work in any case, considering only the
entries for which we have data).
2023-12-14 18:25:34 +00:00
Yulya Artyukhina
8a6510badd
Fix task retries for deleted alert groups (#3553)
# What this PR does

## Which issue(s) this PR fixes

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-12 12:01:47 +00:00
Joey Orlando
16ba87bff6
Don't update alert group metrics when deleting an alert group (#3544)
# Which issue(s) this PR fixes

Fixes https://github.com/grafana/oncall-private/issues/2376

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-11 12:16:00 -05:00
Matias Bordese
3feba3675b
Log average/max delta between alert ingestion and alert group creation (#3526)
Related to https://github.com/grafana/oncall-private/issues/2347
2023-12-07 16:03:41 +00:00