Commit graph

268 commits

Author SHA1 Message Date
Vadim Stepanov
8188dd5dd2
Create missing direct paging integrations (#3468)
# What this PR does

Makes organization sync create direct paging integrations for Grafana
teams that don't have one.

## Which issue(s) this PR fixes

Related to https://github.com/grafana/oncall-private/issues/2302

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-30 17:18:18 +00:00
Joey Orlando
7c4b40a046
upgrade to Python 3.12 (#3456)
# What this PR does

Upgrade to Python 3.12 + fix several invalid test assertions that lead
to test failures in the latest version of `pytest`:
```
AttributeError: 'called_once_with' is not a valid assertion. Use a spec for the mock if 'called_once_with' is meant to be an attribute.. Did you mean: 'assert_called_once_with'?
```

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-30 13:47:41 +00:00
Matias Bordese
aa8a904a8d
Update when slack client ratelimit retry handler is enabled (#3447) 2023-11-30 12:35:46 +00:00
Vadim Stepanov
381a9ecf54
Delete duplicate direct paging integrations (#3412)
# What this PR does

Deletes duplicate direct paging integrations (i.e. keeps only the first
direct paging integration per team).
Also adds a unique constraint that will make such duplicates impossible
at the DB level.

## Which issue(s) this PR fixes

Related to https://github.com/grafana/oncall-private/issues/2302

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-30 11:19:12 +00:00
Matias Bordese
7aa78f5f73
Enable flake8-bugbear, fix issues (#3454)
Enables [flake8-bugbear](https://github.com/PyCQA/flake8-bugbear),
checking for bugs/design problems, and [fixes the issues
found](https://pastebin.com/fEDBz6Ta) (some interesting ones,
particularly with mutable args).

Related to https://github.com/grafana/oncall/pull/3448
2023-11-29 15:04:48 +00:00
Ildar Iskhakov
2fdd885abe
Move new alert group metric creation into async task (#3451)
# What this PR does

Moving metrics creation into separate task to make alert ingestion more
robust

## Which issue(s) this PR fixes

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Julia <ferril.darkdiver@gmail.com>
2023-11-29 12:45:36 +00:00
Matias Bordese
ec1f120d9c
Update transaction.on_commit to use partial instead of lambda (#3448)
For reference,
https://adamj.eu/tech/2022/08/22/use-partial-with-djangos-transaction-on-commit/.

I think this may help with this one too:
https://github.com/grafana/oncall-private/issues/2318
2023-11-29 12:01:30 +00:00
Vadim Stepanov
9e889403f2
Alert group payload labels (#3434)
https://github.com/grafana/oncall/pull/3385 + handle null values
2023-11-27 17:53:54 +00:00
Vadim Stepanov
e09422a07d
Revert "Alert group payload labels" (#3433)
Reverts grafana/oncall#3385
2023-11-27 17:28:34 +00:00
Vadim Stepanov
5fac6aeac5
Alert group payload labels (#3385)
# What this PR does

Adds an ability to extract labels from alert group payload. See
[demo](https://www.loom.com/share/cf2b746eea974547b76f44298e32a54f?sid=67ed1e58-40ed-4136-a201-6482fb7773d3).

## Which issue(s) this PR fixes

https://github.com/grafana/oncall-private/issues/2304

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Maxim Mordasov <maxim.mordasov@grafana.com>
Co-authored-by: Rares Mardare <rares.mardare@grafana.com>
2023-11-27 16:55:31 +00:00
Matias Bordese
d730f6b2bf
Trigger distribute task after alert is committed (#3420)
Fix issue triggering task retries because alert is not yet committed to
the DB.
Similar to https://github.com/grafana/oncall/pull/3001.
2023-11-24 12:02:32 +00:00
Joey Orlando
f70c439334
add more logging in apps.alerts.tasks.notify_user.perform_notification task (#3403)
# What this PR does

add more logging in `apps.alerts.tasks.notify_user.perform_notification`
task (related to https://github.com/grafana/oncall-private/issues/2318;
won't solve that issue but will help with the investigation)
2023-11-21 13:35:23 -05:00
Vadim Stepanov
cb2d4fa76b
Fix deleting integrations with duplicate names (#3397)
# What this PR does

Fixes a bug when it's not possible to delete two or more integrations
having the same name at once.

## Which issue(s) this PR fixes

https://github.com/grafana/oncall-private/issues/2313

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-21 12:44:21 +00:00
Vadim Stepanov
93badbd638
Delete direct paging integration on team delete (#3367)
# What this PR does

- Fix a bug when it's not possible to delete duplicate DP integrations
for "No team"
- Make so that when a team is deleted, its DP integration is deleted
automatically

## Which issue(s) this PR fixes

https://github.com/grafana/oncall-private/issues/2296

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-16 13:54:15 +00:00
Matias Bordese
e1e56fc414
Truncate resolution note text in slack message to satisfy block limits (#3351)
This should help with some retrying tasks.
2023-11-16 13:15:04 +00:00
Joey Orlando
77cb381366
Fix broken openapi schema + add integration test (#3364)
# Which issue(s) this PR fixes

- Fix issue that was causing our openapi schema to return HTTP 500 + add
an integration test which fetches the `.yaml` schema and validates that
the endpoint returns HTTP 200 (should hopefully prevent this from
happening again).
- add a few more type hints along the way

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-16 12:15:05 +00:00
Vadim Stepanov
b797672626
Inheritable integration labels (#3307)
# What this PR does

Allows selecting integration labels that will be passed down to alert
groups

## Which issue(s) this PR fixes

Related to https://github.com/grafana/oncall-private/issues/2179

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Maxim <maxim.mordasov@grafana.com>
2023-11-15 12:10:34 +00:00
Yulya Artyukhina
d7d5c3aa28
Fix acknowledge reminder (#3345)
# What this PR does
Fix acknowledge reminder:
- check if organization was deleted
- improve logging

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-14 13:39:27 +00:00
Ildar Iskhakov
784c5ee7c1
Add notifications success ratio log to auditor (#3312)
# What this PR does

This PR adds alert groups success ratio over last 48 hours

## Which issue(s) this PR fixes

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-10 16:39:13 +08:00
Vadim Stepanov
456829f768
Pass all integration labels down to alert groups (#3302)
Reverts grafana/oncall#3301
2023-11-08 14:04:58 +00:00
Vadim Stepanov
53aae00f7c
Revert "Pass all integration labels down to alert groups" (#3301)
Reverts grafana/oncall#3299
2023-11-08 13:32:10 +00:00
Vadim Stepanov
367e3c9c1d
Pass all integration labels down to alert groups (#3299)
# What this PR does

Passes ALL integration labels down to alert groups, so it's easier to
create labels for alert groups locally.

## Which issue(s) this PR fixes

Related to https://github.com/grafana/oncall-private/issues/2179

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-08 13:17:05 +00:00
Joey Orlando
00b8d1b821
remove AlertGroup.is_restricted column (#3287)
# What this PR does

Second part of https://github.com/grafana/oncall/pull/3256 to remove the
deprecated `AlertGroup.is_restricted` column. That change was released
in v1.3.54.

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-07 16:58:53 +00:00
Yulya Artyukhina
7552de13e5
Add a command to continue escalations for alert groups (#3283)
# What this PR does
Add an ability to continue escalations for alert groups from the point
it was in case if it was stopped

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-07 13:44:23 +00:00
Joey Orlando
eb0f465970
add get_cached_oncall_users_for_multiple_schedules method + add DELETE /alertgroups/<id> internal API endpoint (#3280)
# What this PR does

This PR:
- adds a new method
`apps.schedules.ical_utils.get_cached_oncall_users_for_multiple_schedules`.
In short this method basically adds a layer of caching on top of
`apps.schedules.ical_utils.get_oncall_users_for_multiple_schedules`. We
store one cache value for each schedule. Cache results are stored for 15
minutes. To me this feels like a good balance between improving
performance + returning stale results. Cache values are stored as:
  - key = `f"schedule_{schedule.public_primary_key}_oncall_users"`
- value = `[user.public_primary_key for user in
schedule.currently_oncall_users]`
- adds a `DELETE /alertgroups/<id>` internal API endpoint (needed by
Grafana Incident for the Add Responders integration)
- updates the `is_currently_oncall` query parameter for the `GET /users`
internal API endpoint to return ALL users when the query param value `==
"all"`

## Which issue(s) this PR fixes

https://github.com/grafana/oncall-private/issues/2264

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-06 15:30:32 -05:00
Matias Bordese
cc9dc66437
Move cache clear to fixtures, fix some deprecation notices (#3269) 2023-11-06 16:52:50 +00:00
Joey Orlando
eef27b35d1
remove AlertGroup.is_restricted column (#3256)
# What this PR does

Removes deprecated/not-referenced `AlertGroup.is_restricted` column. Per
our dev docs on "[Removing a nullable field from a
model](https://github.com/grafana/oncall/blob/dev/dev/README.md#removing-a-nullable-field-from-a-model)"
this is the 1st of two migration files. The [second migration
file](https://github.com/grafana/oncall/files/13255276/0038_remove_alertgroup_is_restricted_db.py.zip)
will be added in a PR in a subsequent PR/release.
2023-11-06 12:48:06 +00:00
Joey Orlando
2cbb20601e
Improve performance of GET /users and GET /teams endpoints used by add responders popup (#3241)
# What this PR does

- Improve performance of the specific `GET /users` and `GET /teams`
calls that're made by the Add Responders dropdown in the UI
- Add `GET /team/{teamId}` internal API route (needed by Grafana
Incident team for their Add Responders changes)
- Some UI improvements to the Add Responders popup (loading state +
pre-fetch users and teams when the drawer is opened)
- Re-enable django-admin only if `settings.SILK_PROFILER_ENABLED ==
True` (need to be able to log into django admin to auth to silk routes)

Closes #3231 
Closes https://github.com/grafana/oncall-private/issues/2252

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-03 12:40:54 -04:00
Joey Orlando
f905ac5246
Updates to POST /direct_paging internal API endpoint to support Grafana Incident use-cases (#3232)
# What this PR does

- Add a new column, `grafana_incident_id`, to the `AlertGroup` model.
For now this is really just needed to determine if the Alert Group was
created, via a Direct Page, that originated from Grafana Incident.
- I understand that these IDs may be cluster specific. For now we will
not need to make OnCall backend -> Incident API calls. Should we need to
start doing this we will likely need to start syncing the Incident
plugin's provisioned API url into the `organization` in OnCall, such
that we make the API call to the right Incident backend.
- Add two new optional request body parameters to `POST /direct_paging`,
`source_url` and `grafana_incident_id`
- `source_url` - will easily allow Grafana Incident to specify the URL
to the Incident and have this populate the "Source" button
- `grafana_incident_id` - Grafana Incident can specify this such that we
have a link back to Incident (+ we know that the Alert Group was
generated from Incident)
- Hide the "Declare Incident" button in the UI if the Alert Group was
generated from Grafana Incident

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-01 17:19:44 -04:00
Joey Orlando
b8ad7bf99b
remove all references to deprecated AlertGroup.is_restricted field (#3228)
# What this PR does

remove all references to deprecated `AlertGroup.is_restricted` field +
leave a note to remove the column in a future release
2023-10-31 20:10:45 +00:00
Joey Orlando
1b05b60738
add missing db migrations + add additional backend db migration check to CI (#3234)
# What this PR does

- add missing db migration files generated via `python manage.py
makemigrations`
- fail the `lint-migrations-backend-mysql-rabbitmq` GitHub Actions CI
job if there are missing Django database migration files
2023-10-31 16:00:55 -04:00
Joey Orlando
d49ccef8cb
address some minor direct paging backend issues (#3208)
# Which issue(s) this PR fixes

- Fixes an issue where if the user does not appear in the
`UserHasNotification` query, we don't actually unpage the user and
therefore they still show up in the `paged_users` array. (unpaging ==
creating a `AlertGroupLogRecord.TYPE_UNPAGE_USER` log record)
- Fixes an issue where if a user is paged multiple times, they would
currently show up in `paged_users` > 1

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-10-27 20:47:00 +00:00
Joey Orlando
697248dc75
Add responders improvements (#3128)
# What this PR does

https://www.loom.com/share/c5e10b5ec51343d0954c6f41cfd6a5fb

## Summary of backend changes
- Add `AlertReceiveChannel.get_orgs_direct_paging_integrations` method
and `AlertReceiveChannel.is_contactable` property. These are needed to
be able to (optionally) filter down teams, in the `GET /teams` internal
API endpoint
([here](https://github.com/grafana/oncall/pull/3128/files#diff-a4bd76e557f7e11dafb28a52c1034c075028c693b3c12d702d53c07fc6f24c05R55-R63)),
to just teams that have a "contactable" Direct Paging integration
- `engine/apps/alerts/paging.py`
- update these functions to support new UX. In short `direct_paging` no
longer takes a list of `ScheduleNotifications` or an `EscalationChain`
object
  - add `user_is_oncall` helper function
- add `_construct_title` helper function. In short if no `title` is
provided, which is the case for Direct Pages originating from OnCall
(either UI or Slack), then the format is `f"{from_user.username} is
paging <team.name (if team is specified> <comma separated list of
user.usernames> to join escalation"`
- `engine/apps/api/serializers/team.py` - add
`number_of_users_currently_oncall` attribute to response schema
([code](https://github.com/grafana/oncall/pull/3128/files#diff-26af48f796c9e987a76447586dd0f92349783d6ea6a0b6039a2f0f28bd58c2ebR45-R52))
- `engine/apps/api/serializers/user.py` - add `is_currently_oncall`
attribute to response schema
([code](https://github.com/grafana/oncall/pull/3128/files#diff-6744b5544ebb120437af98a996da5ad7d48ee1139a6112c7e3904010ab98f232R157-R162))
- `engine/apps/api/views/team.py` - add support for two new optional
query params `only_include_notifiable_teams` and `include_no_team`
([code](https://github.com/grafana/oncall/pull/3128/files#diff-a4bd76e557f7e11dafb28a52c1034c075028c693b3c12d702d53c07fc6f24c05R55-R70))
- `engine/apps/api/views/user.py`
- in the `GET /users` internal API endpoint, when specifying the
`search` query param now also search on `teams__name`
([code](https://github.com/grafana/oncall/pull/3128/files#diff-30309629484ad28e6fe09816e1bd226226d652ea977b6f3b6775976c729bf4b5R223);
this is a new UX requirement)
- add support for a new optional query param, `is_currently_oncall`, to
allow filtering users based on.. whether they are currently on call or
not
([code](https://github.com/grafana/oncall/pull/3128/files#diff-30309629484ad28e6fe09816e1bd226226d652ea977b6f3b6775976c729bf4b5R272-R282))
- remove `check_availability` endpoint (no longer used with new UX; also
removed references in frontend code)
- `engine/apps/slack/scenarios/paging.py` and
`engine/apps/slack/scenarios/manage_responders.py` - update Slack
workflows to support new UX. Schedules are no longer a concept here.
When creating a new alert group via `/escalate` the user either
specifies a team and/or user(s) (they must specify at least one of the
two and validation is done here to check this). When adding responders
to an existing alert group it's simply a list of users that they can
add, no more schedules.
- add `Organization.slack_is_configured` and
`Organization.telegram_is_configured` properties. These are needed to
support [this new functionality
](https://github.com/grafana/oncall/pull/3128/files#diff-9d96504027309f2bd1e95352bac1433b09b60eb4fafb611b52a6c15ed16cbc48R271-R272)
in the `AlertReceiveChannel` model.

## Summary of frontend changes
- Refactor/rename `EscalationVariants` component to `AddResponders` +
remove `grafana-plugin/src/containers/UserWarningModal` (no longer
needed with new UX)
- Remove `grafana-plugin/src/models/user.ts` as it seemed to be a
duplicate of `grafana-plugin/src/models/user/user.types.ts`

Related to https://github.com/grafana/incident/issues/4278

- Closes #3115
- Closes #3116
- Closes #3117
- Closes #3118 
- Closes #3177 

## TODO
- [x] make frontend changes
- [x] update Slack backend functionality
- [x] update public documentation
- [x] add/update e2e tests

## Post-deploy To-dos
- [ ] update dev/ops/production Slack bots to update `/escalate` command
description (should now say "Direct page a team or user(s)")

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-10-27 12:12:07 -04:00
Yulya Artyukhina
f239fd7798
Quickfix for getting alert group from slack message payload (#3195)
# What this PR does

## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2217

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-10-25 16:32:55 +00:00
Yulya Artyukhina
a3aeb0ecb5
Fix channel filter endpoint (#3192)
# What this PR does
Add filtering term length check for channel filter endpoints

## Which issue(s) this PR fixes

Closes https://github.com/grafana/oncall-private/issues/2114

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-10-25 12:33:05 +00:00
Vadim Stepanov
2179e7a1c9
Resolution note source mobile app (#3174)
# What this PR does

Fixes https://github.com/grafana/oncall/issues/2320

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-10-20 15:22:45 +01:00
Matias Bordese
88f9f118a3
Use partial instead of lambda when queuing ack task (#3134)
Prefer partial since it is the what the [docs
suggests](https://docs.djangoproject.com/en/4.2/topics/db/transactions/#performing-actions-after-commit).
Also because partial is evaluated immediately while lambda is evaluated
at runtime (which may be causing some issues):

```
>>> from functools import partial
>>> def foo(a, b, c):
...   print(a, b, c)
... 
>>> x = 10
>>> bar_partial = partial(foo, 1, 2, x)
>>> bar_lambda = lambda: foo(1, 2, x)
>>> x = 20
>>> bar_partial()
1 2 10
>>> bar_lambda()
1 2 20
```
2023-10-06 18:31:11 +00:00
Vadim Stepanov
1acb7018d0
Improve alert group deletion API (#3124)
# What this PR does

- Invalidate alert group cache on wipe
- Improve public API docs on alert group deletion
- Add / improve tests

## Which issue(s) this PR fixes

Related to https://github.com/grafana/oncall/issues/3051

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-10-05 14:32:40 +01:00
Vadim Stepanov
a727450d49
Public API: Acknowledge & Resolve actions (#3108)
# What this PR does

Makes it possible to acknowledge/unacknowledge and resolve/unresolve
alert groups via public API, and makes sure these actions are reflected
properly in the alert group timeline.

## Demo

```bash
curl --request POST \
     --header "Authorization: TOKEN" \
     http://localhost:8080/api/v1/alert_groups/IQMHLV8INB24N/resolve
```

<img width="651" alt="Screenshot 2023-10-04 at 16 05 27"
src="https://github.com/grafana/oncall/assets/20116910/d4e66868-0132-4b6b-95c7-8424fced7c0b">

## Which issue(s) this PR fixes

https://github.com/grafana/oncall/issues/3051

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-10-05 09:46:48 +01:00
Matias Bordese
29eae9b2c6
Fix slack notification for shift end affected by taken swap (#3092)
Related to https://github.com/grafana/oncall/issues/3096
2023-10-02 12:56:07 +00:00
Ildar Iskhakov
51014735aa
WIP: Direct paging improvements (#3064)
# What this PR does
* Create Direct Paging integration (with default route) when team is
created with bulk_update
* Create notification policies when user is created with bulk_update
* If user notification policies are empty change it to Email
* Minor markup and wording improvements
* Add grafana queue to helm chart
* Remove disabled commands for redis helm chart
* Improve Dockerfile caching

## Which issue(s) this PR fixes

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-09-28 03:57:49 +00:00
Vadim Stepanov
6caacf4048
Handle Slack ratelimit on alert group deletion (#3038)
# What this PR does

- gracefully retry
`apps.alerts.tasks.delete_alert_group.delete_alert_group` when hitting
Slack ratelimits
- remove Slack messages from the DB as soon as they are deleted from
Slack, so the tasks are not retrying perpetually

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-09-19 08:41:47 +00:00
Vadim Stepanov
8b2212c7dc
Improve Slack error handling (#3000)
# What this PR does

- Rename `SlackClientWithErrorHandling` to just `SlackClient`
- Add more error classes + improve the way errors are raised based on
the Slack error code
- Add API call retries on Slack server errors (e.g. when Slack returns
`5xx` errors)
- Refactor some methods working with Slack API + add tests

## Which issue(s) this PR fixes

- https://github.com/grafana/oncall-private/issues/1837
- https://github.com/grafana/oncall-private/issues/1840
- https://github.com/grafana/oncall-private/issues/1842

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-09-12 09:49:16 +00:00
Matias Bordese
14c32a74bf
Trigger alert group signal after transaction commit (#3001)
See
https://docs.djangoproject.com/en/4.2/topics/db/transactions/#performing-actions-after-commit

Related to https://github.com/grafana/oncall-private/issues/2015
2023-09-11 14:28:37 +00:00
Matias Bordese
971384b50e
Update internal alert group details API docs (#2995)
Related to https://github.com/grafana/oncall/issues/2982
2023-09-11 13:25:00 +00:00
Matias Bordese
bf4d948449
Update slack schedule shift change notification (#2949)
Related to https://github.com/grafana/oncall/issues/2916

Updated notification:

![slack-shift-notification](https://github.com/grafana/oncall/assets/260710/825fda59-6636-44c1-9740-8976e7c109a7)
2023-09-07 13:00:12 +00:00
Yulya Artyukhina
6ff61ad172
Fix escalation step "Notify if num alerts in time window" (#2965)
# What this PR does
Fix escalation step "Notify if num alerts in time window" when
escalation policy was deleted during the escalation

## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2017

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-09-05 10:32:59 +00:00
Joey Orlando
a9155130df
update slack_sdk dependency to latest version (#2947)
# What this PR does

- update `slackclient` dependency to latest version. The version we were
using was 5 years old 😲
- first followed the v2 migration guide
[here](https://github.com/slackapi/python-slack-sdk/wiki/Migrating-to-2.x)
followed by the v3 migration guide
[here](https://slack.dev/python-slack-sdk/v3-migration/). The main
changes were:
    - The PyPI project was renamed from `slackclient` to `slack_sdk`
- it is discouraged/harder to call `api_call` and encouraged to call the
helper methods (ex. `chat_postMessage`;
[note](https://github.com/slackapi/python-slack-sdk/wiki/Migrating-to-2.x#web-client-api-changes)
in migration guide docs)
- In 1.x, a failed api call would return the error payload to you and
have you handle the error. In 2.x, a failed api call will throw an
exception. To handle this in your code, you will have to wrap api calls
with a try except block. Since we overload `WebClient.api_call` this was
an easy change and only required a one line change
- remove `apps.slack.slack_client.slack_server.SlackClientServer` class.
The new version of `slack_sdk` handles the case that we needed to
overload for in the first place.
- merged `apps/slack/slack_client/slack_client.py` and
`apps/slack/slack_client/exceptions.py` into `apps/slack/client.py`

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-09-05 11:31:59 +02:00
Vadim Stepanov
a2851d3f81
Refactor AlertGroup.slack_message (#2957)
# What this PR does

Refactors the foreign key relationship between `AlertGroup` and
`SlackMessage` models (see
https://github.com/grafana/oncall-private/issues/1867 for more info).

## Which issue(s) this PR fixes

https://github.com/grafana/oncall-private/issues/1867

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-09-04 18:03:18 +00:00
Yulya Artyukhina
cc92c53f84
Fix build escalation snapshot (#2954)
# What this PR does
Fix escalation snapshot building if last notified user in escalation
step "Notify users one by one (round-robin)" was deleted
 
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2148

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-09-04 11:10:28 +00:00