Commit graph

196 commits

Author SHA1 Message Date
Joey Orlando
49d20f1a7e
bump uwsgi to 2.0.26 + Python to 3.12.3 (#4495)
# What this PR does

- bumps `uwsgi` to latest version (`2.0.26`), which unblocks us from
bumping Python to 3.12
- bumps Python to 3.12.3
- refactor the Snyk GitHub Actions workflow to use the composable
actions for installed frontend and backend dependencies
- fixes several `AttributeError`s in our tests that went from a warning
to an error in Python 3.12 (see
https://github.com/python/cpython/issues/100690)

# Which issue(s) this PR closes

Closes #4358
Closes https://github.com/grafana/oncall/issues/4387
2024-06-10 15:33:37 -04:00
Ildar Iskhakov
a9ff1cbc33
Handle SlackAPIRatelimitError in perform_notification (#4486)
# What this PR does

Fixes retrying perform_notification task when Slack API returns 429
(Ratelimited)
See
[thread](https://raintank-corp.slack.com/archives/C025VMT6SPK/p1717725432075029)
for more details

## Which issue(s) this PR closes

Closes [issue link here]

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-06-07 13:54:45 +00:00
Vadim Stepanov
f47f9c29fd
Update resolution note message shortcut instruction (#4482)
# What this PR does

Updates the instruction that pops up when clicking the `Add Resolution
notes` button in Slack with an up-to-date GIF.

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-06-07 13:54:45 +00:00
Innokentii Konstantinov
805d4421bd
Support grafana escalate (#4453)
# What this PR does
This PR adds support for **/grafana escalate** command alongside with
**/escalate.**
2024-06-05 05:51:26 +00:00
Vadim Stepanov
b3a56cdffc
Reduce size of payload on /escalate Slack command (#4458) 2024-06-04 18:11:15 +00:00
Innokentii Konstantinov
17f448c506
Prepare OnCall for Unified Slack App (#4232)
This PR does a bunch of changes to prepare OnCall for Unified Slack App:
1. Install Slack via Chatops-Proxy. This change contains two parts:
getting a Slack install link from chatops-proxy
([code](https://github.com/grafana/oncall/pull/4232/files#diff-437a77d49fc04b92d315651b3df5991000b1ab74cf60aabb21aa77cb2823bf52R46))
and receiving a "slack installed" event from chatops-proxy
([code](https://github.com/grafana/oncall/pull/4232/files#diff-976d106f0962be5c1de5e35582193f68435ed0c17f2defd6bd2857bf6e27f65d)).
Also it means that OnCall doesn't need to register slack_links anymore
when slack is connected/disconnected. These changes are behind
UNIFIED_SLACK_APP_ENABLED flag and should be no-op if flag is not
enabled.
2. Get rid of Multiregionatily restrictions - instrument all slack
interactions with a ProxyMeta - json data telling chatops-proxy where to
route the interaction. Note, that it doesn't apply for "Add to
resolution notes" message action - it will be handled differently in
following PR.
3. Move all chatops-proxy related stuff from common/oncall-gateway to
apps/chatops-proxy

Minor changes:
1. Remove usage of **CHATOPS_V3** flag. Chatops v3 is already released
(It's a refactoring from previous quarter)

---------

Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>
Co-authored-by: Rares Mardare <rares.mardare@grafana.com>
2024-06-03 09:07:10 +00:00
Matias Bordese
5291feeb51
Fix update slack group to not raise if group is not found (#4423)
Fixes https://github.com/grafana/oncall-private/issues/2664
2024-05-30 11:27:25 +00:00
Matias Bordese
6acbb71fad
Do not retry on SlackAPICantUpdateMessageError errors (#4405)
Related to https://github.com/grafana/oncall/pull/4329
2024-05-28 17:46:15 +00:00
Matias Bordese
d4ba57b68b
Avoid retrying to update Slack log message if cant_update_message (#4329)
Do not retry updating a message if Slack returns `cant_update_message`
API [error](https://api.slack.com/methods/chat.update#errors) (meaning
bot user has no permission to update the message).
2024-05-09 16:16:53 +00:00
Salvatore Giordano
720bcf983a
Update deep link URL for Slack messages (#4317)
# What this PR does

It updates the slack deep link url to respect the [correct
format](https://api.slack.com/reference/deep-linking#app_or_bot)
requested [here](https://github.com/grafana/oncall/issues/4122)

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.

---------

Co-authored-by: Ildar Iskhakov <Ildar.iskhakov@grafana.com>
2024-05-09 10:18:48 +00:00
Matias Bordese
b3c1800f87
Add Slack deep link entry to alert group permalinks (#4205)
Related to https://github.com/grafana/oncall/issues/4122
2024-04-12 13:25:48 +00:00
Matias Bordese
4ac2df19b5
Update xdist load to use loadscope setting (#4187)
Changed xdist dist setting to use `loadscope` value
See
[docs](https://pytest-xdist.readthedocs.io/en/latest/distribution.html#running-tests-across-multiple-cpus)
2024-04-08 19:03:58 +00:00
Matias Bordese
398b09a85b
Allow getting details from connected integration webhooks (#4153)
Related to https://github.com/grafana/oncall-private/issues/2615
2024-04-08 14:13:17 +00:00
Matias Bordese
dc9dc9a57f
Update backsync method to take source channel as param (#4159)
Update by backsync will now expect the source alert receive channel
triggering the transition (and update the log record using this
information).

Related to https://github.com/grafana/oncall-private/issues/2615
2024-04-05 16:04:13 +00:00
Innokentii Konstantinov
8294ab5639
Add more logs for updating slack user group members (#4146) 2024-04-03 08:28:22 +00:00
Joey Orlando
afc688feda
upgrade flake8 to v7 (#4141)
# Which issue(s) this PR closes

Fixes [this
issue](https://github.com/grafana/oncall-private/pull/2620/files#diff-0144920543fd191db13f76c9fb797116e26eda2bdd2b79332b61bfbf5846208eR193-R197)
(https://github.com/PyCQA/pycodestyle/issues/334#issuecomment-2027394413)
in `grafana/oncall-private`

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-04-02 14:26:19 +00:00
Joey Orlando
c5cd675738
cleanup CustomButton backend code + add ngrok/express outgoing webhook e2e test (#2544)
# What this PR does

- removes unused "custom button" backend code now that we've migrated to
outgoing webhooks
- adds new e2e test for webhooks asserting that an `ngrok`/`express`
webhook handler receives the call as expected + payload is as expected
(related to https://github.com/grafana/oncall/issues/2691) - skipped for
now, the test passes locally but fails on GitHub Actions CI, seems to be
networking related
 
## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)

---------

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>
2024-03-28 15:37:22 +00:00
Yulya Artyukhina
3c93375244
Update alert group state by backsync (#4089)
# What this PR does
Adds method to update alert group state by backsync
Related to https://github.com/grafana/oncall-private/issues/2542
Should be merged with
https://github.com/grafana/oncall-private/pull/2606

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-03-27 12:37:01 +00:00
Matias Bordese
adaab1c6ad
Check for permissions on Slack escalate command (#3891)
Related to https://github.com/grafana/oncall/issues/3109

Fixes issue from https://github.com/grafana/oncall/pull/3881 (problem
was that there is no organization set in the Slack request, making it
impossible to check for user permissions; check permission once an
organization is set in the form instead).
2024-02-14 19:02:09 +00:00
Matias Bordese
5ecdc26b0a
Revert requiring permission on Slack direct paging (#3881)
Reverting part of https://github.com/grafana/oncall/pull/3861
2024-02-12 18:48:43 +00:00
Matias Bordese
160d501bbe
Add permission checks for Slack paging and shift swaps actions (#3861)
Fixes https://github.com/grafana/oncall/issues/3109
2024-02-09 12:30:05 +00:00
Joey Orlando
7db7b09c55
attempt to address some SlackAPIRatelimitError exceptions (#3820)
# Which issue(s) this PR fixes

Closes https://github.com/grafana/oncall-private/issues/2515

Attempts to address some `SlackAPIRatelimitError` exceptions seen in the
following tasks:
- `apps.slack.tasks.post_slack_rate_limit_message`
([logs](https://ops.grafana-ops.net/explore?schemaVersion=1&panes=%7B%22qhs%22:%7B%22datasource%22:%22000000193%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%23%20%7Bcluster%3D~%5C%22prod-%28us-central-0%7Ceu-west-0%29%5C%22,%20namespace%3D%5C%22amixr-prod%5C%22,%20job%3D~%5C%22amixr-prod%2Famixr-engine-celery-retry%2A%5C%22%7D%5Cn%7Bcluster%3D~%5C%22prod-%28us-central-0%7Ceu-west-0%29%5C%22,%20namespace%3D%5C%22amixr-prod%5C%22%7D%20%7C%3D%20%5C%22apps.slack.tasks.post_slack_rate_limit_message%5C%22%20%7C%3D%20%5C%22retry%5C%22%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22000000193%22%7D,%22editorMode%22:%22code%22%7D%5D,%22range%22:%7B%22from%22:%22now-7d%22,%22to%22:%22now%22%7D%7D%7D&orgId=1))
- `alerts.tasks.notify_user.perform_notification`
([logs](https://ops.grafana-ops.net/explore?schemaVersion=1&panes=%7B%22qhs%22:%7B%22datasource%22:%22000000193%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bcluster%3D~%5C%22prod-%28us-central-0%7Ceu-west-0%29%5C%22,%20namespace%3D%5C%22amixr-prod%5C%22%7D%20%7C%3D%20%5C%22SlackAPIRatelimitError%5C%22%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22000000193%22%7D,%22editorMode%22:%22code%22%7D%5D,%22range%22:%7B%22from%22:%22now-7d%22,%22to%22:%22now%22%7D%7D%7D&orgId=1))

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-02-01 14:47:12 -05:00
Michael Derynck
2a466a0c4f
Add transaction on_commit before signals for alert group actions (#3731)
# What this PR does
Add transactions around log record creation and check transaction
on_commit before sending signals passing DB id of alert group log
records. In cases for delete we can then assume any missing IDs on tasks
are from intentionally deleted alert groups and we can stop tasks from
retrying endlessly.

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-31 15:54:50 -07:00
Joey Orlando
758c12790d
fix slack API rate limit errors in send_message_to_thread_if_bot_not_in_channel task (#3803)
# What this PR does

See [this
conversation](https://raintank-corp.slack.com/archives/C04JCU51NF8/p1706722752735009)
for more context.

Additionally, improves logging for this task + adds unit tests.

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-31 13:42:52 -05:00
Joey Orlando
3833d8de56
remove manual alert group (/oncall) slack slash command + force_route_id (#3790)
# What this PR does

Related to [this
discussion](https://raintank-corp.slack.com/archives/C04JCU51NF8/p1706550226831949)

Removes the `/oncall` Slack slash command + the concept of
`force_route_id` (as this Slack slash command was the last piece of code
to use this concept
[here](https://github.com/grafana/oncall/blob/dev/engine/apps/slack/scenarios/manual_incident.py#L146))

## TODO before merging
- [x] update the various env's Slack apps to remove the slash command
from the app manifests

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-30 17:28:23 -05:00
Innokentii Konstantinov
f7df1ad5e7
Slack and telegram routes to test chatops-proxy v3 (#3723) 2024-01-22 13:48:19 +08:00
Innokentii Konstantinov
4a02d83fd1
Chatops api v3 (#3721)
This PR makes OnCall compatible with chatops-proxy v3. When CHATOPS_V3
is enabled, oncall will use new api client to register tenants and slack
installations. Also I added v3 routes for slack and telegram, so it's
possible to test new chatops proxy.

Currently two versions of chatops-proxy api are deployed, but they are
not compatible. They are doing same thing, using different db model and
tables. Once only v3 version will be left in prod, I'll remove
CHATOPS_V3 env var, all leftovers of previous api client and v3 slack
and telegram routes.

---------

Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>
2024-01-20 06:56:17 +00:00
Yulya Artyukhina
c7895c2308
Fix post message to slack channel (#3701)
# What this PR does
Extend list of exceptions to ignore on posting message to slack channel

## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/3694

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-17 13:05:36 +00:00
Vadim Stepanov
6c248ed1c8
Fix posting Slack message when route is deleted (#3702)
# What this PR does

Fixes https://github.com/grafana/oncall/issues/3646

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-17 13:00:25 +00:00
Vadim Stepanov
80f85cf4b4
Fix updating a shift swap with no Slack message (#3686)
# What this PR does

Fixes https://github.com/grafana/oncall/issues/3648

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2024-01-15 17:36:01 +00:00
Vadim Stepanov
d0904ca405
Improve OpenAPI schema coverage (#3629)
# What this PR does

Improves OpenAPI schema coverage for internal API:

- Fixes/Improves `alert group` and `feature` endpoints
- Adds `integration` and `user` endpoints

## Which issue(s) this PR fixes

https://github.com/grafana/oncall/issues/3444

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-12 15:11:22 +00:00
Joey Orlando
9657533b5b
fix duplicate teams showing up in teams dropdown for /escalate slack command (#3590)
# Which issue(s) this PR fixes
- Closes https://github.com/grafana/support-escalations/issues/8763
- Closes https://github.com/grafana/oncall/issues/3388

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-22 12:36:54 +00:00
Matias Bordese
e260e23715
Add missing success log entries for personal notifications (#3557) 2023-12-14 18:32:26 +00:00
Joey Orlando
76a88bc0c1
Revert "upgrade to Python 3.12 (#3456)" and "bump uwsgi version to latest #3466" (#3483)
# What this PR does

This reverts commits 7c4b40a046 and
cdb22285db.

See https://github.com/grafana/oncall-private/pull/2361 for more
details.
2023-12-01 09:56:26 -05:00
Joey Orlando
7c4b40a046
upgrade to Python 3.12 (#3456)
# What this PR does

Upgrade to Python 3.12 + fix several invalid test assertions that lead
to test failures in the latest version of `pytest`:
```
AttributeError: 'called_once_with' is not a valid assertion. Use a spec for the mock if 'called_once_with' is meant to be an attribute.. Did you mean: 'assert_called_once_with'?
```

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-30 13:47:41 +00:00
Matias Bordese
aa8a904a8d
Update when slack client ratelimit retry handler is enabled (#3447) 2023-11-30 12:35:46 +00:00
Matias Bordese
7aa78f5f73
Enable flake8-bugbear, fix issues (#3454)
Enables [flake8-bugbear](https://github.com/PyCQA/flake8-bugbear),
checking for bugs/design problems, and [fixes the issues
found](https://pastebin.com/fEDBz6Ta) (some interesting ones,
particularly with mutable args).

Related to https://github.com/grafana/oncall/pull/3448
2023-11-29 15:04:48 +00:00
Innokentii Konstantinov
ccc64e6b90 Fix 2023-11-28 13:22:18 +08:00
Innokentii Konstantinov
8f17784c49 Fix 2023-11-28 13:16:55 +08:00
Innokentii Konstantinov
22cfba0163 Remove call to bots_info in the slack message handler 2023-11-28 13:14:33 +08:00
Ildar Iskhakov
393c8e06a7
Merge branch 'main' into dev 2023-11-28 09:59:07 +08:00
Innokentii Konstantinov
85f9b0f168
Log slack bot_id and bot_user_it (#3429)
Log slack bot id and bot user id to check if we can avoid request to
slack api
2023-11-27 18:36:53 +08:00
Yulya Artyukhina
863af25994
Fix alert group rendering (#3424)
# What this PR does
Fix alert group rendering when some links were broken because of
replacing `-` to `_`.

## Which issue(s) this PR fixes
https://github.com/grafana/support-escalations/issues/8119
https://github.com/grafana/support-escalations/issues/8468

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-24 15:39:37 +00:00
Matias Bordese
56d1b529e9
Add builtin slack retry on ratelimited error (#3401)
Fixes https://github.com/grafana/oncall-private/issues/2293

Enable Slack client retries on `ratelimited` errors: it will check the
`Retry-After` header before trying again. After 3 attempts it will raise
the error (and we will fallback to the usual error/task retry handling).
2023-11-21 17:32:29 +00:00
Joey Orlando
05ec0f97b5
fix issue in /escalate Slack command when selecting a team (#3381)
# Which issue(s) this PR fixes

Closes https://github.com/grafana/support-escalations/issues/8380

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-20 15:27:01 -05:00
Matias Bordese
3b90c6544b
Avoid msg_too_long errors when posting/updating slack resolution note (#3372) 2023-11-20 12:17:07 +00:00
Matias Bordese
eb849678a6
Update slack user group update not to retry on some errors (#3363) 2023-11-16 13:41:42 +00:00
Matias Bordese
e1e56fc414
Truncate resolution note text in slack message to satisfy block limits (#3351)
This should help with some retrying tasks.
2023-11-16 13:15:04 +00:00
Vadim Stepanov
456829f768
Pass all integration labels down to alert groups (#3302)
Reverts grafana/oncall#3301
2023-11-08 14:04:58 +00:00
Matias Bordese
ac01dd173d
Improve user permissions query (#3291)
The query for checking a user permission (used to get users from a Slack
usergroup, for example) is timing out (and generating retries, besides
affecting some use cases:
[logs](https://ops.grafana-ops.net/explore?panes=%7B%22FCQ%22:%7B%22datasource%22:%22c-R8UWvVk%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22amixr-prod%5C%22,%20cluster%3D%5C%22prod-us-central-0%5C%22%7D%20%7C%3D%20%5C%22Timeout%20exceeded%20in%20regular%20expression%20match.%5C%22%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22c-R8UWvVk%22%7D,%22editorMode%22:%22code%22%7D%5D,%22range%22:%7B%22from%22:%22now-6h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1)):

`django.db.utils.OperationalError: (3699, 'Timeout exceeded in regular
expression match.')`

Change to a `contains` query except for SQLite (not supported), where a
simplified version of the original regex query is used.
2023-11-07 16:58:16 +00:00