Commit graph

1337 commits

Author SHA1 Message Date
Matias Bordese
0a077ccfdb
Update and refactor users API team filter (#3703)
This should hopefully fix the lint issue
[here](https://drone.grafana.net/grafana/oncall/3361/1/7)
2024-01-17 15:18:08 +00:00
Innokentii Konstantinov
36d2c3bdb7
Adds new templates cheatsheats (#3643)
Co-authored-by: Maxim Mordasov <maxim.mordasov@grafana.com>
2024-01-17 13:49:36 +00:00
Yulya Artyukhina
c7895c2308
Fix post message to slack channel (#3701)
# What this PR does
Extend list of exceptions to ignore on posting message to slack channel

## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/3694

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-17 13:05:36 +00:00
Vadim Stepanov
6c248ed1c8
Fix posting Slack message when route is deleted (#3702)
# What this PR does

Fixes https://github.com/grafana/oncall/issues/3646

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-17 13:00:25 +00:00
Joey Orlando
f85cc6d33b
add more logging on celery task retry (#3695)
# What this PR does

This is a follow up to https://github.com/grafana/oncall/pull/3677.

It appears that when a task uses the [`autoretry_for`
kwarg](https://docs.celeryq.dev/en/stable/userguide/tasks.html#automatic-retry-for-known-exceptions)
in the task decorator, it doesn't log the exception in `on_failure` as
would be expected. Now when retrying, we log out a message + any
exception/stack trace information.

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-16 07:13:16 -05:00
Vadim Stepanov
80f85cf4b4
Fix updating a shift swap with no Slack message (#3686)
# What this PR does

Fixes https://github.com/grafana/oncall/issues/3648

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2024-01-15 17:36:01 +00:00
Joey Orlando
da7f07ffd6
Fix occasional AttributeError in apps.grafana_plugin.tasks.sync.sync_organization_async task (#3687)
# Which issue(s) this PR fixes

Fix this issue I came across in a celery task retry exception log:
![Screenshot 2024-01-15 at 11 21
13](https://github.com/grafana/oncall/assets/9406895/ed08f2f1-dc7d-4ad3-88a0-dc02cd740582)


## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-15 11:34:40 -05:00
Vadim Stepanov
cc071806f3
disable DRF_SPECTACULAR_ENABLED by default 2024-01-15 16:06:46 +00:00
Joey Orlando
4036ced9b9
add LogExceptionOnFailureTask celery task class (#3677)
# What this PR does

Closes https://github.com/grafana/oncall-private/issues/2449

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-12 21:31:01 +00:00
Vadim Stepanov
d0904ca405
Improve OpenAPI schema coverage (#3629)
# What this PR does

Improves OpenAPI schema coverage for internal API:

- Fixes/Improves `alert group` and `feature` endpoints
- Adds `integration` and `user` endpoints

## Which issue(s) this PR fixes

https://github.com/grafana/oncall/issues/3444

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-12 15:11:22 +00:00
Matias Bordese
8656404598
Fix oncall_now for a schedule in orgs with multiple entries (#3671)
Fixes https://github.com/grafana/oncall/issues/3626
2024-01-12 14:46:13 +00:00
Yulya Artyukhina
d6a232ba8b
Add missing notification log records (#3664)
Related to https://github.com/grafana/oncall-private/issues/2347
2024-01-12 14:02:44 +00:00
Michael Derynck
d49af63d75
Fix unicode character encoding in JSON for webhooks (#3670)
# What this PR does
Fixes escaping for unicode characters in webhooks.

## Which issue(s) this PR fixes
#3149 

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-11 19:35:23 +00:00
Vadim Stepanov
8b7ffad598
Add team filter for users endpoint (#3666)
# What this PR does

Adds `team` filter for `users` endpoint

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-11 15:03:54 +00:00
Matias Bordese
4e2e7e0a15
Add task logging personal notifications triggered/completed counts (#3638)
Related to https://github.com/grafana/oncall-private/issues/2347
2024-01-10 18:54:27 +00:00
Yulya Artyukhina
c947f8992e
Add endpoint for alert group escalation snapshot (#3615)
# What this PR does
Adds endpoint for alert group escalation snapshot

## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/3277

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-10 14:52:59 +00:00
Yulya Artyukhina
a7d441647e
Add stack slug to /organization endpoint response (#3644)
# What this PR does
Add stack slug to /organization endpoint response

## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2444
## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-10 12:29:43 +00:00
Joey Orlando
f20aa75869
Fix module 'apps.schedules.tasks.notify_about_empty_shifts_in_schedule' has no attribute 'apply_async' AttributeError (#3640)
# Which issue(s) this PR fixes

We've been seeing this `AttributeError` quite frequently for quite some
time
([logs](https://ops.grafana-ops.net/explore?schemaVersion=1&panes=%7B%22oPl%22:%7B%22datasource%22:%22000000193%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bcluster%3D~%5C%22prod-%28eu-west-0%7Cus-central-0%29%5C%22,%20namespace%3D%5C%22amixr-prod%5C%22%7D%20%7C%3D%20%60AttributeError%28%5C%22module%20%27apps.schedules.tasks.notify_about_empty_shifts_in_schedule%27%20has%20no%20attribute%20%27apply_async%27%5C%22%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22000000193%22%7D,%22editorMode%22:%22code%22%7D%5D,%22range%22:%7B%22from%22:%22now-7d%22,%22to%22:%22now%22%7D%7D%7D&orgId=1))

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-10 06:22:44 -05:00
Joey Orlando
006ee4b860
Decrease outgoing webhook timeouts from 10secs to 4secs (#3639)
# Which issue(s) this PR fixes

See all the context
[here](https://raintank-corp.slack.com/archives/C025VMT6SPK/p1704802171131009?thread_ts=1704762857.043879&cid=C025VMT6SPK)

<img width="690" alt="Screenshot 2024-01-09 at 15 26 33"
src="https://github.com/grafana/oncall/assets/9406895/e4c794a3-508d-4f24-af22-0f800828271d">


## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-09 19:55:39 -05:00
Joey Orlando
4cc4099710
Address Telegram HTTP 500s when receiving message from Telegram in discussion group (#3622)
# Which issue(s) this PR fixes

Closes https://github.com/grafana/oncall/issues/3621

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-09 08:31:56 -05:00
Joey Orlando
72e7224ad3
do not retry firebase.messaging.UnregisteredError exceptions for FCM relay tasks (#3637)
# What this PR does

_tldr_; we had a lengthy discussion about this
[here](https://raintank-corp.slack.com/archives/C04JCU51NF8/p1701893410542629?thread_ts=1701690117.016909&cid=C04JCU51NF8).
`firebase.messaging.UnregisteredError` errors occur because of events
outside of our control and retrying will never fix them, therefore we
should simply skip retrying in this case.

We retry these fairly often
([logs](https://ops.grafana-ops.net/explore?schemaVersion=1&panes=%7B%22iWZ%22:%7B%22datasource%22:%22000000193%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%23%20%7Bcluster%3D~%5C%22prod-%28eu-west-0%7Cus-central-0%29%5C%22,%20namespace%3D%5C%22amixr-prod%5C%22%7D%20%7C%3D%20%5C%22task_name%3Dapps.webhooks.tasks.trigger_webhook.execute_webhook%5C%22%20%7C%3D%20%5C%22retry%5C%22%5Cn%7Bcluster%3D~%5C%22prod-%28eu-west-0%7Cus-central-0%29%5C%22,%20namespace%3D%5C%22amixr-prod%5C%22%7D%20%7C%3D%20%5C%22apps.mobile_app.fcm_relay.fcm_relay_async%5C%22%20%7C%3D%20%5C%22UnregisteredError%5C%22%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22000000193%22%7D,%22editorMode%22:%22code%22%7D%5D,%22range%22:%7B%22from%22:%22now-7d%22,%22to%22:%22now%22%7D%7D%7D&orgId=1))
which eats up unnecessary celery worker resources.

Related to https://github.com/grafana/oncall-private/issues/1820

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-09 08:14:20 -05:00
Joey Orlando
3bcf5efc24
manually retry for requests.exceptions.Timeout exceptions when sending outgoing webhooks (#3632)
# Which issue(s) this PR fixes

Fixes https://github.com/grafana/oncall-private/issues/2439

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-08 19:13:15 -05:00
Matias Bordese
d57b41b758
Create log record for telegram formatting error in notification (#3628) 2024-01-08 20:12:28 +00:00
Salvatore Giordano
139df23911
Let mobile use paging endpoint (#3619)
# What this PR does

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-08 12:18:39 +00:00
Matias Bordese
181d5d5712
Setup one-at-a-time lock for sync_organization tasks (#3612)
Related to https://github.com/grafana/support-escalations/issues/8844

Queuing multiple sync_organization tasks for the same org could lead to
parallel running of the sync task for the same organization, potentially
creating duplicated entries and/or generating multiple unneeded API
calls. This prevents running an organization sync while there is a sync
for that same org in progress.
2024-01-04 15:34:28 +00:00
Joey Orlando
0a39f90979
revert forked redis lib change (#3600) (#3609)
# What this PR does

Reverts #3600 (related to
https://github.com/grafana/oncall-private/issues/2406)
2024-01-03 09:27:53 -05:00
Matias Bordese
1fccef65fd
Handle telegram message to reply to not found on send log task (#3587)
Similar to https://github.com/grafana/oncall/pull/404
2024-01-02 16:42:19 +00:00
Matias Bordese
4c8870f974
Add msteams feature flag (#3606)
Related to https://github.com/grafana/oncall-private/issues/2144
2024-01-02 15:55:44 +00:00
Joey Orlando
6c7bc4d20c
bump pinned commit for redis-py forked repo 2023-12-28 15:06:10 -05:00
Joey Orlando
7e8ff0790f
bump redis-py dependency commit
Use commit 124c4b2 (improves debug logging)
2023-12-28 13:49:48 -05:00
Joey Orlando
da47c02990
use forked version of redis-py which adds extra debug logging (#3600)
# Which issue(s) this PR fixes

This helps with debugging
https://github.com/grafana/oncall-private/issues/2406 (**note**: it
doesn't fix it)

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-28 09:14:43 -05:00
Joey Orlando
9657533b5b
fix duplicate teams showing up in teams dropdown for /escalate slack command (#3590)
# Which issue(s) this PR fixes
- Closes https://github.com/grafana/support-escalations/issues/8763
- Closes https://github.com/grafana/oncall/issues/3388

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-22 12:36:54 +00:00
Ravishankar
bcbca9d3b9
fix(3564) Support PATCH Method In Outgoing webhook (#3580)
# What this PR does
Adds PATCH method Support for outgoing webhook

## Which issue(s) this PR fixes
Fixes #3564 

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-12-20 08:49:50 -05:00
Yulya Artyukhina
647d46294c
Fix inbound email integration endpoint (#3586)
# What this PR does
Handle exception on parsing sender email address from email message for
inbound email integration endpoint

## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2398
## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-20 08:40:47 -05:00
Joey Orlando
006682d0b7
fix PUT /api/v1/escalation_policies/<id> issue related to updating from_time and to_time (#3581)
# Which issue(s) this PR fixes

Closes https://github.com/grafana/oncall-private/issues/2373

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-19 09:13:07 -05:00
Yulya Artyukhina
0421bc472a
Fix posting slack message about ratelimits (#3582)
# What this PR does

## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2374
## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-19 06:05:57 +00:00
Matias Bordese
f68b9dd004
Update auditor to check personal notifications (#3563)
Requires https://github.com/grafana/oncall/pull/3557

Related to https://github.com/grafana/oncall-private/issues/2347
2023-12-18 16:13:18 +00:00
Yulya Artyukhina
36227418ed
Speed up escalation auditor (#3578)
# What this PR does
Speed up escalation auditor
- use raw escalation snapshot instead of serialized one

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-18 12:28:55 +00:00
Yulya Artyukhina
8ade7d65e8
Fix alert group columns validation (#3577)
# What this PR does
Fix alert group columns validation: - validate column ids by each type
separately
## Which issue(s) this PR fixes
validation check from this issue -
https://github.com/grafana/oncall-private/issues/2378
## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-15 17:15:50 +00:00
Michael Derynck
e7f3eff72c
Limit how long acknowledge reminders can run for (#3571)
# What this PR does
Stops rescheduling of `acknowledge_reminder_task` after 2 weeks.
Assumption being if it has been sitting for that long in acknowledged
state it is likely to not need more reminders that it is still
acknowledged. Notifications for thread were probably muted a long time
ago.

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-15 16:50:01 +00:00
Yulya Artyukhina
2b62da77b7
Check if escalation was skipped in Slack before trying to notify user (#3562)
# What this PR does
Updates check if escalation was skipped in Slack before trying to notify
user by Slack.

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-15 09:33:01 +00:00
Matias Bordese
e260e23715
Add missing success log entries for personal notifications (#3557) 2023-12-14 18:32:26 +00:00
Matias Bordese
6dada51133
Remove unneeded filter making query slower (#3570)
There is no index for the `received_at` column, and the filter isn't
really needed (aggregation will work in any case, considering only the
entries for which we have data).
2023-12-14 18:25:34 +00:00
Yulya Artyukhina
088414c4d3
Add multi-stack support for mobile app (#3500)
# What this PR does
Allow creating multiple mobile devices with same `registration_id` for
different users (multi-stack support)

## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/3452

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-13 09:00:18 +00:00
Yulya Artyukhina
e003e8a0b8
Fix message is too big exception for mobile push notification (#3556)
# What this PR does
Adds limit for alert title length in mobile app push notifications
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2375
## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-12 16:46:08 +00:00
Yulya Artyukhina
0861113ed5
Add error code for mobile notification logs (#3554)
# What this PR does
Adds error code for mobile notification logs
## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-12 12:02:26 +00:00
Yulya Artyukhina
8a6510badd
Fix task retries for deleted alert groups (#3553)
# What this PR does

## Which issue(s) this PR fixes

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-12 12:01:47 +00:00
Yulya Artyukhina
8a56b2273b
Fix telegram retrying task after alert group was deleted (#3546)
# What this PR does

## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2379

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-12-11 18:06:04 +00:00
Stanislav Lutsenko
0d959a5c20
Fix amazon_ses inbound email ESP provider (#3509)
# What this PR does
Fixes django-anymail[amazon-ses] issues according to [anymail
docs](https://anymail.dev/en/stable/esps/amazon_ses/)

## Which issue(s) this PR fixes
[#3508](https://github.com/grafana/oncall/issues/3508)

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
2023-12-11 12:33:54 -05:00
Joey Orlando
16ba87bff6
Don't update alert group metrics when deleting an alert group (#3544)
# Which issue(s) this PR fixes

Fixes https://github.com/grafana/oncall-private/issues/2376

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-11 12:16:00 -05:00