Commit graph

539 commits

Author SHA1 Message Date
Vadim Stepanov
4c31ede558
Add "used in escalation" filter for schedules internal API (#1425)
# What this PR does
Adds a `used` filter on schedules endpoint for internal API.

Usage:
- `?used=true` returns schedules that are referenced by at least one
escalation policy
- `?used=false` returns schedules that are NOT referenced
- `?used=null` or not providing the query param at all will return all
schedules
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/1423

## Checklist

- [x] Tests updated
2023-03-01 10:09:07 +00:00
Innokentii Konstantinov
6a5e75e083
Fix of templates api behaviour for public and private api (#1408)
# What this PR does

This PR fixes templates behaviour for public and private api. It fix
"reset to default" for templates from messaging backends and some minor
bugs. Also added acknowledge signal and source link templates

## Checklist

- [x] Tests updated
- [x] Documentation added
- [x] `CHANGELOG.md` updated
2023-03-01 16:32:15 +08:00
Vadim Stepanov
a25fd429da
Show 100 latest alerts on alert group page (#1417)
# What this PR does
Make internal API return 100 latest alerts for alert group.

## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/857

## Checklist

- [x] Tests updated
- [x] `CHANGELOG.md` updated
2023-02-28 14:12:56 +00:00
Matias Bordese
04c42e2796
Matiasb/fix task refresh ical when empty value (#1401)
This should fix task error as seen in logs, trying to parse an empty
string as ical value:
```
Task apps.schedules.tasks.refresh_ical_files.refresh_ical_file[] raised unexpected: ValueError("Found no components where exactly one is required: ''")
```
2023-02-24 21:16:09 +00:00
Matias Bordese
721ab9fbb9
Use UTC instead of Etc/UTC when passing tz to dateutil rrule (#1414)
Fixes https://github.com/grafana/oncall-private/issues/1648
2023-02-24 20:54:20 +00:00
Michael Derynck
b3659872a7
Get reCAPTCHA site key from backend env (#1400)
# What this PR does
Move reCAPTCHA site key to backend environment for easier management to
support multiple environments.

## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [x] `CHANGELOG.md` updated
2023-02-24 15:53:35 +00:00
Matias Bordese
98b3b918a5
Add schedule pagination to plugin API (#1309)
Related to #1289

---------

Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>
2023-02-24 14:59:03 +00:00
Michael Derynck
49946e6a4e
Change Organization Deleted/Moved Precedence (#1402)
# What this PR does
When an organization is migrated to a different cluster it has it's
`migration_destination_slug` set for redirection purposes but it also
needs to be deleted so scheduled tasks for it do not run in the old
cluster. By changing the order so moved has precedence over deleted API
calls will be correctly redirected for moved organizations while the
organization is still considered deleted to suppress tasks that are no
longer needed in the old cluster.

## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
2023-02-24 11:45:21 +00:00
Matias Bordese
b6ce63e2a9
Fix/rewrite flaky schedule tests (#1397) 2023-02-23 18:20:51 +00:00
Joey Orlando
b61f2ce41f
patch minor sync issue when HTTP 302 is received from Grafana API instance (#1393)
# What this PR does

this PR refactors the `sync_organization` and
`GrafanaAPIClient.is_rbac_enabled_for_organization` methods to check the
connected response bool rather than explicit check on HTTP 200. This
handles the legitimate case where the Grafana instance may return an
HTTP 302 (redirect) rather than an HTTP 200.

## Which issue(s) this PR fixes

See
[this](https://grafana.slack.com/archives/C02LSUUSE2G/p1677136582890269)
Slack thread in the community channel for more context

## Checklist

- [x] Tests updated
- [ ] Documentation added (N/A)
- [x] `CHANGELOG.md` updated
2023-02-23 13:23:57 +00:00
Vadim Stepanov
a2eed312f9
PD migrator: migrate on-call shifts using public API (#1317)
Allow PD migrator tool to migrate on-call shifts when migrating
schedules (currently it migrates schedules using PD ICal file):
https://github.com/grafana/oncall/issues/1283.

This PR will allow to select the mode of schedule migration via
`SCHEDULE_MIGRATION_MODE_WEB` env variable (`ical` or `web`). Due to
differences in the scheduling systems of PD and OnCall, it's not always
possible to migrate shifts automatically (migration plan will show any
schedules and layers that can't be migrated).

PD rotations that will be possible to migrate:
- Any rotation without restrictions ("restriction" is a PD term for
describing active periods for rotation)
- Daily rotations with daily restrictions
- Weekly rotations with weekly restrictions
- Some weekly rotations with daily restrictions
- Some daily rotations with weekly restrictions

There will be a separate PR to update the
[instruction](https://github.com/grafana/oncall/tree/dev/tools/pagerduty-migrator#readme)
since this one is pretty huge already.
2023-02-23 11:34:03 +00:00
Yulya Artyukhina
53af4783de
Fix the cause of retry of notify_all and notify_group tasks (#1376)
Fix the cause of retry of notify_all and notify_group tasks that was
related to an incorrect step order.
2023-02-23 09:28:13 +00:00
Innokentii Konstantinov
26a2bd9c91
Refactor maintenance (#1340)
# What this PR does
This PR simplifies code of maintenance mode.
1. Perform distribution/escalation maintenance checks in send_signal...
tasks.
2. Use usual alert distribution flow for the maintenance incident.
3. Decouple maintenance mode from slack (all, except
**notify_about_maintenance_action** methods, I don't want to make this
PR too big)

As a bonus from these changes, maintenance mode now mute alert group
delivery in all chatops integrations, not only in slack. (Before,
incidents happened while maintenance were posted to telegram and msteams
anyway)

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
2023-02-23 07:13:03 +00:00
Innokentii Konstantinov
59f83ed331
Revert "Revert "Rework schedules cached ical file values"" (#1382)
Reverts grafana/oncall#1377
2023-02-22 07:30:19 +01:00
Innokentii Konstantinov
f4ee99eb7b
Fix load of ical file from google (#1381) 2023-02-22 07:29:59 +01:00
Matias Bordese
b02dc6bd36
Revert "Rework schedules cached ical file values" (#1377)
Reverts grafana/oncall#1312

This change seems to have introduced some unexpected behavior with slack
user groups. Reverting to reproduce locally and push an improved update.
2023-02-22 10:16:49 +08:00
Innokentii Konstantinov
c733d8b9f2
Cleanup ScenarioStep (#1213)
# What this PR does
This PR cleanup ScenarioStep. It's needed to simplify moving Slack to
the messaging backends in future.

1. Introduce AlertGroupSlackService to move logic from ScenarioStep.
Also it allowed to get rid of importing ScenarioSteps in the code not
related to processing of slack callbacks.
2. Remove tags from ScenarioSteps, they are unused.
3. Remove ScenarioStep.dispatch method. It just was calling
ScenarioStep.process_scenario.
4. Remove "action" param from process_scenario, it was unused.
5. Remove creation of SlackActionRecord on handling SlackEvents. We are
not using it, but it generates INSERT query on most of the user-slack
interactions.
6. Remove "random_prefix_for_routing" from ScenarioStep, it was unused.
## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-02-21 20:22:11 +01:00
Joey Orlando
c55a9010f7
Add Google reCAPTCHA for mobile app phone verification (#1373)
# What this PR does

Adds reCAPTCHA validation to the get mobile verification code endpoint

## Which issue(s) this PR fixes

## Checklist

- [x] Tests updated
- [ ] Documentation added (N/A)
- [x] `CHANGELOG.md` updated

---------

Co-authored-by: Maxim <maxim.mordasov@grafana.com>
2023-02-21 20:17:06 +01:00
Innokentii Konstantinov
61fdcfdc72
Add ratelimit for phone number verification (#1354)
# What this PR does

## Which issue(s) this PR fixes

## Checklist

- [x] Tests updated
- [x] `CHANGELOG.md` updated

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-02-21 16:47:52 +08:00
Yulya Artyukhina
058665b8a8
Fix too long declare incident link (#1342)
# What this PR does

## Which issue(s) this PR fixes
Issue with too long declare incident link in Slack

## Checklist

- [x] `CHANGELOG.md` updated
2023-02-20 18:42:44 +08:00
Vadim Stepanov
507c9a5b92
Show direct paging integrations in list filter options (#1329)
# What this PR does
Changes the internal API to show direct paging integrations in list
filter options.

<img width="1203" alt="Screenshot 2023-02-15 at 16 27 31"
src="https://user-images.githubusercontent.com/20116910/219090234-d40d471c-ffcc-48e9-8799-e48f03681b72.png">

## Checklist

- [x] Tests updated
2023-02-16 11:38:45 +00:00
Matias Bordese
b8f15904a8
Rework schedules cached ical file values (#1312)
Related to #1216 

Set default cached empty value as `""`, while keeping `None` to indicate
a refresh is needed.
2023-02-09 08:45:20 -03:00
Matias Bordese
a121f84a89
Rework slack login to check backend before redirecting (#1306)
Also:
- Remove unused slack login views in `social_auth` app
- Disable unlink actions in the profile if user is not owner (otherwise
it will disconnect the logged in user, not the one being shown on
screen)
2023-02-08 09:08:18 -03:00
Vadim Stepanov
d40db5a352
Make emails case-insensitive for ICal schedules (#1297)
# What this PR does
Fixes a bug when current on-call users for web UI and Slack user group
are incorrect.

This happens when both conditions below are true:
- Using an ICal schedule
- Email of a user in Grafana has a different case than an event in ICal
(e.g. `User@gmail.com` in Grafana, `user@gmail.com` in ICal event)

The bug was introduced by https://github.com/grafana/oncall/pull/1169

## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/1296

## Checklist

- [x] Tests updated
- [x] `CHANGELOG.md` updated
2023-02-06 12:11:13 +00:00
Ildar Iskhakov
1b7ada4315
Add database migrations linter (#1020)
# What this PR does

This PR adds
[django-migration-linter](https://github.com/3YOURMIND/django-migration-linter)
to keep database migrations
 backwards compatible

- we can automatically run migrations and they are zero-downtime, e.g.
old code can work with the migrated database
 - we can run and rollback migrations without worrying about data safety
- OnCall is deployed to the multiple environments core team is not able
to control

See [django-migration-linter
checklist](https://github.com/3YOURMIND/django-migration-linter/blob/main/docs/incompatibilities.md)
for the common mistakes and best practices


## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-02-06 16:01:37 +08:00
Ben Sully
f930e3687e
Include alert details in Grafana Incident alert-group endpoint (#1280)
This will be used by Grafana Incident to infer details about the
incident if alert groups are present.

Depends on #1279
2023-02-03 13:43:21 +00:00
Vadim Stepanov
070eb6e538
Enable mobile app backend by default on OSS (#1286)
# What this PR does
Enables mobile app backend by default on OSS.

## Checklist
- [x] `CHANGELOG.md` updated
2023-02-03 12:44:22 +00:00
Ben Sully
cb98751c0f
Disable listing alert groups using Grafana Incident API (#1279)
Grafana Incident API only ever accesses individual alert groups by ID.
Using the /api/gi/v1/alert-groups endpoint without an ID would list
every
single alert group in the database (from what I can tell), since the
auth
isn't org-specific.

This is a precursor to another PR which adds more details to the alert
group
serializer, which makes the query more resource intensive, so I thought
it
best to disable listing first.
2023-02-03 12:26:03 +00:00
Ildar Iskhakov
930f1f9edf
Revert count query in alert groups internal api endpoint (#1285)
# What this PR does

## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
2023-02-03 20:01:46 +08:00
Ildar Iskhakov
710f7755c0
Fix bug with root/dependant alert groups list api endpoint (#1284)
# What this PR does

## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
2023-02-03 19:53:35 +08:00
Ildar Iskhakov
335c8fe65b
Optimize alert and alert group public api endpoints, add filter by id (#1274)
# What this PR does

## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-02-03 17:05:08 +08:00
Matvey Kukuy
038310829b
Mobile app documentation draft. (#1207)
# What this PR does

First draft of documentation. @alyssawada please use it as a starting
point :)

## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [x] Documentation added
- [ ] `CHANGELOG.md` updated

---------

Co-authored-by: alyssa wada <alyssa.wada@grafana.com>
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Alyssa Wada <101596687+alyssawada@users.noreply.github.com>
Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>
2023-02-02 15:06:28 +00:00
Vadim Stepanov
2218161069
Fix test 2023-02-02 14:28:37 +00:00
Vadim Stepanov
08dbab73d2
Remove mobile_app_settings DynamicSetting (#1268)
# What this PR does
Remove checks for `mobile_app_settings` DynamicSetting, so changing
`FEATURE_MOBILE_APP_INTEGRATION_ENABLED` is enough for toggling the
mobile app backend (aka remove per-org feature flag)

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-02-02 13:21:04 +00:00
Matias Bordese
bc0276fb22
Keep track of direct paging schedule/importance in logs (#1269)
This will eventually allow to improve responders information in an alert
group detail page
2023-02-02 09:21:31 -03:00
Ildar Iskhakov
df1517573e
Cache web template rendered fields for alert and alertgroup endpoints (#1261)
# What this PR does
This PR adds same approach as introduced
[here](https://github.com/grafana/oncall/pull/1236) to all alert and
alertgroup endpoints

## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-02-02 11:37:52 +08:00
Vadim Stepanov
b7176888ed
Better FCM error handling / retries (#1267)
# What this PR does
Raise `FirebaseError` in celery tasks contacting FCM instead of just
logging it + add tests

## Checklist

- [x] Tests updated
2023-02-01 14:45:32 +00:00
Matias Bordese
3e15b8cd85
Add default slack channel info to direct paging dialog (#1263) 2023-02-01 10:03:54 -03:00
Joey Orlando
16196822de
Add utility function to get readonly db key if defined (#1264)
# What this PR does

This is a minor refactor before implementing
https://github.com/grafana/oncall-private/issues/1558.

Additionally, it cleans up a few spots where we do this:
```
# Re-take in case we are in the readonly db context.
```
We currently don't read anything from a read-only database, so this
should be not necessary.

## Checklist

- [x] Tests updated
- [ ] Documentation added (N/A)
- [ ] `CHANGELOG.md` updated (N/A)
2023-02-01 12:07:32 +01:00
Matias Bordese
b1fc123d9f
Add a filter by involved users to alert groups page (#1240)
Related to #1119 

It also adds a shortcut to filter current user's related alert groups
(alert groups user was notified by, or in which user participated). Make
the filter visible by default, with a false value.
2023-01-30 14:08:18 +02:00
Vadim Stepanov
f80271a1f4
Return alert group ID in direct paging API (#1241)
# What this PR does
Make direct paging internal API endpoint return an alert group ID.

## Which issue(s) this PR fixes
Related to https://github.com/grafana/oncall/issues/823

## Checklist

- [x] Tests updated
2023-01-30 11:48:25 +00:00
Ildar Iskhakov
ae44ee5652
Cache render_for_web field for alertgroups list serializer (#1236)
# What this PR does
This PR caches the field `render_for_web` with lifetime 1 day and cache
becomes invalid if it was created before
* last alert received
* template changed


## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
2023-01-28 12:50:41 +08:00
Matias Bordese
e0ae9919c7
Add paging for direct paging users in slack dialog (#1232)
Fixes issue when there are more than 100 users to be listed in the
direct pagination responders select. Alternatively we should consider
moving to an `external_select` block later.
2023-01-27 14:10:44 -03:00
Matias Bordese
dd27b3f2c5
Add schedules support for slack direct paging (#1183)
Related to #823
2023-01-25 09:10:50 -03:00
Joey Orlando
3cf2fcf660
optimize GET /schedules internal API endpoint (#1169)
# What this PR does

Fixes slow internal`GET /schedules` endpoints. Using the fake-data
generation script in #1128, I generated 65 calendar schedules in my
local setup. This resulted in the following endpoint performance:
![Screenshot 2023-01-24 at 12 03
16](https://user-images.githubusercontent.com/9406895/214276618-1a9848ba-eb84-49ec-a099-fdd96beac93f.png)

The responses which show ~76 queries were run on the latest `dev`
branch. Responses w/ ~26 queries were run on this branch.

Additionally:
- add typing to a few methods in `apps/schedules/ical_utils.py`
- document `apps/api/permissions/__init__.py:user_is_authorized`
function

## Which issue(s) this PR fixes

https://github.com/grafana/oncall-private/issues/1552

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated

Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>
2023-01-25 11:08:09 +01:00
Yulya Artyukhina
de5d876d27
Refactor create/update contact points for Alerting integration (#872)
**What this PR does**:
- Keep grafana version on create/update contact points to avoid multiple
requests to alerting
- Add retry limit on create contact point async
- Fix bugs related on create contact point
- Update logs on create/update contact point, make them more clear
- Avoid unnecessary requests to Grafana Alerting
2023-01-25 09:42:42 +01:00
Ildar Iskhakov
1fc3f6d301
Refactor plugin sync (#1200)
# What this PR does

This PR adds a shortcut in the plugin synchronisation process, so the
existing users will be able login without waiting for the sync task.
Every request still starts the background synchronisation task, to be
able to propagate the organisation changes faster than periodic task. It
means that we don't necessarily need "force reload" button in the
interface.
For all the other cases (user does not exist, organisation token "not
ok", etc) process remains same - plugin will show "Initialising
plugin..." until the background task in successfully completed

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-01-25 09:12:08 +08:00
Vadim Stepanov
cf1a1cd7f3
Remove DynamicSetting usage for mobile app backend on OSS (#1204)
# What this PR does
Make so there's no need to populate `mobile_app_settings` DynamicSetting
when using the OSS license to turn on the mobile app backend.
2023-01-24 13:53:54 +00:00
Ildar Iskhakov
46b39b2c87
Remove resolved and acknowledged filters as we switched to status (#1201)
# What this PR does

## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
2023-01-24 18:13:21 +08:00
Innokentii Konstantinov
cfa7fb816c
Sync users and teams on tf requests (#1180)
# What this PR does
This PR add sync with grafana on requests from terraform 

## Which issue(s) this PR fixes
It's needed to fix case when customers want to create team via grafana
terraform provider and use it in the oncall provider without having to
log into Grafana Cloud.

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-01-24 13:44:07 +08:00