Commit graph

185 commits

Author SHA1 Message Date
Michael Derynck
f7e406ca6f
Only run cleanup for integrations deleted recently (#4677)
# What this PR does
Changes operations to cleanup deleted empty integrations so that they
are only performed on organizations that have deleted integrations
recently. The previous task checked everything because we were not
performing the cleanup on a regular basis, now that it has been
scheduled regularly we can operate on recently deleted integrations
instead.

The existing task has been removed from the schedule, the code for it
can be removed after the other tasks have cleared.

## Which issue(s) this PR closes

Closes [issue link here]

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-16 13:59:08 +00:00
Yulya Artyukhina
191814b25e
User notifications bundle (#4457)
# What this PR does
This PR adds two new models: UserNotificationBundle and
BundledNotification (proposals for naming are welcome).

`UserNotificationBundle` manages the information about last notification
time and scheduled notification task for bundled notifications. It is
unique per user + notification_channel + notification importance.

`BundledNotification` contains notification policy and alert group, that
triggered the notification. The BundledNotification instance is created
in `notify_user_task` for every notification, that should be bundled,
and is attached to UserNotificationBundle by ForeignKey connection.

How it works:
If the user was notified recently (within the last two minutes) by the
current notification channel, and this channel is bundlable,
BundledNotification instance will be created and attached to the
UserNotificationBundle instance, and `send_bundled_notification` task
will be scheduled to execute in 2 min.
In `send_bundled_notification` task we get all BundledNotification
attached to the current UserNotificationBundle instance, check if alert
groups are still active and if there is only one notification - perform
regular notification by calling `perform_notification` task, otherwise
call "notify_by_<channel>_bundle" method for the current notification
channel.

PR with method to send notification bundle by SMS -
https://github.com/grafana/oncall/pull/4624

**This feature is disabled by default by feature flag. Public docs will
be added in a separate PR with enabling this feature.**
## Which issue(s) this PR closes
related to https://github.com/grafana/oncall-private/issues/2712

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-16 11:24:08 +00:00
Vadim Stepanov
b421296e13
Fix start_sync_org_with_chatops_proxy crontab (#4661)
Related to
https://raintank-corp.slack.com/archives/C04JCU51NF8/p1720695995646029
2024-07-11 16:28:21 +00:00
Ildar Iskhakov
acf8e396cd
Add START_SYNC_ORG_WITH_CHATOPS_PROXY_ENABLED (#4596)
# What this PR does

## Which issue(s) this PR closes

Closes [issue link here]

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-06-27 16:33:13 +08:00
Innokentii Konstantinov
48b7eca26d
Peridic chatops proxy sync (#4565)
Adds a pediodic job to sync tenants with chatops-proxy. Register request
will behave as upsert, allowing us to backfill new tenant's columns:
stack_id and stack_slug. Upsert is not merged on chatops-proxy, so
that's why task handling 409 status on /tenants/register request.
On top of that, I did small refactoring and introduced a new
register_oncall_tenant func, which receives org as an argument to not to
write `register_tenant(org.uuid, org.stack_id, org.stack_slug,.....)`
every time.
Part of https://github.com/grafana/oncall-gateway/issues/247
Need to be merged after https://github.com/grafana/oncall/pull/4559.
2024-06-24 04:35:59 +00:00
clemthom
28190fe6b7
add exotel call provider (#4433)
# What this PR does

Added support for [Exotel](https://exotel.com/) call provider. 

Features:

- Sending verification code through SMS
- Making test call
- Making notification call


## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-06-06 06:19:02 +00:00
Innokentii Konstantinov
805d4421bd
Support grafana escalate (#4453)
# What this PR does
This PR adds support for **/grafana escalate** command alongside with
**/escalate.**
2024-06-05 05:51:26 +00:00
Andrey Oleynik
0acb66bd1a
conditionally enable searching for alert groups via env var (#4287)
# What this PR does

re-enables the search for alert groups

## Which issue(s) this PR closes

Closes #2232

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
2024-06-04 11:51:02 +00:00
Andrey Oleynik
d0dd15453e
change zvonok call verification (#4393)
# Change zvonok call verification

After May 27, the Zvonok service will block number verification that was
not set up as part of the phone number verification campaign. This PR
modifies the number verification process.

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.

Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
2024-06-04 05:34:57 +00:00
Innokentii Konstantinov
17f448c506
Prepare OnCall for Unified Slack App (#4232)
This PR does a bunch of changes to prepare OnCall for Unified Slack App:
1. Install Slack via Chatops-Proxy. This change contains two parts:
getting a Slack install link from chatops-proxy
([code](https://github.com/grafana/oncall/pull/4232/files#diff-437a77d49fc04b92d315651b3df5991000b1ab74cf60aabb21aa77cb2823bf52R46))
and receiving a "slack installed" event from chatops-proxy
([code](https://github.com/grafana/oncall/pull/4232/files#diff-976d106f0962be5c1de5e35582193f68435ed0c17f2defd6bd2857bf6e27f65d)).
Also it means that OnCall doesn't need to register slack_links anymore
when slack is connected/disconnected. These changes are behind
UNIFIED_SLACK_APP_ENABLED flag and should be no-op if flag is not
enabled.
2. Get rid of Multiregionatily restrictions - instrument all slack
interactions with a ProxyMeta - json data telling chatops-proxy where to
route the interaction. Note, that it doesn't apply for "Add to
resolution notes" message action - it will be handled differently in
following PR.
3. Move all chatops-proxy related stuff from common/oncall-gateway to
apps/chatops-proxy

Minor changes:
1. Remove usage of **CHATOPS_V3** flag. Chatops v3 is already released
(It's a refactoring from previous quarter)

---------

Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>
Co-authored-by: Rares Mardare <rares.mardare@grafana.com>
2024-06-03 09:07:10 +00:00
Joey Orlando
a3187953ec
remove deprecated rbac workaround (#4377)
## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-05-22 15:27:16 +00:00
Joey Orlando
2582a1b1dc
Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279)
# What this PR does

In cloud we are currently (somewhat) improperly determining whether or
not a Grafana stack had the `accessControlOnCall` feature flag enabled.
At first things worked fine. We would enable this feature toggle via the
Grafana Admin UI, and then the OnCall backend would read this value from
GCOM's `GET /instance/<stack_id>` endpoint (via
`config.feature_toggles`), and everything worked as expected.

There was a recent change made in `grafana/deployment_tools` to set this
feature flag to True for all stacks. However, for some reason, the GCOM
endpoint above doesn't return the `accessControlOnCall` feature toggle
value in `config.feature_toggles` if it is set in this manner (it only
returns the value if it is set via the Grafana Admin UI).

So what we should instead be doing is such instead of asking GCOM for
this feature toggle, infer whether RBAC is enabled on the stack by doing
a `HEAD /api/access-control/users/permissions/search` (this endpoint _is
only_ available on a Grafana stack if `accessControlOnCall` is enabled).

**Few caveats to this ☝️**
1. we first have to make sure that the cloud stack is in an `active`
state (ie. not paused). This is because, no matter if the
`accessControlOnCall` is enabled or not, if the stack is in a `paused`
state it will ALWAYS return `HTTP 200` which can be misleading and lead
to bugs (this feels like a bug on the Grafana API, will follow up with
core grafana team)
2. Once we roll out this change we will effectively **actually** be
enabling RBAC for OnCall for all orgs. The Identity Access team would
prefer a progressive rollout, which is why I decided to introduce the
concept of
[`settings.CLOUD_RBAC_ROLLOUT_PERCENTAGE`](https://github.com/grafana/oncall/pull/4279/files#diff-3383aef931e41e44d95829ad971641eeb98fe001be2f5da92217446d300ea1b3R918)
(see also [`Organization.
should_be_considered_for_rbac_permissioning`](https://github.com/grafana/oncall/pull/4279/files#diff-2ca9917f4f56349be39545ee8abd459be5076295d02ca3a7ec545152fcddccdfR348-R362))

## Which issue(s) this PR closes

Related to https://github.com/grafana/identity-access-team/issues/667

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-05-14 16:30:16 +00:00
Michael Derynck
477477e4f7
Add logging of arguments for celery tasks (#4314)
# What this PR does
Enables logging of arguments for celery tasks. Currently it can be
difficult to troubleshoot issues as many tasks give no context to what
they are operating on in the logs.

## Which issue(s) this PR closes

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-05-06 21:55:09 +00:00
Innokentii Konstantinov
9afbcfc063
Fix docs and UI for connecting Grafana Alerting from other stack (#4243)
This PR fixes docs and UI to avoid usage of Grafana (Other) integration
which is using old mechanism for alert creation.
1. Rename Grafana (Other) integration to Grafana Alerting Legacy
2. Remove its mentions from docs and correct docs for connection Grafana
Alerting
3. Make AlertManager featured integration and upgrade its description.

![image](https://github.com/grafana/oncall/assets/20221722/6e84403e-c293-4791-9905-4d06c69775e9)

---------

Co-authored-by: Rares Mardare <rares.mardare@grafana.com>
2024-04-24 08:02:51 +00:00
Ravishankar
6f3f4e3f14
Allow webhook modification by API for advanced webhook (#4175)
# What this PR does

Enables the API to perform updates on the advanced webhooks created via
the UI

## Which issue(s) this PR closes

Closes #3958 

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-04-23 19:18:12 +00:00
Joey Orlando
25f8963749
Google Calendar Integration + automatic Shift Swap Request generation (#4220)
# What this PR does

## Which issue(s) this PR closes

Closes https://github.com/grafana/oncall-private/issues/2591

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-04-15 12:56:28 +00:00
Michael Derynck
d14b1c8e28
Improve performance for rate-limited, banned and deleted integrations (#4137)
# What this PR does
- Remove BanAlertConsumptionBasedOnSettingsMiddleware under high traffic
scenarios this causes too much DB load and the same effect can be
achieved by soft-deleting the integration
- Change lookups for integrations so that non-existent and deleted
integrations are also cached to reduce DB load

## Which issue(s) this PR closes

Closes https://github.com/grafana/oncall-private/issues/2608

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-04-04 18:05:34 +00:00
Joey Orlando
59f727d4f5
Google OAuth2 flow + fetch Google Calendar OOO events (#4067)
# What this PR does

The following is deployed under a feature flag.

**How it works**
1. The user clicks on the "Connect using your Google account" button in
the user profile settings modal
2. The UI makes a call to `GET /api/internal/v1/login/google-oauth2`.
The backend has now been configured to add
`apps.social_auth.backends.GoogleOAuth2` as a "`social_auth` backend".
3. The backend will respond w/ a URL which points to the Google OAuth2
consent screen. The frontend then proceeds by sending the user to this
page. This URL includes the following query parameters (amongst others):
- `redirect_uri` - this will send the user back to
`/api/internal/v1/complete/google-oauth2` (ie. make another API call to
the OnCall backend to finalize the Google OAuth2 flow)
- `state` - this represents an
`apps.auth_token.models.GoogleOAuth2Token` token. This allows us to
identify the OnCall user once they've linked their Google account.
4. Once redirected back to `/api/internal/v1/complete/google-oauth2`,
this will complete the OAuth2 flow. At this point, the backend has
access to several pieces of information about the Google user, including
their `access_token` and `refresh_token`. We persist these (encrypted)
for future use to fetch the user's out-of-office calendar events
5. The response from the API call in 4 above ☝️ is HTTP 302 (redirect)
to `/a/grafana-oncall-app/users/me` (ie. open the user profile settings
modal). At this point the user will see that their account has been
connected and they can further configure the settings

![image](https://github.com/grafana/oncall/assets/9406895/c7673055-8485-4f9a-98df-b4f7347229ce)


## Which issue(s) this PR closes

Closes https://github.com/grafana/oncall-private/issues/2584

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required) - will be done in
https://github.com/grafana/oncall-private/issues/2591
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes. - will be done in
https://github.com/grafana/oncall-private/issues/2591

---------

Co-authored-by: Dominik <dominik.broj@grafana.com>
Co-authored-by: Maxim Mordasov <maxim.mordasov@grafana.com>
2024-04-02 14:59:03 -04:00
Alex Burnett
9118ccfe58
Allow custom values for Self Hosted Clusters (Slugs, Titles, Region) (#4121)
# What this PR does
Allows for environment variables to be set on Grafana OnCall Engine for
Self Hosted users, giving them the ability to set values for Stack Slug,
Org Slug/Title, Region & Cluster Slugs.

This will mean then using the Grafana OnCall App, when adding multiple
OSS Stacks, you can set the correct value of 'stack_slug' so you can
differentiate between the stacks in the App.

## Which issue(s) this PR closes

Closes [#4119](https://github.com/grafana/oncall/issues/4119)

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2024-03-27 20:23:32 +00:00
Michael Derynck
d938b52d80
Add scheduled task to start cleanup tasks (#3976)
# What this PR does
Add scheduled task to start cleanup tasks. Currently purpose is to run
the task every 12 hours and for all active orgs cleanup empty & deleted
integrations. For deleted orgs we can run this manually. It will also
run if an org moves from active to deleted. It is expected to add more
to cleanup_organization task over time.

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-03-04 19:45:01 +00:00
Innokentii Konstantinov
16f2e6ecee
Use uwsgi instrumentation and log request headers (#3997)
# What this PR does
Use uwsgi instead of TracingMiddleware to not to mess up with span
attrs. Also it's an attempt to fix different trace IDs between
cortex-gw, oncall and labels plugin. Last, but not least - add a
middleware to log request headers if OTEL is enabled. It's needed to
debug tracing headers
2024-03-04 09:25:39 +00:00
Innokentii Konstantinov
f6d0441411
Add trace_id to log lines (#3982)
# What this PR does
This PR set up tracing to propagates trace_id to log lines.
There are two different tracers: local one in manage.py - it's used when
app is started via manage.py runserver. In this case spans will be just
written in console. Second traces is confugured in wsgi.py. It will be
used when app is runned via uwsgi and it will export traces via grpc.
Feature is hidden behind the feature flag.
2024-03-04 06:42:43 +00:00
Vadim Stepanov
2b5554c079
Remove explicit request size limits (#3878)
# What this PR does

Remove explicit request size limits both for uwsgi & Django. After this
change, the effective request size limit will be 2.5MB as the default
Django value for
[DATA_UPLOAD_MAX_MEMORY_SIZE](https://docs.djangoproject.com/en/4.2/ref/settings/#data-upload-max-memory-size).

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-02-22 15:00:33 +00:00
Sean Wood
61a657b0cd
Allow setting email app to use SSL instead of TLS (#3911)
# What this PR does
Adds flexibility of the method of encryption in the SMTP email app. Some
email servers are configured to use port 465 (intrinsic TLS) which
requires `EMAIL_USE_SSL` instead of `EMAIL_USE_TLS`.

## Which issue(s) this PR fixes
Fixes https://github.com/grafana/oncall/issues/1044

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
2024-02-20 03:38:09 -05:00
Andrey Oleynik
7cd814507e
Improve zvonok verification call (#3768)
# What this PR does

Added support for message template during phone number verification for
the Zvonok service. Template message support allows the use of
[SSML](https://www.w3.org/TR/2010/REC-speech-synthesis11-20100907/)
markup depending on the speaker used for speech synthesis.

Example: `Your verification code is <prosody
rate="x-slow">$verification_code</prosody>`

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
2024-02-08 14:05:11 -05:00
KevinDW-Fluxys
4e3194c106
expose OUTGOING_WEBHOOK_TIMEOUT as env var (#3801)
# What this PR does
It adds functionality to be able to configure the outgoing webhook
timeout from an environment variable.

## Which issue(s) this PR fixes
Running into timeouts when outgoing webhooks take longer than 4 seconds
(which is exceptional, but can happen) the webhook reports failure,
while it still might have succeeded on the webhook side.

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
2024-02-01 10:48:09 -05:00
Innokentii Konstantinov
eb3f41c80f
Enable Grafana Alerting v2 integration (#3808)
Enables Grafana Alerting v2 feature. It will enable integration dropdown
in OnCall Contact point in the Alerting UI.
In OSS if user has grafana version which supports this feature it will
work as well. If user has old grafana/oncall - nothing will happen.

---------

Co-authored-by: Ildar Iskhakov <Ildar.iskhakov@grafana.com>
2024-02-01 16:20:47 +08:00
Yulya Artyukhina
16ce0136f3
Refactor gaps and empty shift checks (#3785)
Refactor gaps and empty shift checks:
- Increase checking gaps and empty shifts frequency
- Unify gaps and empty shift checks
2024-01-31 15:25:06 +01:00
Joey Orlando
3833d8de56
remove manual alert group (/oncall) slack slash command + force_route_id (#3790)
# What this PR does

Related to [this
discussion](https://raintank-corp.slack.com/archives/C04JCU51NF8/p1706550226831949)

Removes the `/oncall` Slack slash command + the concept of
`force_route_id` (as this Slack slash command was the last piece of code
to use this concept
[here](https://github.com/grafana/oncall/blob/dev/engine/apps/slack/scenarios/manual_incident.py#L146))

## TODO before merging
- [x] update the various env's Slack apps to remove the slash command
from the app manifests

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-30 17:28:23 -05:00
Joey Orlando
2abcc4563a
mobile app proxy - request auth token from cloud auth api (#3748)
# What this PR does

Related to https://github.com/grafana/oncall-private/issues/2071

See [this
conversation](https://raintank-corp.slack.com/archives/C064R17Q1A8/p1706125615995019)
for all the context

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-25 13:46:55 -05:00
Innokentii Konstantinov
4a02d83fd1
Chatops api v3 (#3721)
This PR makes OnCall compatible with chatops-proxy v3. When CHATOPS_V3
is enabled, oncall will use new api client to register tenants and slack
installations. Also I added v3 routes for slack and telegram, so it's
possible to test new chatops proxy.

Currently two versions of chatops-proxy api are deployed, but they are
not compatible. They are doing same thing, using different db model and
tables. Once only v3 version will be left in prod, I'll remove
CHATOPS_V3 env var, all leftovers of previous api client and v3 slack
and telegram routes.

---------

Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>
2024-01-20 06:56:17 +00:00
Vadim Stepanov
cc071806f3
disable DRF_SPECTACULAR_ENABLED by default 2024-01-15 16:06:46 +00:00
Vadim Stepanov
d0904ca405
Improve OpenAPI schema coverage (#3629)
# What this PR does

Improves OpenAPI schema coverage for internal API:

- Fixes/Improves `alert group` and `feature` endpoints
- Adds `integration` and `user` endpoints

## Which issue(s) this PR fixes

https://github.com/grafana/oncall/issues/3444

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-12 15:11:22 +00:00
Matias Bordese
4e2e7e0a15
Add task logging personal notifications triggered/completed counts (#3638)
Related to https://github.com/grafana/oncall-private/issues/2347
2024-01-10 18:54:27 +00:00
Michael Derynck
e7f3eff72c
Limit how long acknowledge reminders can run for (#3571)
# What this PR does
Stops rescheduling of `acknowledge_reminder_task` after 2 weeks.
Assumption being if it has been sitting for that long in acknowledged
state it is likely to not need more reminders that it is still
acknowledged. Notifications for thread were probably muted a long time
ago.

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-15 16:50:01 +00:00
Joey Orlando
382b18b052
Mobile app proxy gateway (#3449)
# What this PR does

Closes https://github.com/grafana/oncall-private/issues/2324

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-05 19:58:05 +00:00
Matias Bordese
45200c33a1
Update beat schedule to use crontab schedule types (#3497)
Update celery beat schedule to use crontab schedule types, since
otherwise the timedelta is relative to the celery start and when we have
a restart we have some bigger than expected gaps between task runs
(alternatively it seems we could also use the `relative` option
described
[here](https://docs.celeryq.dev/en/main/userguide/periodic-tasks.html#available-fields))

Related to https://github.com/grafana/oncall-private/issues/2347
2023-12-04 18:42:12 +00:00
Joey Orlando
1df1b1eaa0
patch redis cluster multi-key operations (#3496)
# Which issue(s) this PR fixes

Related to https://github.com/grafana/oncall-private/issues/2363

Addresses this issue that arises when using
`cache.get_many`/`cache.set_many` operations with a Redis Cluster:
```python3
File "/usr/local/lib/python3.11/site-packages/redis/cluster.py", line 1006, in determine_slot
    raise RedisClusterException(
redis.exceptions.RedisClusterException: MGET - all keys must map to the same key slot
```

From the Redis Cluster
[docs](https://redis.io/docs/reference/cluster-spec/#hash-tags), this
can be addressed with this 👇 . Basically this will ensure that keys in
multi-key operations will resolve to the same hash slot (read: node):

> Hash tags
> There is an exception for the computation of the hash slot that is
used in order to implement hash tags. Hash tags are a way to ensure that
multiple keys are allocated in the same hash slot. This is used in order
to implement multi-key operations in Redis Cluster.
> 
> To implement hash tags, the hash slot for a key is computed in a
slightly different way in certain conditions. If the key contains a
"{...}" pattern only the substring between { and } is hashed in order to
obtain the hash slot. However since it is possible that there are
multiple occurrences of { or } the algorithm is well specified by the
following rules:
> 
> IF the key contains a { character.
> AND IF there is a } character to the right of {.
> AND IF there are one or more characters between the first occurrence
of { and the first occurrence of }.
> Then instead of hashing the key, only what is between the first
occurrence of { and the following first occurrence of } is hashed.

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-04 13:08:57 -05:00
Innokentii Konstantinov
a3e3d8bc9d
Change labels feature flag to work per oncall org (#3493)
It's needed because anyway labels plugin provisioned per stack, not per
org

---------

Co-authored-by: Yulya Artyukhina <Ferril.darkdiver@gmail.com>
2023-12-04 12:45:07 +00:00
Dominik Broj
2c4b34d3af
generate types, create http client and add exemplary usage (#3384)
# What this PR does

https://github.com/grafana/oncall/issues/3330

- add a script that generates TS type definitions based on OnCall API
OpenAPI schemas
- support adding custom properties on the frontend if needed
- add simple example of usage (for `'/labels/keys/'` endpoint)

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-29 05:16:13 +00:00
Ildar Iskhakov
95a3ab3b75
Revert "Cache independent ingestion" (#3417)
Reverts grafana/oncall#3415
2023-11-23 21:38:06 +08:00
Ildar Iskhakov
0d5ef785bf Make alert ingestion cache independent 2023-11-23 11:27:47 +08:00
Matias Bordese
e62fd3b122
Switch to django-redis, update redis deps (#3387)
Moving to django-redis and updating related deps.
django-redis has better support for [sentinel
setup](https://github.com/jazzband/django-redis#use-the-sentinel-connection-factory),
besides improved cache helpers (and being actively maintained).

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-11-20 20:27:13 +00:00
Joey Orlando
5678d79927
allow specifying more than one redis server URI in the REDIS_URI env var (#3368)
# What this PR does

Modifies the Django `settings/base.py` such that `REDIS_URI` can now be
a comma (or semicolon) separated list of URIs. From [Django
docs](https://docs.djangoproject.com/en/4.2/topics/cache/#:~:text=If%20you%20have%20multiple%20Redis%20servers%20set%20up%20in%20the%20replication%20mode%2C%20you%20can%20specify%20the%20servers%20either%20as%20a%20semicolon%20or%20comma%20delimited%20string%2C%20or%20as%20a%20list):

> If you have multiple Redis servers set up in the replication mode, you
can specify the servers either as a semicolon or comma delimited string,
or as a list. While using multiple servers, write operations are
performed on the first server (leader). Read operations are performed on
the other servers (replicas) chosen at random:
> ```python3
> CACHES = {
>     "default": {
>         "BACKEND": "django.core.cache.backends.redis.RedisCache",
>         "LOCATION": [
>             "redis://127.0.0.1:6379",  # leader
>             "redis://127.0.0.1:6378",  # read-replica 1
>             "redis://127.0.0.1:6377",  # read-replica 2
>         ],
>     }
> }
> ```

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-16 10:48:36 -05:00
Matias Bordese
ac01dd173d
Improve user permissions query (#3291)
The query for checking a user permission (used to get users from a Slack
usergroup, for example) is timing out (and generating retries, besides
affecting some use cases:
[logs](https://ops.grafana-ops.net/explore?panes=%7B%22FCQ%22:%7B%22datasource%22:%22c-R8UWvVk%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22amixr-prod%5C%22,%20cluster%3D%5C%22prod-us-central-0%5C%22%7D%20%7C%3D%20%5C%22Timeout%20exceeded%20in%20regular%20expression%20match.%5C%22%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22c-R8UWvVk%22%7D,%22editorMode%22:%22code%22%7D%5D,%22range%22:%7B%22from%22:%22now-6h%22,%22to%22:%22now%22%7D%7D%7D&schemaVersion=1&orgId=1)):

`django.db.utils.OperationalError: (3699, 'Timeout exceeded in regular
expression match.')`

Change to a `contains` query except for SQLite (not supported), where a
simplified version of the original regex query is used.
2023-11-07 16:58:16 +00:00
Joey Orlando
2cbb20601e
Improve performance of GET /users and GET /teams endpoints used by add responders popup (#3241)
# What this PR does

- Improve performance of the specific `GET /users` and `GET /teams`
calls that're made by the Add Responders dropdown in the UI
- Add `GET /team/{teamId}` internal API route (needed by Grafana
Incident team for their Add Responders changes)
- Some UI improvements to the Add Responders popup (loading state +
pre-fetch users and teams when the drawer is opened)
- Re-enable django-admin only if `settings.SILK_PROFILER_ENABLED ==
True` (need to be able to log into django admin to auth to silk routes)

Closes #3231 
Closes https://github.com/grafana/oncall-private/issues/2252

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-03 12:40:54 -04:00
Yulya Artyukhina
66a898df7d
Add labels feature flag for list of organizations (#3246)
# What this PR does
Adds a flag that allows to enable labels feature for the list of
organizations

## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2226

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-02 09:52:32 +00:00
Matias Bordese
11259de8e0
Add settings to allow detaching integrations server (#3203)
To run a detached integrations server:
1. Set env var `DETACHED_INTEGRATIONS_SERVER=True`
2. Run engine with the `integrations_urls.py` root url conf
(e.g. `ROOT_URLCONF=engine.integrations_urls python manage.py runserver
0.0.0.0:8081`)
2023-10-26 12:55:02 +00:00
Yulya Artyukhina
24f4969f61
Add labels implementation for integration (#3014)
# What this PR does
Adds labels implementation for integrations:
- ability to create/update labels on creating/updating integration
- ability to associate labels to integrations
- cache for label reprs on OnCall side
- feature flag to enable/disable labels

## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2157

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Maxim <maxim.mordasov@grafana.com>
Co-authored-by: Rares Mardare <rares.mardare@grafana.com>
2023-10-20 07:30:11 +00:00
Andre Buryndin
d9c3d084be
feature: Hardening the Helm deployment with Redis and Postgres TLS (#3029)
# What this PR does

Short summary: this PR improves security and configuration management
for Helm deployment. Please take a look at the details below.

## Which issue(s) this PR fixes

Issues:
- Cannot explicitly define redis database (only 0 and 1 numbers are
used)
- Cannot securely use TLS for Redis (cannot set CA certificate; cannot
set client certificates)
- Cannot securely use TLS for Postgres (cannot set CA certificate;
cannot set client certificates; cannot set `verify-full` validation)
- ~~Chart option `securityContext.readOnlyRootFilesystem: true` issues
CrashLoopBack pod state~~ will be moved to new PR

## Checklist

- [x] ~~Unit, integration, and e2e (if applicable) tests updated~~ (not
required)
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

- [x] Helm tests are fixed and updated
- [x] Manually verified the features:
  - [x] postgres TLS connection with `verify-full` validation
  - [x] redis TLS connection with `cert_required` validation
  - [x] redis protocol and database number controls
  - [x] all containers properly work in read-only root filesystem
- [x] all changes are backward compatible (doesn't break old
deployments)

## Changelog

- Fixed helm tests
- Added configuration options for secure TLS communication with
dependencies like Redis, MySQL, and Postgres
- ~~Added configuration option for relocating `celerybeat` database file
(read-only root filesystem issue)~~ will be moved to new PR
- Improved redis database configuration options
- Now only single redis database is used
- Added ability to mount custom volumes (with CA certificates, for
example) into Helm chart
- ~~Fixed issue with read-only root filesystem for Helm chart~~ will be
moved to new PR
- Add ability to work with Redis ACL (and AWS ElastiCache)
2023-10-03 09:25:28 -04:00