This fixes scenario described
[here](https://github.com/grafana/oncall/issues/2118#issuecomment-1580499754).
When a rotation is setup in UTC+1, and the shift starts at 00:00 with
Monday as active day and a weekly frequency, the values are translated
to UTC when submitting to the backend, so the shift data becomes
something like: shift starting at 23:00 on Sunday, but since week_start
is on Monday, the "first event" in the week belongs to the "previous
week". This can be addressed by moving the week_start, so a weekly shift
that was starting on a Monday but in UTC tz starts on Sunday,
"translates" to a UTC week_start on Sunday:

(this is with the proposed changes; otherwise you get the same issue
linked above where the first event in the week is assigned to the other
user group).
About selected week days changed when editing a rotation, see inline
comment (related to
[this](https://github.com/grafana/oncall/issues/1322#issuecomment-1521787786))
# What this PR does
Closes#2169
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Introduces AlertManagerV2 integration with better grouping and
autoresolving, not intended for production use yet.
---------
Co-authored-by: Ildar Iskhakov <Ildar.iskhakov@grafana.com>
# What this PR does
- Adds [`mypy` static type checking](https://mypy-lang.org/) to our CI
pipeline. Currently there is still a **ton** of errors being returned by
the tool, as we'll need to fix pre-existing errors. I think we can
slowly chip away at these errors in small PRs, doing them all in one
large PR is likely very risky.
- Also, this PR starts chipping away at one of the main type errors that
we have which is accessing the `datetime` class (from the `datetime`
library) or `timedelta` function on the `django.utils.timezone` module.
Basically we should be instead accessing these two objects from the
native `datetime` module. This makes sense because the [`__all__`
attribute](https://github.com/django/django/blob/main/django/utils/timezone.py#L14-L30)
in `django.utils.timezone` does not re-export `datetime` or `timedelta`.
- splits `engine` dependencies out into `requirements.txt` and
`requirements-dev.txt`
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required) (N/A)
# What this PR does
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
We were noticing some discrepancies in alert groups counts.
From a real example:
```
# queryset is the alert groups initial queryset
>>> qs = queryset.filter(channel_filter__alert_receive_channel=arc)
>>> qs.filter(resolved_at__isnull=False).count()
1318
>>> qs = queryset.filter(channel=arc)
>>> qs.filter(resolved_at__isnull=False).count()
1356
>>> qs.filter(resolved_at__isnull=False, channel_filter__isnull=True).count()
38
```
# What this PR does
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Create a custom non-root user and use it to start an app. So uwsgi does
not require to use `setUid` and `setGid` system calls.
It handles errors while starting in Kubernetes with `runAsNonRoot: true`
check.
## Which issue(s) this PR fixes
closes https://github.com/grafana/oncall/issues/445
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
# What this PR does
Introduces BaseFailed exception for phone_notificator.
# Why
We need to somehow distinguish errors we want to be notified - like
network errors or invalid twilio credentials (I will call them "real"
errors) and errors we want to share with user, but don't want to be
paged ( I will call them "fake" errors).
To do that I added "graceful_msg" to all Failed... exceptions. If
details field is present - it mean we can return 400 code with the
message, if not - 500 code. So, "real" errors will raise Failed...
exception, while "fake" will add "graceful_msg".
# TODO
handle exceptions handled here
https://github.com/grafana/oncall/pull/2065
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Michael Derynck <michael.derynck@grafana.com>
# What this PR does
This PR does three improvements in twilio_phone_provider:
1.
[Speed-up](https://github.com/grafana/oncall/pull/2034/files#diff-7a311767169c024e60e2b4e35fd531dd6e2f1ea785cfc84263e11e7932d622af)
query which calculates amount of phone_calls/sms left.
2. Remove code which was needed only for backward compatibility during
the release of PhoneProvider refactoring and improves logging for
handling status/gather updates.
3. Add db_index to twilio_sid. We are doing lot of lookups by sid and
with increasing amount of data it became resource consuming.
# What this PR does
Add first internal api error code and add "resolution_note" param to
resolve endpoint
## Which issue(s) this PR fixes
It fixes resolving alert groups via mobile app when resolution note is
required
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Makes sure that all viewset actions with `detail=True` use
`self.get_object()` to retrieve an instance that's being acted upon.
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
- update `make test` to always use `settings.ci-test`. Right now it will
use whatever the value of `DJANGO_SETTINGS_MODULE` is in
`./dev/.env.dev`, which causes ~45 tests to fail
- Fix several Python warnings that we see when running the tests
```bash
RemovedInDjango40Warning: The providing_args argument is deprecated. As it is purely documentational, it has no replacement. If you rely on this argument as documentation, you can move the text to a code comment or docstring.
alert_create_signal = django.dispatch.Signal(
```
```bash
PytestCollectionWarning: cannot collect test class 'TestOnlyBackend' because it has a __init__ constructor (from: apps/api/tests/test_alert_receive_channel_template.py)
class TestOnlyBackend(BaseMessagingBackend):
```
```bash
DeprecationWarning: The parameter 'use_aliases' in emoji.emojize() is deprecated and will be removed in version 2.0.0. Use language='alias' instead.
To hide this warning, pin/downgrade the package to 'emoji~=1.6.3'
return emoji.emojize(self.verbal_name, use_aliases=True)
```
```bash
DateTimeField CustomOnCallShift.start received a naive datetime (2023-06-01 12:53:12) while time zone support is active.
warnings.warn("DateTimeField %s received a naive datetime (%s)"
```
```bash
apps/twilioapp/tests/test_phone_calls.py::test_resolve_by_phone
/etc/app/apps/twilioapp/tests/test_phone_calls.py:173: DeprecationWarning: The 'text' argument to find()-type methods is deprecated. Use 'string' instead.
content = BeautifulSoup(content, features="html.parser").findAll(text=True)
```
```bash
apps/twilioapp/tests/test_phone_calls.py::test_resolve_by_phone
apps/twilioapp/tests/test_phone_calls.py::test_wrong_pressed_digit
/usr/local/lib/python3.11/site-packages/bs4/builder/__init__.py:545: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor.
```
```bash
apps/twilioapp/tests/test_phone_calls.py::test_forbidden_requests
/usr/local/lib/python3.11/site-packages/social_django/urls.py:15: RemovedInDjango40Warning: django.conf.urls.url() is deprecated in favor of django.urls.re_path().
url(r'^login/(?P<backend>[^/]+){0}$'.format(extra), views.auth,
```
```bash
apps/twilioapp/tests/test_phone_calls.py: 66 warnings
/usr/local/lib/python3.11/site-packages/debug_toolbar/utils.py:255: DeprecationWarning: currentThread() is deprecated, use current_thread() instead
thread = threading.currentThread()
```
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Fixes https://github.com/grafana/oncall/issues/1103, inspired by
https://github.com/grafana/oncall/pull/1934.
Makes sure that:
1. `LiveSettings.validate_settings` is only called once per update
request and not called for any individual LiveSetting instance save.
2. `telegram.Bot.set_webhook` is only called once per request when
changing `TELEGRAM_WEBHOOK_HOST`.
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/1103
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Allows fetching inbound email address for an integration using public
API. There's a new field `inbound_email` that's only defined for inbound
email integration and `null` for any other integration types.
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/1989
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Closes https://github.com/grafana/oncall-private/issues/1836
- Revert "Revert slack org does not exist changes breaking escalate
command (#2057)" + add some unit tests to ensure we don't break the
`/escalate` command in the future
- cleanup how we destructure the `payload` dict in the endpoint handler
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
allow mobile app to consume internal schedules api endpoints
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
Replace password and authorization header fields with placeholders when
returning data to the UI. Mask the authorization header field when
editing and in the status logs.
# What this PR does
Closes https://github.com/grafana/oncall-private/issues/1632
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
RequestBodyReadingMiddleware is excess as [post-buffering is
enabled](https://github.com/grafana/oncall/blob/dev/engine/uwsgi.ini#L17):
If an HTTP request has a body (like a POST request generated by a form),
you have to read (consume) it in your application. If you do not do
this, the communication socket with your webserver may be clobbered. If
you are lazy you can use the post-buffering option that will
automatically read data for you. For Rack applications this is
automatically enabled.
(https://uwsgi-docs.readthedocs.io/en/latest/ThingsToKnow.html)
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
When checking current shifts to trigger slack notifications, a stable
value for timezone should be used (to avoid triggering multiple
duplicated notifications; using the first `VTIMEZONE` value available
may change on refresh, besides being potentially incorrect).
Also, `VTIMEZONE` shouldn't be used as the calendar timezone, it just
describes a [timezone
definition](https://icalendar.org/iCalendar-RFC-5545/3-6-5-time-zone-component.html)
which can later be [used when specifying a start/end
date/datetime](https://icalendar.org/iCalendar-RFC-5545/3-2-19-time-zone-identifier.html)
("The parameter MUST be specified on properties with a DATE-TIME value
if the DATE-TIME is not either a UTC or a "floating" time.").
OTOH, Google uses the non-standard[ `X-WR-TIMEZONE`
](https://github.com/collective/icalendar/issues/343)field as fallback
TZ for date-times without a `TZID`. Try to use this when available,
otherwise fallback to `UTC`.
Fixes an issue where if a sender is defined for a certain country code
and that sender belongs to a different account than the default from the
environment the status callback would get a 443 response from OnCall
back to Twilio. Checking permission on status callbacks will now use the
same account as was used to send the original message, verify, or make
the call.
This will vastly improve troubleshooting when an unexpected error
occurs.
# What this PR does
Enables logging of the exception that might be thrown by
SocialAuthExceptionMiddleware. It vastly improves the speed and accuracy
of troubleshooting requests to external applications with the aim to
authenticate, since there is usually no response to look at.
## Which issue(s) this PR fixes
Closes [#2035](https://github.com/grafana/oncall/issues/2035)
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
# What this PR does
Fix HTTP 500 when sending demo alert for inbound integration email.
- Make all integration configs have consistent `is_demo_alert_enabled`
and `example_payload` values
- Add tests
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
Before this change, a diff ical check (which happens with frequency with
imported ical), particularly with overrides in an API/terraform schedule
would trigger unexpected slack notifications because the prev vs current
ical comparison will flag a diff, but when comparing current and
previous shifts, `current_shifts` will have the shift in progress while
the `prev_shifts` calculated from the overrides-only diff will most of
the time be empty (unless you set/change an override at current time).
Simplified the checks to always compare previous current shifts (ie. the
ones in the schedule from the DB) vs the recalculated ones using the
(refreshed) ical data from the schedule.
# What this PR does
Reworks Slack handlers for buttons and select menus for AG Slack
messages.
<img width="602" alt="Screenshot 2023-05-31 at 19 34 05"
src="https://github.com/grafana/oncall/assets/20116910/857bf096-7bdd-427b-94b6-15aad873a8ac">
## Current implementation
- It's possible to end up with orphaned Slack messages that are posted
to Slack but have no `SlackMessage` instance in the DB. For such
messages, clicking buttons will result in an exception and HTTP 500. See
private repo
[issue](https://github.com/grafana/oncall-private/issues/1841) for more
info.
- Bug in authorization system, which effectively bypasses any permission
checks. For example, it's possible to resolve an alert group while being
a Viewer.
- No tests covering most buttons.
## Changes in this PR
- Make the system more robust, don't use `SlackMessage` model to figure
out the alert group being interacted on, instead embed `alert_group_pk`
to every button and use it when receiving interaction requests from
Slack.
- Existing orphaned Slack messages will be repaired. Clicking buttons
under orphaned messages will work (and missing `SlackMessage` instance
will be created on interaction). This is possible because some buttons
already have `alert_group_pk` embedded, and it's possible to get this
data on button clicks (even if the clicked button itself doesn't have
`alert_group_pk` embedded).
- Fix authorization. Show warning window when unauthorized:
<img width="511" alt="Screenshot 2023-05-31 at 19 40 02"
src="https://github.com/grafana/oncall/assets/20116910/5abeeaa7-1b61-4a47-b3af-0e21d5cd1907">
- Added tests for all the buttons under AG message. Add tests checking
authorization, actual execution of scenario steps, orphan message
repairing, backward compatibility, etc. Also add tests on
`AlertGroupSlackRenderer` checking that correct data is embedded into
buttons.
- Cosmetic changes such as renaming `incident` to `Alert Group`.
## Which issue(s) this PR fixes
Related to https://github.com/grafana/oncall-private/issues/1841
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Reduce number of alert groups returned for grouping on slack request to
20 to avoid event trigger expiry.
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/1835
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>
Fixes issue when syncing contact points and there are receiver configs
with no `grafana_managed_receiver_configs` key.
(eg. `{"name": "autogen-contact-point-default"}`)
# What this PR does
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Adds a ratelimit for AmazonSNS.
AlertChannelDefining mixin is now injecting alert_receive_channel only
in request, not in kwargs to not to break AmazonSNS.
Many countries are introducing different requirements for SMS senders to
register and/or use alpha numeric ids, short codes or regional numbers
or face being blocked. The changes in this PR will give us more
flexibility by allowing us to map to different resources in Twilio based
on the phone number we are trying to reach. For this first
implementation the selection is made based on country code of the
recipient. Verification and phone calls were given the same treatment
although the immediate need is for SMS. Senders with no country code set
can be used as catch-all defaults. This also falls back to the
configured live settings/environment variables if not configured.
Possible future additions:
- Move through list of trying multiple senders before failing
notification
- Easily expanded to allow per-organization or per-user resources to let
users and tenants configure their own Twilio
- Add UI + replace live settings so users can configure their own
settings
- More selection criteria if needed
TODO:
- [x] Add+Fix Tests
- [x] Verify changes are compatible with #1713
# What this PR does
In trying to solve
https://github.com/grafana/oncall-private/issues/1836, it is very
difficult to understand the root cause without seeing the event payload.
This PR will log this out.
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required) (N/A)