# What this PR does
* api returns all the resources available to the user by default
* substitutes `team switcher` with `multi-select team filter`
* allow referencing between integrations - escalations chains -
[schedules, outgoing webhooks] across teams
https://user-images.githubusercontent.com/2262529/225634581-2d2e8af2-15ce-4c01-a90e-8267d98f5a23.mov
## Which issue(s) this PR fixes
## Checklist
- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
---------
Co-authored-by: Maxim <maxim.mordasov@grafana.com>
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
This PR:
- modifies the `check_escalation_finished_task` celery task to:
- do stricter escalation validation based on the alert group's
escalation snapshot (see the `audit_alert_group_escalation` method in
`engine/apps/alerts/tasks/check_escalation_finished.py` for the
validation logic)
- use a read-only database for querying alert-groups if one is
configured, otherwise use the "default" one
- ping a configurable heartbeat (new env var
`ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL` added)
- increase the task frequency from every 10 to every 13 minutes (this
can be configured via an env variable)
- adds public documentation on how to configure this auditor task
- modifies the local celery startup command to properly take into
consideration all celery related env vars (similar to the ones we use in
`engine/celery_with_exporter.sh`; this made it easier to enable `celery
beat` locally for testing)
- removes the following code:
- removes references to `AlertGroup.estimate_escalation_finish_time` and
marks the model field as deprecated using the [`django-deprecate-fields`
library](https://pypi.org/project/django-deprecate-fields/). This field
was only used for the previous version of this validation task
- `EscalationSnapshotMixin.calculate_eta_for_finish_escalation` was only
used to calculate the value for
`AlertGroup.estimate_escalation_finish_time`
- `calculate_escalation_finish_time` celery task
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/1558
## Checklist
- [x] Tests updated
- [x] Documentation added
- [x] `CHANGELOG.md` updated
This PR add Inbound Email integration.
It designed to support some variety of ESPs, but in prod we will use
Mailgun, so locally I tested it only with mailgun ESP.
**Important:**
To make it work on different clusters I'm planning to provide different
email domains for different regions, like ....@us.oncall.grafana.net,
...@eu.oncall.grafana.net
---------
Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
# What this PR does
This PR fixes templates behaviour for public and private api. It fix
"reset to default" for templates from messaging backends and some minor
bugs. Also added acknowledge signal and source link templates
## Checklist
- [x] Tests updated
- [x] Documentation added
- [x] `CHANGELOG.md` updated
This should fix task error as seen in logs, trying to parse an empty
string as ical value:
```
Task apps.schedules.tasks.refresh_ical_files.refresh_ical_file[] raised unexpected: ValueError("Found no components where exactly one is required: ''")
```
# What this PR does
This PR simplifies code of maintenance mode.
1. Perform distribution/escalation maintenance checks in send_signal...
tasks.
2. Use usual alert distribution flow for the maintenance incident.
3. Decouple maintenance mode from slack (all, except
**notify_about_maintenance_action** methods, I don't want to make this
PR too big)
As a bonus from these changes, maintenance mode now mute alert group
delivery in all chatops integrations, not only in slack. (Before,
incidents happened while maintenance were posted to telegram and msteams
anyway)
## Checklist
- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
# What this PR does
This PR cleanup ScenarioStep. It's needed to simplify moving Slack to
the messaging backends in future.
1. Introduce AlertGroupSlackService to move logic from ScenarioStep.
Also it allowed to get rid of importing ScenarioSteps in the code not
related to processing of slack callbacks.
2. Remove tags from ScenarioSteps, they are unused.
3. Remove ScenarioStep.dispatch method. It just was calling
ScenarioStep.process_scenario.
4. Remove "action" param from process_scenario, it was unused.
5. Remove creation of SlackActionRecord on handling SlackEvents. We are
not using it, but it generates INSERT query on most of the user-slack
interactions.
6. Remove "random_prefix_for_routing" from ScenarioStep, it was unused.
## Which issue(s) this PR fixes
## Checklist
- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
This PR adds
[django-migration-linter](https://github.com/3YOURMIND/django-migration-linter)
to keep database migrations
backwards compatible
- we can automatically run migrations and they are zero-downtime, e.g.
old code can work with the migrated database
- we can run and rollback migrations without worrying about data safety
- OnCall is deployed to the multiple environments core team is not able
to control
See [django-migration-linter
checklist](https://github.com/3YOURMIND/django-migration-linter/blob/main/docs/incompatibilities.md)
for the common mistakes and best practices
## Which issue(s) this PR fixes
## Checklist
- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
Make direct paging internal API endpoint return an alert group ID.
## Which issue(s) this PR fixes
Related to https://github.com/grafana/oncall/issues/823
## Checklist
- [x] Tests updated
# What this PR does
This PR caches the field `render_for_web` with lifetime 1 day and cache
becomes invalid if it was created before
* last alert received
* template changed
## Which issue(s) this PR fixes
## Checklist
- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
**What this PR does**:
- Keep grafana version on create/update contact points to avoid multiple
requests to alerting
- Add retry limit on create contact point async
- Fix bugs related on create contact point
- Update logs on create/update contact point, make them more clear
- Avoid unnecessary requests to Grafana Alerting
# What this PR does
Adds an ability to page an escalation chain for a newly created direct
paging alert group using the internal API. Also [adds a forgotten
migration](32fc44e744)
related to the direct paging backend.
Related to https://github.com/grafana/oncall/issues/823
## Checklist
- [x] Tests updated
- [ ] Documentation added (N/A)
- [ ] `CHANGELOG.md` updated (N/A)
Add a dummy step for declare incident button to prevent raising 'Step is
undefined' exception because Slack sends a POST request to the backend
upon clicking a button with a redirect link to Incident.
This pr doesn't change any functionality
Also changes the default integration used when creating an alert group
for a direct page to a custom manual integration to avoid
conflicts/unexpected behaviors with existing manual alerts.
Check if Grafana Incident is enabled. If it is, add a button with a link
to declare Grafana Incident from Alert group in Slack and on Web.
Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>
# What this PR does
This PR added a new parameter (state) into the alert_group public API to
filter the state of the alert groups
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/684
## Checklist
- [x] Tests updated
- [x] Documentation added
- [x] `CHANGELOG.md` updated
Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>
# What this PR does
With the addition of tighter controls on jinja templates handle
exceptions while rendering group data as follows:
- Title will cache error message as title and display to user and the
error will be logged
- Group distinction will be left as None and the error will be logged
- Is resolve signal will be treated as False and the error will be
logged
- Is acknowledge signal will be treated as False and the error will be
logged
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/1542
* Modify plugin.json to support RBAC role registration
* defines 26 new custom roles in plugin.json. The main roles are:
- Admin: read/write access to everything in OnCall
- Reader: read access to everything in OnCall
- OnCaller : read access to everything in OnCall + edit access to Alert Groups and Schedules
- <object-type> Editor: read/write access to everything related to <object-type>
- <object-type> Reader: read access for <object-type>
- User Settings Admin: read/write access to all user's settings, not just own settings. This is in comparison to User Settings Editor which can only read/write own settings
* update changelog and documentation (#686)
* implement RBAC for OnCall backend
This commit refactors backend authorization. It trys to use RBAC authorization if the org's grafana instance supports it, otherwise it falls back to basic role authorization.
* update RBAC backend tests
* add tests for RBAC changes
- run backend tests as matrix where RBAC is enabled/disabled. When RBAC is enabled, the permissions granted are read from the role grants in the frontend's plugin.json file (instead of relying what we specify in RBACPermission.Permissions)
- remove --reuse-db --nomigrations flags from engine/tox.ini
- minor autoformatting changes to docker-compose-developer.yml
* remove --ds=settings.ci-test from pytest CI command
DJANGO_SETTINGS_MODULE is already specified as an env var so this is just unecessary duplication
* update gitignore
* update github action job name for "test"
* RBAC frontend changes
* refactors the use of basic roles (ex. Viewer, Editor, Admin) use RBAC permissions (when supported), or falling back to basic roles when RBAC is not supported.
- updates the UserAction enum in grafana-plugin/src/state/userAction.ts. Previously this was hardcoded to a list of strings that were being returned by the OnCall API. Now the values here correspond to the permissions in plugin.json (plus a fallback role)
* changes per Gabriel's comments:
- get rid of group attribute in rbac roles
- remove displayName role attribute
- remove hidden role attribute
- add back role to includes section
* don't try to update user timezone if they don't have permission
* Improve feedback so template errors are given to user
* Add security error logging
* Add limits for templates, payloads, results
* Show popup error notification for webhook errors and template errors that don't have a result
* Update tests
* Split exceptions into warnings/errors to give more control when previewing, rendering, saving templates
* Limit title lengths
* Make TypeError a warning
* Adjust title length limit
* Remove length limiting on urlize since it is being done on template render
* Fix tests
* Add KeyError and ValueError to warnings
* No longer enforcing json result when saving webhook in case it is dependent on payload
* Add tests for expected exceptions coming from apply_jinja_template
* Update changelog
* Send raw post if template result is not JSON
* add permalinks list to internal API alertgroup view
* add user's name and full avatar URL to the user view
* make avatar_full_url a property
* fix tests
* fix user connection criteria
* move mobile notifications to a separate backend, remove critical notification
* remove outdated mobile app code
* MOBILE_APP_PUSH_NOTIFICATIONS_ENABLED -> FEATURE_MOBILE_APP_INTEGRATION_ENABLED
* create error log if no devices are set up
* move mobile auth related code to the mobile_app Django app
* move mobile auth related code to the mobile_app Django app
* move mobile auth related code to the mobile_app Django app
* fix typing
* add GCMDevice todos
* add user connection capabilities
* add user connect/disconnect to the messaging backend
* move APNS endpoint to mobile_app Django app
* restore critical notifications
* support hackathon app
* tweak migrations so mobile app auth tokens are preserved
* reuse notify_by IDs
* use mobile app template to render push notification
* add GCM/FCM (Android) support
* fix unlink user
* logger.error -> logger.info
This commit patches issue related to #708.
#708 forgot to remove attributes on models outside of the migration_tool django app that were referencing model attributes from migration_tool.
The only attribute that referenced a field in migration_tool was migrator_lock on the Alert model. This commit removes any references to that attribute.