oncall-engine/engine/apps
Joey Orlando 2582a1b1dc
Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279)
# What this PR does

In cloud we are currently (somewhat) improperly determining whether or
not a Grafana stack had the `accessControlOnCall` feature flag enabled.
At first things worked fine. We would enable this feature toggle via the
Grafana Admin UI, and then the OnCall backend would read this value from
GCOM's `GET /instance/<stack_id>` endpoint (via
`config.feature_toggles`), and everything worked as expected.

There was a recent change made in `grafana/deployment_tools` to set this
feature flag to True for all stacks. However, for some reason, the GCOM
endpoint above doesn't return the `accessControlOnCall` feature toggle
value in `config.feature_toggles` if it is set in this manner (it only
returns the value if it is set via the Grafana Admin UI).

So what we should instead be doing is such instead of asking GCOM for
this feature toggle, infer whether RBAC is enabled on the stack by doing
a `HEAD /api/access-control/users/permissions/search` (this endpoint _is
only_ available on a Grafana stack if `accessControlOnCall` is enabled).

**Few caveats to this ☝️**
1. we first have to make sure that the cloud stack is in an `active`
state (ie. not paused). This is because, no matter if the
`accessControlOnCall` is enabled or not, if the stack is in a `paused`
state it will ALWAYS return `HTTP 200` which can be misleading and lead
to bugs (this feels like a bug on the Grafana API, will follow up with
core grafana team)
2. Once we roll out this change we will effectively **actually** be
enabling RBAC for OnCall for all orgs. The Identity Access team would
prefer a progressive rollout, which is why I decided to introduce the
concept of
[`settings.CLOUD_RBAC_ROLLOUT_PERCENTAGE`](https://github.com/grafana/oncall/pull/4279/files#diff-3383aef931e41e44d95829ad971641eeb98fe001be2f5da92217446d300ea1b3R918)
(see also [`Organization.
should_be_considered_for_rbac_permissioning`](https://github.com/grafana/oncall/pull/4279/files#diff-2ca9917f4f56349be39545ee8abd459be5076295d02ca3a7ec545152fcddccdfR348-R362))

## Which issue(s) this PR closes

Related to https://github.com/grafana/identity-access-team/issues/667

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-05-14 16:30:16 +00:00
..
alerts Minor query optimizations (#4325) 2024-05-08 17:27:36 +00:00
api Improve query performance when fetching alert payload for templating (#4338) 2024-05-13 17:07:06 +00:00
api_for_grafana_incident Add render_for_web information to alert group incident API (#3005) 2023-09-12 12:02:42 +00:00
auth_token Google OAuth2 flow + fetch Google Calendar OOO events (#4067) 2024-04-02 14:59:03 -04:00
base Add endpoint for organization config checks (#4204) 2024-04-11 14:51:56 +00:00
email Allow setting email app to use SSL instead of TLS (#3911) 2024-02-20 03:38:09 -05:00
google Update out of office task to not retry on HttpError (#4328) 2024-05-09 16:16:46 +00:00
grafana_plugin Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279) 2024-05-14 16:30:16 +00:00
heartbeat Improve OpenAPI schema coverage (#3629) 2024-01-12 15:11:22 +00:00
integrations Add test for caching deleted integration, fix test wrap methods (#4173) 2024-04-05 20:35:38 +00:00
labels Support prescribed labels (#3848) 2024-02-20 14:42:51 +08:00
metrics_exporter Avoid generating response time value metrics for empty integrations (#4339) 2024-05-13 17:23:22 +00:00
mobile_app Set a timeout for mobile app incident proxy requests (#4306) 2024-05-03 13:00:06 +00:00
oss_installation Improve OpenAPI schema coverage (#3629) 2024-01-12 15:11:22 +00:00
phone_notifications Revert "upgrade to Python 3.12 (#3456)" and "bump uwsgi version to latest #3466" (#3483) 2023-12-01 09:56:26 -05:00
public_api Minor query optimizations (#4325) 2024-05-08 17:27:36 +00:00
schedules Update cached schedule users to consider deleted users (#4246) 2024-04-23 11:40:02 +00:00
slack Avoid retrying to update Slack log message if cant_update_message (#4329) 2024-05-09 16:16:53 +00:00
social_auth GCal autogenerated shift swap requests - don't recreate if one was previously created and deleted (#4281) 2024-04-25 18:16:42 +00:00
telegram Fix docs and UI for connecting Grafana Alerting from other stack (#4243) 2024-04-24 08:02:51 +00:00
twilioapp Update alert group state by backsync (#4089) 2024-03-27 12:37:01 +00:00
user_management Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279) 2024-05-14 16:30:16 +00:00
webhooks Add acknowledged, resolved user information on webhook payload (#4176) 2024-04-26 21:50:08 +00:00
zvonok Update alert group state by backsync (#4089) 2024-03-27 12:37:01 +00:00
__init__.py World, meet OnCall! 2022-06-03 08:09:47 -06:00