oncall-engine/engine
Joey Orlando 2582a1b1dc
Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279)
# What this PR does

In cloud we are currently (somewhat) improperly determining whether or
not a Grafana stack had the `accessControlOnCall` feature flag enabled.
At first things worked fine. We would enable this feature toggle via the
Grafana Admin UI, and then the OnCall backend would read this value from
GCOM's `GET /instance/<stack_id>` endpoint (via
`config.feature_toggles`), and everything worked as expected.

There was a recent change made in `grafana/deployment_tools` to set this
feature flag to True for all stacks. However, for some reason, the GCOM
endpoint above doesn't return the `accessControlOnCall` feature toggle
value in `config.feature_toggles` if it is set in this manner (it only
returns the value if it is set via the Grafana Admin UI).

So what we should instead be doing is such instead of asking GCOM for
this feature toggle, infer whether RBAC is enabled on the stack by doing
a `HEAD /api/access-control/users/permissions/search` (this endpoint _is
only_ available on a Grafana stack if `accessControlOnCall` is enabled).

**Few caveats to this ☝️**
1. we first have to make sure that the cloud stack is in an `active`
state (ie. not paused). This is because, no matter if the
`accessControlOnCall` is enabled or not, if the stack is in a `paused`
state it will ALWAYS return `HTTP 200` which can be misleading and lead
to bugs (this feels like a bug on the Grafana API, will follow up with
core grafana team)
2. Once we roll out this change we will effectively **actually** be
enabling RBAC for OnCall for all orgs. The Identity Access team would
prefer a progressive rollout, which is why I decided to introduce the
concept of
[`settings.CLOUD_RBAC_ROLLOUT_PERCENTAGE`](https://github.com/grafana/oncall/pull/4279/files#diff-3383aef931e41e44d95829ad971641eeb98fe001be2f5da92217446d300ea1b3R918)
(see also [`Organization.
should_be_considered_for_rbac_permissioning`](https://github.com/grafana/oncall/pull/4279/files#diff-2ca9917f4f56349be39545ee8abd459be5076295d02ca3a7ec545152fcddccdfR348-R362))

## Which issue(s) this PR closes

Related to https://github.com/grafana/identity-access-team/issues/667

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-05-14 16:30:16 +00:00
..
apps Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279) 2024-05-14 16:30:16 +00:00
common Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279) 2024-05-14 16:30:16 +00:00
config_integrations Fix docs and UI for connecting Grafana Alerting from other stack (#4243) 2024-04-24 08:02:51 +00:00
engine Remove kwargs from celery logging (#4316) 2024-05-07 14:29:49 +00:00
settings Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279) 2024-05-14 16:30:16 +00:00
static/images remove django admin panel (#2731) 2023-08-02 14:26:50 -04:00
type_stubs/icalendar continue addressing mypy violations (#2170) 2023-06-27 10:23:08 +00:00
.dockerignore One startup command to rule them all (#760) 2022-11-07 16:34:43 +01:00
.gitignore modify push notification settings + use fcm-django library (#998) 2022-12-20 12:41:34 +01:00
celery_with_exporter.sh Add flag to debug logs (#912) 2022-11-29 11:16:42 +08:00
conftest.py Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279) 2024-05-14 16:30:16 +00:00
Dockerfile Switch to uv Python package installer/resolver (#4005) 2024-04-26 14:30:38 +00:00
grpcio-1.57.0-cp311-cp311-linux_aarch64.whl Use local arm64 grpcio wheel to make local builds on arm64 faster (#4000) 2024-03-05 06:31:58 +00:00
manage.py Instrument requests lib (#4008) 2024-03-05 05:22:34 +00:00
pyproject.toml Update out of office task to not retry on HttpError (#4328) 2024-05-09 16:16:46 +00:00
requirements-dev.in Use pip-tools to handle Python deps (#3892) 2024-02-20 17:44:15 +00:00
requirements-dev.txt Bump django from 4.2.10 to 4.2.11 in /engine (#4079) 2024-03-19 21:14:44 +00:00
requirements.in Bump social-auth-app-django from 5.3.0 to 5.4.1 in /engine (#4274) 2024-04-24 19:45:48 +00:00
requirements.txt Bump werkzeug from 3.0.1 to 3.0.3 in /engine (#4313) 2024-05-13 16:39:27 +00:00
tox.ini Update xdist load to use loadscope setting (#4187) 2024-04-08 19:03:58 +00:00
uwsgi.ini Remove explicit request size limits (#3878) 2024-02-22 15:00:33 +00:00
wait_for_test_mysql_start.sh Revert "Revert "speed up ci builds from 15 to <7 minutes"" (#1643) 2023-03-28 09:34:03 +02:00