oncall-engine/engine/settings
Joey Orlando 2582a1b1dc
Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279)
# What this PR does

In cloud we are currently (somewhat) improperly determining whether or
not a Grafana stack had the `accessControlOnCall` feature flag enabled.
At first things worked fine. We would enable this feature toggle via the
Grafana Admin UI, and then the OnCall backend would read this value from
GCOM's `GET /instance/<stack_id>` endpoint (via
`config.feature_toggles`), and everything worked as expected.

There was a recent change made in `grafana/deployment_tools` to set this
feature flag to True for all stacks. However, for some reason, the GCOM
endpoint above doesn't return the `accessControlOnCall` feature toggle
value in `config.feature_toggles` if it is set in this manner (it only
returns the value if it is set via the Grafana Admin UI).

So what we should instead be doing is such instead of asking GCOM for
this feature toggle, infer whether RBAC is enabled on the stack by doing
a `HEAD /api/access-control/users/permissions/search` (this endpoint _is
only_ available on a Grafana stack if `accessControlOnCall` is enabled).

**Few caveats to this ☝️**
1. we first have to make sure that the cloud stack is in an `active`
state (ie. not paused). This is because, no matter if the
`accessControlOnCall` is enabled or not, if the stack is in a `paused`
state it will ALWAYS return `HTTP 200` which can be misleading and lead
to bugs (this feels like a bug on the Grafana API, will follow up with
core grafana team)
2. Once we roll out this change we will effectively **actually** be
enabling RBAC for OnCall for all orgs. The Identity Access team would
prefer a progressive rollout, which is why I decided to introduce the
concept of
[`settings.CLOUD_RBAC_ROLLOUT_PERCENTAGE`](https://github.com/grafana/oncall/pull/4279/files#diff-3383aef931e41e44d95829ad971641eeb98fe001be2f5da92217446d300ea1b3R918)
(see also [`Organization.
should_be_considered_for_rbac_permissioning`](https://github.com/grafana/oncall/pull/4279/files#diff-2ca9917f4f56349be39545ee8abd459be5076295d02ca3a7ec545152fcddccdfR348-R362))

## Which issue(s) this PR closes

Related to https://github.com/grafana/identity-access-team/issues/667

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-05-14 16:30:16 +00:00
..
__init__.py World, meet OnCall! 2022-06-03 08:09:47 -06:00
base.py Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279) 2024-05-14 16:30:16 +00:00
celery_task_routes.py Handle alert group deleted when task is already queued (#4230) 2024-04-16 14:39:00 +00:00
ci-test.py address occasional failing tests when run w/ pytest-xdist (#3840) 2024-02-06 11:57:54 -05:00
dev.py continue addressing mypy violations (#2170) 2023-06-27 10:23:08 +00:00
helm.py Allow multiple database and celery broker types (#582) 2022-10-04 09:25:53 +01:00
hobby.py Allow multiple database and celery broker types (#582) 2022-10-04 09:25:53 +01:00
prod_without_db.py Improve OpenAPI schema coverage (#3629) 2024-01-12 15:11:22 +00:00