Commit graph

35 commits

Author SHA1 Message Date
Matias Bordese
cd5e9955b9
Make sure organization token is valid before sync (#4904)
Since we will be triggering sync for orgs without a `last_time_synced`
set, we need to make sure the token is valid (previously both,
`last_time_synced` and the token, were updated from the frontend plugin)
2024-08-22 17:49:22 +00:00
Dominik Broj
06d19bf6e9
New OnCall plugin initialization process (#4657)
# What this PR does

New OnCall plugin initialization process

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.

---------

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>
Co-authored-by: Matias Bordese <mbordese@gmail.com>
2024-08-16 16:43:52 +00:00
Matias Bordese
7d6da1e157
Handle None values from sync data (#4775)
This is required to support the install v2 endpoint (to be used by
backend plugin) which could be pushing null permissions, teams, or team
memberships.
2024-08-02 18:08:45 +00:00
Matias Bordese
85c63e7ba2
Fix refactored permissions sync (#4771) 2024-07-31 20:27:42 +00:00
Matias Bordese
35f23cdcc6
Rework organization sync and grafana plugin engine backend (#4756)
Related to
https://github.com/grafana/oncall-private/issues/2806#issuecomment-2246286918.

Prepare engine for the backend plugin enablement/migration:

 - Refactor sync code
- Improve plugin user authentication to set up user on-the-fly (when
missing)
- Implement v2 endpoints for install, sync and status (to be used via
the backend plugin)

(most of the changes come from
https://github.com/grafana/oncall/pull/4657; backport all engine changes
that keep backwards compatibility)
2024-07-31 16:12:56 +00:00
Vadim Stepanov
7a2fc923df
Don't update RBAC status on Grafana server error (#4753)
# What this PR does

Fixes a bug when RBAC permissions are getting erased when Grafana's API
returns a 5xx server error on organization sync.

## Which issue(s) this PR closes

Closes https://github.com/grafana/oncall-private/issues/2834

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-29 16:28:35 +00:00
Joey Orlando
a3187953ec
remove deprecated rbac workaround (#4377)
## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-05-22 15:27:16 +00:00
Joey Orlando
2582a1b1dc
Refactor how RBAC enabled/disabled status is determined for Grafana Cloud stacks (#4279)
# What this PR does

In cloud we are currently (somewhat) improperly determining whether or
not a Grafana stack had the `accessControlOnCall` feature flag enabled.
At first things worked fine. We would enable this feature toggle via the
Grafana Admin UI, and then the OnCall backend would read this value from
GCOM's `GET /instance/<stack_id>` endpoint (via
`config.feature_toggles`), and everything worked as expected.

There was a recent change made in `grafana/deployment_tools` to set this
feature flag to True for all stacks. However, for some reason, the GCOM
endpoint above doesn't return the `accessControlOnCall` feature toggle
value in `config.feature_toggles` if it is set in this manner (it only
returns the value if it is set via the Grafana Admin UI).

So what we should instead be doing is such instead of asking GCOM for
this feature toggle, infer whether RBAC is enabled on the stack by doing
a `HEAD /api/access-control/users/permissions/search` (this endpoint _is
only_ available on a Grafana stack if `accessControlOnCall` is enabled).

**Few caveats to this ☝️**
1. we first have to make sure that the cloud stack is in an `active`
state (ie. not paused). This is because, no matter if the
`accessControlOnCall` is enabled or not, if the stack is in a `paused`
state it will ALWAYS return `HTTP 200` which can be misleading and lead
to bugs (this feels like a bug on the Grafana API, will follow up with
core grafana team)
2. Once we roll out this change we will effectively **actually** be
enabling RBAC for OnCall for all orgs. The Identity Access team would
prefer a progressive rollout, which is why I decided to introduce the
concept of
[`settings.CLOUD_RBAC_ROLLOUT_PERCENTAGE`](https://github.com/grafana/oncall/pull/4279/files#diff-3383aef931e41e44d95829ad971641eeb98fe001be2f5da92217446d300ea1b3R918)
(see also [`Organization.
should_be_considered_for_rbac_permissioning`](https://github.com/grafana/oncall/pull/4279/files#diff-2ca9917f4f56349be39545ee8abd459be5076295d02ca3a7ec545152fcddccdfR348-R362))

## Which issue(s) this PR closes

Related to https://github.com/grafana/identity-access-team/issues/667

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-05-14 16:30:16 +00:00
Michael Derynck
d938b52d80
Add scheduled task to start cleanup tasks (#3976)
# What this PR does
Add scheduled task to start cleanup tasks. Currently purpose is to run
the task every 12 hours and for all active orgs cleanup empty & deleted
integrations. For deleted orgs we can run this manually. It will also
run if an org moves from active to deleted. It is expected to add more
to cleanup_organization task over time.

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-03-04 19:45:01 +00:00
Innokentii Konstantinov
dc355dbf0f Fix grafana_labels sync 2024-02-01 13:43:32 +08:00
Michael Derynck
8427953fad
Fix Incident plugin status sync (#3802)
# What this PR does
- Handle case where key exists for jsonData but explicitly set to None
- Disable incident if plugin disabled after or in the case it was
removed completely from the Grafana instance

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-31 11:52:20 -07:00
Innokentii Konstantinov
c58a81bbdf
Enable labels feature only if labels plugin is enabled (#3769)
# What this PR does
Adds a check to enable labels feature only if plugin provisioned. It's
needed to be protected from reconciliation delays and etc.

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-30 07:29:16 +00:00
Michael Derynck
032ced6fd0
Add more logging to plugin sync and install (#3730)
# What this PR does
Add logging to process for syncing OnCall backend with Grafana to help
troubleshoot issues in self-hosted setups.


## Which issue(s) this PR fixes

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-23 22:59:33 +00:00
Joey Orlando
da7f07ffd6
Fix occasional AttributeError in apps.grafana_plugin.tasks.sync.sync_organization_async task (#3687)
# Which issue(s) this PR fixes

Fix this issue I came across in a celery task retry exception log:
![Screenshot 2024-01-15 at 11 21
13](https://github.com/grafana/oncall/assets/9406895/ed08f2f1-dc7d-4ad3-88a0-dc02cd740582)


## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-15 11:34:40 -05:00
Matias Bordese
181d5d5712
Setup one-at-a-time lock for sync_organization tasks (#3612)
Related to https://github.com/grafana/support-escalations/issues/8844

Queuing multiple sync_organization tasks for the same org could lead to
parallel running of the sync task for the same organization, potentially
creating duplicated entries and/or generating multiple unneeded API
calls. This prevents running an organization sync while there is a sync
for that same org in progress.
2024-01-04 15:34:28 +00:00
Joey Orlando
382b18b052
Mobile app proxy gateway (#3449)
# What this PR does

Closes https://github.com/grafana/oncall-private/issues/2324

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-05 19:58:05 +00:00
Matias Bordese
e39baa6bbe
Revert "Refactor gcom api calls when syncing org" (#3498)
Reverts grafana/oncall#3489

Reviewing logs, it seems something broke related to [token
auth](https://ops.grafana-ops.net/explore?schemaVersion=1&panes=%7B%22ffS%22:%7B%22datasource%22:%22OP27Xzxnk%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bcluster%3D~%5C%22dev-.%2A%5C%22,%20namespace%3D%5C%22grafana-com%5C%22,%20job%3D%5C%22grafana-com%2Fgrafana-com-api%5C%22%7D%20%7C~%20%5C%22%2Finstances%2F%5Ba-z0-9%5D%2B.config%3Dtrue%5C%22%20%7C%3D%20%5C%22Grafana%20OnCall%5C%22%22,%22editorMode%22:%22code%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22OP27Xzxnk%22%7D%7D%5D,%22range%22:%7B%22from%22:%22now-1h%22,%22to%22:%22now%22%7D%7D%7D&orgId=1).
Reverting for now, will revisit in a later PR.
2023-12-04 18:02:58 +00:00
Matias Bordese
9eb09c0272
Refactor gcom api calls when syncing org (#3489)
Make one API call instead of two.
2023-12-04 13:08:59 +00:00
Joey Orlando
d6140cbe8d
Re-enable a few mypy rules + fix existing errors (#2725)
# What this PR does

Related to https://github.com/grafana/oncall/issues/2392

- Re-enable the following `mypy` rules + fix their pre-existing errors:
  - `no-redef`
  - `valid-type`
  - `var-annotated`
- Add stronger return typing to the `GrafanaAPIClient` by use of
generics + add some links to documentation in the method docstrings

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-08-03 09:43:03 +00:00
Vadim Stepanov
b2f4ffb98a
apps.get_model -> import (#2619)
# What this PR does

Remove
[`apps.get_model`](https://docs.djangoproject.com/en/3.2/ref/applications/#django.apps.apps.get_model)
invocations and use inline `import` statements in places where models
are imported within functions/methods to avoid circular imports.

I believe `import` statements are more appropriate for most use cases as
they allow for better static code analysis & formatting, and solve the
issue of circular imports without being unnecessarily dynamic as
`apps.get_model`. With `import` statements, it's possible to:

- Jump to model definitions in most IDEs
- Automatically sort inline imports with `isort`
- Find import errors faster/easier (most IDEs highlight broken imports)
- Have more consistency across regular & inline imports when importing
models

This PR also adds a flake8 rule to ban imports of `django.apps.apps`, so
it's harder to use `apps.get_model` by mistake (it's possible to ignore
this rule by using `# noqa: I251`). The rule is not enforced on
directories with migration files, because `apps.get_model` is often used
to get a historical state of a model, which is useful when writing
migrations ([see this SO answer for more
details](https://stackoverflow.com/a/37769213)). So `apps.get_model` is
considered OK in migrations (even necessary in some cases).

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-07-25 09:43:23 +00:00
Michael Derynck
397f961486
Fix organizations not being deleted by start_cleanup_deleted_organizations (#1950)
Organizations that have been deleted outside OnCall were not being
cleaned up by this task as expected.

- Use PluginAuthToken instead of GCOM token == None to determine if the
oncall organization should be matched in GCOM
- Fix how delete was being checked for the instance, the previous method
does not work.
2023-05-17 12:56:57 +00:00
Joey Orlando
3b274f45f4
add several new database columns + emit two new Django signals (#1522)
# What this PR does

- add new columns `gcom_org_contract_type`,
`gcom_org_irm_sku_subscription_start_date`, and
`gcom_org_oldest_admin_with_billing_privileges_user_id` to
`user_management_organization` table + `is_restricted` column to
`alerts_alertgroup` table
- emit two new Django signals
- `org_sync_signal` at the end of the
`engine/apps/user_management/sync.py::sync_organization` method
  - `alert_group_created_signal` when a new Alert Group is created

## Checklist

- [ ] Tests updated (N/A)
- [ ] Documentation added (N/A)
- [x] `CHANGELOG.md` updated

---------

Co-authored-by: Rares Mardare <rares.mardare@grafana.com>
2023-04-14 09:15:57 +02:00
Innokentii Konstantinov
fbb83daf21
Store org cluster_slug (#1480)
# What this PR does
Store org cluster slug to write insight logs
2023-03-09 04:10:19 +00:00
Joey Orlando
b61f2ce41f
patch minor sync issue when HTTP 302 is received from Grafana API instance (#1393)
# What this PR does

this PR refactors the `sync_organization` and
`GrafanaAPIClient.is_rbac_enabled_for_organization` methods to check the
connected response bool rather than explicit check on HTTP 200. This
handles the legitimate case where the Grafana instance may return an
HTTP 302 (redirect) rather than an HTTP 200.

## Which issue(s) this PR fixes

See
[this](https://grafana.slack.com/archives/C02LSUUSE2G/p1677136582890269)
Slack thread in the community channel for more context

## Checklist

- [x] Tests updated
- [ ] Documentation added (N/A)
- [x] `CHANGELOG.md` updated
2023-02-23 13:23:57 +00:00
Innokentii Konstantinov
cfa7fb816c
Sync users and teams on tf requests (#1180)
# What this PR does
This PR add sync with grafana on requests from terraform 

## Which issue(s) this PR fixes
It's needed to fix case when customers want to create team via grafana
terraform provider and use it in the oncall provider without having to
log into Grafana Cloud.

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-01-24 13:44:07 +08:00
Yulya Artyukhina
9129a720ef
Integration with grafana incident (#1081)
Check if Grafana Incident is enabled. If it is, add a button with a link
to declare Grafana Incident from Alert group in Slack and on Web.

Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>
2023-01-17 13:04:50 +01:00
Joey Orlando
babacf4da8
refactor the is_rbac_permissions_enabled check to be more robust (#1099)
# What this PR does
Checks the `is_rbac_permissions_enabled` flag differently based on
whether we are dealing with an open-source, or cloud installation:
- for open-source installations, simply continue making a `HEAD` request
to the list RBAC permissions Grafana API endpoint.
- for cloud installations, use the `config` object returned from `GET
/instances/{instance_id}?config=true` and check whether
`instance_info["config"]["feature_toggles"]["accessControlOnCall"] ==
"true"`

## Which issue(s) this PR fixes
Resolves the issue in hosted grafana where when a stack is inactive, the
hosted grafana gateway, returns 200 to the `HEAD` request (which
erroneously sets the `is_rbac_permissions_enabled` flag to `true`)

## Checklist

- [x] Tests updated (N/A)
- [ ] Documentation added
- [x] `CHANGELOG.md` updated
2023-01-11 12:48:30 +01:00
Innokentii Konstantinov
8abbcee050
Org soft-delete (#1073)
# What this PR does
It introduces soft-delete of organization, since grafana stacks are
soft-deleted too. Also, we had a problem with deleting orgs with large
amounts of alerts, so soft-deletion will fix this problem. I think, that
problem of cleaning alerts of deleted orgs should be solved as a part of
alert retention
2023-01-05 12:42:55 +08:00
Joey Orlando
b66dd1a30c
fix sync.. again (#978) 2022-12-12 18:48:26 +01:00
Joey Orlando
3625b75791
fix cloud sync related issue (#977)
this PR reverts [this
change](9e598385f4 (diff-a74aa8f07a8fdc31af66559390f0fc77b66692d43e6d3c5f94311ef7eed5efabL19-L55))
and removes the `str` casting that was done on the `orgId` field
returned from the Grafana COM API
2022-12-12 18:25:56 +01:00
Joey Orlando
9e598385f4
Add RBAC Support (#777)
* Modify plugin.json to support RBAC role registration

* defines 26 new custom roles in plugin.json. The main roles are:

- Admin: read/write access to everything in OnCall
- Reader: read access to everything in OnCall
- OnCaller : read access to everything in OnCall + edit access to Alert Groups and Schedules
- <object-type> Editor: read/write access to everything related to <object-type>
- <object-type> Reader: read access for <object-type>
- User Settings Admin: read/write access to all user's settings, not just own settings. This is in comparison to User Settings Editor which can only read/write own settings

* update changelog and documentation (#686)

* implement RBAC for OnCall backend

This commit refactors backend authorization. It trys to use RBAC authorization if the org's grafana instance supports it, otherwise it falls back to basic role authorization.

* update RBAC backend tests

* add tests for RBAC changes
- run backend tests as matrix where RBAC is enabled/disabled. When RBAC is enabled, the permissions granted are read from the role grants in the frontend's plugin.json file (instead of relying what we specify in RBACPermission.Permissions)
- remove --reuse-db --nomigrations flags from engine/tox.ini
- minor autoformatting changes to docker-compose-developer.yml

* remove --ds=settings.ci-test from pytest CI command

DJANGO_SETTINGS_MODULE is already specified as an env var so this is just unecessary duplication

* update gitignore

* update github action job name for "test"

* RBAC frontend changes

* refactors the use of basic roles (ex. Viewer, Editor, Admin) use RBAC permissions (when supported), or falling back to basic roles when RBAC is not supported.

- updates the UserAction enum in grafana-plugin/src/state/userAction.ts. Previously this was hardcoded to a list of strings that were being returned by the OnCall API. Now the values here correspond to the permissions in plugin.json (plus a fallback role)

* changes per Gabriel's comments:
- get rid of group attribute in rbac roles
- remove displayName role attribute
- remove hidden role attribute
- add back role to includes section

* don't try to update user timezone if they don't have permission
2022-11-29 09:41:56 +01:00
Michael Derynck
fa5d4f2674 Add region_slug column to organization 2022-10-11 12:04:33 -06:00
Michael Derynck
fa0d854ee5 Fix log message not displaying primary key for organization being deleted 2022-09-08 11:56:04 -06:00
Michael Derynck
20d3ff2fff Add task to delete organizations if their stack has been deleted in gcom 2022-09-02 14:06:42 -06:00
Michael Derynck
6b40f95033 World, meet OnCall!
Co-authored-by: Eve832 <eve.meelan@grafana.com>
    Co-authored-by: Francisco Montes de Oca <nevermind89x@gmail.com>
    Co-authored-by: Ildar Iskhakov <ildar.iskhakov@grafana.com>
    Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
    Co-authored-by: Julia <ferril.darkdiver@gmail.com>
    Co-authored-by: maskin25 <kengurek@gmail.com>
    Co-authored-by: Matias Bordese <mbordese@gmail.com>
    Co-authored-by: Matvey Kukuy <motakuk@gmail.com>
    Co-authored-by: Michael Derynck <michael.derynck@grafana.com>
    Co-authored-by: Richard Hartmann <richih@richih.org>
    Co-authored-by: Robby Milo <robbymilo@fastmail.com>
    Co-authored-by: Timur Olzhabayev <timur.olzhabayev@grafana.com>
    Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>
    Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>
2022-06-03 08:09:47 -06:00