centralcloud/oncall-engine

Author	SHA1	Message	Date
Vadim Stepanov	f977f9faee	Minor formatting changes (#2641 ) # What this PR does - Updates `black` and `flake8` to latest - Removes `F541` from flake8 ignore (`F541 f-string is missing placeholders`) - Enables ["float to top" option](https://pycqa.github.io/isort/docs/configuration/options.html#float-to-top) for `isort`	2023-07-26 14:45:44 +01:00
Ildar Iskhakov	c73d0f385a	Remove checks that slow down plugin load and cause "Initializing plugin..." (#2624 ) # What this PR does * Removes "Initializing plugin.." message during load * Removes black screen when plugin loads * Removes wait for syncs between OnCall and Grafana * Deprecates GET /status, POST /sync, GET /sync in favour of single POST /status ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-26 10:57:57 +00:00
Vadim Stepanov	b2f4ffb98a	`apps.get_model` -> `import` (#2619 ) # What this PR does Remove [`apps.get_model`](https://docs.djangoproject.com/en/3.2/ref/applications/#django.apps.apps.get_model) invocations and use inline `import` statements in places where models are imported within functions/methods to avoid circular imports. I believe `import` statements are more appropriate for most use cases as they allow for better static code analysis & formatting, and solve the issue of circular imports without being unnecessarily dynamic as `apps.get_model`. With `import` statements, it's possible to: - Jump to model definitions in most IDEs - Automatically sort inline imports with `isort` - Find import errors faster/easier (most IDEs highlight broken imports) - Have more consistency across regular & inline imports when importing models This PR also adds a flake8 rule to ban imports of `django.apps.apps`, so it's harder to use `apps.get_model` by mistake (it's possible to ignore this rule by using `# noqa: I251`). The rule is not enforced on directories with migration files, because `apps.get_model` is often used to get a historical state of a model, which is useful when writing migrations ([see this SO answer for more details](https://stackoverflow.com/a/37769213)). So `apps.get_model` is considered OK in migrations (even necessary in some cases). ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-25 09:43:23 +00:00
Joey Orlando	f16df03279	support comma and space delimited grafana feature_toggles (#2623 ) ## Which issue(s) this PR fixes See the following threads for more context on the issue this PR addresses: - https://raintank-corp.slack.com/archives/CRUKW54N5/p1690068395450819 - https://raintank-corp.slack.com/archives/C036J5B39/p1690183217162019 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-25 02:39:39 -04:00
Joey Orlando	182f18de2c	fix parsing of grafana feature flags that're enabled via the feature_toggles.enabled syntax (#2477 ) # What this PR does Grafana provides two methods for enabling feature flags: - `feature_toggles.enabled` - `feature_toggles.<name_of_feature_flag>` For example, to enable `accessControlOnCall` you could either do: ```json { "feature_toggles": { "enabled": "accessControlOnCall,someOtherCoolFeatureFlag" } } ``` or: ```json { "feature_toggles": { "accessControlOnCall": true, "someOtherCoolFeatureFlag": true } } ``` In method 1, if they're multiple feature flags present, they are _comma separated, not space separated_. This PR fixes this parsing issue. ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) (N/A) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-10 04:59:15 -04:00
Joey Orlando	75028d0427	continue addressing mypy violations (#2170 ) # What this PR does See #2173 Also, closes #2187 . All of the new files under `type_stubs/icalendar` were autogenerated by running: ```bash stubgen -p icalendar -o type_stubs ``` ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-06-27 10:23:08 +00:00
Joey Orlando	9dde1805aa	add mypy static type checker to backend codebase (#2151 ) # What this PR does - Adds [`mypy` static type checking](https://mypy-lang.org/) to our CI pipeline. Currently there is still a ton of errors being returned by the tool, as we'll need to fix pre-existing errors. I think we can slowly chip away at these errors in small PRs, doing them all in one large PR is likely very risky. - Also, this PR starts chipping away at one of the main type errors that we have which is accessing the `datetime` class (from the `datetime` library) or `timedelta` function on the `django.utils.timezone` module. Basically we should be instead accessing these two objects from the native `datetime` module. This makes sense because the [`__all__` attribute](https://github.com/django/django/blob/main/django/utils/timezone.py#L14-L30) in `django.utils.timezone` does not re-export `datetime` or `timedelta`. - splits `engine` dependencies out into `requirements.txt` and `requirements-dev.txt` ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated (N/A) - [ ] Documentation added (or `pr:no public docs` PR label added if not required) (N/A) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) (N/A)	2023-06-12 12:50:33 -04:00
Vadim Stepanov	c921674471	Improve plugin authentication (#1995 ) # What this PR does Handle different failing authentication scenarios (e.g. when token is invalid or instance context is not a valid JSON) so endpoints return appropriate response code (401 instead of 500). ## Which issue(s) this PR fixes Related to https://github.com/grafana/oncall-private/issues/1633 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-05-23 16:13:25 +00:00
Michael Derynck	397f961486	Fix organizations not being deleted by start_cleanup_deleted_organizations (#1950 ) Organizations that have been deleted outside OnCall were not being cleaned up by this task as expected. - Use PluginAuthToken instead of GCOM token == None to determine if the oncall organization should be matched in GCOM - Fix how delete was being checked for the instance, the previous method does not work.	2023-05-17 12:56:57 +00:00
Joey Orlando	014a9c2ec2	allow the POST incoming alert endpoints to queue create_alert tasks independent of the database status (#1896 ) # What this PR does https://www.loom.com/share/18cc445117de4895a10892d56c7d3699 In preparation to upgrade our cloud databases, this PR makes some minor changes which, after testing locally, allowed the `POST /<integration_type>/<alert_channel_key>` endpoints to successfully receive incoming alerts and queue the celery tasks. I've tested all of the defined `POST /integrations/v1/<integration_type>/<alert_channel_key>` endpoints by sending `POST` requests to an integrations' URL while the MySQL database was down, bringing the database back up, and ensuring the alerts were created. ## Some other findings - the integration heartbeat endpoints will not work as we interact w/ the database to persist the incoming heartbeat instance - if the integration was created in the last 180 seconds, incoming alerts will fail due to the way we cache the integration IDs ([code](https://github.com/grafana/oncall/blob/dev/engine/apps/integrations/mixins/alert_channel_defining_mixin.py#L47-L50)) - The `create_alert` celery task is set to `max_retries=None` and `retry_backoff=True`. This means that the queued tasks will continue retrying forever w/ an exponential backoff, until the alerts can be created in the database (ie. when the database is back online). ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated (N/A) - [ ] Documentation added (or `pr:no public docs` PR label added if not required) (N/A) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) (N/A)	2023-05-10 12:36:23 +00:00
Joey Orlando	2879537c30	properly parse grafana cloud feature toggles (#1880 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) (N/A) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-05-04 16:38:26 +00:00
Michael Derynck	6a08fdc145	Do not retry start_sync_organization (#1856 ) start_sync_organization is scheduled to run every 30 mins. Countdown is not specified so the default countdown with exponential backoff will result in retries happening after the next 30 min trigger. If it is in a state where it is retrying for a long period of time (>30 mins) it will stack up too many redundant sync_organization_async tasks when it finally does succeed.	2023-05-02 02:33:26 +00:00
Innokentii Konstantinov	f0ce08bd67	Check stack cluster for insight_logs (#1469 ) # What this PR does This PR modify is_insight_logs_enabled to check for a stack cluster instead of DynamicSetting ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-03-09 06:30:54 +00:00
Innokentii Konstantinov	fbb83daf21	Store org cluster_slug (#1480 ) # What this PR does Store org cluster slug to write insight logs	2023-03-09 04:10:19 +00:00
Innokentii Konstantinov	7bad073626	Remove OSS_INSTALATION env var (#881 ) It's a duplicate of LICENSE env var What this PR does: Remove OSS_INSTALLATION env var in favour of LICENSE env var. Also, I refactored features tests a little. From my point of view it makes little sense to test if all features are disabled or enabled. Better to test specific use-case (e.g. oss installation). Also to test that all features are disabled it is needed to set LICENSE equals cloud license, which makes test confusing. Checklist - [x] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-03-07 11:07:42 +00:00
Innokentii Konstantinov	4b91203eca	Add validation of hostname for recapctha (#1445 ) # What this PR does - Implement recapthca v3 check. DRF_RECAPTCHA didn't support hostname validation and it's too complicated to add it. - Add validation of verification code on oncall side to not to call twilio with obviously invalid codes ## Checklist - [x] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-03-06 08:59:48 +00:00
Michael Derynck	b3659872a7	Get reCAPTCHA site key from backend env (#1400 ) # What this PR does Move reCAPTCHA site key to backend environment for easier management to support multiple environments. ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [x] `CHANGELOG.md` updated	2023-02-24 15:53:35 +00:00
Joey Orlando	b61f2ce41f	patch minor sync issue when HTTP 302 is received from Grafana API instance (#1393 ) # What this PR does this PR refactors the `sync_organization` and `GrafanaAPIClient.is_rbac_enabled_for_organization` methods to check the connected response bool rather than explicit check on HTTP 200. This handles the legitimate case where the Grafana instance may return an HTTP 302 (redirect) rather than an HTTP 200. ## Which issue(s) this PR fixes See [this](https://grafana.slack.com/archives/C02LSUUSE2G/p1677136582890269) Slack thread in the community channel for more context ## Checklist - [x] Tests updated - [ ] Documentation added (N/A) - [x] `CHANGELOG.md` updated	2023-02-23 13:23:57 +00:00
Ildar Iskhakov	1fc3f6d301	Refactor plugin sync (#1200 ) # What this PR does This PR adds a shortcut in the plugin synchronisation process, so the existing users will be able login without waiting for the sync task. Every request still starts the background synchronisation task, to be able to propagate the organisation changes faster than periodic task. It means that we don't necessarily need "force reload" button in the interface. For all the other cases (user does not exist, organisation token "not ok", etc) process remains same - plugin will show "Initialising plugin..." until the background task in successfully completed Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-01-25 09:12:08 +08:00
Innokentii Konstantinov	cfa7fb816c	Sync users and teams on tf requests (#1180 ) # What this PR does This PR add sync with grafana on requests from terraform ## Which issue(s) this PR fixes It's needed to fix case when customers want to create team via grafana terraform provider and use it in the oncall provider without having to log into Grafana Cloud. Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-01-24 13:44:07 +08:00
Joey Orlando	98241b9a10	fake-data generation script + fixes for django-silk and django-debug-toolbar (#1128 ) # What this PR does ## Main stuff - add Python script to populate local Grafana/OnCall setup w/ large amounts of fake data. Right now the data types that can be generated are: - teams and Admin users via the Grafana API (must be synced manually by going into the UI before going onto the next step) - Calendar Schedules which have three 8h oncall-shifts, via the OnCall public API - fixes `django-debug-toolbar` when being run in `docker-compose` locally ## Other stuff - documents how to easily modify the Grafana `docker-compose` container provisioning configuration - document solutions for two backend setup related issues encountered when running the engine/celery workers locally, outside of `docker-compose`, on an Apple silicon Mac - fixes small bug in `grafana_plugin.helpers.client.APIClient.call_api` where it would call `response.json()` for all requests, regardless of whether or not the response actually contained data or not - in `engine/settings/dev.py`, properly setup `django-silk` and document the steps to use it locally - make it possible to log out debug SQL queries by specifying `DEV_DEBUG_VIEW_SQL_QUERIES` env var, rather than having to uncomment out a section of `settings/dev.py` ## Which issue(s) this PR fixes - Some local setup issues when trying to use `django-silk` and `django-debug-toolbar` - Makes it much easier to populate your local setup with a lot of fake data - Makes it possible to easily modify your local grafana's provisioning configuration ## Checklist - [ ] Tests updated (N/A) - [ ] Documentation added (N/A) - [ ] `CHANGELOG.md` updated (N/A)	2023-01-20 09:19:41 +01:00
Yulya Artyukhina	9129a720ef	Integration with grafana incident (#1081 ) Check if Grafana Incident is enabled. If it is, add a button with a link to declare Grafana Incident from Alert group in Slack and on Web. Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>	2023-01-17 13:04:50 +01:00
Joey Orlando	babacf4da8	refactor the is_rbac_permissions_enabled check to be more robust (#1099 ) # What this PR does Checks the `is_rbac_permissions_enabled` flag differently based on whether we are dealing with an open-source, or cloud installation: - for open-source installations, simply continue making a `HEAD` request to the list RBAC permissions Grafana API endpoint. - for cloud installations, use the `config` object returned from `GET /instances/{instance_id}?config=true` and check whether `instance_info["config"]["feature_toggles"]["accessControlOnCall"] == "true"` ## Which issue(s) this PR fixes Resolves the issue in hosted grafana where when a stack is inactive, the hosted grafana gateway, returns 200 to the `HEAD` request (which erroneously sets the `is_rbac_permissions_enabled` flag to `true`) ## Checklist - [x] Tests updated (N/A) - [ ] Documentation added - [x] `CHANGELOG.md` updated	2023-01-11 12:48:30 +01:00
Innokentii Konstantinov	8abbcee050	Org soft-delete (#1073 ) # What this PR does It introduces soft-delete of organization, since grafana stacks are soft-deleted too. Also, we had a problem with deleting orgs with large amounts of alerts, so soft-deletion will fix this problem. I think, that problem of cleaning alerts of deleted orgs should be solved as a part of alert retention	2023-01-05 12:42:55 +08:00
Joey Orlando	8c0eba46b9	remove is_rbac_permissions_enabled logic from check_gcom_permission function (#976 ) this field will be addressed in a subsequent sync call it is not necessary here	2022-12-12 17:05:44 +01:00
Michael Derynck	52d6009c2a	Remove unused parameter from gcom call (#975 ) # What this PR does Remove unused parameter from gcom call ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2022-12-12 08:40:32 -07:00
Michael Derynck	ad3cd8f5dd	Remove unused call for checking gcom api keys (#931 )	2022-12-01 10:35:53 -07:00
Joey Orlando	a9ac7e82df	update grafana API RBAC permissions endpoint	2022-11-30 09:03:10 +01:00
Joey Orlando	9e598385f4	Add RBAC Support (#777 ) * Modify plugin.json to support RBAC role registration * defines 26 new custom roles in plugin.json. The main roles are: - Admin: read/write access to everything in OnCall - Reader: read access to everything in OnCall - OnCaller : read access to everything in OnCall + edit access to Alert Groups and Schedules - <object-type> Editor: read/write access to everything related to <object-type> - <object-type> Reader: read access for <object-type> - User Settings Admin: read/write access to all user's settings, not just own settings. This is in comparison to User Settings Editor which can only read/write own settings * update changelog and documentation (#686) * implement RBAC for OnCall backend This commit refactors backend authorization. It trys to use RBAC authorization if the org's grafana instance supports it, otherwise it falls back to basic role authorization. * update RBAC backend tests * add tests for RBAC changes - run backend tests as matrix where RBAC is enabled/disabled. When RBAC is enabled, the permissions granted are read from the role grants in the frontend's plugin.json file (instead of relying what we specify in RBACPermission.Permissions) - remove --reuse-db --nomigrations flags from engine/tox.ini - minor autoformatting changes to docker-compose-developer.yml * remove --ds=settings.ci-test from pytest CI command DJANGO_SETTINGS_MODULE is already specified as an env var so this is just unecessary duplication * update gitignore * update github action job name for "test" * RBAC frontend changes * refactors the use of basic roles (ex. Viewer, Editor, Admin) use RBAC permissions (when supported), or falling back to basic roles when RBAC is not supported. - updates the UserAction enum in grafana-plugin/src/state/userAction.ts. Previously this was hardcoded to a list of strings that were being returned by the OnCall API. Now the values here correspond to the permissions in plugin.json (plus a fallback role) * changes per Gabriel's comments: - get rid of group attribute in rbac roles - remove displayName role attribute - remove hidden role attribute - add back role to includes section * don't try to update user timezone if they don't have permission	2022-11-29 09:41:56 +01:00
Yulya Artyukhina	381520ee13	Get rid of installation token + add a bunch of tests (#624 ) * Get rid of installation token (for OSS installations) This is done by being required to supply the grafana API URL as an environment variable on the backend. Additionally, optionally an OnCall API URL environment variable can be passed in to the frontend (this basically allows completely skipping the need to configure anything). - deduplicated a lot of the sync logic on the frontend + made error message more useful and consistent - Split PluginConfigPage component into several subcomponents (making it easier to test each individual component) - Moved RootWithLoader (from plugin/GrafanaPluginRootPage) into its own subcomponent (making it easier to test) - Added tests for pre-existing components that were touched: - PluginConfigPage component (and its new subcomponents) - state/plugin and state/rootBaseStore functions - apps.grafana_plugin django app Helm changes: - add GRAFANA_API_URL to oncall.env - some yaml autoformatting changes - remove reference to python manage.py issue_invite_for_the_frontend --override Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>	2022-11-21 16:26:00 +01:00
Michael Derynck	37825059ff	Add region sync and reverse proxy for migration	2022-10-24 21:25:32 -06:00
Michael Derynck	fa5d4f2674	Add region_slug column to organization	2022-10-11 12:04:33 -06:00
Michael Derynck	b79aa95d42	Tweaks from code review, add schedule and queue assignment	2022-09-07 07:58:44 -06:00
Michael Derynck	7acaf77d0f	Query deleted instances directly instead of checking all that have not sync'd recently	2022-09-06 10:21:05 -06:00
Michael Derynck	20d3ff2fff	Add task to delete organizations if their stack has been deleted in gcom	2022-09-02 14:06:42 -06:00
Julia	d7492bb943	Fix creation contact points for grafana alerting integration	2022-06-16 17:16:31 +03:00
Matias Bordese	0e26568857	Remove demo token related code/logic	2022-06-09 09:16:10 -03:00
Michael Derynck	6b40f95033	World, meet OnCall! Co-authored-by: Eve832 <eve.meelan@grafana.com> Co-authored-by: Francisco Montes de Oca <nevermind89x@gmail.com> Co-authored-by: Ildar Iskhakov <ildar.iskhakov@grafana.com> Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com> Co-authored-by: Julia <ferril.darkdiver@gmail.com> Co-authored-by: maskin25 <kengurek@gmail.com> Co-authored-by: Matias Bordese <mbordese@gmail.com> Co-authored-by: Matvey Kukuy <motakuk@gmail.com> Co-authored-by: Michael Derynck <michael.derynck@grafana.com> Co-authored-by: Richard Hartmann <richih@richih.org> Co-authored-by: Robby Milo <robbymilo@fastmail.com> Co-authored-by: Timur Olzhabayev <timur.olzhabayev@grafana.com> Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com> Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>	2022-06-03 08:09:47 -06:00

38 commits