centralcloud/oncall-engine

Author	SHA1	Message	Date
Joey Orlando	f5495ed702	add first multi-role e2e tests (#2417 ) # What this PR does Lays ground work for #1586. Adds three new fixtures, `adminRolePage`, `editorRolePage`, and `viewerRolePage`. These fixtures can be easily accessed in a `test` context and allow the test to be run as a user authenticated with one of these Grafana basic roles. The bulk of the changes in the PR are to the "global setup" step. There is a bit of logic + communication with the Grafana instance's API, in order to create all the necessary authentication credentials. Lastly, adds the first basic role authorization test, asserting that Admin/Editors can view the list of OnCall users, whereas Viewers cannot. ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-07-04 09:19:14 +00:00
Joey Orlando	9dde1805aa	add mypy static type checker to backend codebase (#2151 ) # What this PR does - Adds [`mypy` static type checking](https://mypy-lang.org/) to our CI pipeline. Currently there is still a ton of errors being returned by the tool, as we'll need to fix pre-existing errors. I think we can slowly chip away at these errors in small PRs, doing them all in one large PR is likely very risky. - Also, this PR starts chipping away at one of the main type errors that we have which is accessing the `datetime` class (from the `datetime` library) or `timedelta` function on the `django.utils.timezone` module. Basically we should be instead accessing these two objects from the native `datetime` module. This makes sense because the [`__all__` attribute](https://github.com/django/django/blob/main/django/utils/timezone.py#L14-L30) in `django.utils.timezone` does not re-export `datetime` or `timedelta`. - splits `engine` dependencies out into `requirements.txt` and `requirements-dev.txt` ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated (N/A) - [ ] Documentation added (or `pr:no public docs` PR label added if not required) (N/A) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) (N/A)	2023-06-12 12:50:33 -04:00
Joey Orlando	47042decb7	don't enforce line-length rule for markdownlint for code-blocks or tables (#2145 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-06-09 06:57:19 +00:00
Matias Bordese	eee5065e74	Add initial setup for local dev prometheus exporter (#2039 )	2023-06-01 12:31:33 +00:00
Roman Pertl	39770c2266	Feat(Dev): Improve Building of Grafana Plugin in Development Env + update node version (#1890 ) # What this PR does - Improvement to the local development environment for the grafana plugin - Run initial yarn build inside the docker container with the same version that is later used for periodic rebuilds - Removes the requirement for having yarn/nodejs installed locally - Using a named volume for storing the node_modules, so they are only stored once - Remove the yarn install step from the Dockerfile - Ideally we store the node_modules only once inside the named volumes. Currently they are stored times - on the host system outside of dockerin grafana-plugins/node_modules - inside the docker image - inside the anonymous docker volume created at the start of a container - update `node` to 18.16.0 (14.17.0 has reached end-of-life as of 3 weeks ago) ## Which issue(s) this PR fixes ## Checklist - [X] ~Unit, integration, and e2e (if applicable) tests updated~ N/A - [ ] Documentation added (or `pr:no public docs` PR label added if not required) - [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) --------- Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>	2023-05-17 16:12:51 -04:00
Joey Orlando	bb3521b879	upgrade to python 3.11.3 (#1849 ) # What this PR does Upgrades the backend to Python 3.11.3 (latest stable release) + update linting step on Drone builds to run all the linting steps, not just the Python ones. ## Checklist - [ ] Unit, integration, and e2e (if applicable) tests updated (N/A) - [ ] Documentation added (or `pr:no public docs` PR label added if not required) (N/A) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-05-05 15:32:40 +00:00
Joey Orlando	0d4db59137	Add "Notifications Receiver" RBAC role (#1853 ) # What this PR does Closes #1651 Plus, add developer instructions on how to run `grafana-enterprise` with RBAC for OnCall, enabled locally. ## Todo - [x] add API integration test for new `permission` query param filter ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)	2023-05-02 12:19:34 +00:00
Michael Derynck	fdac018948	Update dev directory .gitignore (#1850 )	2023-04-28 15:51:58 +00:00
Shantanu Alsi	e806ad32f1	Fix documentation links (#1766 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-04-19 10:12:16 +01:00
Yulya Artyukhina	f61af74411	Add mobile app auth for `notification_policies` endpoint (#1725 ) Add mobile app authentication for `notification_policies` endpoint	2023-04-11 16:36:46 +00:00
Matvey Kukuy	e14cf8f269	Readme updates	2023-04-11 15:43:52 +03:00
Ildar Iskhakov	b2b7237bc4	Update README.md (#1587 ) # What this PR does ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-03-21 23:45:22 +08:00
Ildar Iskhakov	3f9dec6a68	Add "make help" command (#1583 ) # What this PR does Moved part of dev/README.md into `make help` command: ``` make help start start all of the docker containers init build the frontend plugin code then run make start restart restart all docker containers build rebuild images (e.g. when changing requirements.txt) cleanup this will remove all of the images, containers, volumes, and networks lint run both frontend and backend linters test run backend tests start-celery-beat start celery beat purge-queues purge celery queues shell starts an OnCall engine Django shell dbshell opens a DB shell engine-manage run Django's `manage.py` script, inside of a docker container, passing `$CMD` as arguments. exec-engine exec into engine container's bash _backend-debug-enable enable Django's debug mode and Silk profiling (this is disabled by default for performance reasons) _backend-debug-disable disable Django's debug mode and Silk profiling backend-manage-command run Django's `manage.py` script, passing `$CMD` as arguments. ``` ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-03-21 08:12:13 +00:00
Joey Orlando	4d655dff60	modify check_escalation_finished_task task (#1266 ) # What this PR does This PR: - modifies the `check_escalation_finished_task` celery task to: - do stricter escalation validation based on the alert group's escalation snapshot (see the `audit_alert_group_escalation` method in `engine/apps/alerts/tasks/check_escalation_finished.py` for the validation logic) - use a read-only database for querying alert-groups if one is configured, otherwise use the "default" one - ping a configurable heartbeat (new env var `ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL` added) - increase the task frequency from every 10 to every 13 minutes (this can be configured via an env variable) - adds public documentation on how to configure this auditor task - modifies the local celery startup command to properly take into consideration all celery related env vars (similar to the ones we use in `engine/celery_with_exporter.sh`; this made it easier to enable `celery beat` locally for testing) - removes the following code: - removes references to `AlertGroup.estimate_escalation_finish_time` and marks the model field as deprecated using the [`django-deprecate-fields` library](https://pypi.org/project/django-deprecate-fields/). This field was only used for the previous version of this validation task - `EscalationSnapshotMixin.calculate_eta_for_finish_escalation` was only used to calculate the value for `AlertGroup.estimate_escalation_finish_time` - `calculate_escalation_finish_time` celery task ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/1558 ## Checklist - [x] Tests updated - [x] Documentation added - [x] `CHANGELOG.md` updated	2023-03-17 10:14:08 +00:00
Vadim Stepanov	302586792f	Update dev/README.md (#1516 ) # What this PR does Fix `cp` command usage in `dev/README.md` + add `npx playwright install` step to integration tests instruction	2023-03-09 17:09:02 +00:00
Innokentii Konstantinov	7bad073626	Remove OSS_INSTALATION env var (#881 ) It's a duplicate of LICENSE env var What this PR does: Remove OSS_INSTALLATION env var in favour of LICENSE env var. Also, I refactored features tests a little. From my point of view it makes little sense to test if all features are disabled or enabled. Better to test specific use-case (e.g. oss installation). Also to test that all features are disabled it is needed to set LICENSE equals cloud license, which makes test confusing. Checklist - [x] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated	2023-03-07 11:07:42 +00:00
Joey Orlando	8f22b2fd74	first UI integration test - phone verification + receive SMS alert flow (#900 ) What this PR does: Adds our first UI integration test using [Playwright](https://playwright.dev/) and runs the test on CI. Right now the test: - logs into Grafana - configures the plugin (if it isn't already) - creates an OnCall schedule, where the current user will be OnCall - creates an escalation chain to notify based on the newly created OnCall schedule - creates a webhook integration, attached to the created escalation chain - sends a demo alert for the new integration - goes to the alert groups page and validates that the escalation step to alert the OnCall user actually happened Currently the Playwright tests are run against the 3 default headless browsers, chromium, Firefox, and webkit. The CI job that runs these tests is run as a matrix against 3 tagged versions of `grafana`; `main`, `latest`, and `9.2.6`. Secondly, it adds most of the logic for a second test which: - logs into Grafana - configures the plugin (if it isn't already) - goes to the user's settings, verifies their phone number (using a tool called [MailSlurp](https://www.mailslurp.com/)) - configures the current user's default escalation policy to send alerts via SMS - creates an escalation policy and configures it to send alerts to our current user - creates an integration and assigns the created escalation policy - triggers a test alert + verifies that we receive the SMS alert text (again, using MailSlurp) Which issue(s) this PR fixes: Closes #873 Checklist - [x] Tests updated - [ ] Documentation added (N/A) - [ ] `CHANGELOG.md` updated (N/A)	2023-03-06 16:28:52 +00:00
Ildar Iskhakov	1b7ada4315	Add database migrations linter (#1020 ) # What this PR does This PR adds [django-migration-linter](https://github.com/3YOURMIND/django-migration-linter) to keep database migrations backwards compatible - we can automatically run migrations and they are zero-downtime, e.g. old code can work with the migrated database - we can run and rollback migrations without worrying about data safety - OnCall is deployed to the multiple environments core team is not able to control See [django-migration-linter checklist](https://github.com/3YOURMIND/django-migration-linter/blob/main/docs/incompatibilities.md) for the common mistakes and best practices ## Which issue(s) this PR fixes ## Checklist - [ ] Tests updated - [ ] Documentation added - [ ] `CHANGELOG.md` updated --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-06 16:01:37 +08:00
Vadim Stepanov	08dbab73d2	Remove mobile_app_settings DynamicSetting (#1268 ) # What this PR does Remove checks for `mobile_app_settings` DynamicSetting, so changing `FEATURE_MOBILE_APP_INTEGRATION_ENABLED` is enough for toggling the mobile app backend (aka remove per-org feature flag) Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-02 13:21:04 +00:00
Vadim Stepanov	9b709e86c9	Fix local dev setup slowness (#1270 ) # What this PR does Fixes an issue when a local dev setup becomes extremely slow. - Set `DEBUG` and `SILK_PROFILER_ENABLED` to `False` by default + add utility make commands to toggle it - Use `uwsgi` instead of Django's built-in `runserver` for local dev setup - Limit Celery concurrency to 3 for local dev setup (previously was 20, used >1GB RAM on my machine) --------- Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2023-02-02 09:08:48 +00:00
Joey Orlando	98241b9a10	fake-data generation script + fixes for django-silk and django-debug-toolbar (#1128 ) # What this PR does ## Main stuff - add Python script to populate local Grafana/OnCall setup w/ large amounts of fake data. Right now the data types that can be generated are: - teams and Admin users via the Grafana API (must be synced manually by going into the UI before going onto the next step) - Calendar Schedules which have three 8h oncall-shifts, via the OnCall public API - fixes `django-debug-toolbar` when being run in `docker-compose` locally ## Other stuff - documents how to easily modify the Grafana `docker-compose` container provisioning configuration - document solutions for two backend setup related issues encountered when running the engine/celery workers locally, outside of `docker-compose`, on an Apple silicon Mac - fixes small bug in `grafana_plugin.helpers.client.APIClient.call_api` where it would call `response.json()` for all requests, regardless of whether or not the response actually contained data or not - in `engine/settings/dev.py`, properly setup `django-silk` and document the steps to use it locally - make it possible to log out debug SQL queries by specifying `DEV_DEBUG_VIEW_SQL_QUERIES` env var, rather than having to uncomment out a section of `settings/dev.py` ## Which issue(s) this PR fixes - Some local setup issues when trying to use `django-silk` and `django-debug-toolbar` - Makes it much easier to populate your local setup with a lot of fake data - Makes it possible to easily modify your local grafana's provisioning configuration ## Checklist - [ ] Tests updated (N/A) - [ ] Documentation added (N/A) - [ ] `CHANGELOG.md` updated (N/A)	2023-01-20 09:19:41 +01:00
Dieter Plaetinck	6e4d877f93	dev quickstart: no need to configure, safe to ignore warnings (#1131 )	2023-01-13 12:19:42 +01:00
Ildar Iskhakov	15256bc4cf	Remove local uwsgi instrumentation and local development tempo and agent	2023-01-04 22:25:17 +08:00
Ildar Iskhakov	846497ddc7	Move env vars to docker-compose-dev.yml and clean up	2023-01-04 10:49:42 +08:00
Ildar Iskhakov	2b0e4e1d14	Merge branch 'dev' into iskhakov/add-tracing	2023-01-04 10:46:49 +08:00
Joey Orlando	7ebc9cbbf7	modify push notification settings + use fcm-django library (#998 ) - swaps out `django-push-notifications` for [`fcm-django`](https://github.com/grafana/fcm-django). Again.. this is a fork of the parent repo for exactly the same reason.. the migrations point to `auth_user` without letting us use our own user model, this has been patched in the `grafana` fork. The reason why we are using `fcm-django` vs `django-push-notifications` is that the latter does not support the new FCM API, only the "legacy" API. The legacy FCM API does not support certain push notification settings that we would like to use. - modifies the iOS/Android specific push notification settings - adds a `flower` pod in the `docker-compose-developer.yml`, useful for debugging tasks locally - sets the mobile app verification token TTL to 5 minutes when developing locally. The default of 1 minute makes working with device emulators really tricky.. This PR also swaps out the base image in `engine/Dockerfile` from `python:3.9-alpine3.16` to `python:3.9-slim-buster`. As to why.. in short, with the introduction of the `fcm-django` library there is now a peer-dependency on [`grpcio`](https://github.com/grpc/grpc) (which is used by `firebase_admin`.. which I am using in this PR to interact directly with Firebase Cloud Messaging (FCM)). `grpcio` does not publish wheels (read: compiled binaries) for the Alpine distro. It does publish wheels for Debian and hence `pip install -r requirements.txt` does not need to build this library from the source distribution. This is a [known "issue"](https://github.com/grpc/grpc/issues/22815#issuecomment-1107874367) and the recommended solution in the community is to.. not use alpine. These were the numbers, when building the image locally, in terms of image size and build time: \| \| Local image size (uncompressed \| Build time (may differ based on your network speed) \| \| ------------------------- \| -------------------------------------- \| ---------- \| \| `python:3.9-alpine3.16` \| 785MB \| 320s \| \| `python:3.9-slim-buster` \| 1.05GB \| 90s \| Co-authored-by: Salvatore Giordano <salvatoregiordanoo@gmail.com>	2022-12-20 12:41:34 +01:00
Ildar Iskhakov	fa3413d2a9	Add tracing support	2022-12-19 17:15:06 +08:00
Joey Orlando	ed4be171f6	add make command to configure mobile app feature (#988 ) Adds a make command, `enable-mobile-app-feature-flags`, which sets the backend feature flag in `./dev/.env.dev`, and updates a record in the `base_dynamicsetting` database table, which are needed to enable the mobile app backend features.	2022-12-14 09:36:35 +01:00
Joey Orlando	c08eeb72a3	add precommit rules for markdown/json files (#915 ) * add markdownlint precommit steps + fix existing errors * prettier json linting	2022-12-01 14:26:54 +01:00
Joey Orlando	9a7b8acd5a	centralize timezone validation + add serializer validation for on call shifts and schedules (#924 ) * Centralize timezone validation into one spot + add serializer validation for schedules and oncall shifts (both public and internal API) * add engine-manage make command	2022-12-01 14:13:39 +01:00
Ildar Iskhakov	3198612c65	Add flag to debug logs (#912 )	2022-11-29 11:16:42 +08:00
Jack Jackson	c517569108	Create generic Make command to run Django `manage.py` (#886 ) See https://github.com/grafana/oncall/pull/868	2022-11-22 22:15:22 +01:00
Jack Jackson	491093100a	Note that Docker Compose >=v2.10 is required (#885 ) See https://github.com/grafana/oncall/issues/846	2022-11-22 22:13:52 +01:00
Yulya Artyukhina	381520ee13	Get rid of installation token + add a bunch of tests (#624 ) * Get rid of installation token (for OSS installations) This is done by being required to supply the grafana API URL as an environment variable on the backend. Additionally, optionally an OnCall API URL environment variable can be passed in to the frontend (this basically allows completely skipping the need to configure anything). - deduplicated a lot of the sync logic on the frontend + made error message more useful and consistent - Split PluginConfigPage component into several subcomponents (making it easier to test each individual component) - Moved RootWithLoader (from plugin/GrafanaPluginRootPage) into its own subcomponent (making it easier to test) - Added tests for pre-existing components that were touched: - PluginConfigPage component (and its new subcomponents) - state/plugin and state/rootBaseStore functions - apps.grafana_plugin django app Helm changes: - add GRAFANA_API_URL to oncall.env - some yaml autoformatting changes - remove reference to python manage.py issue_invite_for_the_frontend --override Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>	2022-11-21 16:26:00 +01:00
Yulia Shanyrova	a53dc71883	developer readme has been updated: troubleshooting case (#859 ) * developer readme has been updated: troubleshooting case Co-authored-by: Joey Orlando <joey.orlando@grafana.com>	2022-11-21 14:19:43 +01:00
Jack Jackson	0bb1a792ea	Add documentation for issue make when CDPATH is set unusually Note - you may be wondering "how could your system possibly function if `.` is not part of `CDPATH`? Testing suggests that behaviour is inconsistent between shells - `sh` will entirely ignore the current directory if `.` is absent, but `zsh` will still attempt to search the current directory even if `.` is not in `CDPATH`	2022-11-14 08:07:43 +01:00
Jack Jackson	8cdc1a76e3	Typo-fix	2022-11-14 08:06:17 +01:00
Jack Jackson	8b6bd3a32d	Correct typos regarding `COMPOSE_PROFILES` Proof: ``` $ grep 'COMPOSE' dev/.env.dev COMPOSE_PROFILE=overriden_value_1 COMPOSE_PROFILES=overriden_value_2 $ make start COMPOSE_PROFILES=overriden_value_2 [...] ```	2022-11-14 08:05:32 +01:00
Vadim Stepanov	d2243ba09b	Add Makefile command for rebuilding images (#817 )	2022-11-09 15:43:12 +00:00
Joey Orlando	1177e44cc7	enterprise dev changes + few other small changes (#802 ) * support enterprise development in docker * fix flaky mysql healthcheck command I was getting the mysql_to_create_grafana_db and oncall_db_migration prematurely starting up this commit changes the healthcheck used here to be the same as what is used in docker-compose-mysql-rabbitmq.yml * upgrade docker-compose config files to 3.9 3.8 does not actually support the "long form" version of depends_on see here for more info https://stackoverflow.com/a/54249757 https://docs.docker.com/compose/compose-file/compose-file-v3/#depends_on * add make init command and update documentation * cleanup gitignore files	2022-11-09 07:21:33 +01:00
Joey Orlando	78d01df864	One startup command to rule them all (#760 ) * Modify `docker-compose-developer` configuration files, and `Makefile` to support running everything in containers for local development - Make use of the COMPOSE_PROFILES env var that is supported by docker-compose to allow swapping-out/turning off certain docker-compose services. - add makefile cleanup command. Will remove all docker resources related to running the project locally - The "restart grafana container" issue, where users would need to restart their grafana container when setting up the project for the first time, is now fixed (make command now runs yarn build:dev before docker-compose startup; this ensures grafana-plugin/dist is available for grafana container before it starts up) - The DEVELOPER.md has been updated as well to reflect these new changes. It has been moved to ./dev/README.md (and references to the old file have been updated). - The redis image that is referenced in the docker-compose files has been pinned to v7.0.5 (latest version as of this commit) to avoid any surprises w/ future releases. - remove root .dockerignore in favour of individual .dockerignore files in ./engine and ./grafana-plugin	2022-11-07 16:34:43 +01:00

41 commits