# What this PR does - autogenerate new types exposed by backend, remove custom types that duplicate autogenerated ones - use autogenerated types for alert receive channels - in alert_receive_channel model: - use autogenerate http client (`onCallApi`) for http requests - extract methods that don't update state into alert_receive_channel.helpers.ts and make them pure (they accept AlertReceiveChannelStore as param) to avoid inconsistency and issues with `this` binding - use `makeAutoObservable` - remove unneeded decorators - rename update* methods to fetch* whenever such methods retrieve data from backend with GET requests - in other models use `@action.bound` for actions and arrow functions for store methods that are not actions (in subsequent PRs we will apply the same changes as in alert_receive_channel, this is just for now until we do it) - refactor http-client so that it shows global notification on http errors automatically and provide the possibility to opt-out from it when making a call - improve type-safety of `GSelect` - fix bug related to attaching alert group (https://raintank-corp.slack.com/archives/C04JCU51NF8/p1707476487580579) ## Which issue(s) this PR fixes https://github.com/grafana/oncall/issues/3331 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required) --------- Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com> |
||
|---|---|---|
| .. | ||
| grafana/provisioning | ||
| scripts | ||
| .env.dev.example | ||
| .env.mysql.dev | ||
| .env.postgres.dev | ||
| .env.sqlite.dev | ||
| .gitignore | ||
| add_env_var.sh | ||
| helm-local.yml | ||
| kind-config.yaml | ||
| kind.yml | ||
| prometheus.yml | ||
| README.md | ||
Developer quickstart
Related: How to develop integrations
Quick Start using Kubernetes and Tilt (beta)
If you are experiencing issues, please check "Running the project with docker-compose".
Install dependencies
- Tilt | Kubernetes for Prod, Tilt for Dev
- tilt-dev/ctlptl: Making local Kubernetes clusters fun and easy to set up
- Kind
- Yarn
Launch the environment
-
Create local k8s cluster:
make cluster/up -
Deploy the project:
tilt upYou can set local environment variables using
dev/helm-local.dev.ymlfile, e.g.:env: - name: FEATURE_LABELS_ENABLED_FOR_ALL value: "True" -
Wait until all resources are green and open http://localhost:3000/a/grafana-oncall-app (user: oncall, password: oncall)
-
Modify source code, backend and frontend will be hot reloaded
-
Clean up the project by deleting the local k8s cluster:
make cluster/down
Running the project with docker-compose
By default everything runs inside Docker. These options can be modified via the COMPOSE_PROFILES
environment variable.
- Firstly, ensure that you have
dockerinstalled and running on your machine. NOTE: thedocker-compose-developer.ymlfile uses some syntax/features that are only supported by Docker Compose v2. For instructions on how to enable this (if you haven't already done so), see here. Ensure you have Docker Compose version 2.20.2 or above installed - update instructions are here. - Run
make init start. By default this will run everything in Docker, using SQLite as the database and Redis as the message broker/cache. SeeCOMPOSE_PROFILESbelow for more details on how to swap out/disable which components are run in Docker. - Open Grafana in a browser here (login:
oncall, password:oncall). - You should now see the OnCall plugin configuration page. You may safely ignore the warning about the invalid plugin signature. Set "OnCall backend URL" as "http://host.docker.internal:8080". When opening the main plugin page, you may also ignore warnings about version mismatch and lack of communication channels.
- Enjoy! Check our OSS docs if you want to set up Slack, Telegram, Twilio or SMS/calls through Grafana Cloud.
- (Optional) Install
pre-commithooks by runningmake install-precommit-hook
Note: on subsequent startups you can simply run make start, this is a bit faster because it skips the frontend
build step.
COMPOSE_PROFILES
This configuration option represents a comma-separated list of docker-compose profiles.
It allows you to swap-out, or disable, certain components in Docker.
This option can be configured in two ways:
- Setting a
COMPOSE_PROFILESenvironment variable indev/.env.dev. This allows you to avoid having to setCOMPOSE_PROFILESfor eachmakecommand you execute afterwards. - Passing in a
COMPOSE_PROFILESargument when runningmakecommands. For example:
make start COMPOSE_PROFILES=postgres,engine,grafana,rabbitmq
The possible profiles values are:
grafanaprometheusengineoncall_uiredisrabbitmqpostgresmysqltelegram_polling
The default is engine,oncall_ui,redis,grafana. This runs:
- all OnCall components (using SQLite as the database)
- Redis as the Celery message broker/cache
- a Grafana container
GRAFANA_IMAGE
If you would like to change the image or version of Grafana being run, simply pass in a GRAFANA_IMAGE environment variable
to make start (or alternatively set it in your root .env file). The value of this environment variable should be a
valid grafana image/tag combination (ex. grafana:main or grafana-enterprise:latest).
Configuring Grafana
This section is applicable for when you are running a Grafana container inside of docker-compose and you would like
to modify your Grafana instance's provisioning configuration.
The following commands assume you run them from the root of the project:
touch ./dev/grafana.dev.ini
# make desired changes to ./dev/grafana.dev.ini then run
touch .env && ./dev/add_env_var.sh GRAFANA_DEV_PROVISIONING ./dev/grafana/grafana.dev.ini .env
For example, if you would like to enable the topnav feature toggle, you can modify your ./dev/grafana.dev.ini as
such:
[feature_toggles]
enable = top_nav
The next time you start the project via docker-compose, the grafana container will have ./dev/grafana/grafana.dev.ini
volume mounted inside the container.
Modifying Provisioning Configuration
Files under ./dev/grafana/provisioning are volume mounted into your Grafana container and allow you to easily
modify the instance's provisioning configuration. See the Grafana docs here
for more information.
Enabling RBAC for OnCall for local development
To run the project locally w/ RBAC for OnCall enabled, you will first need to run a grafana-enterprise container,
instead of a grafana container. See the instructions here on how to do so.
Next, you will need to follow the steps here on setting up/downloading a Grafana Enterprise license.
Lastly, you will need to modify the instance's configuration. Follow the instructions here on
how to do so. You can modify your configuration file (./dev/grafana.dev.ini) as such:
[rbac]
enabled = true
[feature_toggles]
enable = accessControlOnCall
[server]
root_url = https://<your-stack-slug>.grafana.net/
[enterprise]
license_text = <content-of-the-license-jwt-that-you-downloaded>
(Note: you may need to restart your grafana container after modifying its configuration)
Enabling OnCall prometheus exporter for local development
Add prometheus to your COMPOSE_PROFILES and set FEATURE_PROMETHEUS_EXPORTER_ENABLED=True in your
dev/.env.dev file. You may need to restart your grafana container to make sure the new datasource
is added (or add it manually using the UI; Prometheus will be running in host.docker.internal:9090
by default, using default settings).
Django Silk Profiling
In order to setup django-silk for local profiling, perform the following
steps:
make backend-debug-enablemake engine-manage CMD="createsuperuser"- follow CLI prompts to create a Django superuser- Visit http://localhost:8080/django-admin and login using the credentials you created in step #2
You should now be able to visit http://localhost:8080/silk/ and see the Django Silk UI.
See the django-silk documentation here for more information.
Running backend services outside Docker
By default everything runs inside Docker. If you would like to run the backend services outside of Docker (for integrating w/ PyCharm for example), follow these instructions:
-
Create a Python 3.11 virtual environment using a method of your choosing (ex. venv or pyenv-virtualenv). Make sure the virtualenv is "activated".
-
postgresis a dependency on some of our Python dependencies (notablypsycopg2(docs)). Please visit here for installation instructions. -
make backend-bootstrap- installs all backend dependencies -
Modify your
.env.devby copying the contents of one of.env.mysql.dev,.env.postgres.dev, or.env.sqlite.devinto.env.dev(you should exclude theGF_prefixed environment variables).In most cases where you are running stateful services via
docker-compose, and backend services outside of docker, you will simply need to change the database host tolocalhost(or in the case ofsqliteupdate the file-path to yoursqlitedatabase file). You will need to change the broker host tolocalhostas well. -
make backend-migrate- runs necessary database migrations -
Open two separate shells and then run the following:
make run-backend-server- runs the HTTP servermake run-backend-celery- runs Celery workers
UI E2E Tests
We've developed a suite of "end-to-end" integration tests using Playwright. These tests are run on pull request CI builds. New features should ideally include a new/modified integration test.
To run these tests locally simply do the following:
- Install Playwright dependencies with
npx playwright install - Launch the environment
- Then you interact with tests in 2 different ways:
- Using
Tilt- open E2eTests section where you will find 4 buttons:- Restart headless run (you can configure browsers, reporter and failure allowance there)
- Open watch mode
- Show last HTML report
- Stop (stops any pending e2e test process)
- Using
make:make test:e2eto start headless runmake test:e2e:watchto open watch modemake test:e2e:show:reportto open last HTML report
- Using
Helm unit tests
To run the helm unit tests you will need the following dependencies installed:
helm- installation instructionshelm-unittestplugin - installation instructions
Then you can simply run
make test-helm
Useful make commands
🚶This part was moved to
make helpcommand. Run it to see all the available commands and their descriptions
Setting environment variables
If you need to override any additional environment variables, you should set these in a root .env.dev file.
This file is automatically picked up by the OnCall engine Docker containers. This file is ignored from source control
and also overrides any defaults that are set in other .env* files
Slack application setup
For Slack app configuration check our docs: https://grafana.com/docs/oncall/latest/open-source/#slack-setup
Update drone build
The .drone.yml build file must be signed when changes are made to it. Follow these steps:
If you have not installed drone CLI follow these instructions
To sign the .drone.yml file:
export DRONE_SERVER=https://drone.grafana.net
# Get your drone token from https://drone.grafana.net/account
export DRONE_TOKEN=<Your DRONE_TOKEN>
drone sign --save grafana/oncall .drone.yml
Troubleshooting
ld: library not found for -lssl
Problem:
make backend-bootstrap
...
ld: library not found for -lssl
clang: error: linker command failed with exit code 1 (use -v to see invocation)
error: command 'gcc' failed with exit status 1
...
Solution:
export LDFLAGS=-L/usr/local/opt/openssl/lib
make backend-bootstrap
Could not build wheels for cryptography which use PEP 517 and cannot be installed directly
Happens on Apple Silicon
Problem:
build/temp.macosx-12-arm64-3.9/_openssl.c:575:10: fatal error: 'openssl/opensslv.h' file not found
#include <openssl/opensslv.h>
^~~~~~~~~~~~~~~~~~~~
1 error generated.
error: command '/usr/bin/clang' failed with exit code 1
----------------------------------------
ERROR: Failed building wheel for cryptography
Solution:
LDFLAGS="-L$(brew --prefix openssl@1.1)/lib" CFLAGS="-I$(brew --prefix openssl@1.1)/include" pip install `cat engine/requirements.txt | grep cryptography`
django.db.utils.OperationalError: (1366, "Incorrect string value")
Problem:
django.db.utils.OperationalError: (1366, "Incorrect string value: '\\xF0\\x9F\\x98\\x8A\\xF0\\x9F...' for column 'cached_name' at row 1")
Solution:
Recreate the database with the correct encoding.
/bin/sh: line 0: cd: grafana-plugin: No such file or directory
Problem:
When running make init:
/bin/sh: line 0: cd: grafana-plugin: No such file or directory
make: *** [init] Error 1
This arises when the environment variable [CDPATH](https://www.theunixschool.com/2012/04/what-is-cdpath.html) is
set and when the current path (.) is not explicitly part of CDPATH.
Solution:
Either make . part of CDPATH in your .rc file setup, or temporarily override the variable when running make commands:
$ CDPATH="." make init
# Setting CDPATH to empty seems to also work - only tested on zsh, YMMV
$ CDPATH="" make init
Problem:
When running make init start:
Error response from daemon: open /var/lib/docker/overlay2/ac57b871108ee1b98ff4455e36d2175eae90cbc7d4c9a54608c0b45cfb7c6da5/committed: is a directory
make: *** [start] Error 1
Solution: clear everything in docker by resetting or:
make cleanup
Encountered error while trying to install package - grpcio
Problem:
We are currently using a library, fcm-django, which has a dependency on grpcio. Google does not provide grpcio
wheels built for Apple Silicon Macs. The best solution so far has been to use a conda virtualenv. There's apparently
a lot of community work put into making packages play well with M1/arm64 architecture.
pip install -r requirements.txt
...
note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> grpcio
...
Solution:
Use a conda virtualenv, and then run the following when installing the engine dependencies/
See here for more details
GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=1 GRPC_PYTHON_BUILD_SYSTEM_ZLIB=1 pip install -r requirements.txt
distutils.errors.CompileError: command '/usr/bin/clang' failed with exit code 1
See solution for "Encountered error while trying to install package - grpcio" here
symbol not found in flat namespace '_EVP_DigestSignUpdate'
Problem:
This problem seems to occur when running the Celery process, outside of docker-compose
(via make run-backend-celery), and using a conda virtual environment.
conda create --name oncall-dev python=3.9.13
conda activate oncall-dev
make backend-bootstrap
make run-backend-celery
File "~/oncall/engine/engine/__init__.py", line 5, in <module>
from .celery import app as celery_app
File "~/oncall/engine/engine/celery.py", line 11, in <module>
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
File "/opt/homebrew/Caskroom/miniconda/base/envs/oncall-dev/lib/python3.9/site-packages/opentelemetry/exporter/otlp/proto/grpc/trace_exporter/__init__.py", line 20, in <module>
from grpc import ChannelCredentials, Compression
File "/opt/homebrew/Caskroom/miniconda/base/envs/oncall-dev/lib/python3.9/site-packages/grpc/__init__.py", line 22, in <module>
from grpc import _compression
File "/opt/homebrew/Caskroom/miniconda/base/envs/oncall-dev/lib/python3.9/site-packages/grpc/_compression.py", line 20, in <module>
from grpc._cython import cygrpc
ImportError: dlopen(/opt/homebrew/Caskroom/miniconda/base/envs/oncall-dev/lib/python3.9/site-packages/grpc/_cython/cygrpc.cpython-39-darwin.so, 0x0002): symbol not found in flat namespace '_EVP_DigestSignUpdate'
Solution:
This solution posted in a GitHub issue thread for
the grpc/grpc repository, fixes the issue:
conda install grpcio
make run-backend-celery
IDE Specific Instructions
PyCharm
- Follow the instructions listed in "Running backend services outside Docker".
- Open the project in PyCharm
- Settings → Project OnCall
- In Python Interpreter click the gear and create a new Virtualenv from existing environment selecting the venv created in Step 1.
- In Project Structure make sure the project root is the content root and add /engine to Sources
- Under Settings → Languages & Frameworks → Django
- Enable Django support
- Set Django project root to /engine
- Set Settings to settings/dev.py
- Create a new Django Server run configuration to Run/Debug the engine
- Use a plugin such as EnvFile to load the .env.dev file
- Change port from 8000 to 8080
How to write database migrations
We use django-migration-linter to keep database migrations backwards compatible
- we can automatically run migrations and they are zero-downtime, e.g. old code can work with the migrated database
- we can run and rollback migrations without worrying about data safety
- OnCall is deployed to the multiple environments core team is not able to control
See django-migration-linter checklist for the common mistakes and best practices
Removing a nullable field from a model
This only works for nullable fields (fields with
null=Truein the field definition).DO NOT USE THIS APPROACH FOR NON-NULLABLE FIELDS, IT CAN BREAK THINGS!
-
Remove all usages of the field you want to remove. Make sure the field is not used anywhere, including filtering, querying, or explicit field referencing from views, models, forms, serializers, etc.
-
Remove the field from the model definition.
-
Generate migrations using the following management command:
python manage.py remove_field <APP_LABEL> <MODEL_NAME> <FIELD_NAME>Example:
python manage.py remove_field alerts AlertReceiveChannel restricted_atThis command will generate two migrations that MUST BE DEPLOYED IN TWO SEPARATE RELEASES:
- Migration #1 will remove the field from Django's state, but not from the database. Release #1 must include migration #1, and must not include migration #2.
- Migration #2 will remove the field from the database. Stash this migration for use in a future release.
-
Make release #1 (removal of the field + migration #1). Once released and deployed, Django will not be aware of this field anymore, but the field will be still present in the database. This allows for a gradual migration, where the field is no longer used in new code, but still exists in the database for backward compatibility with old code.
-
In any subsequent release, include migration #2 (the one that removes the field from the database).
-
After releasing and deploying migration #2, the field will be removed both from the database and Django state, without backward compatibility issues or downtime 🎉
Autogenerating TS types based on OpenAPI schema
| ⚠️ WARNING |
|---|
| Transition to this approach is in progress |
Overview
In order to automate types creation and prevent API usage pitfalls, OnCall project is using the following approach:
- OnCall Engine (backend) exposes OpenAPI schema
- OnCall Grafana Plugin (frontend) autogenerates TS type definitions based on it
- OnCall Grafana Plugin (frontend) uses autogenerated types as a single source of truth for any backend-related interactions (url paths, request bodies, params, response payloads)
Instruction
-
Whenever API contract changes, run
yarn generate-typesfromgrafana-plugindirectory -
Then you can start consuming types and you can use fully typed http client:
import { ApiSchemas } from "network/oncall-api/api.types"; import { onCallApi } from "network/oncall-api/http-client"; const { data: { results }, } = await onCallApi().GET("/alertgroups/"); const alertGroups: Array<ApiSchemas["AlertGroup"]> = results; -
[Optional] If there is any property that is not yet exposed in OpenAPI schema and you already want to use it, you can append missing properties to particular schemas by editing
grafana-plugin/src/network/oncall-api/types-generator/custom-schemas.tsfile:export type CustomApiSchemas = { Alert: { propertyMissingInOpenAPI: string; }; AlertGroup: { anotherPropertyMissingInOpenAPI: number[]; }; };Then add their names to
CUSTOMIZED_SCHEMASarray ingrafana-plugin/src/network/oncall-api/types-generator/generate-types.ts:const CUSTOMIZED_SCHEMAS = ["Alert", "AlertGroup"];The outcome is that autogenerated schemas will be modified as follows:
import type { CustomApiSchemas } from './types-generator/custom-schemas'; export interface components { schemas: { Alert: CustomApiSchemas['Alert'] & { readonly id: string; ... }; AlertGroup: CustomApiSchemas['AlertGroup'] & { readonly pk: string; ... }, ... } }
System components
flowchart TD
client[Monitoring System]
third_party["Slack, Twilio,
3rd party services.."]
server[Server]
celery[Celery Worker]
db[(SQL Database)]
redis[("Cache
(Redis)")]
broker[("AMPQ Broker
(Redis or RabbitMQ)")]
subgraph OnCall Backend
server <--> redis
server <--> db
server -->|"Schedule tasks
with ETA"| broker
broker -->|"Fetch tasks"| celery
celery --> db
end
subgraph Grafana Stack
plugin["OnCall Frontend
Plugin"]
proxy[Plugin Proxy]
api[Grafana API]
plugin --> proxy --> server
api --> server
end
client -->|Alerts| server
third_party -->|"Statuses,
events"| server
celery -->|"Notifications,
Outgoing Webhooks"| third_party