oncall-engine/docker-compose-developer.yml

344 lines
9.7 KiB
YAML
Raw Normal View History

version: "3.9"
x-labels: &oncall-labels
- "com.grafana.oncall.env=dev"
x-user: &oncall-user "appuser"
x-oncall-build: &oncall-build-args
context: ./engine
target: ${ONCALL_IMAGE_TARGET:-dev}
labels: *oncall-labels
x-oncall-volumes: &oncall-volumes
- ./engine:/etc/app
# see all the fun answers/comments here on why we need to do this
# tldr; using /dev/null as a default leads to a lot of fun problems
# https://stackoverflow.com/a/60456034
- ${ENTERPRISE_ENGINE:-/dev/null}:${ENTERPRISE_ENGINE_VOLUME_MOUNT_DEST_DIR:-/tmp/empty:ro}
- ${SQLITE_DB_FILE:-/dev/null}:/var/lib/oncall/oncall.db
Add RBAC Support (#777) * Modify plugin.json to support RBAC role registration * defines 26 new custom roles in plugin.json. The main roles are: - Admin: read/write access to everything in OnCall - Reader: read access to everything in OnCall - OnCaller : read access to everything in OnCall + edit access to Alert Groups and Schedules - <object-type> Editor: read/write access to everything related to <object-type> - <object-type> Reader: read access for <object-type> - User Settings Admin: read/write access to all user's settings, not just own settings. This is in comparison to User Settings Editor which can only read/write own settings * update changelog and documentation (#686) * implement RBAC for OnCall backend This commit refactors backend authorization. It trys to use RBAC authorization if the org's grafana instance supports it, otherwise it falls back to basic role authorization. * update RBAC backend tests * add tests for RBAC changes - run backend tests as matrix where RBAC is enabled/disabled. When RBAC is enabled, the permissions granted are read from the role grants in the frontend's plugin.json file (instead of relying what we specify in RBACPermission.Permissions) - remove --reuse-db --nomigrations flags from engine/tox.ini - minor autoformatting changes to docker-compose-developer.yml * remove --ds=settings.ci-test from pytest CI command DJANGO_SETTINGS_MODULE is already specified as an env var so this is just unecessary duplication * update gitignore * update github action job name for "test" * RBAC frontend changes * refactors the use of basic roles (ex. Viewer, Editor, Admin) use RBAC permissions (when supported), or falling back to basic roles when RBAC is not supported. - updates the UserAction enum in grafana-plugin/src/state/userAction.ts. Previously this was hardcoded to a list of strings that were being returned by the OnCall API. Now the values here correspond to the permissions in plugin.json (plus a fallback role) * changes per Gabriel's comments: - get rid of group attribute in rbac roles - remove displayName role attribute - remove hidden role attribute - add back role to includes section * don't try to update user timezone if they don't have permission
2022-11-29 09:41:56 +01:00
# this is mounted for testing purposes. Some of the authorization tests
# reference this file
- ./grafana-plugin/src/plugin.json:/etc/grafana-plugin/src/plugin.json
x-env-files: &oncall-env-files
- ./dev/.env.dev
- ./dev/.env.${DB}.dev
x-env-vars: &oncall-env-vars
BROKER_TYPE: ${BROKER_TYPE}
GRAFANA_API_URL: http://localhost:3000
modify push notification settings + use fcm-django library (#998) - swaps out `django-push-notifications` for [`fcm-django`](https://github.com/grafana/fcm-django). Again.. this is a fork of the parent repo for exactly the same reason.. the migrations point to `auth_user` without letting us use our own user model, this has been patched in the `grafana` fork. The reason why we are using `fcm-django` vs `django-push-notifications` is that the latter does not support the new FCM API, only the "legacy" API. The legacy FCM API does not support certain push notification settings that we would like to use. - modifies the iOS/Android specific push notification settings - adds a `flower` pod in the `docker-compose-developer.yml`, useful for debugging tasks locally - sets the mobile app verification token TTL to 5 minutes when developing locally. The default of 1 minute makes working with device emulators really tricky.. This PR also swaps out the base image in `engine/Dockerfile` from `python:3.9-alpine3.16` to `python:3.9-slim-buster`. As to why.. in short, with the introduction of the `fcm-django` library there is now a peer-dependency on [`grpcio`](https://github.com/grpc/grpc) (which is used by `firebase_admin`.. which I am using in this PR to interact directly with Firebase Cloud Messaging (FCM)). `grpcio` does not publish wheels (read: compiled binaries) for the Alpine distro. It does publish wheels for Debian and hence `pip install -r requirements.txt` does not need to build this library from the source distribution. This is a [known "issue"](https://github.com/grpc/grpc/issues/22815#issuecomment-1107874367) and the recommended solution in the community is to.. not use alpine. These were the numbers, when building the image locally, in terms of image size and build time: | | Local image size (uncompressed | Build time (may differ based on your network speed) | | ------------------------- | -------------------------------------- | ---------- | | `python:3.9-alpine3.16` | 785MB | 320s | | `python:3.9-slim-buster` | 1.05GB | 90s | Co-authored-by: Salvatore Giordano <salvatoregiordanoo@gmail.com>
2022-12-20 12:41:34 +01:00
GOOGLE_APPLICATION_CREDENTIALS: /etc/app/gcp_service_account.json
FCM_PROJECT_ID: oncall-mobile-dev
# basically this is needed because the oncall backend containers have been configured to communicate w/ grafana via
# http://localhost:3000 (GRAFANA_API_URL). This URL is used in two scenarios:
# 1. oncall backend -> grafana API communication (happens within docker)
# 2. accessing templated URLs generated by the oncall backend - meant to be accessed via a browser on your host machine
# The alternative is to set GRAFANA_API_URL to http://grafana:3000. However, this would only work in scenario #1
# as http://grafana:3000 would not be resolvable on the host machine (without modifying /etc/hosts)
#
# by adding this extra_host entry to the oncall backend containers any calls to localhost will get routed to the docker
# gateway, onto the host machine, where localhost:3000 points to grafana
x-extra-hosts: &oncall-extra-hosts
- "localhost:host-gateway"
services:
oncall_ui:
container_name: oncall_ui
labels: *oncall-labels
build:
context: ./grafana-plugin
dockerfile: Dockerfile.dev
labels: *oncall-labels
environment:
ONCALL_API_URL: http://host.docker.internal:8080
modify push notification settings + use fcm-django library (#998) - swaps out `django-push-notifications` for [`fcm-django`](https://github.com/grafana/fcm-django). Again.. this is a fork of the parent repo for exactly the same reason.. the migrations point to `auth_user` without letting us use our own user model, this has been patched in the `grafana` fork. The reason why we are using `fcm-django` vs `django-push-notifications` is that the latter does not support the new FCM API, only the "legacy" API. The legacy FCM API does not support certain push notification settings that we would like to use. - modifies the iOS/Android specific push notification settings - adds a `flower` pod in the `docker-compose-developer.yml`, useful for debugging tasks locally - sets the mobile app verification token TTL to 5 minutes when developing locally. The default of 1 minute makes working with device emulators really tricky.. This PR also swaps out the base image in `engine/Dockerfile` from `python:3.9-alpine3.16` to `python:3.9-slim-buster`. As to why.. in short, with the introduction of the `fcm-django` library there is now a peer-dependency on [`grpcio`](https://github.com/grpc/grpc) (which is used by `firebase_admin`.. which I am using in this PR to interact directly with Firebase Cloud Messaging (FCM)). `grpcio` does not publish wheels (read: compiled binaries) for the Alpine distro. It does publish wheels for Debian and hence `pip install -r requirements.txt` does not need to build this library from the source distribution. This is a [known "issue"](https://github.com/grpc/grpc/issues/22815#issuecomment-1107874367) and the recommended solution in the community is to.. not use alpine. These were the numbers, when building the image locally, in terms of image size and build time: | | Local image size (uncompressed | Build time (may differ based on your network speed) | | ------------------------- | -------------------------------------- | ---------- | | `python:3.9-alpine3.16` | 785MB | 320s | | `python:3.9-slim-buster` | 1.05GB | 90s | Co-authored-by: Salvatore Giordano <salvatoregiordanoo@gmail.com>
2022-12-20 12:41:34 +01:00
MOBILE_APP_QR_INTERVAL_QUEUE: 290000 # 4 minutes and 50 seconds
volumes:
- ./grafana-plugin:/etc/app
- node_modules_dev:/etc/app/node_modules
# https://stackoverflow.com/a/60456034
# see the explaination above that uses the $ENTERPRISE_ENGINE env var
- ${ENTERPRISE_FRONTEND:-/dev/null}:${ENTERPRISE_FRONTEND_VOLUME_MOUNT_DEST_DIR:-/tmp/empty:ro}
profiles:
- oncall_ui
oncall_engine:
container_name: oncall_engine
labels: *oncall-labels
build: *oncall-build-args
restart: always
user: *oncall-user
command: sh -c "uwsgi --disable-logging --py-autoreload 3 --ini uwsgi.ini"
env_file: *oncall-env-files
environment: *oncall-env-vars
volumes: *oncall-volumes
extra_hosts: *oncall-extra-hosts
ports:
- "8080:8080"
depends_on:
oncall_db_migration:
condition: service_completed_successfully
profiles:
- engine
# used to invoke one-off commands, primarily from the Makefile
# oncall_engine couldn't (easily) be used due to it's depends_on property
# we could alternatively just use `docker run` however that would require
# duplicating the env-files, volume mounts, etc in the Makefile
oncall_engine_commands:
container_name: oncall_engine_commands
labels: *oncall-labels
build: *oncall-build-args
user: *oncall-user
env_file: *oncall-env-files
environment: *oncall-env-vars
volumes: *oncall-volumes
extra_hosts: *oncall-extra-hosts
profiles:
# no need to start this except from within the Makefile
- _engine_commands
oncall_celery:
container_name: oncall_celery
labels: *oncall-labels
build: *oncall-build-args
restart: always
user: *oncall-user
command: "python manage.py start_celery"
env_file: *oncall-env-files
environment: *oncall-env-vars
volumes: *oncall-volumes
extra_hosts: *oncall-extra-hosts
depends_on:
oncall_db_migration:
condition: service_completed_successfully
profiles:
- engine
modify push notification settings + use fcm-django library (#998) - swaps out `django-push-notifications` for [`fcm-django`](https://github.com/grafana/fcm-django). Again.. this is a fork of the parent repo for exactly the same reason.. the migrations point to `auth_user` without letting us use our own user model, this has been patched in the `grafana` fork. The reason why we are using `fcm-django` vs `django-push-notifications` is that the latter does not support the new FCM API, only the "legacy" API. The legacy FCM API does not support certain push notification settings that we would like to use. - modifies the iOS/Android specific push notification settings - adds a `flower` pod in the `docker-compose-developer.yml`, useful for debugging tasks locally - sets the mobile app verification token TTL to 5 minutes when developing locally. The default of 1 minute makes working with device emulators really tricky.. This PR also swaps out the base image in `engine/Dockerfile` from `python:3.9-alpine3.16` to `python:3.9-slim-buster`. As to why.. in short, with the introduction of the `fcm-django` library there is now a peer-dependency on [`grpcio`](https://github.com/grpc/grpc) (which is used by `firebase_admin`.. which I am using in this PR to interact directly with Firebase Cloud Messaging (FCM)). `grpcio` does not publish wheels (read: compiled binaries) for the Alpine distro. It does publish wheels for Debian and hence `pip install -r requirements.txt` does not need to build this library from the source distribution. This is a [known "issue"](https://github.com/grpc/grpc/issues/22815#issuecomment-1107874367) and the recommended solution in the community is to.. not use alpine. These were the numbers, when building the image locally, in terms of image size and build time: | | Local image size (uncompressed | Build time (may differ based on your network speed) | | ------------------------- | -------------------------------------- | ---------- | | `python:3.9-alpine3.16` | 785MB | 320s | | `python:3.9-slim-buster` | 1.05GB | 90s | Co-authored-by: Salvatore Giordano <salvatoregiordanoo@gmail.com>
2022-12-20 12:41:34 +01:00
flower:
container_name: flower
labels: *oncall-labels
image: mher/flower:1.2.0
environment:
# TODO: make this work properly w/ BROKER_TYPE env var
CELERY_BROKER_URL: "redis://redis:6379/0"
ports:
- "5555:5555"
depends_on:
oncall_celery:
condition: service_started
profiles:
- engine
oncall_db_migration:
container_name: oncall_db_migration
labels: *oncall-labels
build: *oncall-build-args
user: *oncall-user
command: "python manage.py migrate --noinput"
env_file: *oncall-env-files
environment: *oncall-env-vars
volumes: *oncall-volumes
extra_hosts: *oncall-extra-hosts
depends_on:
postgres:
condition: service_healthy
mysql:
condition: service_healthy
rabbitmq:
condition: service_healthy
redis:
condition: service_healthy
profiles:
- engine
redis:
container_name: redis
labels: *oncall-labels
image: redis:7.0.5
restart: always
ports:
- "6379:6379"
deploy:
labels: *oncall-labels
resources:
limits:
memory: 500m
cpus: "0.5"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
timeout: 5s
interval: 5s
retries: 10
volumes:
- redisdata_dev:/data
profiles:
- redis
rabbitmq:
container_name: rabbitmq
labels: *oncall-labels
image: "rabbitmq:3.12.0-management"
restart: always
environment:
RABBITMQ_DEFAULT_USER: "rabbitmq"
RABBITMQ_DEFAULT_PASS: "rabbitmq"
RABBITMQ_DEFAULT_VHOST: "/"
ports:
- "15672:15672"
- "5672:5672"
deploy:
labels: *oncall-labels
resources:
limits:
memory: 1000m
cpus: "0.5"
healthcheck:
test: rabbitmq-diagnostics -q ping
interval: 30s
timeout: 30s
retries: 3
volumes:
- rabbitmqdata_dev:/var/lib/rabbitmq
profiles:
- rabbitmq
mysql:
container_name: mysql
labels: *oncall-labels
image: mysql:8.0.32
Fix duplicate orders for user notification policies (#2278) # What this PR does Fixes an issue when multiple user notification policies have duplicated order values, leading to the following unexpected behaviours: 1. Not possible to rearrange notification policies that have duplicated orders. 2. The notification system only executes the first policy from each order group. For example, if there are policies with orders `[0, 0, 0, 0]`, only the first policy will be executed, and all others will be skipped. So the user will see four policies in the UI, while only one of them will be actually executed. This PR fixes the issue by adding a unique index on `(user_id, important, order)` for `UserNotificationPolicy` model. However, it's not possible to add that unique index using the ordering library that we use due to it's implementation details. I added a new abstract Django model `OrderedModel` that's able to work with such unique indices + under concurrent load. Important info on this new `OrderedModel` abstract model: - Orders are unique on the DB level - Orders are allowed to be non-consecutive, for example order sequence `[100, 150, 400]` is valid - When deleting an instance, orders of other instances don't change. This is a notable difference from the library we use. I think it's better to only delete the instance without changing any other orders, because it reduces the number of dependencies between instances (e.g. Terraform drift will be much smaller this way if a policy is deleted via the web UI). ## Which issue(s) this PR fixes Related to https://github.com/grafana/oncall-private/issues/1680 ## Checklist - [x] Unit, integration, and e2e (if applicable) tests updated - [x] Documentation added (or `pr:no public docs` PR label added if not required) - [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not required)
2023-06-21 12:13:56 +01:00
command: --default-authentication-plugin=mysql_native_password --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --max_connections=1024
restart: always
environment:
MYSQL_ROOT_PASSWORD: empty
MYSQL_DATABASE: oncall_local_dev
ports:
- "3306:3306"
deploy:
labels: *oncall-labels
resources:
limits:
memory: 1000m
cpus: "0.5"
healthcheck:
test: "mysql -uroot -pempty oncall_local_dev -e 'select 1'"
timeout: 20s
retries: 10
volumes:
- mysqldata_dev:/var/lib/mysql
profiles:
- mysql
mysql_to_create_grafana_db:
container_name: mysql_to_create_grafana_db
labels: *oncall-labels
image: mysql:8.0.32
2022-06-10 17:32:58 +03:00
command: bash -c "mysql -h mysql -uroot -pempty -e 'CREATE DATABASE IF NOT EXISTS grafana CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;'"
depends_on:
mysql:
condition: service_healthy
profiles:
- mysql
postgres:
container_name: postgres
labels: *oncall-labels
image: postgres:14.4
restart: always
environment:
POSTGRES_DB: oncall_local_dev
POSTGRES_PASSWORD: empty
POSTGRES_INITDB_ARGS: --encoding=UTF-8
ports:
- "5432:5432"
deploy:
labels: *oncall-labels
resources:
limits:
memory: 500m
cpus: "0.5"
healthcheck:
test: ["CMD", "pg_isready", "-U", "postgres"]
interval: 10s
timeout: 5s
retries: 5
volumes:
- postgresdata_dev:/var/lib/postgresql/data
profiles:
- postgres
postgres_to_create_grafana_db:
container_name: postgres_to_create_grafana_db
labels: *oncall-labels
image: postgres:14.4
command: bash -c "PGPASSWORD=empty psql -U postgres -h postgres -tc \"SELECT 1 FROM pg_database WHERE datname = 'grafana'\" | grep -q 1 || PGPASSWORD=empty psql -U postgres -h postgres -c \"CREATE DATABASE grafana\""
depends_on:
postgres:
condition: service_healthy
profiles:
- postgres
prometheus:
container_name: prometheus
labels: *oncall-labels
image: prom/prometheus
volumes:
- ./dev/prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
profiles:
- prometheus
grafana:
container_name: grafana
labels: *oncall-labels
image: "grafana/${GRAFANA_IMAGE:-grafana:latest}"
restart: always
environment:
GF_SECURITY_ADMIN_USER: oncall
GF_SECURITY_ADMIN_PASSWORD: oncall
GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS: grafana-oncall-app
env_file:
- ./dev/.env.${DB}.dev
ports:
- "3000:3000"
deploy:
labels: *oncall-labels
resources:
limits:
memory: 500m
cpus: "0.5"
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- grafanadata_dev:/var/lib/grafana
- ./grafana-plugin:/var/lib/grafana/plugins/grafana-plugin
- ./provisioning:/etc/grafana/provisioning
fake-data generation script + fixes for django-silk and django-debug-toolbar (#1128) # What this PR does ## Main stuff - add Python script to populate local Grafana/OnCall setup w/ large amounts of fake data. Right now the data types that can be generated are: - teams and Admin users via the Grafana API (must be synced manually by going into the UI before going onto the next step) - Calendar Schedules which have three 8h oncall-shifts, via the OnCall public API - fixes `django-debug-toolbar` when being run in `docker-compose` locally ## Other stuff - documents how to easily modify the Grafana `docker-compose` container provisioning configuration - document solutions for two backend setup related issues encountered when running the engine/celery workers locally, outside of `docker-compose`, on an Apple silicon Mac - fixes small bug in `grafana_plugin.helpers.client.APIClient.call_api` where it would call `response.json()` for all requests, regardless of whether or not the response actually contained data or not - in `engine/settings/dev.py`, properly setup `django-silk` and document the steps to use it locally - make it possible to log out debug SQL queries by specifying `DEV_DEBUG_VIEW_SQL_QUERIES` env var, rather than having to uncomment out a section of `settings/dev.py` ## Which issue(s) this PR fixes - Some local setup issues when trying to use `django-silk` and `django-debug-toolbar` - Makes it much easier to populate your local setup with a lot of fake data - Makes it possible to easily modify your local grafana's provisioning configuration ## Checklist - [ ] Tests updated (N/A) - [ ] Documentation added (N/A) - [ ] `CHANGELOG.md` updated (N/A)
2023-01-20 09:19:41 +01:00
- ${GRAFANA_DEV_PROVISIONING:-/dev/null}:/etc/grafana/grafana.ini
depends_on:
postgres:
condition: service_healthy
mysql:
condition: service_healthy
profiles:
- grafana
volumes:
redisdata_dev:
labels: *oncall-labels
grafanadata_dev:
labels: *oncall-labels
rabbitmqdata_dev:
labels: *oncall-labels
postgresdata_dev:
labels: *oncall-labels
mysqldata_dev:
labels: *oncall-labels
node_modules_dev:
labels: *oncall-labels
networks:
default:
name: oncall_dev
labels: *oncall-labels