2022-06-14 09:54:41 -06:00
# Grafana OnCall Helm Chart
This Grafana OnCall Chart is the best way to operate Grafana OnCall on Kubernetes.
It will deploy Grafana OnCall engine and celery workers, along with RabbitMQ cluster, Redis Cluster, and MySQL 5.7 database.
It will also deploy cert manager and nginx ingress controller, as Grafana OnCall backend might need to be externally available
2022-12-01 14:26:54 +01:00
to receive alerts from other monitoring systems. Grafana OnCall engine acts as a backend and can be connected to the
Grafana frontend plugin named Grafana OnCall.
2022-06-14 09:54:41 -06:00
Architecture diagram can be found [here ](https://raw.githubusercontent.com/grafana/oncall/dev/docs/img/architecture_diagram.png )
2022-11-02 08:34:41 +01:00
## Production usage
2022-10-13 21:40:56 +08:00
2022-12-01 14:26:54 +01:00
**Default helm chart configuration is not intended for production.**
The helm chart includes all the services into a single release, which is not recommended for production usage.
It is recommended to run stateful services such as MySQL and RabbitMQ separately from this release or use managed
PaaS solutions. It will significantly reduce the overhead of managing them.
Here are the instructions on how to set up your own [ingress ](#set-up-external-access ), [MySQL ](#connect-external-mysql ),
[RabbitMQ ](#connect-external-rabbitmq ), [Redis ](#connect-external-redis )
2022-06-14 09:54:41 -06:00
2022-06-21 18:10:48 +03:00
### Cluster requirements
2022-11-02 08:34:41 +01:00
2022-11-21 16:26:00 +01:00
- ensure you can run x86-64/amd64 workloads. arm64 architecture is currently not supported
- kubernetes version 1.25+ is not supported, if cert-manager is enabled
2022-06-14 09:54:41 -06:00
## Install
2022-11-02 08:34:41 +01:00
2022-06-21 17:37:52 +03:00
### Prepare the repo
2022-11-02 08:34:41 +01:00
```bash
2022-06-21 17:37:52 +03:00
# Add the repository
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
```
2022-11-02 08:34:41 +01:00
2022-06-14 09:54:41 -06:00
### Installing the helm chart
2022-11-02 08:34:41 +01:00
2022-06-14 09:54:41 -06:00
```bash
2022-06-21 17:37:52 +03:00
# Install the chart
2022-06-14 09:54:41 -06:00
helm install \
--wait \
--set base_url=example.com \
--set grafana."grafana\.ini".server.domain=example.com \
2022-06-14 20:47:34 +03:00
release-oncall \
2022-06-21 17:37:52 +03:00
grafana/oncall
2022-06-14 09:54:41 -06:00
```
2022-06-21 17:37:52 +03:00
Follow the `helm install` output to finish setting up Grafana OnCall backend and Grafana OnCall frontend plugin e.g.
2022-11-02 08:34:41 +01:00
```bash
2022-06-21 17:37:52 +03:00
👋 Your Grafana OnCall instance has been successfully deployed
❗ Set up a DNS record for your domain (use A Record and "@" to point a root domain to the IP address)
Get the external IP address by running the following commands and point example.com to it:
kubectl get ingress release-oncall -o jsonpath="{.status.loadBalancer.ingress[0].ip}"
Wait until the dns record got propagated.
NOTE: Check with the following command: nslookup example.com
Try reaching https://example.com/ready/ from the browser, make sure it is not cached locally
🦎 Grafana was installed as a part of this helm release. Open https://example.com/grafana/plugins/grafana-oncall-app
The User is admin
Get password by running this command:
kubectl get secret --namespace default release-oncall-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
🔗 Connect Grafana OnCall Plugin to Grafana OnCall backend:
Fill the Grafana OnCall Backend URL:
http://release-oncall-engine:8080
🎉🎉🎉 Done! 🎉🎉🎉
```
2022-06-14 09:54:41 -06:00
## Configuration
You can edit values.yml to make changes to the helm chart configuration and re-deploy the release with the following command:
2022-11-02 08:34:41 +01:00
2022-06-14 09:54:41 -06:00
```bash
helm upgrade \
--install \
--wait \
--set base_url=example.com \
--set grafana."grafana\.ini".server.domain=example.com \
2022-06-14 20:47:34 +03:00
release-oncall \
2022-06-21 17:37:52 +03:00
grafana/oncall
2022-06-14 09:54:41 -06:00
```
2023-06-22 22:43:05 -07:00
### Passwords and external secrets
As OnCall subcharts are Bitname charts, there is a common approach to secrets. Bundled charts allow specifying passwords
in values.yaml explicitly or as K8s secret value. OnCall chart refers either to secret created in sub-chart or
to specified external secret.
Similarly, if component chart is disabled, the password(s) can be supplied in `external<Component>` value
(e.g. externalMysql) explicitly or as K8s secret value. In the first case, the secret is created with the specified
value. In the second case the external secret is used.
- If `<subchart>.auth.existingSecret` is non-empty, then this secret is used. Secret keys are pre-defined by chart.
- If subchart supports password files and `<subchart>.customPasswordFiles` dictionary is non-empty, then password files
are used. Dictionary keys are pre-defined per sub-chart. Password files are not supported by OnCall chart and should
not be used with bundled sub-charts.
- Passwords are specified via `auth` section values, e.g. `auth.password` . K8s secret is created.
- If `<subchart>.auth.forcePassword` is `true` , then passwords MUST be specified. Otherwise, missing passwords
are generated.
If external component is used instead of the bundled one:
- If existingSecret within appropriate external component values is non-empty (e.g. `externalMysql.existingSecret` ) then
it is used together with corresponding key names, e.g. `externalMysql.passwordKey` .
- Otherwise, corresponding password values are used, e.g. `externalMysql.password` . K8s secret is created by OnCall chart.
Below is the summary for the dependent charts.
MySQL/MariaDB:
```yaml
database:
type: "mysql" # This is default
mariaDB:
enabled: true # Default
auth:
existingSecret: ""
forcePassword: false
# Secret name: `<release>-mariadb`
rootPassword: "" # Secret key: mariadb-root-password
password: "" # Secret key: mariadb-password
replicationPassword: "" # Secret key: mariadb-replication-password
externalMysql:
password: ""
existingSecret: ""
passwordKey: ""
```
Postgres:
```yaml
database:
type: postgresql
mariadb:
enabled: false # Must be set to false for Postgres
postgresql:
enabled: true # Must be set to true for bundled Postgres
auth:
existingSecret: ""
secretKeys:
adminPasswordKey: ""
userPasswordKey: "" # Not needed
replicationPasswordKey: "" # Not needed with disabled replication
# Secret name: `<release>-postgresql`
postgresPassword: "" # password for admin user postgres. As non-admin user is not created, only this one is relevant.
password: "" # Not needed
replicationPassword: "" # Not needed with disabled replication
externalPostgresql:
user: ""
password: ""
existingSecret: ""
passwordKey: ""
```
Rabbitmq:
```yaml
rabbitmq:
enabled: true
auth:
existingPasswordSecret: "" # Must contain `rabbitmq-password` key
existingErlangSecret: "" # Must contain `rabbitmq-erlang-cookie` key
# Secret name: `<release>-rabbitmq`
password: ""
erlangCookie: ""
externalRabbitmq:
user: ""
password: ""
existingSecret: ""
passwordKey: ""
usernameKey: ""
```
Redis:
```yaml
redis:
enabled: true
auth:
existingSecret: ""
existingSecretPasswordKey: ""
# Secret name: `<release>-redis`
password: ""
externalRedis:
password: ""
existingSecret: ""
passwordKey: ""
```
2023-11-16 10:15:12 -03:00
### Running split ingestion and API services
You can run a detached service for handling integrations by setting up the following variables:
```yaml
detached_integrations:
enabled: true
detached_integrations_service:
enabled: true
```
This will run an integrations-only service listening by default in port 30003.
2022-09-05 17:12:18 +03:00
### Set up Slack and Telegram
You can set up Slack connection via following variables:
2022-11-02 08:34:41 +01:00
```yaml
2022-09-05 17:12:18 +03:00
oncall:
slack:
enabled: true
2023-01-12 02:03:33 -05:00
commandName: oncall
2022-09-05 17:12:18 +03:00
clientId: ~
clientSecret: ~
2023-01-12 02:03:33 -05:00
signingSecret: ~
existingSecret: ""
clientIdKey: ""
clientSecretKey: ""
signingSecretKey: ""
redirectHost: ~
2022-09-05 17:12:18 +03:00
```
2023-01-12 02:03:33 -05:00
`oncall.slack.commandName` is used for changing default bot slash command,
`oncall` . In slack, it could be called via `/<oncall.slack.commandName>` .
2022-09-05 17:12:18 +03:00
2023-08-24 13:12:24 +06:00
To set up Telegram token and webhook url use:
2022-09-05 17:12:18 +03:00
2022-11-02 08:34:41 +01:00
```yaml
2022-09-05 17:12:18 +03:00
oncall:
telegram:
enabled: true
token: ~
webhookUrl: ~
```
2023-08-24 13:12:24 +06:00
To use Telegram long polling instead of webhook use:
```yaml
telegramPolling:
enabled: true
```
2022-06-14 09:54:41 -06:00
### Set up external access
2022-11-02 08:34:41 +01:00
2022-06-14 09:54:41 -06:00
Grafana OnCall can be connected to the external monitoring systems or grafana deployed to the other cluster.
Nginx Ingress Controller and Cert Manager charts are included in the helm chart with the default configuration.
2022-12-01 14:26:54 +01:00
If you set the DNS A Record pointing to the external IP address of the installation with the Hostname matching
base_url parameter, https will be automatically set up. If grafana is enabled in the chart values, it will also be
available on `https://<base_url>/grafana/` . See the details in `helm install` output.
2022-06-14 09:54:41 -06:00
2022-12-01 14:26:54 +01:00
To use a different ingress controller or tls certificate management system, set the following values to
false and edit ingress settings
2022-06-14 09:54:41 -06:00
2022-11-02 08:34:41 +01:00
```yaml
2022-06-21 17:37:52 +03:00
ingress-nginx:
2022-06-14 09:54:41 -06:00
enabled: false
cert-manager:
enabled: false
2022-11-21 16:26:00 +01:00
2022-06-14 09:54:41 -06:00
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: "nginx"
cert-manager.io/issuer: "letsencrypt-prod"
```
2022-11-02 08:34:41 +01:00
### Use PostgreSQL instead of MySQL
It is possible to use PostgreSQL instead of MySQL. To do so, set mariadb.enabled to `false` ,
postgresql.enabled to `true` and database.type to `postgresql` .
```yaml
mariadb:
enabled: false
postgresql:
enabled: true
database:
type: postgresql
```
2022-06-14 09:54:41 -06:00
### Connect external MySQL
It is recommended to use the managed MySQL 5.7 database provided by your cloud provider
Make sure to create the database with the following parameters before installing this chart
2022-11-02 08:34:41 +01:00
```sql
2022-06-14 09:54:41 -06:00
CREATE DATABASE oncall CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
```
2022-11-02 08:34:41 +01:00
To use an external MySQL instance set mariadb.enabled to `false` and configure the `externalMysql` parameters.
```yaml
2022-06-14 09:54:41 -06:00
mariadb:
2022-11-02 08:34:41 +01:00
enabled: false
2022-06-14 09:54:41 -06:00
# Make sure to create the database with the following parameters:
# CREATE DATABASE oncall CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
externalMysql:
host:
port:
db_name:
user:
password:
2023-06-13 08:42:22 +03:00
existingSecret: ""
usernameKey: username
passwordKey: password
2022-11-21 16:26:00 +01:00
```
2022-06-14 09:54:41 -06:00
2022-11-02 08:34:41 +01:00
### Connect external PostgreSQL
To use an external PostgreSQL instance set mariadb.enabled to `false` ,
postgresql.enabled to `false` , database.type to `postgresql` and configure
the `externalPostgresql` parameters.
```yaml
mariadb:
enabled: false
postgresql:
enabled: false
database:
type: postgresql
# Make sure to create the database with the following parameters:
2023-02-01 22:46:59 -08:00
# CREATE DATABASE oncall WITH ENCODING UTF8;
2022-11-02 08:34:41 +01:00
externalPostgresql:
host:
port:
db_name:
user:
password:
existingSecret: ""
passwordKey: password
2022-11-21 16:26:00 +01:00
```
2022-11-02 08:34:41 +01:00
2022-06-14 09:54:41 -06:00
### Connect external RabbitMQ
Option 1. Install RabbitMQ separately into the cluster using the [official documentation ](https://www.rabbitmq.com/kubernetes/operator/operator-overview.html )
Option 2. Use managed solution such as [CloudAMPQ ](https://www.cloudamqp.com/ )
To use an external RabbitMQ instance set rabbitmq.enabled to `false` and configure the `externalRabbitmq` parameters.
2022-11-02 08:34:41 +01:00
```yaml
2022-06-14 09:54:41 -06:00
rabbitmq:
2022-11-21 16:26:00 +01:00
enabled: false # Disable the RabbitMQ dependency from the release
2022-06-14 09:54:41 -06:00
externalRabbitmq:
host:
port:
user:
password:
2022-11-03 08:31:00 +01:00
protocol:
vhost:
existingSecret: ""
passwordKey: password
usernameKey: username
2022-06-14 09:54:41 -06:00
```
2022-10-13 21:40:56 +08:00
### Connect external Redis
To use an external Redis instance set redis.enabled to `false` and configure the `externalRedis` parameters.
2022-11-02 08:34:41 +01:00
```yaml
2022-10-13 21:40:56 +08:00
redis:
2022-11-21 16:26:00 +01:00
enabled: false # Disable the Redis dependency from the release
2022-10-13 21:40:56 +08:00
externalRedis:
host:
password:
2023-06-13 08:42:22 +03:00
existingSecret: ""
passwordKey: password
2022-10-13 21:40:56 +08:00
```
2022-07-18 14:05:03 +01:00
## Update
2022-11-02 08:34:41 +01:00
```bash
2022-07-18 14:05:03 +01:00
# Add & upgrade the repository
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# Re-deploy
helm upgrade \
--install \
--wait \
--set base_url=example.com \
--set grafana."grafana\.ini".server.domain=example.com \
release-oncall \
grafana/oncall
```
2022-12-01 14:26:54 +01:00
After re-deploying, please also update the Grafana OnCall plugin on the plugin version page.
See [Grafana docs ](https://grafana.com/docs/grafana/latest/administration/plugin-management/#update-a-plugin ) for
more info on updating Grafana plugins.
2022-07-18 14:05:03 +01:00
2022-06-14 09:54:41 -06:00
## Uninstall
2022-11-02 08:34:41 +01:00
2022-06-14 09:54:41 -06:00
### Uninstalling the helm chart
2022-11-02 08:34:41 +01:00
2022-06-14 09:54:41 -06:00
```bash
2022-06-14 20:47:34 +03:00
helm delete release-oncall
2022-06-14 09:54:41 -06:00
```
### Clean up PVC's
2022-11-02 08:34:41 +01:00
2022-06-14 09:54:41 -06:00
```bash
2022-06-14 20:47:34 +03:00
kubectl delete pvc data-release-oncall-mariadb-0 data-release-oncall-rabbitmq-0 \
redis-data-release-oncall-redis-master-0 redis-data-release-oncall-redis-replicas-0 \
redis-data-release-oncall-redis-replicas-1 redis-data-release-oncall-redis-replicas-2
2022-06-14 09:54:41 -06:00
```
2022-11-02 08:34:41 +01:00
2022-06-14 09:54:41 -06:00
### Clean up secrets
2022-11-02 08:34:41 +01:00
2022-06-14 09:54:41 -06:00
```bash
2022-06-14 20:47:34 +03:00
kubectl delete secrets certificate-tls release-oncall-cert-manager-webhook-ca release-oncall-ingress-nginx-admission
2022-06-14 09:54:41 -06:00
```
2022-11-21 16:26:00 +01:00
## Troubleshooting
### Issues during initial configuration
2022-12-01 14:26:54 +01:00
In the event that you run into issues during initial configuration, it is possible that mismatching versions between
your OnCall backend and UI is the culprit. Ensure that the versions match, and if not,
consider updating your `helm` deployment.