oncall-engine/helm/oncall
Michael Derynck 4572131951
Merge dev to main (#73)
* Log (failed) attempt to notify a user with viewer role

* Remove old publishing workflow

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Add publishing workflows for next (unreleased) and released documentation

Notable features:
- Merges are blocked by strict Hugo reference checking. However, this
only works for references that resolve within the repository. Once you
have Hugo references to website pages beyond this repository, you will
want to remove this test job.
- Pushes to main are automatically published to "next" documentation
consistent with our other OSS projects.
- Pushes of release tags publish to a versioned directory in the
website. The website uses `v<MAJOR>.<MINOR>.x` versioning and the
"Determine technical documentation version" step will make sure that a
tag such as `v0.20.7` is mapped to `v0.20.x`.
- Pushes to release branches will only be published if there is an
existing corresponding release tag. For example, pushing to a new
release branch `release-0.1000` will not trigger a publish of
documentation until there is a `v0.1000.0` release tag.

> **Note:** I have used a release branch naming convention
`release-<MAJOR>-<MINOR>` which is consistent with grafana/mimir but I
see that in the old amixr repository there are long lived release
branches for patch versions. If that is required. I can update this PR
to support that but I would recommend not including patch versions in
release branch naming unless you have a good reason to do so.

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Add helm chaart installation

* s/mimir/oncall/

Signed-off-by: Jack Baldry <jack.baldry@grafana.com>

* Remove https:// prefix from BASE_URL docker env var

* Fix cloud heartbeat name

* Polishing telegram

* Update docker-compose.yml

* Update plugin README  (#48)

* Update README and screenshot, remove plop for build info since version is now displayed prominently

* Sign build

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Build actions (#38)

* Drone, github action changes

* Minor version updates

* Update frontend dependencies

* Re-enable unit test

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Revert stylelint version (#52)

* Revert stylelint version

* Build plugin as well as lint

* Build in previous step

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Update screenshot (#53)

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* oncall images for docs (#55)

* Fix chart

* Finalise helm chart

* Update README.md

* Top menu fix

* Fix db encoding

* Add api key docs

* Reverting utf8 fix

* bug fixes

* fix for link for OSS version

* Fixing utf8 and docker compose

* 8080 -> 8000 port for consistency

* Improve the helm chart

* makeReq

* Fixing images

* Fixing port

* Fixing port

* Fixing port

* Fixing port

* Fixing port

* Fixing port

* Fixing port

* Add last moment improvements

* Fixing port

* Replace symlink with file for CHANGELOG.MD (#68)

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Edit Chart.yaml

* Edit version

* Edit README.md

* Fixing port

* Update README.md

* Fix linting

* image: grafana/oncall

* Merge dev to main (#71)

* Merge dev to main (#54)

* Log (failed) attempt to notify a user with viewer role

* Remove https:// prefix from BASE_URL docker env var

* Fix cloud heartbeat name

* Polishing telegram

* Update docker-compose.yml

* Update plugin README  (#48)

* Update README and screenshot, remove plop for build info since version is now displayed prominently

* Sign build

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Build actions (#38)

* Drone, github action changes

* Minor version updates

* Update frontend dependencies

* Re-enable unit test

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Revert stylelint version (#52)

* Revert stylelint version

* Build plugin as well as lint

* Build in previous step

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Update screenshot (#53)

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

Co-authored-by: Matias Bordese <mbordese@gmail.com>
Co-authored-by: Matvey Kukuy <Matvey-Kuk@users.noreply.github.com>
Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
Co-authored-by: Matvey Kukuy <matvey@amixr.io>
Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Merge dev to main (#69)

* Log (failed) attempt to notify a user with viewer role

* Remove https:// prefix from BASE_URL docker env var

* Fix cloud heartbeat name

* Polishing telegram

* Update docker-compose.yml

* Update plugin README  (#48)

* Update README and screenshot, remove plop for build info since version is now displayed prominently

* Sign build

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Build actions (#38)

* Drone, github action changes

* Minor version updates

* Update frontend dependencies

* Re-enable unit test

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Revert stylelint version (#52)

* Revert stylelint version

* Build plugin as well as lint

* Build in previous step

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* Update screenshot (#53)

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

* oncall images for docs (#55)

* Update README.md

* Top menu fix

* Fix db encoding

* Add api key docs

* Reverting utf8 fix

* bug fixes

* fix for link for OSS version

* Fixing utf8 and docker compose

* 8080 -> 8000 port for consistency

* makeReq

* Fixing images

* Fixing port

* Fixing port

* Fixing port

* Fixing port

* Fixing port

* Fixing port

* Fixing port

* Fixing port

* Replace symlink with file for CHANGELOG.MD (#68)

Co-authored-by: Michael Derynck <michael.derynck@grafana.com>

Co-authored-by: Matias Bordese <mbordese@gmail.com>
Co-authored-by: Matvey Kukuy <Matvey-Kuk@users.noreply.github.com>
Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
Co-authored-by: Matvey Kukuy <matvey@amixr.io>
Co-authored-by: Michael Derynck <michael.derynck@grafana.com>
Co-authored-by: Alyssa Wada <101596687+alyssawada@users.noreply.github.com>
Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>

Co-authored-by: Ildar Iskhakov <Ildar.iskhakov@grafana.com>
Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
Co-authored-by: Matias Bordese <mbordese@gmail.com>
Co-authored-by: Matvey Kukuy <Matvey-Kuk@users.noreply.github.com>
Co-authored-by: Matvey Kukuy <matvey@amixr.io>
Co-authored-by: Alyssa Wada <101596687+alyssawada@users.noreply.github.com>
Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>

Co-authored-by: Matias Bordese <mbordese@gmail.com>
Co-authored-by: Jack Baldry <jack.baldry@grafana.com>
Co-authored-by: Ildar Iskhakov <ildar.iskhakov@grafana.com>
Co-authored-by: Matvey Kukuy <Matvey-Kuk@users.noreply.github.com>
Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
Co-authored-by: Matvey Kukuy <matvey@amixr.io>
Co-authored-by: Alyssa Wada <101596687+alyssawada@users.noreply.github.com>
Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>
2022-06-14 09:54:41 -06:00
..
charts Merge dev to main (#73) 2022-06-14 09:54:41 -06:00
templates Merge dev to main (#73) 2022-06-14 09:54:41 -06:00
.helmignore Merge dev to main (#73) 2022-06-14 09:54:41 -06:00
Chart.yaml Merge dev to main (#73) 2022-06-14 09:54:41 -06:00
README.md Merge dev to main (#73) 2022-06-14 09:54:41 -06:00
values.yaml Merge dev to main (#73) 2022-06-14 09:54:41 -06:00

Grafana OnCall Helm Chart

This Grafana OnCall Chart is the best way to operate Grafana OnCall on Kubernetes. It will deploy Grafana OnCall engine and celery workers, along with RabbitMQ cluster, Redis Cluster, and MySQL 5.7 database. It will also deploy cert manager and nginx ingress controller, as Grafana OnCall backend might need to be externally available to receive alerts from other monitoring systems. Grafana OnCall engine acts as a backend and can be connected to the Grafana frontend plugin named Grafana OnCall. Architecture diagram can be found here

Default helm chart configuration is not intended for production. The helm chart includes all the services into a single release, which is not recommended for production usage. It is recommended to run stateful services such as MySQL and RabbitMQ separately from this release or use managed PaaS solutions. It will significantly reduce the overhead of managing them

Cluster requirements:

  • ensure you can run x86-64/amd64 workloads. arm64 architecture is currently not supported

Install

Installing the helm chart

helm install \
    --wait \
    --set base_url=example.com \
    --set grafana."grafana\.ini".server.domain=example.com \
    oncall \
    .

Follow the helm install output to finish setting up Grafana OnCall backend and Grafana OnCall frontend plugin

Configuration

You can edit values.yml to make changes to the helm chart configuration and re-deploy the release with the following command:

helm upgrade \
    --install \
    --wait \
    --set base_url=example.com \
    --set grafana."grafana\.ini".server.domain=example.com \
    oncall \
    .

Set up external access

Grafana OnCall can be connected to the external monitoring systems or grafana deployed to the other cluster. Nginx Ingress Controller and Cert Manager charts are included in the helm chart with the default configuration. If you set the DNS A Record pointing to the external IP address of the installation with the Hostname matching base_url parameter, https will be automatically set up. If grafana is enabled in the chart values, it will also be available on https://<base_url>/grafana/. See the details in helm install output.

To use a different ingress controller or tls certificate management system, set the following values to false and edit ingress settings

nginx-ingress:
  enabled: false

cert-manager:
  enabled: false
 
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/issuer: "letsencrypt-prod"

Connect external MySQL

It is recommended to use the managed MySQL 5.7 database provided by your cloud provider Make sure to create the database with the following parameters before installing this chart

CREATE DATABASE oncall CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

To use an external MySQL instance set mysql.enabled to false and configure the externalMysql parameters.

mariadb:
  enabled: true

# Make sure to create the database with the following parameters:
# CREATE DATABASE oncall CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
externalMysql:
  host:
  port:
  db_name:
  user:
  password:

Connect external RabbitMQ

Option 1. Install RabbitMQ separately into the cluster using the official documentation Option 2. Use managed solution such as CloudAMPQ

To use an external RabbitMQ instance set rabbitmq.enabled to false and configure the externalRabbitmq parameters.

rabbitmq:
  enabled: false  # Disable the RabbitMQ dependency from the release
 
externalRabbitmq:
  host:
  port:
  user:
  password:

Uninstall

Uninstalling the helm chart

helm delete oncall

Clean up PVC's

kubectl delete pvc data-oncall-mariadb-0 data-oncall-rabbitmq-0 \
redis-data-oncall-redis-master-0 redis-data-oncall-redis-replicas-0 \
redis-data-oncall-redis-replicas-1 redis-data-oncall-redis-replicas-2

Clean up secrets

kubectl delete secrets certificate-tls oncall-cert-manager-webhook-ca oncall-ingress-nginx-admission