# What this PR does
Short summary: this PR improves security and configuration management
for Helm deployment. Please take a look at the details below.
## Which issue(s) this PR fixes
Issues:
- Cannot explicitly define redis database (only 0 and 1 numbers are
used)
- Cannot securely use TLS for Redis (cannot set CA certificate; cannot
set client certificates)
- Cannot securely use TLS for Postgres (cannot set CA certificate;
cannot set client certificates; cannot set `verify-full` validation)
- ~~Chart option `securityContext.readOnlyRootFilesystem: true` issues
CrashLoopBack pod state~~ will be moved to new PR
## Checklist
- [x] ~~Unit, integration, and e2e (if applicable) tests updated~~ (not
required)
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
- [x] Helm tests are fixed and updated
- [x] Manually verified the features:
- [x] postgres TLS connection with `verify-full` validation
- [x] redis TLS connection with `cert_required` validation
- [x] redis protocol and database number controls
- [x] all containers properly work in read-only root filesystem
- [x] all changes are backward compatible (doesn't break old
deployments)
## Changelog
- Fixed helm tests
- Added configuration options for secure TLS communication with
dependencies like Redis, MySQL, and Postgres
- ~~Added configuration option for relocating `celerybeat` database file
(read-only root filesystem issue)~~ will be moved to new PR
- Improved redis database configuration options
- Now only single redis database is used
- Added ability to mount custom volumes (with CA certificates, for
example) into Helm chart
- ~~Fixed issue with read-only root filesystem for Helm chart~~ will be
moved to new PR
- Add ability to work with Redis ACL (and AWS ElastiCache)
# What this PR does
* Create Direct Paging integration (with default route) when team is
created with bulk_update
* Create notification policies when user is created with bulk_update
* If user notification policies are empty change it to Email
* Minor markup and wording improvements
* Add grafana queue to helm chart
* Remove disabled commands for redis helm chart
* Improve Dockerfile caching
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Set `oncall.smtp.enabled` to `true` by default to enable email
notifications on Helm deployments.
Email notifications are enabled by default on docker-compose deployments
already: see [this feature
flag](df6f6183ec/engine/settings/base.py (L63)).
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/2917
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Runs Telegram long polling to get updates.
It's enabled by setting `FEATURE_TELEGRAM_LONG_POLLING_ENABLED=True`.
That will disable webhook and run separate deployment for telegram long
polling.
Telegram long polling is not very HA mode, but it does not need to
expose webhook url to internet and simplifies telegram integration.
## Which issue(s) this PR fixes
closes#561
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
- updates the GitHub Actions workflow to move the e2e tests into a
"[reusable
workflow](https://docs.github.com/en/actions/using-workflows/reusing-workflows#creating-a-reusable-workflow)"
which are run in two scenarios:
- all tests _except_ those annotated as `@expensive` are run against
`grafana/grafana:latest` on all feature branches
- all tests _including_ `@expensive` tests are run on weekdays @ 07h00
UTC, against a matrix of 6 grafana versions. Results of these builds
will be posted to `#irm-amixr-flux` Slack channel.
- local development will now be:
```bash
make build-dev-images init-k8s start-k8s
```
- `build-dev-images` - builds the engine and UI docker images (only need
to run first time)
- `init-k8s` - creates a `kind` cluster and loads the two Docker images
onto the cluster nodes (only need to run first time)
- `start-k8s` - switches `kubectl` context to the created `kind`
cluster, and uses `helm` to deploy everything as defined in
`./dev/helm-local.yml` and `./dev/helm-local.dev.yml` (that latter file
is `.gitignored` and specific to how _you_ want your setup to look like.
Hot reloading works as before. This is the _start_ of #2381. (I've
marked these `make` commands as beta, because they've not yet been
thoroughly tested for local development).
- modifies the `helm` chart to add the concept of `oncall.devMode`,
`ui`, and ability to run oncall w/ sqlite
- `oncall.devMode` will essentially just add `volumes` and
`volumeMounts` to the various engine/migrate containers +
- `ui.enabled` + `ui.env` - create a ui container (which is needed for
hot reloading locally)
- `sqlite` - this was useful for the e2e test environments where Github
runner resources are scarce. Running `mariadb` eats up precious
resources, instead lets just use sqlite here
- fixes an issue that caused sporadic HTTP 502s from the grafana
plugin-proxy, which led to flaky tests. See [this
comment](https://github.com/grafana/oncall/pull/2751/files#diff-09040e8df192699b9c5742110ebbe8d9d5c3938cb156cc1cb99fa1c3fdee4fefR72-R77)
for more context + a link to a relevant Slack conversation. **tldr;**
there is a bug with the Grafana plugin proxy in Grafana >= v10.0.3.
Let's stop using the `latest`/`main` docker tags in our test and pin to
`10.0.2` for now
- ~~re-enables the e2e test which validates a phone number via SMS, and
asserts that we can receive an alert escalation via SMS (new Mailslurp
API Key has been added as a repo secret)~~ update: this is still blocked
by procurement, will be done in a future PR
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Add [`yamllint`](https://github.com/adrienverge/yamllint) to
`pre-commit` configuration + fix pre-existing errors
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
fixes a bug introduced in #2655 where the `topologySpreadConstraints`
helm values are not scoped correctly on the engine template.
## Which issue(s) this PR fixes
closes#2655
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>