# What this PR does
Disable accessControlOnCall for Grafana 11.3
<!--
*Note*: If you want the issue to be auto-closed once the PR is merged,
change "Related to" to "Closes" in the line above.
If you have more than one GitHub issue that this PR closes, be sure to
preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
# What this PR does
Updates the helm chart and docker compose files with the required
changes to support the plugin initialization changes. Updated
instructions on the README.md show how to setup & intialize OnCall
without needing to go to the configuration page, this is currently the
preferred method.
## Which issue(s) this PR closes
Related to [issue link here]
<!--
*Note*: If you want the issue to be auto-closed once the PR is merged,
change "Related to" to "Closes" in the line above.
If you have more than one GitHub issue that this PR closes, be sure to
preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
---------
Co-authored-by: GitHub Actions <actions@github.com>
# What this PR does
Adds flexibility of the method of encryption in the SMTP email app. Some
email servers are configured to use port 465 (intrinsic TLS) which
requires `EMAIL_USE_SSL` instead of `EMAIL_USE_TLS`.
## Which issue(s) this PR fixes
Fixes https://github.com/grafana/oncall/issues/1044
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
# What this PR does
Related to [this
discussion](https://raintank-corp.slack.com/archives/C04JCU51NF8/p1706550226831949)
Removes the `/oncall` Slack slash command + the concept of
`force_route_id` (as this Slack slash command was the last piece of code
to use this concept
[here](https://github.com/grafana/oncall/blob/dev/engine/apps/slack/scenarios/manual_incident.py#L146))
## TODO before merging
- [x] update the various env's Slack apps to remove the slash command
from the app manifests
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
- Use Grafana Scenes to add Insights as a separate page in OnCall
- Add an option to run Prometheus instance via helm so that Prometheus
Exporter feature can be used easily without the need of setting up
Prometheus separately
## Which issue(s) this PR fixes
https://github.com/grafana/oncall-private/issues/2382
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
The examples at extraVolumeMounts and extraVolumes properties are
swapped
# What this PR does
Fixing the properties extraVolumeMounts and extraVolumes in Helm chart
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
Signed-off-by: Kleber Rocha <klinux@gmail.com>
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
## Issue
At the first run (`helm install ...`) the migration job cannot start the
container because cannot find Postgres/Redis/MySQL credentials and
ServiceAccount.
Workaround: set `.migrate.useHook` value to `false` for the `install`
stage, and after you can switch back to `true`.
This PR completely resolves this issue.
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated (doesn't
violate anything)
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
- [x] It is manually tested in the internal environment
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
# What this PR does
Add option to add additional pod labels.
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
- [x] It is manually tested in the internal environment
---------
Co-authored-by: Marius Ensrud <marius.ensrud@skatteetaten.no>
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
Co-authored-by: Ildar Iskhakov <Ildar.iskhakov@grafana.com>
# What this PR does
Short summary: this PR improves security and configuration management
for Helm deployment. Please take a look at the details below.
## Which issue(s) this PR fixes
Issues:
- Cannot explicitly define redis database (only 0 and 1 numbers are
used)
- Cannot securely use TLS for Redis (cannot set CA certificate; cannot
set client certificates)
- Cannot securely use TLS for Postgres (cannot set CA certificate;
cannot set client certificates; cannot set `verify-full` validation)
- ~~Chart option `securityContext.readOnlyRootFilesystem: true` issues
CrashLoopBack pod state~~ will be moved to new PR
## Checklist
- [x] ~~Unit, integration, and e2e (if applicable) tests updated~~ (not
required)
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
- [x] Helm tests are fixed and updated
- [x] Manually verified the features:
- [x] postgres TLS connection with `verify-full` validation
- [x] redis TLS connection with `cert_required` validation
- [x] redis protocol and database number controls
- [x] all containers properly work in read-only root filesystem
- [x] all changes are backward compatible (doesn't break old
deployments)
## Changelog
- Fixed helm tests
- Added configuration options for secure TLS communication with
dependencies like Redis, MySQL, and Postgres
- ~~Added configuration option for relocating `celerybeat` database file
(read-only root filesystem issue)~~ will be moved to new PR
- Improved redis database configuration options
- Now only single redis database is used
- Added ability to mount custom volumes (with CA certificates, for
example) into Helm chart
- ~~Fixed issue with read-only root filesystem for Helm chart~~ will be
moved to new PR
- Add ability to work with Redis ACL (and AWS ElastiCache)
# What this PR does
* Create Direct Paging integration (with default route) when team is
created with bulk_update
* Create notification policies when user is created with bulk_update
* If user notification policies are empty change it to Email
* Minor markup and wording improvements
* Add grafana queue to helm chart
* Remove disabled commands for redis helm chart
* Improve Dockerfile caching
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Set `oncall.smtp.enabled` to `true` by default to enable email
notifications on Helm deployments.
Email notifications are enabled by default on docker-compose deployments
already: see [this feature
flag](df6f6183ec/engine/settings/base.py (L63)).
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/2917
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Runs Telegram long polling to get updates.
It's enabled by setting `FEATURE_TELEGRAM_LONG_POLLING_ENABLED=True`.
That will disable webhook and run separate deployment for telegram long
polling.
Telegram long polling is not very HA mode, but it does not need to
expose webhook url to internet and simplifies telegram integration.
## Which issue(s) this PR fixes
closes#561
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
- updates the GitHub Actions workflow to move the e2e tests into a
"[reusable
workflow](https://docs.github.com/en/actions/using-workflows/reusing-workflows#creating-a-reusable-workflow)"
which are run in two scenarios:
- all tests _except_ those annotated as `@expensive` are run against
`grafana/grafana:latest` on all feature branches
- all tests _including_ `@expensive` tests are run on weekdays @ 07h00
UTC, against a matrix of 6 grafana versions. Results of these builds
will be posted to `#irm-amixr-flux` Slack channel.
- local development will now be:
```bash
make build-dev-images init-k8s start-k8s
```
- `build-dev-images` - builds the engine and UI docker images (only need
to run first time)
- `init-k8s` - creates a `kind` cluster and loads the two Docker images
onto the cluster nodes (only need to run first time)
- `start-k8s` - switches `kubectl` context to the created `kind`
cluster, and uses `helm` to deploy everything as defined in
`./dev/helm-local.yml` and `./dev/helm-local.dev.yml` (that latter file
is `.gitignored` and specific to how _you_ want your setup to look like.
Hot reloading works as before. This is the _start_ of #2381. (I've
marked these `make` commands as beta, because they've not yet been
thoroughly tested for local development).
- modifies the `helm` chart to add the concept of `oncall.devMode`,
`ui`, and ability to run oncall w/ sqlite
- `oncall.devMode` will essentially just add `volumes` and
`volumeMounts` to the various engine/migrate containers +
- `ui.enabled` + `ui.env` - create a ui container (which is needed for
hot reloading locally)
- `sqlite` - this was useful for the e2e test environments where Github
runner resources are scarce. Running `mariadb` eats up precious
resources, instead lets just use sqlite here
- fixes an issue that caused sporadic HTTP 502s from the grafana
plugin-proxy, which led to flaky tests. See [this
comment](https://github.com/grafana/oncall/pull/2751/files#diff-09040e8df192699b9c5742110ebbe8d9d5c3938cb156cc1cb99fa1c3fdee4fefR72-R77)
for more context + a link to a relevant Slack conversation. **tldr;**
there is a bug with the Grafana plugin proxy in Grafana >= v10.0.3.
Let's stop using the `latest`/`main` docker tags in our test and pin to
`10.0.2` for now
- ~~re-enables the e2e test which validates a phone number via SMS, and
asserts that we can receive an alert escalation via SMS (new Mailslurp
API Key has been added as a repo secret)~~ update: this is still blocked
by procurement, will be done in a future PR
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Add [`yamllint`](https://github.com/adrienverge/yamllint) to
`pre-commit` configuration + fix pre-existing errors
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Adds support for defining extra containers which run as sidecar
alongside the celery and engine containers
## Which issue(s) this PR fixes
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Adds support for `topologySpreadConstraints` and `priorityClassName` on
the celerly/engine deployment templates
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/2655
## Checklist
- [X] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
Adds support for custom annotations on the helm chart's migrate job
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/1618
## Checklist
- [X] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Added 'resources limits' definition for wait-for-db container
## Which issue(s) this PR fixes
I face a problem: when i install OnCall by Helm, my pods with
oncall-engine and oncall-celery stuck on Init state, because they don't
have enough resources to run.
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
# What this PR does
Add affinity and tolerations for celery
## Which issue(s) this PR fixes
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
- Enabling existing secrets for external MySQL and Redis
- Tolerate existing secrets for bundled charts.
- README.md: secrets handling explained.
- Fixed multiple bugs where missing required field was replaced with
default instead of failing.
- PHONE_NOTIFICATIONS_LIMIT was on the wrong level: it was not set if
existingSecret was true.
Next are the cosmetic changes. They improve chart consistency, e.g.
prevent generation of multiple new lines in certain cases:
- Common approach to spaces trimming. This typically allows curly blocks
and actual strings indentation and nice `nindent` usage:
- Two curly blocks should not trim the same space. I.e. "{{ ... -}} {{-
... }}" shouldn't happen.
- Template generates either single line or multiline string. In both
cases, no new line appears on both sides of the output string. So we
delete unnecessary new lines inside and at the end of string with
"trim-to-left" (`{{-` ) and the leading new line using "trim-to-right"
(`-}}`).
Note that trimming both leading and trailing new line is not always
easily possible: https://github.com/Masterminds/sprig/issues/357
Example.
```
{{- define "mytemplate" -}}
{{ if someBoolean -}}
{{ .Value.some }}
{{- else -}}
some string
{{- end }}
{{- end }}
```
- `template` replaced with `include`. It is often recommended to use
`include` by default, as it allows pipelining.
## Checklist
- [ ] Tests updated - No tests for Helm chart
- [X] Documentation added
- [x] `CHANGELOG.md` updated
Co-authored-by: Ildar Iskhakov <Ildar.iskhakov@grafana.com>
# What this PR does
This PR adds the ability to use an existing secret for external Redis
and external MySQL and it follows the same changes that PR #761 did for
RabbitMQ (including the fix that was done for it later in #775)
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Ildar Iskhakov <Ildar.iskhakov@grafana.com>
# What this PR does
1. Fixes setting extra envs using:
```yaml
env:
proxy: http://example.com
SOME_VAR: some-value
```
It had failed if postgresql setting enabled and in `job-migrate`
2. Fixes an issue if custom database and username set for internal
mariadb, `MYSQL_` envs did not use them
```yaml
mariadb:
auth:
database: grafana_oncall
username: grafana_oncall
```
3. Added `imagePullSecrets: []` to values.yaml. It used in helm chart,
but does not present in the values.yaml
4. More unit tests
## Which issue(s) this PR fixes
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
Co-authored-by: Ildar Iskhakov <Ildar.iskhakov@grafana.com>
# What this PR does
* Upgrade to the recent Grafana
* Upgrade to the recent bitnami mariadb, rabbitmq charts which support
arm64 now
* Remove deprecated psp policies from grafana chart
* Make startupProbe period smaller to increase installation speed
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Adds nodeSelector for celery deployment and migrate job
## Which issue(s) this PR fixes
Fixes errors while deploying resources in a cluster with Gatekeeper
policy ( that restricts deployments without nodeSelector).
## Checklist
- [x] Tests updated
- [x] Documentation added
- [x] `CHANGELOG.md` updated
---------
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
Adds `uwsgi` configuration to helm chart.
Sets environment variables with name capitalized and prefixed with
`UWSGI_`, and dashes are substituted with underscores like described
[here](https://uwsgi-docs.readthedocs.io/en/latest/Configuration.html#environment-variables)
Sets `UWSGI_LISTEN=1024` by default, but can be overwritten or
completely removed by `uwsgi: null`
Or is it better to not specify default value (it's not backward
compatible)?
Also, small indentation fixes for postgresql configuration.
## Which issue(s) this PR fixes
closes https://github.com/grafana/oncall/issues/562
Also, [this PR](https://github.com/grafana/oncall/pull/856) has been
closed because of this PR
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
Fixing some bugs with external Postgresql configuration.
Also I added some unit tests for helm chart using
[helm-unittest](https://github.com/helm-unittest/helm-unittest). If it's
not an appropriate tool, please suggest another, or I can remove that
test. I added
[this](https://github.com/marketplace/actions/helm-unit-tests) Github
Action to run helm unit tests.
## Which issue(s) this PR fixes
closes#1727closes#1923closes#1245closes#845
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
# What this PR does
This allows the use of existing secrets for the twilio configuration,
much like slack already does, as have used that to influence these
changes
## Which issue(s) this PR fixes
## Checklist
- [ ] Tests updated
- [x] Documentation added
- [x] `CHANGELOG.md` updated
---------
Co-authored-by: Matvey Kukuy <Matvey-Kuk@users.noreply.github.com>
# What this PR does
This PR adds the option to use helm hooks for the database migration.
## Which issue(s) this PR fixes
Currently oncall always shows as out-of-sync in argo-cd because the name
changes on each hard refresh.
When using a helm hook the job is executed on sync but does not show as
out-of-sync
## Checklist
- [ ] Tests updated
- [ ] Documentation added
- [x] `CHANGELOG.md` updated
---------
Co-authored-by: Ildar Iskhakov <Ildar.iskhakov@grafana.com>
# What this PR does
The templates actually generate the SLACK_SLASH_COMMAND_NAME envvar from
`.Values.oncall.slack.commandName`, not `command`. This commit changes
the default values file to reflect this.
## Which issue(s) this PR fixes
#1092
## Checklist
- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
^ Are these applicable with regards to updating the helm chart, not the
core application? I did take the time to test both the default and
changed value after this change in a kubernetes deployment.
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
Now it is possible to change engine deployment update strategy in
values.yaml.
## Which issue(s) this PR fixes
This is due to #334 and #316, as with rolling update, race conditions
might happen when there is still an old engine pod running.
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
* Get rid of installation token (for OSS installations)
This is done by being required to supply the grafana API URL as an
environment variable on the backend. Additionally, optionally an OnCall
API URL environment variable can be passed in to the frontend (this basically
allows completely skipping the need to configure anything).
- deduplicated a lot of the sync logic on the frontend + made
error message more useful and consistent
- Split PluginConfigPage component into several subcomponents
(making it easier to test each individual component)
- Moved RootWithLoader (from plugin/GrafanaPluginRootPage) into its own
subcomponent (making it easier to test)
- Added tests for pre-existing components that were touched:
- PluginConfigPage component (and its new subcomponents)
- state/plugin and state/rootBaseStore functions
- apps.grafana_plugin django app
Helm changes:
- add GRAFANA_API_URL to oncall.env
- some yaml autoformatting changes
- remove reference to python manage.py issue_invite_for_the_frontend --override
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
* init rabbitmq existing secret
Signed-off-by: David van der Spek <vanderspek.david@gmail.com>
* bump chart
Signed-off-by: David van der Spek <vanderspek.david@gmail.com>
Signed-off-by: David van der Spek <vanderspek.david@gmail.com>