# What this PR does
Disable accessControlOnCall for Grafana 11.3
<!--
*Note*: If you want the issue to be auto-closed once the PR is merged,
change "Related to" to "Closes" in the line above.
If you have more than one GitHub issue that this PR closes, be sure to
preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
# What this PR does
- Fix some types errors when OnCall type checking runs under IRM
- Make type check required on CI
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
# What this PR does
- add missing labels-related permissions for external service account
used by new oncall init process
- fix expensive e2e tests in new oncall init process
- unify Grafana versions between standard and expensive e2e tests
- fix running tilt through ops-devenv in new oncall init process
- avoid duplicated standard e2e tests on workflows that run daily and on
merges to main
## Which issue(s) this PR closes
Related to https://github.com/grafana/oncall-private/issues/2656
<!--
*Note*: If you want the issue to be auto-closed once the PR is merged,
change "Related to" to "Closes" in the line above.
If you have more than one GitHub issue that this PR closes, be sure to
preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
---------
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
# What this PR does
New OnCall plugin initialization process
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
---------
Co-authored-by: Michael Derynck <michael.derynck@grafana.com>
Co-authored-by: Matias Bordese <mbordese@gmail.com>
# What this PR does
Related to https://github.com/grafana/oncall-private/issues/2692
This PR simply deduplicates a lot of steps in our
`linting-and-tests.yml` GitHub Actions workflow. This will make it much
easier in `grafana/oncall-private` to be able to reuse some of these
composable building blocks.
# What this PR does
Refactors the PagerDuty migration script to be a bit more generic + adds
a migration script to migrate from Splunk OnCall (VictorOps)
tldr;
```bash
❯ docker build -t oncall-migrator .
[+] Building 0.4s (10/10) FINISHED
❯ docker run --rm \
-e MIGRATING_FROM="pagerduty" \
-e MODE="plan" \
-e ONCALL_API_URL="http://localhost:8080" \
-e ONCALL_API_TOKEN="<ONCALL_API_TOKEN>" \
-e PAGERDUTY_API_TOKEN="<PAGERDUTY_API_TOKEN>" \
oncall-migrator
running pagerduty migration script...
❯ docker run --rm \
-e MIGRATING_FROM="splunk" \
-e MODE="plan" \
-e ONCALL_API_URL="http://localhost:8080" \
-e ONCALL_API_TOKEN="<ONCALL_API_TOKEN>" \
-e SPLUNK_API_ID="<SPLUNK_API_ID>" \
-e SPLUNK_API_KEY="<SPLUNK_API_KEY>" \
oncall-migrator
migrating from splunk oncall...
```
https://www.loom.com/share/a855062d436a4ef79f030e22528d8c71
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
# What this PR does
- Use cross-plugin e2e tests setup (cloning ops-devenv, gops-labels)
only on a daily runs and on dev/main branch pipelines (exclude it from
PRs so that community PRs don't rely on secrets)
- Rename "Daily e2e tests" to "Expensive e2e tests" and run them both
daily and when PRs are merged to dev/main
- Post Slack message only if e2e tests fail
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
[uv](https://github.com/astral-sh/uv) is an extremely fast Python
package installer and resolver, written in Rust, and designed as a
drop-in replacement for pip and pip-tools workflows (see
[post](https://astral.sh/blog/uv))
# What this PR does
- Run e2e tests using ops-devenv against environment that includes
OnCall & Labels
- Add first e2e test for Labels (creating new label key and value)
## Which issue(s) this PR closes
Closes https://github.com/grafana/oncall/issues/4083
<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
# What this PR does
- bring back and fix frontend unit tests
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
# What this PR does
Allow e2e tests GitHub Actions job to run for PRs from external forks.
At the moment they are not allowed to run because PRs from external
forks are not allowed to access secrets in this repository. The secrets
accessed by the e2e tests are _really_ only required by the
daily/"expensive" e2e tests.
[Example PR](https://github.com/grafana/oncall/pull/3992) from external
fork where e2e tests fail ([CI
job](https://github.com/grafana/oncall/actions/runs/8175205794?pr=3992)):
<img width="1110" alt="Screenshot 2024-03-06 at 11 58 05"
src="https://github.com/grafana/oncall/assets/9406895/00ca97c2-0740-4e3e-9a03-7e92f20b69e6">
## Which issue(s) this PR closes
N/A
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
show up in the autogenerated release notes.
# What this PR does
- fixes our e2e tests to work on all tested versions
- updates Grafana versions that we run the daily e2e tests against (bump
`10.0.2` to `10.0.11` + add `10.1.7` tags)
- updates the Slack status message format + change channel from
#irm-amixr-flux to #gops-oncall-dev
<img width="1479" alt="Screenshot 2024-02-24 at 08 30 06"
src="https://github.com/grafana/oncall/assets/9406895/f5cb91f8-12ce-4978-9c37-c72ee8a01e4b">
## NOTE
It looks like we have some e2e tests that fail under the following
circumstances:
- on Firefox or WebKit
- on Grafana 10.2 and 10.3 (once we fix these, we should [update our e2e
tests that run on all PR
builds](https://github.com/grafana/oncall/blob/dev/.github/workflows/linting-and-tests.yml#L325)
to run against `10.3.3` which is the current latest major version
available)
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
[pip-tools](https://github.com/jazzband/pip-tools) helps making builds
deterministic, controlling deps (and indirect deps) upgrades (and
versions consistency) avoiding unexpected (and potentially breaking)
changes.
We keep our direct deps in `requirements.in` from which we generate the
`requirements.txt` (where *all* deps are pinned). We also constrain dev
(and enterprise) deps based on base requirements.
Check how to [update
deps](https://github.com/jazzband/pip-tools?tab=readme-ov-file#updating-requirements).
# What this PR does
Upgrade to Python 3.12 + fix several invalid test assertions that lead
to test failures in the latest version of `pytest`:
```
AttributeError: 'called_once_with' is not a valid assertion. Use a spec for the mock if 'called_once_with' is meant to be an attribute.. Did you mean: 'assert_called_once_with'?
```
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
In PR pipelines install dependencies and run e2e tests only in Chromium.
In daily e2e workflow use Chromium, Firefox and Webkit.
## Which issue(s) this PR fixes
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
- add missing db migration files generated via `python manage.py
makemigrations`
- fail the `lint-migrations-backend-mysql-rabbitmq` GitHub Actions CI
job if there are missing Django database migration files
# What this PR does
- updates the GitHub Actions workflow to move the e2e tests into a
"[reusable
workflow](https://docs.github.com/en/actions/using-workflows/reusing-workflows#creating-a-reusable-workflow)"
which are run in two scenarios:
- all tests _except_ those annotated as `@expensive` are run against
`grafana/grafana:latest` on all feature branches
- all tests _including_ `@expensive` tests are run on weekdays @ 07h00
UTC, against a matrix of 6 grafana versions. Results of these builds
will be posted to `#irm-amixr-flux` Slack channel.
- local development will now be:
```bash
make build-dev-images init-k8s start-k8s
```
- `build-dev-images` - builds the engine and UI docker images (only need
to run first time)
- `init-k8s` - creates a `kind` cluster and loads the two Docker images
onto the cluster nodes (only need to run first time)
- `start-k8s` - switches `kubectl` context to the created `kind`
cluster, and uses `helm` to deploy everything as defined in
`./dev/helm-local.yml` and `./dev/helm-local.dev.yml` (that latter file
is `.gitignored` and specific to how _you_ want your setup to look like.
Hot reloading works as before. This is the _start_ of #2381. (I've
marked these `make` commands as beta, because they've not yet been
thoroughly tested for local development).
- modifies the `helm` chart to add the concept of `oncall.devMode`,
`ui`, and ability to run oncall w/ sqlite
- `oncall.devMode` will essentially just add `volumes` and
`volumeMounts` to the various engine/migrate containers +
- `ui.enabled` + `ui.env` - create a ui container (which is needed for
hot reloading locally)
- `sqlite` - this was useful for the e2e test environments where Github
runner resources are scarce. Running `mariadb` eats up precious
resources, instead lets just use sqlite here
- fixes an issue that caused sporadic HTTP 502s from the grafana
plugin-proxy, which led to flaky tests. See [this
comment](https://github.com/grafana/oncall/pull/2751/files#diff-09040e8df192699b9c5742110ebbe8d9d5c3938cb156cc1cb99fa1c3fdee4fefR72-R77)
for more context + a link to a relevant Slack conversation. **tldr;**
there is a bug with the Grafana plugin proxy in Grafana >= v10.0.3.
Let's stop using the `latest`/`main` docker tags in our test and pin to
`10.0.2` for now
- ~~re-enables the e2e test which validates a phone number via SMS, and
asserts that we can receive an alert escalation via SMS (new Mailslurp
API Key has been added as a repo secret)~~ update: this is still blocked
by procurement, will be done in a future PR
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Add [`yamllint`](https://github.com/adrienverge/yamllint) to
`pre-commit` configuration + fix pre-existing errors
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
There are the following tests added:
- admin is allowed to edit other profiles
- editor is not allowed to edit other profiles
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/1586
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Rares Mardare <rares.mardare@grafana.com>
My previous PR only updated the CI that ran on PRs, I forgot about the
CD for releases.
Fixes https://github.com/grafana/oncall/actions/runs/5547818896.
You can see that this will resolve the problem because it is what allows
the PR CI to pass. I just forgot to include it in the release CD.
Signed-off-by: Jack Baldry <jack.baldry@grafana.com>
# What this PR does
The `docs/reference` shortcode supports contextual destinations and
version inference.
`<ONCALL VERSION>` is inferred to match the version of the documentation
set. For example, the inferred version for the page
/docs/grafana/oncall/latest/get-started/ is "latest". It can also be
overriden using front matter.
Given the same page, but with the additional front matter
`oncall_version: next`, the variable is substituted with "next" rather
than "latest".
Contextual destinations are achieved using repeated labels in the
shortcode inner text. The format is [<LABEL>]: "<PAGE PATH PREFIX> ->
<HUGO REFERENCE>".
- _`<LABEL>`_ matches the reference style link label used in the rest of
the text.
- _`<PAGE PATH PREFIX>`_ is matched against the page during the
production build. If the match is successful, the destination that is
used is _`<HUGO REFERENCE>`_. The first matching prefix is used, not the
longest matching prefix.
## Which issue(s) this PR fixes
- Broken links due to ambiguous relref resolution. Any relref parameter
that does not start with either `/`, `./`, or `../` can resolve
ambiguously and is resulting in broken link behavior on the current
site.
- Broken links in Grafana Cloud. We mount OnCall documentation in
Grafana Cloud. In https://github.com/grafana/website/pull/13872 the
location will become /docs/grafana-cloud/alerting-and-irm/oncall. This
PR is intended to be merged alongside that PR.
---------
Signed-off-by: Jack Baldry <jack.baldry@grafana.com>
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
Lays ground work for #1586. Adds three new fixtures, `adminRolePage`,
`editorRolePage`, and `viewerRolePage`. These fixtures can be easily
accessed in a `test` context and allow the test to be run as a user
authenticated with one of these Grafana basic roles.
The bulk of the changes in the PR are to the "global setup" step. There
is a bit of logic + communication with the Grafana instance's API, in
order to create all the necessary authentication credentials.
Lastly, adds the first basic role authorization test, asserting that
Admin/Editors can view the list of OnCall users, whereas Viewers cannot.
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
```bash
❯ mypy .
Success: no issues found in 595 source files
```
- re-enable the mypy CI check
- fixes all `django-manager-missing` mypy errors
- disable all other rules currently giving mypy errors
- changing the approach here. rather than enforcing that backend
contributors fix >= 1 `mypy` error on their PR, lets simply disable all
the rules that're currently returning errors and slowly re-enable these
one at a time #2392
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required) (N/A)
# What this PR does
Update `rabbitmq` Docker containers used in the `docker-compose` config
files, Drone pipelines, and GitHub Actions to use version 3.12.0.
FWIW, we're already using v12.0.0 of the bitnami `rabbitmq` `helm` chart
which, by default, uses the `3.12.0-debian-11-r0` tag for the `rabbitmq`
image ([chart
docs](https://artifacthub.io/packages/helm/bitnami/rabbitmq/12.0.0)).
closes#695
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required) (N/A)
# What this PR does
- Adds [`mypy` static type checking](https://mypy-lang.org/) to our CI
pipeline. Currently there is still a **ton** of errors being returned by
the tool, as we'll need to fix pre-existing errors. I think we can
slowly chip away at these errors in small PRs, doing them all in one
large PR is likely very risky.
- Also, this PR starts chipping away at one of the main type errors that
we have which is accessing the `datetime` class (from the `datetime`
library) or `timedelta` function on the `django.utils.timezone` module.
Basically we should be instead accessing these two objects from the
native `datetime` module. This makes sense because the [`__all__`
attribute](https://github.com/django/django/blob/main/django/utils/timezone.py#L14-L30)
in `django.utils.timezone` does not re-export `datetime` or `timedelta`.
- splits `engine` dependencies out into `requirements.txt` and
`requirements-dev.txt`
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required) (N/A)
# What this PR does
* Upgrade to the recent Grafana
* Upgrade to the recent bitnami mariadb, rabbitmq charts which support
arm64 now
* Remove deprecated psp policies from grafana chart
* Make startupProbe period smaller to increase installation speed
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Fixing some bugs with external Postgresql configuration.
Also I added some unit tests for helm chart using
[helm-unittest](https://github.com/helm-unittest/helm-unittest). If it's
not an appropriate tool, please suggest another, or I can remove that
test. I added
[this](https://github.com/marketplace/actions/helm-unit-tests) Github
Action to run helm unit tests.
## Which issue(s) this PR fixes
closes#1727closes#1923closes#1245closes#845
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
Occasionally, the Playwright global setup step (which authenticates w/
the Grafana API + configures the plugin) would fail, leading to the CI
job to instantly fail (playwright doesn't retry global setup if it
fails).
My current hypothesis as to why this is happening is because the
`oncall-engine` and `oncall-celery` pods aren't _actually_ ready in
these cases based on the way the `jupyterhub/action-k8s-await-workloads`
action await k8s workloads:
<img width="1076" alt="Screenshot 2023-05-23 at 18 24 36"
src="https://github.com/grafana/oncall/assets/9406895/68d8d2d9-4274-4749-8788-e0a9a3dbad83">
By using the `kubectl rollout status deployment/<deployment-name>
--timeout=300s` instead, we can be sure that these pods are _actually_
ready to receive traffic before we start the tests.
```bash
❯ kubectl rollout status --help
Show the status of the rollout.
By default 'rollout status' will watch the status of the latest rollout until it's done. If you don't want to wait for
the rollout to finish then you can use --watch=false. Note that if a new rollout starts in-between, then 'rollout
status' will continue watching the latest revision. If you want to pin to a specific revision and abort if it is rolled
over by another revision, use --revision=N where N is the revision you need to watch for.
```
Lastly, even despite this, sometimes the `POST
/api/internal/v1/plugin/sync` endpoint will return HTTP 500 ([example
logs](https://github.com/grafana/oncall/actions/runs/5062712137/jobs/9088529416#step:19:2536)
from failed CI job). In this case, let's setup the Playwright global
setup to retry 3 times.
#1692 is still open. This PR is not an ideal approach, but it's a quick
win while we wait for that issue to be resolved.
By retrying failing tests up to 3 times, we _should_ be fine to
re-enable these on CI. If a test is failing > 3 times, there's likely a
legitimate issue occuring.
# What this PR does
- Improvement to the local development environment for the grafana
plugin
- Run initial yarn build inside the docker container with the same
version that is later used for periodic rebuilds
- Removes the requirement for having yarn/nodejs installed locally
- Using a named volume for storing the node_modules, so they are only
stored once
- Remove the yarn install step from the Dockerfile
- Ideally we store the node_modules only once inside the named volumes.
Currently they are stored times
- on the host system outside of dockerin grafana-plugins/node_modules
- inside the docker image
- inside the anonymous docker volume created at the start of a container
- update `node` to 18.16.0 (14.17.0 has reached end-of-life as of 3
weeks ago)
## Which issue(s) this PR fixes
## Checklist
- [X] ~Unit, integration, and e2e (if applicable) tests updated~ N/A
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
# What this PR does
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Upgrades the backend to Python 3.11.3 (latest stable release) + update
linting step on Drone builds to run **all** the linting steps, not just
the Python ones.
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)