# What this PR does
Add [`yamllint`](https://github.com/adrienverge/yamllint) to
`pre-commit` configuration + fix pre-existing errors
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
There are the following tests added:
- admin is allowed to edit other profiles
- editor is not allowed to edit other profiles
## Which issue(s) this PR fixes
https://github.com/grafana/oncall/issues/1586
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Rares Mardare <rares.mardare@grafana.com>
My previous PR only updated the CI that ran on PRs, I forgot about the
CD for releases.
Fixes https://github.com/grafana/oncall/actions/runs/5547818896.
You can see that this will resolve the problem because it is what allows
the PR CI to pass. I just forgot to include it in the release CD.
Signed-off-by: Jack Baldry <jack.baldry@grafana.com>
# What this PR does
The `docs/reference` shortcode supports contextual destinations and
version inference.
`<ONCALL VERSION>` is inferred to match the version of the documentation
set. For example, the inferred version for the page
/docs/grafana/oncall/latest/get-started/ is "latest". It can also be
overriden using front matter.
Given the same page, but with the additional front matter
`oncall_version: next`, the variable is substituted with "next" rather
than "latest".
Contextual destinations are achieved using repeated labels in the
shortcode inner text. The format is [<LABEL>]: "<PAGE PATH PREFIX> ->
<HUGO REFERENCE>".
- _`<LABEL>`_ matches the reference style link label used in the rest of
the text.
- _`<PAGE PATH PREFIX>`_ is matched against the page during the
production build. If the match is successful, the destination that is
used is _`<HUGO REFERENCE>`_. The first matching prefix is used, not the
longest matching prefix.
## Which issue(s) this PR fixes
- Broken links due to ambiguous relref resolution. Any relref parameter
that does not start with either `/`, `./`, or `../` can resolve
ambiguously and is resulting in broken link behavior on the current
site.
- Broken links in Grafana Cloud. We mount OnCall documentation in
Grafana Cloud. In https://github.com/grafana/website/pull/13872 the
location will become /docs/grafana-cloud/alerting-and-irm/oncall. This
PR is intended to be merged alongside that PR.
---------
Signed-off-by: Jack Baldry <jack.baldry@grafana.com>
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
# What this PR does
Fixes the CODEOWNERS file which is marked as invalid because of the team
name change.
The team will need to be given write access to the repository by a
repository admin.
https://github.com/orgs/grafana/teams/docs-gops is the team.
Signed-off-by: Jack Baldry <jack.baldry@grafana.com>
# What this PR does
Lays ground work for #1586. Adds three new fixtures, `adminRolePage`,
`editorRolePage`, and `viewerRolePage`. These fixtures can be easily
accessed in a `test` context and allow the test to be run as a user
authenticated with one of these Grafana basic roles.
The bulk of the changes in the PR are to the "global setup" step. There
is a bit of logic + communication with the Grafana instance's API, in
order to create all the necessary authentication credentials.
Lastly, adds the first basic role authorization test, asserting that
Admin/Editors can view the list of OnCall users, whereas Viewers cannot.
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
```bash
❯ mypy .
Success: no issues found in 595 source files
```
- re-enable the mypy CI check
- fixes all `django-manager-missing` mypy errors
- disable all other rules currently giving mypy errors
- changing the approach here. rather than enforcing that backend
contributors fix >= 1 `mypy` error on their PR, lets simply disable all
the rules that're currently returning errors and slowly re-enable these
one at a time #2392
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required) (N/A)
# What this PR does
Update `rabbitmq` Docker containers used in the `docker-compose` config
files, Drone pipelines, and GitHub Actions to use version 3.12.0.
FWIW, we're already using v12.0.0 of the bitnami `rabbitmq` `helm` chart
which, by default, uses the `3.12.0-debian-11-r0` tag for the `rabbitmq`
image ([chart
docs](https://artifacthub.io/packages/helm/bitnami/rabbitmq/12.0.0)).
closes#695
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required) (N/A)
# What this PR does
- Adds [`mypy` static type checking](https://mypy-lang.org/) to our CI
pipeline. Currently there is still a **ton** of errors being returned by
the tool, as we'll need to fix pre-existing errors. I think we can
slowly chip away at these errors in small PRs, doing them all in one
large PR is likely very risky.
- Also, this PR starts chipping away at one of the main type errors that
we have which is accessing the `datetime` class (from the `datetime`
library) or `timedelta` function on the `django.utils.timezone` module.
Basically we should be instead accessing these two objects from the
native `datetime` module. This makes sense because the [`__all__`
attribute](https://github.com/django/django/blob/main/django/utils/timezone.py#L14-L30)
in `django.utils.timezone` does not re-export `datetime` or `timedelta`.
- splits `engine` dependencies out into `requirements.txt` and
`requirements-dev.txt`
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required) (N/A)
# What this PR does
* Upgrade to the recent Grafana
* Upgrade to the recent bitnami mariadb, rabbitmq charts which support
arm64 now
* Remove deprecated psp policies from grafana chart
* Make startupProbe period smaller to increase installation speed
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Use [this](https://github.com/actions/stale) GitHub Action to run a
cron-job every morning to triage stale pull requests. The messages
posted to the pull request when stale/closed were borrowed from the
`grafana/grafana` repo
([example](https://github.com/grafana/grafana/pull/65754)).
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Fixing some bugs with external Postgresql configuration.
Also I added some unit tests for helm chart using
[helm-unittest](https://github.com/helm-unittest/helm-unittest). If it's
not an appropriate tool, please suggest another, or I can remove that
test. I added
[this](https://github.com/marketplace/actions/helm-unit-tests) Github
Action to run helm unit tests.
## Which issue(s) this PR fixes
closes#1727closes#1923closes#1245closes#845
## Checklist
- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
Occasionally, the Playwright global setup step (which authenticates w/
the Grafana API + configures the plugin) would fail, leading to the CI
job to instantly fail (playwright doesn't retry global setup if it
fails).
My current hypothesis as to why this is happening is because the
`oncall-engine` and `oncall-celery` pods aren't _actually_ ready in
these cases based on the way the `jupyterhub/action-k8s-await-workloads`
action await k8s workloads:
<img width="1076" alt="Screenshot 2023-05-23 at 18 24 36"
src="https://github.com/grafana/oncall/assets/9406895/68d8d2d9-4274-4749-8788-e0a9a3dbad83">
By using the `kubectl rollout status deployment/<deployment-name>
--timeout=300s` instead, we can be sure that these pods are _actually_
ready to receive traffic before we start the tests.
```bash
❯ kubectl rollout status --help
Show the status of the rollout.
By default 'rollout status' will watch the status of the latest rollout until it's done. If you don't want to wait for
the rollout to finish then you can use --watch=false. Note that if a new rollout starts in-between, then 'rollout
status' will continue watching the latest revision. If you want to pin to a specific revision and abort if it is rolled
over by another revision, use --revision=N where N is the revision you need to watch for.
```
Lastly, even despite this, sometimes the `POST
/api/internal/v1/plugin/sync` endpoint will return HTTP 500 ([example
logs](https://github.com/grafana/oncall/actions/runs/5062712137/jobs/9088529416#step:19:2536)
from failed CI job). In this case, let's setup the Playwright global
setup to retry 3 times.
#1692 is still open. This PR is not an ideal approach, but it's a quick
win while we wait for that issue to be resolved.
By retrying failing tests up to 3 times, we _should_ be fine to
re-enable these on CI. If a test is failing > 3 times, there's likely a
legitimate issue occuring.
# What this PR does
- Improvement to the local development environment for the grafana
plugin
- Run initial yarn build inside the docker container with the same
version that is later used for periodic rebuilds
- Removes the requirement for having yarn/nodejs installed locally
- Using a named volume for storing the node_modules, so they are only
stored once
- Remove the yarn install step from the Dockerfile
- Ideally we store the node_modules only once inside the named volumes.
Currently they are stored times
- on the host system outside of dockerin grafana-plugins/node_modules
- inside the docker image
- inside the anonymous docker volume created at the start of a container
- update `node` to 18.16.0 (14.17.0 has reached end-of-life as of 3
weeks ago)
## Which issue(s) this PR fixes
## Checklist
- [X] ~Unit, integration, and e2e (if applicable) tests updated~ N/A
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
---------
Co-authored-by: Joey Orlando <joseph.t.orlando@gmail.com>
# What this PR does
## Which issue(s) this PR fixes
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Upgrades the backend to Python 3.11.3 (latest stable release) + update
linting step on Drone builds to run **all** the linting steps, not just
the Python ones.
## Checklist
- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
# What this PR does
Replaces `snyk test` with `snyk monitor` so results get pushed to out
Snyk platform and the [Snyk
Dashboards](https://ops.grafana-ops.net/d/H0w7l5NVk/snyk-overview?orgId=1)
gets updated.
## Which issue(s) this PR fixes
## Checklist
- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
This PR cuts GitHub Action build times from 14-15 minutes, down to just
under 7 minutes. It does this by:
- caching `grafana-plugins/node_modules` and `pip` dependencies based on
their respective dependency files (eg. `requirements.txt` &
`yarn.lock`). This step alone saves ~3 minutes.
- get rid of the "build-engine-docker-image" and
"backend-integration-tests" jobs in the old "Integration Tests"
workflow. This was split out this way so that we could build the backend
docker image once, upload the artifact, and then reuse it across the
backend and e2e tests. We no longer need these backend integration tests
because we are testing the same thing in the e2e tests. This saves ~45
seconds of having to upload the image artifact.
- few improvements within the integration tests themselves:
- move plugin configuration to the `globalSetup.ts`. This means that
every test does not need to check if the plugin has been configured
because it is done once before all the tests are run.
- cache the plugin frontend build. If your commit doesn't change
anything to `grafana-plugin/src` or `grafana-plugin/yarn.lock` it should
be safe to reuse a previously built/cached version of the plugin
frontend. This saves ~3 minutes
- cache playwright binaries/dependencies. Only re-install them if the
version of `@playwright/test` in `grafana-plugin/yarn.lock` changes.
This saves ~3 minutes.
**Other things to mention**
Once we refactor the `GSelect` component to not call the `onChange`
callback on every keyDown event (#1628), this should allow us to
parallelize the integration tests, and cut the time required to execute
the tests themselves in half
Related to #828
- Enable web UI for API/Terraform schedules to add overrides
- Refactor backend to add a flag toggling between web-based and
iCal-based overrides (these options are mutually exclusive)
Also updated read-only tooltips (related to #1483)
**What this PR does**:
Adds our first UI integration test using
[Playwright](https://playwright.dev/) and runs the test on CI. Right now
the test:
- logs into Grafana
- configures the plugin (if it isn't already)
- creates an OnCall schedule, where the current user will be OnCall
- creates an escalation chain to notify based on the newly created
OnCall schedule
- creates a webhook integration, attached to the created escalation
chain
- sends a demo alert for the new integration
- goes to the alert groups page and validates that the escalation step
to alert the OnCall user actually happened
Currently the Playwright tests are run against the 3 default headless
browsers, chromium, Firefox, and webkit. The CI job that runs these
tests is run as a matrix against 3 tagged versions of `grafana`; `main`,
`latest`, and `9.2.6`.
Secondly, it adds most of the logic for a second test which:
- logs into Grafana
- configures the plugin (if it isn't already)
- goes to the user's settings, verifies their phone number (using a tool
called [MailSlurp](https://www.mailslurp.com/))
- configures the current user's default escalation policy to send alerts
via SMS
- creates an escalation policy and configures it to send alerts to our
current user
- creates an integration and assigns the created escalation policy
- triggers a test alert + verifies that we receive the SMS alert text
(again, using MailSlurp)
**Which issue(s) this PR fixes**:
Closes#873
**Checklist**
- [x] Tests updated
- [ ] Documentation added (N/A)
- [ ] `CHANGELOG.md` updated (N/A)