oncall-engine

History

Joey Orlando eefe7be56a e2e tests on CI - actually await k8s resources to be ready before starting tests (#1997 ) Occasionally, the Playwright global setup step (which authenticates w/ the Grafana API + configures the plugin) would fail, leading to the CI job to instantly fail (playwright doesn't retry global setup if it fails). My current hypothesis as to why this is happening is because the `oncall-engine` and `oncall-celery` pods aren't _actually_ ready in these cases based on the way the `jupyterhub/action-k8s-await-workloads` action await k8s workloads: <img width="1076" alt="Screenshot 2023-05-23 at 18 24 36" src="https://github.com/grafana/oncall/assets/9406895/68d8d2d9-4274-4749-8788-e0a9a3dbad83"> By using the `kubectl rollout status deployment/<deployment-name> --timeout=300s` instead, we can be sure that these pods are _actually_ ready to receive traffic before we start the tests. ```bash ❯ kubectl rollout status --help Show the status of the rollout. By default 'rollout status' will watch the status of the latest rollout until it's done. If you don't want to wait for the rollout to finish then you can use --watch=false. Note that if a new rollout starts in-between, then 'rollout status' will continue watching the latest revision. If you want to pin to a specific revision and abort if it is rolled over by another revision, use --revision=N where N is the revision you need to watch for. ``` Lastly, even despite this, sometimes the `POST /api/internal/v1/plugin/sync` endpoint will return HTTP 500 ([example logs](https://github.com/grafana/oncall/actions/runs/5062712137/jobs/9088529416#step:19:2536) from failed CI job). In this case, let's setup the Playwright global setup to retry 3 times.		2023-05-23 20:20:46 -04:00
..
helm_release.yml	Revert "Revert "speed up ci builds from 15 to <7 minutes"" (#1643 )	2023-03-28 09:34:03 +02:00
helm_release_pr.yml	Update helm_release_pr.yml	2023-01-20 16:41:51 +08:00
issue_commands.yml	Revert "Revert "speed up ci builds from 15 to <7 minutes"" (#1643 )	2023-03-28 09:34:03 +02:00
issues_add_to_project.yml	GH Action to add OSS issues to team's kanban board (#1674 )	2023-03-30 14:53:08 +03:00
linting-and-tests.yml	e2e tests on CI - actually await k8s resources to be ready before starting tests (#1997 )	2023-05-23 20:20:46 -04:00
publish-technical-documentation-next.yml	Revert "Revert "speed up ci builds from 15 to <7 minutes"" (#1643 )	2023-03-28 09:34:03 +02:00
publish-technical-documentation-release.yml	Revert "Revert "speed up ci builds from 15 to <7 minutes"" (#1643 )	2023-03-28 09:34:03 +02:00
snyk.yml	Feat(Dev): Improve Building of Grafana Plugin in Development Env + update node version (#1890 )	2023-05-17 16:12:51 -04:00
verify-changelog-updated.yml	don't run changelog/public-docs CI checks on merge_group Github events (#1388 )	2023-02-22 16:18:25 +01:00
verify-public-docs-updated.yml	don't run changelog/public-docs CI checks on merge_group Github events (#1388 )	2023-02-22 16:18:25 +01:00