Commit graph

22 commits

Author SHA1 Message Date
Ildar Iskhakov
c7a7a3f81a
Use dataclass methods in custom ratelimits and fix tests (#5036)
# What this PR does

Follow up PR for https://github.com/grafana/oncall/pull/5004
Tests haven’t caught a bug, so the method and the tests are fixed

## Which issue(s) this PR closes

Related to [issue link here]

<!--
*Note*: If you want the issue to be auto-closed once the PR is merged,
change "Related to" to "Closes" in the line above.
If you have more than one GitHub issue that this PR closes, be sure to
preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-09-18 13:32:16 +00:00
Ildar Iskhakov
c718863bd8
Add custom ratelimits per org (#5004)
# What this PR does

This PR refactors Throttling for public API and integrations API and
allows to specify organization ratelimits.


## Which issue(s) this PR closes

Related to [issue link here]

<!--
*Note*: If you want the issue to be auto-closed once the PR is merged,
change "Related to" to "Closes" in the line above.
If you have more than one GitHub issue that this PR closes, be sure to
preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-09-17 23:16:41 +00:00
Ildar Iskhakov
5ffbc18dc6
Remove Simulate Incident button (#4479)
# What this PR does

Removes "Simulate Incident" button which is replaced by "Send Demo
Alert" button

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2024-06-07 13:54:45 +00:00
Michael Derynck
f5855915a2
Improve performance for deleted integration lookups (#4163)
# What this PR does
- Refactor alert receive channel lookup so it is easier to follow
- Remove the additional lookup that was taking place for alert receive
channels that belong to a deleted organization, these can be treated as
deleted for usage purposes even though the alert receive channel itself
does not have `deleted_at` populated
- Organizations that have been moved will still need to be looked up
everytime. This is not optimized in favor of not maintaining a cache of
Organizations. These are not frequent requests and can be optimized
later if necessary.

## Which issue(s) this PR closes

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-04-05 16:16:30 +00:00
Michael Derynck
c255ec66d8
Fix bug in ratelimit logic introduced in #4137 (#4162)
# What this PR does
Fix bug introduced in #4137

## Which issue(s) this PR closes

Closes [issue link here]

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-04-04 20:34:18 +00:00
Michael Derynck
d14b1c8e28
Improve performance for rate-limited, banned and deleted integrations (#4137)
# What this PR does
- Remove BanAlertConsumptionBasedOnSettingsMiddleware under high traffic
scenarios this causes too much DB load and the same effect can be
achieved by soft-deleting the integration
- Change lookups for integrations so that non-existent and deleted
integrations are also cached to reduce DB load

## Which issue(s) this PR closes

Closes https://github.com/grafana/oncall-private/issues/2608

<!--
*Note*: if you have more than one GitHub issue that this PR closes, be
sure to preface
each issue link with a [closing
keyword](https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests#linking-a-pull-request-to-an-issue).
This ensures that the issue(s) are auto-closed once the PR has been
merged.
-->

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-04-04 18:05:34 +00:00
Vadim Stepanov
b7e2dc14f8
Fix ratelimit bug (#4108)
# What this PR does

Fixes a bug in the ratelimit logic when integration-specific ratelimit
429s are still counted towards the organization-wide ratelimit.

## Which issue(s) this PR closes

Related to https://github.com/grafana/support-escalations/issues/9579

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-03-26 17:20:05 +00:00
Matias Bordese
dbd5452a0b
Handle a possible outdated cached integration error (#3741)
Related to
[logs](https://ops.grafana-ops.net/explore?schemaVersion=1&panes=%7B%22hum%22:%7B%22datasource%22:%22c-R8UWvVk%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22amixr-prod%5C%22,%20job%3D%5C%22amixr-prod%2Famixr-integrations%5C%22%7D%20%7C%3D%20%5C%22django.core.serializers.base.DeserializationError%5C%22%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22c-R8UWvVk%22%7D,%22editorMode%22:%22code%22%7D%5D,%22range%22:%7B%22from%22:%221706023840486%22,%22to%22:%221706024722486%22%7D%7D%7D&orgId=1)
2024-01-23 20:46:12 +00:00
Ildar Iskhakov
95a3ab3b75
Revert "Cache independent ingestion" (#3417)
Reverts grafana/oncall#3415
2023-11-23 21:38:06 +08:00
Ildar Iskhakov
566e8c53ba Ignore typing checks for imported library (https://mypy.readthedocs.io/en/stable/running_mypy.html\#missing-library-stubs-or-py-typed-marker) 2023-11-23 16:14:30 +08:00
Ildar Iskhakov
0d5ef785bf Make alert ingestion cache independent 2023-11-23 11:27:47 +08:00
Matvey Kukuy
a898835eb4 Fixing ratelimit texts 2023-09-27 14:18:00 +03:00
Joey Orlando
df21be3a50
add more integration tests for integrations api (#2845) 2023-08-21 15:40:29 +02:00
Vadim Stepanov
b2f4ffb98a
apps.get_model -> import (#2619)
# What this PR does

Remove
[`apps.get_model`](https://docs.djangoproject.com/en/3.2/ref/applications/#django.apps.apps.get_model)
invocations and use inline `import` statements in places where models
are imported within functions/methods to avoid circular imports.

I believe `import` statements are more appropriate for most use cases as
they allow for better static code analysis & formatting, and solve the
issue of circular imports without being unnecessarily dynamic as
`apps.get_model`. With `import` statements, it's possible to:

- Jump to model definitions in most IDEs
- Automatically sort inline imports with `isort`
- Find import errors faster/easier (most IDEs highlight broken imports)
- Have more consistency across regular & inline imports when importing
models

This PR also adds a flake8 rule to ban imports of `django.apps.apps`, so
it's harder to use `apps.get_model` by mistake (it's possible to ignore
this rule by using `# noqa: I251`). The rule is not enforced on
directories with migration files, because `apps.get_model` is often used
to get a historical state of a model, which is useful when writing
migrations ([see this SO answer for more
details](https://stackoverflow.com/a/37769213)). So `apps.get_model` is
considered OK in migrations (even necessary in some cases).

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-07-25 09:43:23 +00:00
Innokentii Konstantinov
056b0ddc7e
Add ratelimit for AmazonSNS (#2032)
Adds a ratelimit for AmazonSNS. 
AlertChannelDefining mixin is now injecting alert_receive_channel only
in request, not in kwargs to not to break AmazonSNS.
2023-05-26 09:57:26 +00:00
Joey Orlando
014a9c2ec2
allow the POST incoming alert endpoints to queue create_alert tasks independent of the database status (#1896)
# What this PR does

https://www.loom.com/share/18cc445117de4895a10892d56c7d3699

In preparation to upgrade our cloud databases, this PR makes some minor
changes which, after testing locally, allowed the `POST
/<integration_type>/<alert_channel_key>` endpoints to successfully
receive incoming alerts and queue the celery tasks.

I've tested all of the defined `POST
/integrations/v1/<integration_type>/<alert_channel_key>` endpoints by
sending `POST` requests to an integrations' URL while the MySQL database
was down, bringing the database back up, and ensuring the alerts were
created.

## Some other findings
- the integration heartbeat endpoints will not work as we interact w/
the database to persist the incoming heartbeat instance
- if the integration was created in the last 180 seconds, incoming
alerts will fail due to the way we cache the integration IDs
([code](https://github.com/grafana/oncall/blob/dev/engine/apps/integrations/mixins/alert_channel_defining_mixin.py#L47-L50))
- The `create_alert` celery task is set to `max_retries=None` and
`retry_backoff=True`. This means that the queued tasks will continue
retrying forever w/ an exponential backoff, until the alerts can be
created in the database (ie. when the database is back online).

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated (N/A)
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required) (N/A)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required) (N/A)
2023-05-10 12:36:23 +00:00
Michael Derynck
49946e6a4e
Change Organization Deleted/Moved Precedence (#1402)
# What this PR does
When an organization is migrated to a different cluster it has it's
`migration_destination_slug` set for redirection purposes but it also
needs to be deleted so scheduled tasks for it do not run in the old
cluster. By changing the order so moved has precedence over deleted API
calls will be correctly redirected for moved organizations while the
organization is still considered deleted to suppress tasks that are no
longer needed in the old cluster.

## Which issue(s) this PR fixes

## Checklist

- [ ] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated
2023-02-24 11:45:21 +00:00
Innokentii Konstantinov
8abbcee050
Org soft-delete (#1073)
# What this PR does
It introduces soft-delete of organization, since grafana stacks are
soft-deleted too. Also, we had a problem with deleting orgs with large
amounts of alerts, so soft-deletion will fix this problem. I think, that
problem of cleaning alerts of deleted orgs should be solved as a part of
alert retention
2023-01-05 12:42:55 +08:00
Michael Derynck
6267e31b22 Check id instead of object to avoid unnecessary query 2022-10-28 15:45:51 -06:00
Michael Derynck
febe1b2185 Add basic organization moved exception handling and middleware 2022-10-20 15:04:58 -06:00
Michael Derynck
ce8f4e53fa
Conform URLs (#281)
* Make any URLs build from env vars tolerant of path prefix, trailing/leading slashes

* Add comment

* Lint
2022-07-25 09:12:50 -06:00
Michael Derynck
6b40f95033 World, meet OnCall!
Co-authored-by: Eve832 <eve.meelan@grafana.com>
    Co-authored-by: Francisco Montes de Oca <nevermind89x@gmail.com>
    Co-authored-by: Ildar Iskhakov <ildar.iskhakov@grafana.com>
    Co-authored-by: Innokentii Konstantinov <innokenty.konstantinov@grafana.com>
    Co-authored-by: Julia <ferril.darkdiver@gmail.com>
    Co-authored-by: maskin25 <kengurek@gmail.com>
    Co-authored-by: Matias Bordese <mbordese@gmail.com>
    Co-authored-by: Matvey Kukuy <motakuk@gmail.com>
    Co-authored-by: Michael Derynck <michael.derynck@grafana.com>
    Co-authored-by: Richard Hartmann <richih@richih.org>
    Co-authored-by: Robby Milo <robbymilo@fastmail.com>
    Co-authored-by: Timur Olzhabayev <timur.olzhabayev@grafana.com>
    Co-authored-by: Vadim Stepanov <vadimkerr@gmail.com>
    Co-authored-by: Yulia Shanyrova <yulia.shanyrova@grafana.com>
2022-06-03 08:09:47 -06:00