Commit graph

18 commits

Author SHA1 Message Date
Yulya Artyukhina
503939783f
Add settings var to choose application metrics to collect (#4781)
# What this PR does
Adds settings var `METRICS_TO_COLLECT` to choose what metrics should be
collected by `ApplicationMetricsCollector`.
It allows to collect different application metrics using different
exporters.

## Which issue(s) this PR closes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-08-12 10:37:48 +00:00
Yulya Artyukhina
4e03732fb1
Drop stale metrics cache (#4695)
# What this PR does
based on this PR - https://github.com/grafana/oncall/pull/4517

## Which issue(s) this PR closes

Related to https://github.com/grafana/oncall/issues/4455

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-17 17:01:04 +00:00
Yulya Artyukhina
191814b25e
User notifications bundle (#4457)
# What this PR does
This PR adds two new models: UserNotificationBundle and
BundledNotification (proposals for naming are welcome).

`UserNotificationBundle` manages the information about last notification
time and scheduled notification task for bundled notifications. It is
unique per user + notification_channel + notification importance.

`BundledNotification` contains notification policy and alert group, that
triggered the notification. The BundledNotification instance is created
in `notify_user_task` for every notification, that should be bundled,
and is attached to UserNotificationBundle by ForeignKey connection.

How it works:
If the user was notified recently (within the last two minutes) by the
current notification channel, and this channel is bundlable,
BundledNotification instance will be created and attached to the
UserNotificationBundle instance, and `send_bundled_notification` task
will be scheduled to execute in 2 min.
In `send_bundled_notification` task we get all BundledNotification
attached to the current UserNotificationBundle instance, check if alert
groups are still active and if there is only one notification - perform
regular notification by calling `perform_notification` task, otherwise
call "notify_by_<channel>_bundle" method for the current notification
channel.

PR with method to send notification bundle by SMS -
https://github.com/grafana/oncall/pull/4624

**This feature is disabled by default by feature flag. Public docs will
be added in a separate PR with enabling this feature.**
## Which issue(s) this PR closes
related to https://github.com/grafana/oncall-private/issues/2712

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-07-16 11:24:08 +00:00
Yulya Artyukhina
f583da5b56
Add service_name label to insight metrics (#4300)
# What this PR does
Adds `service_name` label to insight metrics
NOTE: It is related to [this
PR](https://github.com/grafana/oncall/pull/4227) and should be merged no
sooner than two days after the next release (current release version is
1.4.4), because we need to wait for the metrics cache to be updated for
all organizations (uses the new cache structure with `services`)

## Which issue(s) this PR closes
Related to https://github.com/grafana/oncall-private/issues/2610

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-05-22 14:17:42 +00:00
Matias Bordese
b07b140383
Avoid generating response time value metrics for empty integrations (#4339)
This should help with ongoing issue generating too big metrics payloads
(issue introduced when including service name labels).
2024-05-13 17:23:22 +00:00
Yulya Artyukhina
0790d45ab5
Fix calculating metrics from different services in metrics collector (#4297)
# What this PR does
Fix calculating metric values per integration from different services

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-04-29 12:08:53 +00:00
Yulya Artyukhina
d1085b718c
Prepare insight metrics structure for adding service_name label (#4227)
# What this PR does
Prepare insight metrics for adding `service_name` label.
This PR updates metrics cache structure, supporting both old and new
version of cache.
`service_name` label can be added with additional PR when all metric
cache is updated.

## Which issue(s) this PR closes
https://github.com/grafana/oncall-private/issues/2610

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-04-29 09:45:23 +00:00
Yulya Artyukhina
3c93375244
Update alert group state by backsync (#4089)
# What this PR does
Adds method to update alert group state by backsync
Related to https://github.com/grafana/oncall-private/issues/2542
Should be merged with
https://github.com/grafana/oncall-private/pull/2606

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] Added the relevant release notes label (see labels prefixed w/
`release:`). These labels dictate how your PR will
    show up in the autogenerated release notes.
2024-03-27 12:37:01 +00:00
Michael Derynck
2a466a0c4f
Add transaction on_commit before signals for alert group actions (#3731)
# What this PR does
Add transactions around log record creation and check transaction
on_commit before sending signals passing DB id of alert group log
records. In cases for delete we can then assume any missing IDs on tasks
are from intentionally deleted alert groups and we can stop tasks from
retrying endlessly.

## Which issue(s) this PR fixes

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2024-01-31 15:54:50 -07:00
Joey Orlando
2bb80b487e
address issue with metrics calculations when redis cluster is used (#3510)
## Which issue(s) this PR fixes

Fixes this issue we started seeing popping up because of a change
introduced in #3496:
```python3
File "/etc/app/apps/metrics_exporter/views.py", line 22, in get
    result = generate_latest(application_metrics_registry).decode("utf-8")
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/prometheus_client/exposition.py", line 198, in generate_latest
    for metric in registry.collect():
  File "/usr/local/lib/python3.11/site-packages/prometheus_client/registry.py", line 97, in collect
    yield from collector.collect()
  File "/etc/app/apps/metrics_exporter/metrics_collectors.py", line 56, in collect
    alert_groups_total, missing_org_ids_1 = self._get_alert_groups_total_metric(org_ids)
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/etc/app/apps/metrics_exporter/metrics_collectors.py", line 97, in _get_alert_groups_total_metric
    org_id_from_key = RE_ALERT_GROUPS_TOTAL.match(org_key).groups()[0]
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'groups'
```



```python3
>>> import re
>>> ALERT_GROUPS_TOTAL = "oncall_alert_groups_total"
>>> _RE_BASE_PATTERN = r"{{?{}}}?_(\d+)"
>>> RE_ALERT_GROUPS_TOTAL = re.compile(_RE_BASE_PATTERN.format(ALERT_GROUPS_TOTAL))
>>> org_key = "{oncall_alert_groups_total}_1"
>>> RE_ALERT_GROUPS_TOTAL.match(org_key).groups()[0]
'1'
>>> org_key = "oncall_alert_groups_total_1"
>>> RE_ALERT_GROUPS_TOTAL.match(org_key).groups()[0]
'1'
```

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-12-05 12:12:08 -05:00
Vadim Stepanov
8188dd5dd2
Create missing direct paging integrations (#3468)
# What this PR does

Makes organization sync create direct paging integrations for Grafana
teams that don't have one.

## Which issue(s) this PR fixes

Related to https://github.com/grafana/oncall-private/issues/2302

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-11-30 17:18:18 +00:00
Ildar Iskhakov
2fdd885abe
Move new alert group metric creation into async task (#3451)
# What this PR does

Moving metrics creation into separate task to make alert ingestion more
robust

## Which issue(s) this PR fixes

## Checklist

- [ ] Unit, integration, and e2e (if applicable) tests updated
- [ ] Documentation added (or `pr:no public docs` PR label added if not
required)
- [ ] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Julia <ferril.darkdiver@gmail.com>
2023-11-29 12:45:36 +00:00
Joey Orlando
d6140cbe8d
Re-enable a few mypy rules + fix existing errors (#2725)
# What this PR does

Related to https://github.com/grafana/oncall/issues/2392

- Re-enable the following `mypy` rules + fix their pre-existing errors:
  - `no-redef`
  - `valid-type`
  - `var-annotated`
- Add stronger return typing to the `GrafanaAPIClient` by use of
generics + add some links to documentation in the method docstrings

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
2023-08-03 09:43:03 +00:00
Yulya Artyukhina
f1fcb41fb4
Add "user_was_notified_of_alert_groups" metric (#2334)
This PR adds new metric for Prometheus exporter
"user_was_notified_of_alert_groups" which counts how many alert groups
user was notified of.

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
2023-06-28 08:15:19 +00:00
Matias Bordese
f21bd8d0b5
Refactor metrics exporter to use cache.get_many (#2212)
Eventually we could also process orgs by chunks.
2023-06-14 12:14:29 +00:00
Joey Orlando
8c5f6238dc
get rid of need for FEATURE_PROMETHEUS_EXPORTER_ENABLED to be present in ci-test.py settings file (#2161) 2023-06-12 09:43:05 -04:00
Matias Bordese
cc3c18c89c
Add instructions for prometheus exporter setup (#2103) 2023-06-12 13:04:07 +00:00
Yulya Artyukhina
15ef692009
OnCall prometheus metrics exporter (#1605)
# What this PR does
Add OnCall prometheus metrics exporter

## Which issue(s) this PR fixes

## Checklist

- [x] Tests updated
- [ ] Documentation added
- [ ] `CHANGELOG.md` updated

---------

Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Co-authored-by: Matias Bordese <mbordese@gmail.com>
2023-05-25 18:26:13 +00:00