modify check_escalation_finished_task task (#1266)
# What this PR does This PR: - modifies the `check_escalation_finished_task` celery task to: - do stricter escalation validation based on the alert group's escalation snapshot (see the `audit_alert_group_escalation` method in `engine/apps/alerts/tasks/check_escalation_finished.py` for the validation logic) - use a read-only database for querying alert-groups if one is configured, otherwise use the "default" one - ping a configurable heartbeat (new env var `ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL` added) - increase the task frequency from every 10 to every 13 minutes (this can be configured via an env variable) - adds public documentation on how to configure this auditor task - modifies the local celery startup command to properly take into consideration all celery related env vars (similar to the ones we use in `engine/celery_with_exporter.sh`; this made it easier to enable `celery beat` locally for testing) - removes the following code: - removes references to `AlertGroup.estimate_escalation_finish_time` and marks the model field as deprecated using the [`django-deprecate-fields` library](https://pypi.org/project/django-deprecate-fields/). This field was only used for the previous version of this validation task - `EscalationSnapshotMixin.calculate_eta_for_finish_escalation` was only used to calculate the value for `AlertGroup.estimate_escalation_finish_time` - `calculate_escalation_finish_time` celery task ## Which issue(s) this PR fixes https://github.com/grafana/oncall-private/issues/1558 ## Checklist - [x] Tests updated - [x] Documentation added - [x] `CHANGELOG.md` updated
This commit is contained in:
parent
f85f77dbc7
commit
4d655dff60
21 changed files with 1369 additions and 254 deletions
|
|
@ -7,6 +7,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||
|
||||
## Unreleased
|
||||
|
||||
### Added
|
||||
|
||||
- Modified `check_escalation_finished_task` celery task to use read-only databases for its query, if one is defined +
|
||||
make the validation logic stricter + ping a configurable heartbeat on successful completion of this task ([1266](https://github.com/grafana/oncall/pull/1266))
|
||||
|
||||
### Changed
|
||||
|
||||
- Updated wording in some Slack messages to use 'Alert Group' instead of 'Incident' ([1565](https://github.com/grafana/oncall/pull/1565))
|
||||
|
|
|
|||
|
|
@ -29,10 +29,11 @@ GRAFANA_INCIDENT_STATIC_API_KEY=
|
|||
GRAFANA_API_URL=http://localhost:3000
|
||||
|
||||
CELERY_WORKER_QUEUE="default,critical,long,slack,telegram,webhook,retry,celery"
|
||||
CELERY_WORKER_CONCURRENCY=1
|
||||
CELERY_WORKER_CONCURRENCY=3
|
||||
CELERY_WORKER_MAX_TASKS_PER_CHILD=100
|
||||
CELERY_WORKER_SHUTDOWN_INTERVAL=65m
|
||||
CELERY_WORKER_BEAT_ENABLED=True
|
||||
CELERY_WORKER_DEBUG_LOGS=False
|
||||
|
||||
RABBITMQ_USERNAME=rabbitmq
|
||||
RABBITMQ_PASSWORD=rabbitmq
|
||||
|
|
|
|||
|
|
@ -243,7 +243,7 @@ The limit can be changed using env variables:
|
|||
|
||||
## Mobile application set up
|
||||
|
||||
>**Note**: This application is currently in beta
|
||||
> **Note**: This application is currently in beta
|
||||
|
||||
Grafana OnCall OSS users can use the mobile app to receive push notifications from OnCall.
|
||||
Grafana OnCall OSS relies on Grafana Cloud as on relay for push notifications.
|
||||
|
|
@ -255,3 +255,29 @@ For Grafana OnCall OSS, the mobile app QR code includes an authentication token
|
|||
Your Grafana OnCall OSS instance should be reachable from the same network as your mobile device, preferably from the internet.
|
||||
|
||||
For more information, see [Grafana OnCall mobile app]({{< relref "../mobile-app" >}})
|
||||
|
||||
## Alert Group Escalation Auditor
|
||||
|
||||
Grafana OnCall has a periodic background task, which runs to check that all alert group escalations have finished
|
||||
properly. This feature, if configured, can also ping an OnCall Webhook Integration's heartbeat URL, so that you can be
|
||||
alerted, in the event that something goes wrong.
|
||||
|
||||
Logs originating from the celery worker, for the `apps.alerts.tasks.check_escalation_finished.check_escalation_finished_task`
|
||||
task, that reference a `AlertGroupEscalationPolicyExecutionAuditException` exception
|
||||
indicate that the auditor periodic task is failing check(s) on one or more alert groups. Logs for this task which
|
||||
mention `.. passed the audit checks` indicate that there were no issues with with the escalation on the audited
|
||||
alert groups.
|
||||
|
||||
To configure this feature as such:
|
||||
|
||||
1. Create a Webhook, or Formatted Webhook, Integration type.
|
||||
1. Under the "Heartbeat" tab in the Integration modal, copy the unique heartbeat URL that is shown.
|
||||
1. Set the hearbeat's expected time interval to 15 minutes (see note below regarding `ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_INTERVAL`)
|
||||
1. Configure the integration's escalation chain as necessary
|
||||
1. Populate the following env variables:
|
||||
|
||||
- `ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL` - integration's unique heartbeat URL
|
||||
- `ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_INTERVAL` - how often the auditor task should run. By default the
|
||||
task runs every 13 minutes so we therefore recommend setting the heartbeat's expected time interval to 15 minutes. If you
|
||||
would like to modify this, we recommend configuring this env variable to 1 or 2 minutes less than the value set for the
|
||||
integration's heartbeat expected time interval.
|
||||
|
|
|
|||
|
|
@ -1,3 +1,4 @@
|
|||
import datetime
|
||||
import logging
|
||||
from typing import Optional
|
||||
|
||||
|
|
@ -5,19 +6,16 @@ import pytz
|
|||
from celery import uuid as celery_uuid
|
||||
from dateutil.parser import parse
|
||||
from django.apps import apps
|
||||
from django.utils import timezone
|
||||
from django.utils.functional import cached_property
|
||||
from rest_framework.exceptions import ValidationError
|
||||
|
||||
from apps.alerts.constants import NEXT_ESCALATION_DELAY
|
||||
from apps.alerts.escalation_snapshot.snapshot_classes import (
|
||||
ChannelFilterSnapshot,
|
||||
EscalationChainSnapshot,
|
||||
EscalationPolicySnapshot,
|
||||
EscalationSnapshot,
|
||||
)
|
||||
from apps.alerts.escalation_snapshot.utils import eta_for_escalation_step_notify_if_time
|
||||
from apps.alerts.tasks import calculate_escalation_finish_time, escalate_alert_group
|
||||
from apps.alerts.tasks import escalate_alert_group
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
|
@ -90,8 +88,7 @@ class EscalationSnapshotMixin:
|
|||
'next_step_eta': '2021-10-18T10:28:28.890369Z
|
||||
}
|
||||
"""
|
||||
|
||||
escalation_snapshot = None
|
||||
data = {}
|
||||
|
||||
if self.escalation_chain_exists:
|
||||
channel_filter = self.channel_filter
|
||||
|
|
@ -104,53 +101,7 @@ class EscalationSnapshotMixin:
|
|||
"escalation_policies_snapshots": escalation_policies,
|
||||
"slack_channel_id": self.slack_channel_id,
|
||||
}
|
||||
escalation_snapshot = EscalationSnapshot.serializer(data).data
|
||||
return escalation_snapshot
|
||||
|
||||
def calculate_eta_for_finish_escalation(self, escalation_started=False, start_time=None):
|
||||
if not self.escalation_snapshot:
|
||||
return
|
||||
EscalationPolicy = apps.get_model("alerts", "EscalationPolicy")
|
||||
TOLERANCE_SECONDS = 1
|
||||
TOLERANCE_TIME = timezone.timedelta(seconds=NEXT_ESCALATION_DELAY + TOLERANCE_SECONDS)
|
||||
start_time = start_time or timezone.now() # start time may be different for silenced incidents
|
||||
wait_summ = timezone.timedelta()
|
||||
# Get next_active_escalation_policy_order using flag `escalation_started` because this calculation can be
|
||||
# started in parallel with escalation task where next_active_escalation_policy_order can be changed.
|
||||
# That's why we are using `escalation_started` flag here, which means, that we want count eta from the first
|
||||
# step.
|
||||
next_escalation_policy_order = (
|
||||
self.escalation_snapshot.next_active_escalation_policy_order if escalation_started else 0
|
||||
)
|
||||
escalation_policies = self.escalation_snapshot.escalation_policies_snapshots[next_escalation_policy_order:]
|
||||
for escalation_policy in escalation_policies:
|
||||
if escalation_policy.step == EscalationPolicy.STEP_WAIT:
|
||||
if escalation_policy.wait_delay is not None:
|
||||
wait_summ += escalation_policy.wait_delay
|
||||
else:
|
||||
wait_summ += EscalationPolicy.DEFAULT_WAIT_DELAY # Default wait in case it's not selected yet
|
||||
elif escalation_policy.step == EscalationPolicy.STEP_NOTIFY_IF_TIME:
|
||||
if escalation_policy.from_time and escalation_policy.to_time:
|
||||
estimate_start_time = start_time + wait_summ
|
||||
STEP_TOLERANCE = timezone.timedelta(minutes=1)
|
||||
next_step_estimate_start_time = eta_for_escalation_step_notify_if_time(
|
||||
escalation_policy.from_time,
|
||||
escalation_policy.to_time,
|
||||
estimate_start_time + STEP_TOLERANCE,
|
||||
)
|
||||
wait_summ += next_step_estimate_start_time - estimate_start_time
|
||||
elif escalation_policy.step == EscalationPolicy.STEP_REPEAT_ESCALATION_N_TIMES:
|
||||
# the part of escalation with repeat step will be passed six times: the first time plus five repeats
|
||||
wait_summ *= EscalationPolicy.MAX_TIMES_REPEAT + 1
|
||||
elif escalation_policy.step == EscalationPolicy.STEP_NOTIFY_IF_NUM_ALERTS_IN_TIME_WINDOW:
|
||||
# In this case we cannot calculate finish time, so we return None
|
||||
return
|
||||
elif escalation_policy.step == EscalationPolicy.STEP_FINAL_RESOLVE:
|
||||
break
|
||||
wait_summ += TOLERANCE_TIME
|
||||
|
||||
escalation_finish_time = start_time + wait_summ
|
||||
return escalation_finish_time
|
||||
return EscalationSnapshot.serializer(data).data
|
||||
|
||||
@property
|
||||
def channel_filter_with_respect_to_escalation_snapshot(self):
|
||||
|
|
@ -166,38 +117,52 @@ class EscalationSnapshotMixin:
|
|||
|
||||
@cached_property
|
||||
def channel_filter_snapshot(self) -> Optional[ChannelFilterSnapshot]:
|
||||
# in some cases we need only channel filter and don't want to serialize whole escalation
|
||||
channel_filter_snapshot_object = None
|
||||
"""
|
||||
in some cases we need only channel filter and don't want to serialize whole escalation
|
||||
"""
|
||||
escalation_snapshot = self.raw_escalation_snapshot
|
||||
if escalation_snapshot is not None:
|
||||
channel_filter_snapshot = ChannelFilterSnapshot.serializer().to_internal_value(
|
||||
escalation_snapshot["channel_filter_snapshot"]
|
||||
)
|
||||
channel_filter_snapshot_object = ChannelFilterSnapshot(**channel_filter_snapshot)
|
||||
return channel_filter_snapshot_object
|
||||
if not escalation_snapshot:
|
||||
return None
|
||||
|
||||
channel_filter_snapshot = escalation_snapshot["channel_filter_snapshot"]
|
||||
if not channel_filter_snapshot:
|
||||
return None
|
||||
|
||||
channel_filter_snapshot = ChannelFilterSnapshot.serializer().to_internal_value(channel_filter_snapshot)
|
||||
return ChannelFilterSnapshot(**channel_filter_snapshot)
|
||||
|
||||
@cached_property
|
||||
def escalation_chain_snapshot(self) -> Optional[EscalationChainSnapshot]:
|
||||
# in some cases we need only escalation chain and don't want to serialize whole escalation
|
||||
"""
|
||||
in some cases we need only escalation chain and don't want to serialize whole escalation
|
||||
escalation_chain_snapshot_object = None
|
||||
"""
|
||||
escalation_snapshot = self.raw_escalation_snapshot
|
||||
if escalation_snapshot is not None:
|
||||
escalation_chain_snapshot = EscalationChainSnapshot.serializer().to_internal_value(
|
||||
escalation_snapshot["escalation_chain_snapshot"]
|
||||
)
|
||||
escalation_chain_snapshot_object = EscalationChainSnapshot(**escalation_chain_snapshot)
|
||||
return escalation_chain_snapshot_object
|
||||
if not escalation_snapshot:
|
||||
return None
|
||||
|
||||
escalation_chain_snapshot = escalation_snapshot["escalation_chain_snapshot"]
|
||||
if not escalation_chain_snapshot:
|
||||
return None
|
||||
|
||||
escalation_chain_snapshot = EscalationChainSnapshot.serializer().to_internal_value(escalation_chain_snapshot)
|
||||
return EscalationChainSnapshot(**escalation_chain_snapshot)
|
||||
|
||||
@cached_property
|
||||
def escalation_snapshot(self) -> Optional[EscalationSnapshot]:
|
||||
escalation_snapshot_object = None
|
||||
raw_escalation_snapshot = self.raw_escalation_snapshot
|
||||
if raw_escalation_snapshot is not None:
|
||||
if raw_escalation_snapshot:
|
||||
try:
|
||||
escalation_snapshot_object = self._deserialize_escalation_snapshot(raw_escalation_snapshot)
|
||||
return self._deserialize_escalation_snapshot(raw_escalation_snapshot)
|
||||
except ValidationError as e:
|
||||
logger.error(f"Error trying to deserialize raw escalation snapshot: {e}")
|
||||
return escalation_snapshot_object
|
||||
return None
|
||||
|
||||
@cached_property
|
||||
def has_escalation_policies_snapshots(self) -> bool:
|
||||
if not self.raw_escalation_snapshot:
|
||||
return False
|
||||
return len(self.raw_escalation_snapshot["escalation_policies_snapshots"]) > 0
|
||||
|
||||
def _deserialize_escalation_snapshot(self, raw_escalation_snapshot) -> EscalationSnapshot:
|
||||
"""
|
||||
|
|
@ -225,20 +190,34 @@ class EscalationSnapshotMixin:
|
|||
return escalation_snapshot_object
|
||||
|
||||
@property
|
||||
def escalation_chain_exists(self):
|
||||
return not self.pause_escalation and self.channel_filter and self.channel_filter.escalation_chain
|
||||
def escalation_chain_exists(self) -> bool:
|
||||
if self.pause_escalation:
|
||||
return False
|
||||
elif not self.channel_filter:
|
||||
return False
|
||||
return self.channel_filter.escalation_chain is not None
|
||||
|
||||
@property
|
||||
def pause_escalation(self):
|
||||
# get pause_escalation field directly to avoid serialization overhead
|
||||
return self.raw_escalation_snapshot is not None and self.raw_escalation_snapshot.get("pause_escalation", False)
|
||||
def pause_escalation(self) -> bool:
|
||||
"""
|
||||
get pause_escalation field directly to avoid serialization overhead
|
||||
"""
|
||||
if not self.raw_escalation_snapshot:
|
||||
return False
|
||||
return self.raw_escalation_snapshot.get("pause_escalation", False)
|
||||
|
||||
@property
|
||||
def next_step_eta(self):
|
||||
# get next_step_eta field directly to avoid serialization overhead
|
||||
raw_next_step_eta = (
|
||||
self.raw_escalation_snapshot.get("next_step_eta") if self.raw_escalation_snapshot is not None else None
|
||||
)
|
||||
def next_step_eta(self) -> Optional[datetime.datetime]:
|
||||
"""
|
||||
get next_step_eta field directly to avoid serialization overhead
|
||||
"""
|
||||
if not self.raw_escalation_snapshot:
|
||||
return None
|
||||
|
||||
raw_next_step_eta = self.raw_escalation_snapshot.get("next_step_eta")
|
||||
if not raw_next_step_eta:
|
||||
return None
|
||||
|
||||
if raw_next_step_eta:
|
||||
return parse(raw_next_step_eta).replace(tzinfo=pytz.UTC)
|
||||
|
||||
|
|
@ -272,13 +251,10 @@ class EscalationSnapshotMixin:
|
|||
is_escalation_finished=False,
|
||||
raw_escalation_snapshot=raw_escalation_snapshot,
|
||||
)
|
||||
if not self.pause_escalation:
|
||||
calculate_escalation_finish_time.apply_async((self.pk,), immutable=True)
|
||||
escalate_alert_group.apply_async((self.pk,), countdown=countdown, immutable=True, eta=eta, task_id=task_id)
|
||||
|
||||
def stop_escalation(self):
|
||||
self.is_escalation_finished = True
|
||||
self.estimate_escalation_finish_time = None
|
||||
# change active_escalation_id to prevent alert escalation
|
||||
self.active_escalation_id = "intentionally_stopped"
|
||||
self.save(update_fields=["is_escalation_finished", "estimate_escalation_finish_time", "active_escalation_id"])
|
||||
self.save(update_fields=["is_escalation_finished", "active_escalation_id"])
|
||||
|
|
|
|||
|
|
@ -8,11 +8,11 @@ from apps.alerts.escalation_snapshot.serializers import (
|
|||
|
||||
|
||||
class EscalationSnapshotSerializer(serializers.Serializer):
|
||||
channel_filter_snapshot = ChannelFilterSnapshotSerializer()
|
||||
escalation_chain_snapshot = EscalationChainSnapshotSerializer()
|
||||
channel_filter_snapshot = ChannelFilterSnapshotSerializer(allow_null=True, default=None)
|
||||
escalation_chain_snapshot = EscalationChainSnapshotSerializer(allow_null=True, default=None)
|
||||
last_active_escalation_policy_order = serializers.IntegerField(allow_null=True, default=None)
|
||||
escalation_policies_snapshots = EscalationPolicySnapshotSerializer(many=True)
|
||||
slack_channel_id = serializers.CharField(allow_null=True)
|
||||
escalation_policies_snapshots = EscalationPolicySnapshotSerializer(many=True, default=list)
|
||||
slack_channel_id = serializers.CharField(allow_null=True, default=None)
|
||||
pause_escalation = serializers.BooleanField(allow_null=True, default=False)
|
||||
next_step_eta = serializers.DateTimeField(allow_null=True, default=None)
|
||||
|
||||
|
|
|
|||
|
|
@ -1,15 +1,23 @@
|
|||
import logging
|
||||
from typing import Optional
|
||||
import typing
|
||||
|
||||
from celery.utils.log import get_task_logger
|
||||
from django.utils import timezone
|
||||
|
||||
from apps.alerts.escalation_snapshot.serializers import EscalationSnapshotSerializer
|
||||
from apps.alerts.escalation_snapshot.snapshot_classes.escalation_policy_snapshot import EscalationPolicySnapshot
|
||||
from apps.alerts.models.alert_group_log_record import AlertGroupLogRecord
|
||||
|
||||
logger = get_task_logger(__name__)
|
||||
logger.setLevel(logging.DEBUG)
|
||||
|
||||
if typing.TYPE_CHECKING:
|
||||
from apps.alerts.escalation_snapshot.snapshot_classes import (
|
||||
ChannelFilterSnapshot,
|
||||
EscalationChainSnapshot,
|
||||
EscalationPolicySnapshot,
|
||||
)
|
||||
from apps.alerts.models import AlertGroup
|
||||
|
||||
|
||||
class EscalationSnapshot:
|
||||
__slots__ = (
|
||||
|
|
@ -28,34 +36,34 @@ class EscalationSnapshot:
|
|||
|
||||
def __init__(
|
||||
self,
|
||||
alert_group,
|
||||
channel_filter_snapshot,
|
||||
escalation_chain_snapshot,
|
||||
last_active_escalation_policy_order,
|
||||
escalation_policies_snapshots,
|
||||
slack_channel_id,
|
||||
pause_escalation,
|
||||
next_step_eta,
|
||||
alert_group: "AlertGroup",
|
||||
channel_filter_snapshot: "ChannelFilterSnapshot",
|
||||
escalation_chain_snapshot: "EscalationChainSnapshot",
|
||||
last_active_escalation_policy_order: int,
|
||||
escalation_policies_snapshots: typing.List["EscalationPolicySnapshot"],
|
||||
slack_channel_id: str,
|
||||
pause_escalation: bool,
|
||||
next_step_eta: typing.Optional[str],
|
||||
):
|
||||
self.alert_group = alert_group
|
||||
self.channel_filter_snapshot = channel_filter_snapshot # ChannelFilterSnapshot object
|
||||
self.escalation_chain_snapshot = escalation_chain_snapshot # EscalationChainSnapshot object
|
||||
self.channel_filter_snapshot = channel_filter_snapshot
|
||||
self.escalation_chain_snapshot = escalation_chain_snapshot
|
||||
self.last_active_escalation_policy_order = last_active_escalation_policy_order
|
||||
self.escalation_policies_snapshots = escalation_policies_snapshots # list of EscalationPolicySnapshot objects
|
||||
self.escalation_policies_snapshots = escalation_policies_snapshots
|
||||
self.slack_channel_id = slack_channel_id
|
||||
self.pause_escalation = pause_escalation
|
||||
self.next_step_eta = next_step_eta
|
||||
self.stop_escalation = False
|
||||
|
||||
@property
|
||||
def last_active_escalation_policy_snapshot(self) -> Optional[EscalationPolicySnapshot]:
|
||||
def last_active_escalation_policy_snapshot(self) -> typing.Optional["EscalationPolicySnapshot"]:
|
||||
order = self.last_active_escalation_policy_order
|
||||
if order is None:
|
||||
return None
|
||||
return self.escalation_policies_snapshots[order]
|
||||
|
||||
@property
|
||||
def next_active_escalation_policy_snapshot(self) -> Optional[EscalationPolicySnapshot]:
|
||||
def next_active_escalation_policy_snapshot(self) -> typing.Optional["EscalationPolicySnapshot"]:
|
||||
order = self.next_active_escalation_policy_order
|
||||
if len(self.escalation_policies_snapshots) < order + 1:
|
||||
next_link = None
|
||||
|
|
@ -71,6 +79,31 @@ class EscalationSnapshot:
|
|||
next_order = self.last_active_escalation_policy_order + 1
|
||||
return next_order
|
||||
|
||||
@property
|
||||
def executed_escalation_policy_snapshots(self) -> typing.List["EscalationPolicySnapshot"]:
|
||||
"""
|
||||
Returns a list of escalation policy snapshots that have already been executed, according
|
||||
to the value of last_active_escalation_policy_order
|
||||
"""
|
||||
if self.last_active_escalation_policy_order is None:
|
||||
return []
|
||||
elif self.last_active_escalation_policy_order == 0:
|
||||
return [self.escalation_policies_snapshots[0]]
|
||||
return self.escalation_policies_snapshots[: self.last_active_escalation_policy_order]
|
||||
|
||||
def next_step_eta_is_valid(self) -> typing.Union[None, bool]:
|
||||
"""
|
||||
`next_step_eta` should never be less than the current time (with a 5 minute buffer provided)
|
||||
as this field should be updated as the escalation policy is executed over time. If it is, this means that
|
||||
an escalation policy step has been missed, or is substantially delayed
|
||||
|
||||
if `next_step_eta` is `None` then `None` is returned, otherwise a boolean is returned
|
||||
representing the result of the time comparision
|
||||
"""
|
||||
if self.next_step_eta is None:
|
||||
return None
|
||||
return self.next_step_eta > (timezone.now() - timezone.timedelta(minutes=5))
|
||||
|
||||
def save_to_alert_group(self) -> None:
|
||||
self.alert_group.raw_escalation_snapshot = self.convert_to_dict()
|
||||
self.alert_group.save(update_fields=["raw_escalation_snapshot"])
|
||||
|
|
@ -83,7 +116,6 @@ class EscalationSnapshot:
|
|||
Executes actual escalation step and saves result of execution like stop_escalation param and eta,
|
||||
that will be used for start next escalate_alert_group task.
|
||||
Also updates self.last_active_escalation_policy_order if escalation step was executed.
|
||||
:return: None
|
||||
"""
|
||||
escalation_policy_snapshot = self.next_active_escalation_policy_snapshot
|
||||
if escalation_policy_snapshot is None:
|
||||
|
|
|
|||
|
|
@ -134,7 +134,7 @@ class IncidentLogBuilder:
|
|||
# check if escalation snapshot wasn't saved and channel filter was deleted.
|
||||
# We cannot generate escalation plan in this case
|
||||
escalation_snapshot = self.alert_group.escalation_snapshot
|
||||
if escalation_snapshot is None:
|
||||
if not self.alert_group.has_escalation_policies_snapshots:
|
||||
return escalation_plan_dict
|
||||
|
||||
if self.alert_group.silenced_until:
|
||||
|
|
|
|||
|
|
@ -13,6 +13,7 @@ from django.db import IntegrityError, models, transaction
|
|||
from django.db.models import JSONField, Q, QuerySet
|
||||
from django.utils import timezone
|
||||
from django.utils.functional import cached_property
|
||||
from django_deprecate_fields import deprecate_field
|
||||
|
||||
from apps.alerts.escalation_snapshot import EscalationSnapshotMixin
|
||||
from apps.alerts.incident_appearance.renderers.constants import DEFAULT_BACKUP_TITLE
|
||||
|
|
@ -336,7 +337,9 @@ class AlertGroup(AlertGroupSlackRenderingMixin, EscalationSnapshotMixin, models.
|
|||
maintenance_uuid = models.CharField(max_length=100, unique=True, null=True, default=None)
|
||||
|
||||
raw_escalation_snapshot = JSONField(null=True, default=None)
|
||||
estimate_escalation_finish_time = models.DateTimeField(null=True, default=None)
|
||||
|
||||
# THIS FIELD IS DEPRECATED AND SHOULD EVENTUALLY BE REMOVED
|
||||
estimate_escalation_finish_time = deprecate_field(models.DateTimeField(null=True, default=None))
|
||||
|
||||
# This field is used for constraints so we can use get_or_create() in concurrent calls
|
||||
# https://docs.djangoproject.com/en/3.2/ref/models/querysets/#get-or-create
|
||||
|
|
@ -1464,14 +1467,7 @@ class AlertGroup(AlertGroupSlackRenderingMixin, EscalationSnapshotMixin, models.
|
|||
def start_unsilence_task(self, countdown):
|
||||
task_id = celery_uuid()
|
||||
self.unsilence_task_uuid = task_id
|
||||
|
||||
# recalculate finish escalation time
|
||||
escalation_start_time = timezone.now() + timezone.timedelta(seconds=countdown)
|
||||
self.estimate_escalation_finish_time = self.calculate_eta_for_finish_escalation(
|
||||
start_time=escalation_start_time
|
||||
)
|
||||
|
||||
self.save(update_fields=["unsilence_task_uuid", "estimate_escalation_finish_time"])
|
||||
self.save(update_fields=["unsilence_task_uuid"])
|
||||
unsilence_task.apply_async((self.pk,), task_id=task_id, countdown=countdown)
|
||||
|
||||
@property
|
||||
|
|
|
|||
|
|
@ -3,7 +3,6 @@ from .alert_group_web_title_cache import ( # noqa:F401
|
|||
update_web_title_cache,
|
||||
update_web_title_cache_for_alert_receive_channel,
|
||||
)
|
||||
from .calculcate_escalation_finish_time import calculate_escalation_finish_time # noqa
|
||||
from .call_ack_url import call_ack_url # noqa: F401
|
||||
from .check_escalation_finished import check_escalation_finished_task # noqa: F401
|
||||
from .create_contact_points_for_datasource import create_contact_points_for_datasource # noqa: F401
|
||||
|
|
|
|||
|
|
@ -1,15 +0,0 @@
|
|||
from django.apps import apps
|
||||
from django.conf import settings
|
||||
|
||||
from common.custom_celery_tasks import shared_dedicated_queue_retry_task
|
||||
|
||||
|
||||
@shared_dedicated_queue_retry_task(
|
||||
autoretry_for=(Exception,), retry_backoff=True, max_retries=1 if settings.DEBUG else None
|
||||
)
|
||||
def calculate_escalation_finish_time(alert_group_pk):
|
||||
AlertGroup = apps.get_model("alerts", "AlertGroup")
|
||||
alert_group = AlertGroup.all_objects.filter(pk=alert_group_pk)[0]
|
||||
if alert_group.escalation_snapshot:
|
||||
alert_group.estimate_escalation_finish_time = alert_group.calculate_eta_for_finish_escalation()
|
||||
alert_group.save(update_fields=["estimate_escalation_finish_time"])
|
||||
|
|
@ -1,48 +1,148 @@
|
|||
import datetime
|
||||
import typing
|
||||
|
||||
import requests
|
||||
from celery import shared_task
|
||||
from django.apps import apps
|
||||
from django.conf import settings
|
||||
from django.db.models import Q
|
||||
from django.utils import timezone
|
||||
|
||||
from apps.alerts.tasks.task_logger import task_logger
|
||||
from common.custom_celery_tasks import shared_dedicated_queue_retry_task
|
||||
from common.database import get_random_readonly_database_key_if_present_otherwise_default
|
||||
|
||||
if typing.TYPE_CHECKING:
|
||||
from apps.alerts.models.alert_group import AlertGroup
|
||||
|
||||
|
||||
@shared_dedicated_queue_retry_task(
|
||||
autoretry_for=(Exception,), retry_backoff=True, max_retries=1 if settings.DEBUG else None, default_retry_delay=60
|
||||
)
|
||||
class AlertGroupEscalationPolicyExecutionAuditException(BaseException):
|
||||
"""This exception is raised when an alert group's escalation policy did not execute execute properly for some reason"""
|
||||
|
||||
|
||||
def send_alert_group_escalation_auditor_task_heartbeat() -> None:
|
||||
heartbeat_url = settings.ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL
|
||||
if heartbeat_url:
|
||||
task_logger.info(f"Sending heartbeat to configured URL: {heartbeat_url}")
|
||||
requests.get(heartbeat_url).raise_for_status()
|
||||
task_logger.info(f"Heartbeat successfully sent to {heartbeat_url}")
|
||||
else:
|
||||
task_logger.info(f"Skipping sending heartbeat as no heartbeat URL is configured")
|
||||
|
||||
|
||||
def audit_alert_group_escalation(alert_group: "AlertGroup") -> None:
|
||||
escalation_snapshot = alert_group.escalation_snapshot
|
||||
alert_group_id = alert_group.id
|
||||
base_msg = f"Alert group {alert_group_id}"
|
||||
|
||||
if not escalation_snapshot:
|
||||
raise AlertGroupEscalationPolicyExecutionAuditException(
|
||||
f"{base_msg} does not have an escalation snapshot associated with it, this should never occur"
|
||||
)
|
||||
task_logger.info(f"{base_msg} has an escalation snapshot associated with it, auditing if it executed properly")
|
||||
|
||||
escalation_policies_snapshots = escalation_snapshot.escalation_policies_snapshots
|
||||
|
||||
if not escalation_policies_snapshots:
|
||||
task_logger.info(
|
||||
f"{base_msg}'s escalation snapshot has an empty escalation_policies_snapshots, skipping further validation"
|
||||
)
|
||||
return
|
||||
task_logger.info(
|
||||
f"{base_msg}'s escalation snapshot has a populated escalation_policies_snapshots, continuing validation"
|
||||
)
|
||||
|
||||
if escalation_snapshot.next_step_eta_is_valid() is False:
|
||||
raise AlertGroupEscalationPolicyExecutionAuditException(
|
||||
f"{base_msg}'s escalation snapshot does not have a valid next_step_eta: {escalation_snapshot.next_step_eta}"
|
||||
)
|
||||
task_logger.info(f"{base_msg}'s escalation snapshot has a valid next_step_eta: {escalation_snapshot.next_step_eta}")
|
||||
|
||||
executed_escalation_policy_snapshots = escalation_snapshot.executed_escalation_policy_snapshots
|
||||
num_of_executed_escalation_policy_snapshots = len(executed_escalation_policy_snapshots)
|
||||
|
||||
if num_of_executed_escalation_policy_snapshots == 0:
|
||||
task_logger.info(
|
||||
f"{base_msg}'s escalation snapshot does not have any executed escalation policies, skipping further validation"
|
||||
)
|
||||
else:
|
||||
task_logger.info(
|
||||
f"{base_msg}'s escalation snapshot has {num_of_executed_escalation_policy_snapshots} executed escalation policies"
|
||||
)
|
||||
|
||||
# TODO: consider adding the below checks later on. This is it a bit trickier to properly audit as the
|
||||
# number of log records can vary if there are any STEP_NOTIFY_IF_NUM_ALERTS_IN_TIME_WINDOW or
|
||||
# STEP_REPEAT_ESCALATION_N_TIMES escalation policy steps in the escalation chain
|
||||
# see conversations in the original PR (https://github.com/grafana/oncall/pull/1266) for more context on this
|
||||
#
|
||||
# compare number of triggered/failed alert group log records to the number of executed
|
||||
# escalation policy snapshot steps
|
||||
# num_of_relevant_log_records = AlertGroupLogRecord.objects.filter(
|
||||
# alert_group_id=alert_group_id,
|
||||
# type__in=[AlertGroupLogRecord.TYPE_ESCALATION_TRIGGERED, AlertGroupLogRecord.TYPE_ESCALATION_FAILED],
|
||||
# ).count()
|
||||
|
||||
# if num_of_relevant_log_records < num_of_executed_escalation_policy_snapshots:
|
||||
# raise AlertGroupEscalationPolicyExecutionAuditException(
|
||||
# f"{base_msg}'s number of triggered/failed alert group log records ({num_of_relevant_log_records}) is less "
|
||||
# f"than the number of executed escalation policy snapshot steps ({num_of_executed_escalation_policy_snapshots})"
|
||||
# )
|
||||
|
||||
# task_logger.info(
|
||||
# f"{base_msg}'s number of triggered/failed alert group log records ({num_of_relevant_log_records}) is greater "
|
||||
# f"than or equal to the number of executed escalation policy snapshot steps ({num_of_executed_escalation_policy_snapshots})"
|
||||
# )
|
||||
|
||||
task_logger.info(f"{base_msg} passed the audit checks")
|
||||
|
||||
|
||||
def get_auditable_alert_groups_started_at_range() -> typing.Tuple[datetime.datetime, datetime.datetime]:
|
||||
"""
|
||||
NOTE: this started_at__range is a bit of a hack..
|
||||
we wanted to avoid performing a migration on the alerts_alertgroup table to update
|
||||
alert groups where raw_escalation_snapshot was None. raw_escalation_snapshot being None is a legitimate case,
|
||||
where the alert group's integration does not have an escalation chain associated with it.
|
||||
|
||||
However, we wanted a way to be able to differentiate between "actually None" and "there was an error writing to
|
||||
raw_escalation_snapshot" (as this is performed async by a celery task).
|
||||
|
||||
This field was updated, in the commit that added this comment, to no longer be set to None by default.
|
||||
As part of this celery task we do a check that this field is in fact not None, so if we were to check older
|
||||
alert groups, whose integration did not have an escalation chain at the time the alert group was created
|
||||
we would raise errors
|
||||
"""
|
||||
return (datetime.datetime(2023, 3, 25), timezone.now() - timezone.timedelta(days=2))
|
||||
|
||||
|
||||
# don't retry this task as the AlertGroup DB query is rather expensive
|
||||
@shared_task
|
||||
def check_escalation_finished_task():
|
||||
"""
|
||||
This task periodically checks if there are no alert groups with not finished escalations.
|
||||
TODO: QA this properly, check if new type of escalations had been added
|
||||
"""
|
||||
AlertGroup = apps.get_model("alerts", "AlertGroup")
|
||||
AlertReceiveChannel = apps.get_model("alerts", "AlertReceiveChannel")
|
||||
|
||||
CHECKING_TOLERANCE = timezone.timedelta(minutes=5)
|
||||
CHECKING_TIME = timezone.now() - CHECKING_TOLERANCE
|
||||
|
||||
alert_groups = AlertGroup.all_objects.filter(
|
||||
alert_groups = AlertGroup.all_objects.using(get_random_readonly_database_key_if_present_otherwise_default()).filter(
|
||||
~Q(channel__integration=AlertReceiveChannel.INTEGRATION_MAINTENANCE),
|
||||
~Q(silenced=True, silenced_until__isnull=True), # filter silenced forever alert_groups
|
||||
is_escalation_finished=False,
|
||||
resolved=False,
|
||||
acknowledged=False,
|
||||
root_alert_group=None,
|
||||
estimate_escalation_finish_time__lte=CHECKING_TIME,
|
||||
started_at__range=get_auditable_alert_groups_started_at_range(),
|
||||
)
|
||||
|
||||
if not alert_groups.exists():
|
||||
return
|
||||
task_logger.info("There are no alert groups to audit, everything is good :)")
|
||||
|
||||
exception_template = "Escalation for alert_group {} is not finished at expected time {}, now {}"
|
||||
alert_group_ids_that_failed_audit: typing.List[str] = []
|
||||
|
||||
now = timezone.now()
|
||||
exception_text = "\n".join(
|
||||
exception_template.format(alert_group.pk, alert_group.estimate_escalation_finish_time, now)
|
||||
for alert_group in alert_groups
|
||||
)
|
||||
for alert_group in alert_groups:
|
||||
try:
|
||||
audit_alert_group_escalation(alert_group)
|
||||
except AlertGroupEscalationPolicyExecutionAuditException:
|
||||
alert_group_ids_that_failed_audit.append(str(alert_group.id))
|
||||
|
||||
ids = alert_groups.values_list("pk", flat=True)
|
||||
task_logger.debug(ids)
|
||||
if alert_group_ids_that_failed_audit:
|
||||
raise AlertGroupEscalationPolicyExecutionAuditException(
|
||||
f"The following alert group id(s) failed auditing: {', '.join(alert_group_ids_that_failed_audit)}"
|
||||
)
|
||||
|
||||
raise Exception(exception_text)
|
||||
send_alert_group_escalation_auditor_task_heartbeat()
|
||||
|
|
|
|||
|
|
@ -1,6 +1,9 @@
|
|||
import datetime
|
||||
|
||||
import pytest
|
||||
|
||||
from apps.alerts.incident_appearance.templaters import AlertSlackTemplater
|
||||
from apps.alerts.models import EscalationPolicy
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
|
|
@ -9,3 +12,51 @@ def mock_alert_renderer_render_for(monkeypatch):
|
|||
return "invalid_render_for"
|
||||
|
||||
monkeypatch.setattr(AlertSlackTemplater, "_render_for", mock_render_for)
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def escalation_snapshot_test_setup(
|
||||
make_organization_and_user,
|
||||
make_user_for_organization,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_escalation_chain,
|
||||
make_escalation_policy,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, user_1 = make_organization_and_user()
|
||||
user_2 = make_user_for_organization(organization)
|
||||
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
|
||||
escalation_chain = make_escalation_chain(organization)
|
||||
channel_filter = make_channel_filter(
|
||||
alert_receive_channel,
|
||||
escalation_chain=escalation_chain,
|
||||
notification_backends={"BACKEND": {"channel_id": "abc123"}},
|
||||
)
|
||||
|
||||
notify_to_multiple_users_step = make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_NOTIFY_MULTIPLE_USERS,
|
||||
)
|
||||
notify_to_multiple_users_step.notify_to_users_queue.set([user_1, user_2])
|
||||
wait_step = make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_WAIT,
|
||||
wait_delay=EscalationPolicy.FIFTEEN_MINUTES,
|
||||
)
|
||||
# random time for test
|
||||
from_time = datetime.time(10, 30)
|
||||
to_time = datetime.time(18, 45)
|
||||
notify_if_time_step = make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_NOTIFY_IF_TIME,
|
||||
from_time=from_time,
|
||||
to_time=to_time,
|
||||
)
|
||||
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
alert_group.save()
|
||||
return alert_group, notify_to_multiple_users_step, wait_step, notify_if_time_step
|
||||
|
|
|
|||
|
|
@ -1,45 +1,329 @@
|
|||
from unittest.mock import Mock, PropertyMock, patch
|
||||
|
||||
import pytest
|
||||
import requests
|
||||
from django.test import override_settings
|
||||
from django.utils import timezone
|
||||
|
||||
from apps.alerts.models import AlertReceiveChannel
|
||||
from apps.alerts.tasks import check_escalation_finished_task
|
||||
from apps.alerts.models import AlertGroup
|
||||
from apps.alerts.tasks.check_escalation_finished import (
|
||||
AlertGroupEscalationPolicyExecutionAuditException,
|
||||
audit_alert_group_escalation,
|
||||
check_escalation_finished_task,
|
||||
send_alert_group_escalation_auditor_task_heartbeat,
|
||||
)
|
||||
|
||||
MOCKED_HEARTBEAT_URL = "https://hello.com/lsdjjkf"
|
||||
|
||||
|
||||
# def _get_relevant_log_record_type() -> int:
|
||||
# return random.choice([AlertGroupLogRecord.TYPE_ESCALATION_TRIGGERED, AlertGroupLogRecord.TYPE_ESCALATION_FAILED])
|
||||
|
||||
|
||||
def test_send_alert_group_escalation_auditor_task_heartbeat_does_not_call_the_heartbeat_url_if_one_is_not_configured():
|
||||
with patch("apps.alerts.tasks.check_escalation_finished.requests") as mock_requests:
|
||||
send_alert_group_escalation_auditor_task_heartbeat()
|
||||
mock_requests.get.assert_not_called()
|
||||
|
||||
|
||||
@override_settings(ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL=MOCKED_HEARTBEAT_URL)
|
||||
def test_send_alert_group_escalation_auditor_task_heartbeat_calls_the_heartbeat_url_if_one_is_configured():
|
||||
with patch("apps.alerts.tasks.check_escalation_finished.requests") as mock_requests:
|
||||
send_alert_group_escalation_auditor_task_heartbeat()
|
||||
|
||||
mock_requests.get.assert_called_once_with(MOCKED_HEARTBEAT_URL)
|
||||
mock_requests.get.return_value.raise_for_status.assert_called_once_with()
|
||||
|
||||
|
||||
@override_settings(ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL=MOCKED_HEARTBEAT_URL)
|
||||
def test_send_alert_group_escalation_auditor_task_heartbeat_raises_an_exception_if_the_heartbeat_url_request_fails():
|
||||
with patch("apps.alerts.tasks.check_escalation_finished.requests") as mock_requests:
|
||||
mock_response = Mock()
|
||||
mock_response.status_code = 500
|
||||
mock_response.raise_for_status.side_effect = requests.exceptions.HTTPError
|
||||
|
||||
mock_requests.get.return_value = mock_response
|
||||
|
||||
with pytest.raises(requests.exceptions.HTTPError):
|
||||
send_alert_group_escalation_auditor_task_heartbeat()
|
||||
|
||||
mock_requests.get.assert_called_once_with(MOCKED_HEARTBEAT_URL)
|
||||
mock_requests.get.return_value.raise_for_status.assert_called_once_with()
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_check_escalation_finished_task(
|
||||
def test_audit_alert_group_escalation_raises_exception_if_the_alert_group_does_not_have_an_escalation_snapshot(
|
||||
escalation_snapshot_test_setup,
|
||||
):
|
||||
alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
alert_group.escalation_snapshot = None
|
||||
|
||||
with pytest.raises(AlertGroupEscalationPolicyExecutionAuditException):
|
||||
audit_alert_group_escalation(alert_group)
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_audit_alert_group_escalation_skips_further_validation_if_the_escalation_policies_snapshots_is_empty(
|
||||
escalation_snapshot_test_setup,
|
||||
):
|
||||
alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
|
||||
alert_group.escalation_snapshot.escalation_policies_snapshots = []
|
||||
audit_alert_group_escalation(alert_group)
|
||||
|
||||
alert_group.escalation_snapshot.escalation_policies_snapshots = None
|
||||
audit_alert_group_escalation(alert_group)
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
@pytest.mark.parametrize(
|
||||
"next_step_eta_is_valid_return_value,raises_exception",
|
||||
[
|
||||
(None, False),
|
||||
(True, False),
|
||||
(False, True),
|
||||
],
|
||||
)
|
||||
def test_audit_alert_group_escalation_next_step_eta_validation(
|
||||
escalation_snapshot_test_setup, next_step_eta_is_valid_return_value, raises_exception
|
||||
):
|
||||
alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
|
||||
with patch(
|
||||
"apps.alerts.escalation_snapshot.snapshot_classes.escalation_snapshot.EscalationSnapshot.next_step_eta_is_valid"
|
||||
) as mock_next_step_eta_is_valid:
|
||||
mock_next_step_eta_is_valid.return_value = next_step_eta_is_valid_return_value
|
||||
|
||||
if raises_exception:
|
||||
with pytest.raises(AlertGroupEscalationPolicyExecutionAuditException):
|
||||
audit_alert_group_escalation(alert_group)
|
||||
else:
|
||||
try:
|
||||
audit_alert_group_escalation(alert_group)
|
||||
except AlertGroupEscalationPolicyExecutionAuditException:
|
||||
pytest.fail()
|
||||
|
||||
mock_next_step_eta_is_valid.assert_called_once_with()
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_audit_alert_group_escalation_no_executed_escalation_policy_snapshots(escalation_snapshot_test_setup):
|
||||
alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
|
||||
with patch(
|
||||
"apps.alerts.escalation_snapshot.snapshot_classes.escalation_snapshot.EscalationSnapshot.executed_escalation_policy_snapshots",
|
||||
new_callable=PropertyMock,
|
||||
) as mock_executed_escalation_policy_snapshots:
|
||||
mock_executed_escalation_policy_snapshots.return_value = []
|
||||
audit_alert_group_escalation(alert_group)
|
||||
mock_executed_escalation_policy_snapshots.assert_called_once_with()
|
||||
|
||||
|
||||
# # see TODO: comment in engine/apps/alerts/tasks/check_escalation_finished.py
|
||||
# @pytest.mark.django_db
|
||||
# def test_audit_alert_group_escalation_all_executed_escalation_policy_snapshots_have_triggered_log_records(
|
||||
# escalation_snapshot_test_setup,
|
||||
# make_organization_and_user,
|
||||
# make_alert_group_log_record,
|
||||
# ):
|
||||
# _, user = make_organization_and_user()
|
||||
# alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
# escalation_policies_snapshots = alert_group.escalation_snapshot.escalation_policies_snapshots
|
||||
|
||||
# for escalation_policy_snapshot in escalation_policies_snapshots:
|
||||
# escalation_policy = EscalationPolicy.objects.get(id=escalation_policy_snapshot.id)
|
||||
# log_record_type = _get_relevant_log_record_type()
|
||||
|
||||
# make_alert_group_log_record(alert_group, log_record_type, user, escalation_policy=escalation_policy)
|
||||
|
||||
# with patch(
|
||||
# "apps.alerts.escalation_snapshot.snapshot_classes.escalation_snapshot.EscalationSnapshot.executed_escalation_policy_snapshots",
|
||||
# new_callable=PropertyMock,
|
||||
# ) as mock_executed_escalation_policy_snapshots:
|
||||
# mock_executed_escalation_policy_snapshots.return_value = escalation_policies_snapshots
|
||||
# audit_alert_group_escalation(alert_group)
|
||||
# mock_executed_escalation_policy_snapshots.assert_called_once_with()
|
||||
|
||||
# see TODO: comment in engine/apps/alerts/tasks/check_escalation_finished.py
|
||||
# @pytest.mark.django_db
|
||||
# def test_audit_alert_group_escalation_one_executed_escalation_policy_snapshot_does_not_have_a_triggered_log_record(
|
||||
# escalation_snapshot_test_setup,
|
||||
# make_organization_and_user,
|
||||
# make_alert_group_log_record,
|
||||
# ):
|
||||
# _, user = make_organization_and_user()
|
||||
# alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
# escalation_policies_snapshots = alert_group.escalation_snapshot.escalation_policies_snapshots
|
||||
|
||||
# # let's skip creating a relevant alert group log record for the first executed escalation policy
|
||||
# for idx, escalation_policy_snapshot in enumerate(escalation_policies_snapshots):
|
||||
# if idx != 0:
|
||||
# escalation_policy = EscalationPolicy.objects.get(id=escalation_policy_snapshot.id)
|
||||
# make_alert_group_log_record(
|
||||
# alert_group, _get_relevant_log_record_type(), user, escalation_policy=escalation_policy
|
||||
# )
|
||||
|
||||
# with patch(
|
||||
# "apps.alerts.escalation_snapshot.snapshot_classes.escalation_snapshot.EscalationSnapshot.executed_escalation_policy_snapshots",
|
||||
# new_callable=PropertyMock,
|
||||
# ) as mock_executed_escalation_policy_snapshots:
|
||||
# mock_executed_escalation_policy_snapshots.return_value = escalation_policies_snapshots
|
||||
|
||||
# with pytest.raises(AlertGroupEscalationPolicyExecutionAuditException):
|
||||
# audit_alert_group_escalation(alert_group)
|
||||
# mock_executed_escalation_policy_snapshots.assert_called_once_with()
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_check_escalation_finished_task_queries_doesnt_grab_alert_groups_outside_of_date_range(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, user = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(
|
||||
organization, integration=AlertReceiveChannel.INTEGRATION_GRAFANA
|
||||
)
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
|
||||
now = timezone.now()
|
||||
two_days_ago = now - timezone.timedelta(days=2)
|
||||
two_days_in_future = now + timezone.timedelta(days=2)
|
||||
|
||||
# we don't have escalation finish time, seems we cannot calculate it due escalation chain snapshot has uncalculated
|
||||
# steps or does not exist, no exception is raised
|
||||
check_escalation_finished_task()
|
||||
# we can't simply pass started_at to the fixture because started_at is being "auto-set" on the Model
|
||||
alert_group1 = make_alert_group(alert_receive_channel)
|
||||
alert_group1.started_at = now
|
||||
|
||||
# it's acceptable time for finish escalation, because we have tolerance time 5 min from now, no exception is raised
|
||||
alert_group.estimate_escalation_finish_time = now
|
||||
alert_group.save()
|
||||
check_escalation_finished_task()
|
||||
alert_group2 = make_alert_group(alert_receive_channel)
|
||||
alert_group2.started_at = now - timezone.timedelta(days=5)
|
||||
|
||||
# it is acceptable time for finish escalation, so no exception is raised
|
||||
alert_group.estimate_escalation_finish_time = now + timezone.timedelta(minutes=10)
|
||||
alert_group.save()
|
||||
check_escalation_finished_task()
|
||||
alert_group3 = make_alert_group(alert_receive_channel)
|
||||
alert_group3.started_at = now + timezone.timedelta(days=5)
|
||||
|
||||
# escalation is not finished yet and passed more than 5 minutes after estimate time, exception is raised
|
||||
alert_group.estimate_escalation_finish_time = now - timezone.timedelta(minutes=10)
|
||||
alert_group.save()
|
||||
with pytest.raises(Exception):
|
||||
check_escalation_finished_task()
|
||||
AlertGroup.all_objects.bulk_update([alert_group1, alert_group2, alert_group3], ["started_at"])
|
||||
|
||||
# escalation is finished and we don't care anymore about its finish time, so no exception is raised
|
||||
alert_group.is_escalation_finished = True
|
||||
alert_group.save()
|
||||
check_escalation_finished_task()
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.get_auditable_alert_groups_started_at_range"
|
||||
) as mocked_get_auditable_alert_groups_started_at_range:
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.audit_alert_group_escalation"
|
||||
) as mocked_audit_alert_group_escalation:
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.send_alert_group_escalation_auditor_task_heartbeat"
|
||||
) as mocked_send_alert_group_escalation_auditor_task_heartbeat:
|
||||
mocked_get_auditable_alert_groups_started_at_range.return_value = (two_days_ago, two_days_in_future)
|
||||
|
||||
check_escalation_finished_task()
|
||||
|
||||
mocked_audit_alert_group_escalation.assert_called_once_with(alert_group1)
|
||||
mocked_send_alert_group_escalation_auditor_task_heartbeat.assert_called_once_with()
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_check_escalation_finished_task_calls_audit_alert_group_escalation_for_every_alert_group(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
|
||||
now = timezone.now()
|
||||
two_days_ago = now - timezone.timedelta(days=2)
|
||||
two_days_in_future = now + timezone.timedelta(days=2)
|
||||
|
||||
# we can't simply pass started_at to the fixture because started_at is being "auto-set" on the Model
|
||||
alert_group1 = make_alert_group(alert_receive_channel)
|
||||
alert_group1.started_at = now
|
||||
|
||||
alert_group2 = make_alert_group(alert_receive_channel)
|
||||
alert_group2.started_at = now
|
||||
|
||||
alert_group3 = make_alert_group(alert_receive_channel)
|
||||
alert_group3.started_at = now
|
||||
|
||||
AlertGroup.all_objects.bulk_update([alert_group1, alert_group2, alert_group3], ["started_at"])
|
||||
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.get_auditable_alert_groups_started_at_range"
|
||||
) as mocked_get_auditable_alert_groups_started_at_range:
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.audit_alert_group_escalation"
|
||||
) as mocked_audit_alert_group_escalation:
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.send_alert_group_escalation_auditor_task_heartbeat"
|
||||
) as mocked_send_alert_group_escalation_auditor_task_heartbeat:
|
||||
mocked_get_auditable_alert_groups_started_at_range.return_value = (two_days_ago, two_days_in_future)
|
||||
|
||||
check_escalation_finished_task()
|
||||
|
||||
mocked_audit_alert_group_escalation.assert_any_call(alert_group1)
|
||||
mocked_audit_alert_group_escalation.assert_any_call(alert_group2)
|
||||
mocked_audit_alert_group_escalation.assert_any_call(alert_group3)
|
||||
mocked_send_alert_group_escalation_auditor_task_heartbeat.assert_called_once_with()
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_check_escalation_finished_task_simply_calls_heartbeat_when_no_alert_groups_found():
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.audit_alert_group_escalation"
|
||||
) as mocked_audit_alert_group_escalation:
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.send_alert_group_escalation_auditor_task_heartbeat"
|
||||
) as mocked_send_alert_group_escalation_auditor_task_heartbeat:
|
||||
check_escalation_finished_task()
|
||||
mocked_audit_alert_group_escalation.assert_not_called()
|
||||
mocked_send_alert_group_escalation_auditor_task_heartbeat.assert_called_once_with()
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_check_escalation_finished_task_calls_audit_alert_group_escalation_for_every_alert_group_even_if_one_fails(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
|
||||
now = timezone.now()
|
||||
two_days_ago = now - timezone.timedelta(days=2)
|
||||
two_days_in_future = now + timezone.timedelta(days=2)
|
||||
|
||||
# we can't simply pass started_at to the fixture because started_at is being "auto-set" on the Model
|
||||
alert_group1 = make_alert_group(alert_receive_channel)
|
||||
alert_group1.started_at = now
|
||||
|
||||
alert_group2 = make_alert_group(alert_receive_channel)
|
||||
alert_group2.started_at = now
|
||||
|
||||
alert_group3 = make_alert_group(alert_receive_channel)
|
||||
alert_group3.started_at = now
|
||||
|
||||
AlertGroup.all_objects.bulk_update([alert_group1, alert_group2, alert_group3], ["started_at"])
|
||||
|
||||
def _mocked_audit_alert_group_escalation(alert_group):
|
||||
if not alert_group.id == alert_group3.id:
|
||||
raise AlertGroupEscalationPolicyExecutionAuditException("asdfasdf")
|
||||
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.get_auditable_alert_groups_started_at_range"
|
||||
) as mocked_get_auditable_alert_groups_started_at_range:
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.audit_alert_group_escalation"
|
||||
) as mocked_audit_alert_group_escalation:
|
||||
with patch(
|
||||
"apps.alerts.tasks.check_escalation_finished.send_alert_group_escalation_auditor_task_heartbeat"
|
||||
) as mocked_send_alert_group_escalation_auditor_task_heartbeat:
|
||||
mocked_get_auditable_alert_groups_started_at_range.return_value = (two_days_ago, two_days_in_future)
|
||||
mocked_audit_alert_group_escalation.side_effect = _mocked_audit_alert_group_escalation
|
||||
|
||||
with pytest.raises(AlertGroupEscalationPolicyExecutionAuditException) as exc:
|
||||
check_escalation_finished_task()
|
||||
|
||||
assert (
|
||||
str(exc.value)
|
||||
== f"The following alert group id(s) failed auditing: {alert_group1.id}, {alert_group2.id}"
|
||||
)
|
||||
|
||||
mocked_audit_alert_group_escalation.assert_any_call(alert_group1)
|
||||
mocked_audit_alert_group_escalation.assert_any_call(alert_group2)
|
||||
mocked_audit_alert_group_escalation.assert_any_call(alert_group3)
|
||||
|
||||
mocked_send_alert_group_escalation_auditor_task_heartbeat.assert_not_called()
|
||||
|
|
|
|||
|
|
@ -134,7 +134,7 @@ def test_escalation_step_notify_multiple_users(
|
|||
escalation_step_test_setup,
|
||||
make_escalation_policy,
|
||||
):
|
||||
organization, user, _, channel_filter, alert_group, reason = escalation_step_test_setup
|
||||
_, user, _, channel_filter, alert_group, reason = escalation_step_test_setup
|
||||
|
||||
notify_users_step = make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
|
|
@ -386,7 +386,7 @@ def test_escalation_step_notify_if_num_alerts_in_window(
|
|||
).exists()
|
||||
assert not mocked_execute_tasks.called
|
||||
|
||||
organization, user, _, channel_filter, alert_group, reason = escalation_step_test_setup
|
||||
_, _, _, channel_filter, alert_group, reason = escalation_step_test_setup
|
||||
|
||||
make_alert(alert_group=alert_group, raw_request_data={})
|
||||
|
||||
|
|
|
|||
|
|
@ -1,5 +1,3 @@
|
|||
import datetime
|
||||
|
||||
import pytest
|
||||
from django.utils import timezone
|
||||
|
||||
|
|
@ -11,54 +9,6 @@ from apps.alerts.escalation_snapshot.snapshot_classes import (
|
|||
from apps.alerts.models import EscalationPolicy
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def escalation_snapshot_test_setup(
|
||||
make_organization_and_user,
|
||||
make_user_for_organization,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_escalation_chain,
|
||||
make_escalation_policy,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, user_1 = make_organization_and_user()
|
||||
user_2 = make_user_for_organization(organization)
|
||||
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
|
||||
escalation_chain = make_escalation_chain(organization)
|
||||
channel_filter = make_channel_filter(
|
||||
alert_receive_channel,
|
||||
escalation_chain=escalation_chain,
|
||||
notification_backends={"BACKEND": {"channel_id": "abc123"}},
|
||||
)
|
||||
|
||||
notify_to_multiple_users_step = make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_NOTIFY_MULTIPLE_USERS,
|
||||
)
|
||||
notify_to_multiple_users_step.notify_to_users_queue.set([user_1, user_2])
|
||||
wait_step = make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_WAIT,
|
||||
wait_delay=EscalationPolicy.FIFTEEN_MINUTES,
|
||||
)
|
||||
# random time for test
|
||||
from_time = datetime.time(10, 30)
|
||||
to_time = datetime.time(18, 45)
|
||||
notify_if_time_step = make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_NOTIFY_IF_TIME,
|
||||
from_time=from_time,
|
||||
to_time=to_time,
|
||||
)
|
||||
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
alert_group.save()
|
||||
return alert_group, notify_to_multiple_users_step, wait_step, notify_if_time_step
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_raw_escalation_snapshot(escalation_snapshot_test_setup):
|
||||
alert_group, notify_to_multiple_users_step, wait_step, notify_if_time_step = escalation_snapshot_test_setup
|
||||
|
|
@ -142,7 +92,7 @@ def test_raw_escalation_snapshot(escalation_snapshot_test_setup):
|
|||
|
||||
@pytest.mark.django_db
|
||||
def test_serialized_escalation_snapshot(escalation_snapshot_test_setup):
|
||||
alert_group, notify_to_multiple_users_step, wait_step, notify_if_time_step = escalation_snapshot_test_setup
|
||||
alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
escalation_snapshot = alert_group.escalation_snapshot
|
||||
assert isinstance(escalation_snapshot, EscalationSnapshot)
|
||||
assert escalation_snapshot.channel_filter_snapshot is not None and isinstance(
|
||||
|
|
@ -163,7 +113,7 @@ def test_serialized_escalation_snapshot(escalation_snapshot_test_setup):
|
|||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_snapshot_with_deleted_channel_filter(escalation_snapshot_test_setup):
|
||||
alert_group, notify_to_multiple_users_step, wait_step, notify_if_time_step = escalation_snapshot_test_setup
|
||||
alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
alert_group.channel_filter.delete()
|
||||
|
||||
escalation_snapshot = alert_group.escalation_snapshot
|
||||
|
|
@ -174,7 +124,7 @@ def test_escalation_snapshot_with_deleted_channel_filter(escalation_snapshot_tes
|
|||
|
||||
@pytest.mark.django_db
|
||||
def test_change_escalation_snapshot(escalation_snapshot_test_setup):
|
||||
alert_group, notify_to_multiple_users_step, wait_step, notify_if_time_step = escalation_snapshot_test_setup
|
||||
alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
|
||||
new_active_order = 2
|
||||
now = timezone.now()
|
||||
|
|
@ -194,7 +144,7 @@ def test_change_escalation_snapshot(escalation_snapshot_test_setup):
|
|||
|
||||
@pytest.mark.django_db
|
||||
def test_next_escalation_policy_snapshot(escalation_snapshot_test_setup):
|
||||
alert_group, notify_to_multiple_users_step, wait_step, notify_if_time_step = escalation_snapshot_test_setup
|
||||
alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
escalation_snapshot = alert_group.escalation_snapshot
|
||||
|
||||
assert escalation_snapshot.last_active_escalation_policy_order is None
|
||||
|
|
@ -226,3 +176,39 @@ def test_next_escalation_policy_snapshot(escalation_snapshot_test_setup):
|
|||
is escalation_snapshot.escalation_policies_snapshots[-1]
|
||||
)
|
||||
assert escalation_snapshot.next_active_escalation_policy_snapshot is None
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
@pytest.mark.parametrize(
|
||||
"next_step_eta,expected",
|
||||
[
|
||||
(None, None),
|
||||
(timezone.now() - timezone.timedelta(weeks=50), False),
|
||||
(timezone.now() - timezone.timedelta(minutes=4), True),
|
||||
(timezone.now() + timezone.timedelta(minutes=4), True),
|
||||
],
|
||||
)
|
||||
def test_next_step_eta_is_valid(escalation_snapshot_test_setup, next_step_eta, expected) -> None:
|
||||
alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
escalation_snapshot = alert_group.escalation_snapshot
|
||||
|
||||
escalation_snapshot.next_step_eta = next_step_eta
|
||||
|
||||
assert escalation_snapshot.next_step_eta_is_valid() is expected
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_executed_escalation_policy_snapshots(escalation_snapshot_test_setup):
|
||||
alert_group, _, _, _ = escalation_snapshot_test_setup
|
||||
escalation_snapshot = alert_group.escalation_snapshot
|
||||
|
||||
escalation_snapshot.last_active_escalation_policy_order = None
|
||||
assert escalation_snapshot.executed_escalation_policy_snapshots == []
|
||||
|
||||
escalation_snapshot.last_active_escalation_policy_order = 0
|
||||
assert escalation_snapshot.executed_escalation_policy_snapshots == [
|
||||
escalation_snapshot.escalation_policies_snapshots[0]
|
||||
]
|
||||
|
||||
escalation_snapshot.last_active_escalation_policy_order = len(escalation_snapshot.escalation_policies_snapshots)
|
||||
assert escalation_snapshot.executed_escalation_policy_snapshots == escalation_snapshot.escalation_policies_snapshots
|
||||
|
|
|
|||
658
engine/apps/alerts/tests/test_escalation_snapshot_mixin.py
Normal file
658
engine/apps/alerts/tests/test_escalation_snapshot_mixin.py
Normal file
|
|
@ -0,0 +1,658 @@
|
|||
from unittest.mock import PropertyMock, patch
|
||||
|
||||
import pytest
|
||||
import pytz
|
||||
from rest_framework.exceptions import ValidationError
|
||||
|
||||
from apps.alerts.escalation_snapshot.snapshot_classes import EscalationSnapshot
|
||||
from apps.alerts.models import EscalationPolicy
|
||||
|
||||
MOCK_SLACK_CHANNEL_ID = "asdfljaskdf"
|
||||
EMPTY_RAW_ESCALATION_SNAPSHOT = {
|
||||
"channel_filter_snapshot": None,
|
||||
"escalation_chain_snapshot": None,
|
||||
"last_active_escalation_policy_order": None,
|
||||
"escalation_policies_snapshots": [],
|
||||
"slack_channel_id": None,
|
||||
"pause_escalation": False,
|
||||
"next_step_eta": None,
|
||||
}
|
||||
|
||||
|
||||
@patch("apps.alerts.models.alert_group.AlertGroup.slack_channel_id", new_callable=PropertyMock)
|
||||
@pytest.mark.django_db
|
||||
def test_build_raw_escalation_snapshot_escalation_chain_exists(
|
||||
mock_alert_group_slack_channel_id,
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_escalation_chain,
|
||||
make_escalation_policy,
|
||||
make_alert_group,
|
||||
):
|
||||
mock_alert_group_slack_channel_id.return_value = MOCK_SLACK_CHANNEL_ID
|
||||
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
escalation_chain = make_escalation_chain(organization=organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel, escalation_chain=escalation_chain)
|
||||
make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_WAIT,
|
||||
wait_delay=EscalationPolicy.FIFTEEN_MINUTES,
|
||||
)
|
||||
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
expected_snapshot = EscalationSnapshot.serializer(
|
||||
{
|
||||
"channel_filter_snapshot": alert_group.channel_filter,
|
||||
"escalation_chain_snapshot": alert_group.channel_filter.escalation_chain,
|
||||
"escalation_policies_snapshots": alert_group.channel_filter.escalation_chain.escalation_policies.all(),
|
||||
"slack_channel_id": MOCK_SLACK_CHANNEL_ID,
|
||||
}
|
||||
)
|
||||
|
||||
assert alert_group.build_raw_escalation_snapshot() == expected_snapshot.data
|
||||
|
||||
|
||||
@patch(
|
||||
"apps.alerts.escalation_snapshot.escalation_snapshot_mixin.EscalationSnapshotMixin.pause_escalation",
|
||||
new_callable=PropertyMock,
|
||||
)
|
||||
@pytest.mark.django_db
|
||||
def test_build_raw_escalation_snapshot_escalation_chain_does_not_exist_escalation_paused(
|
||||
mocked_pause_escalation,
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_escalation_chain,
|
||||
make_alert_group,
|
||||
):
|
||||
mocked_pause_escalation.return_value = True
|
||||
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
escalation_chain = make_escalation_chain(organization=organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel, escalation_chain=escalation_chain)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.build_raw_escalation_snapshot() == EMPTY_RAW_ESCALATION_SNAPSHOT
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_build_raw_escalation_snapshot_escalation_chain_does_not_exist_no_channel_filter(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
|
||||
assert alert_group.build_raw_escalation_snapshot() == EMPTY_RAW_ESCALATION_SNAPSHOT
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_build_raw_escalation_snapshot_escalation_chain_does_not_exist_no_channel_filter_escalation_chain(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.build_raw_escalation_snapshot() == EMPTY_RAW_ESCALATION_SNAPSHOT
|
||||
|
||||
|
||||
@patch(
|
||||
"apps.alerts.escalation_snapshot.escalation_snapshot_mixin.EscalationSnapshotMixin.channel_filter_snapshot",
|
||||
new_callable=PropertyMock,
|
||||
)
|
||||
@pytest.mark.django_db
|
||||
def test_channel_filter_with_respect_to_escalation_snapshot(
|
||||
mock_channel_filter_snapshot,
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
channel_filter_snapshot = "asdfasdfadsfadsf"
|
||||
mock_channel_filter_snapshot.return_value = channel_filter_snapshot
|
||||
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.channel_filter_with_respect_to_escalation_snapshot == channel_filter_snapshot
|
||||
|
||||
|
||||
@patch(
|
||||
"apps.alerts.escalation_snapshot.escalation_snapshot_mixin.EscalationSnapshotMixin.channel_filter_snapshot",
|
||||
new_callable=PropertyMock,
|
||||
)
|
||||
@pytest.mark.django_db
|
||||
def test_channel_filter_with_respect_to_escalation_snapshot_no_channel_filter_snapshot(
|
||||
mock_channel_filter_snapshot,
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
mock_channel_filter_snapshot.return_value = None
|
||||
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.channel_filter_with_respect_to_escalation_snapshot == channel_filter
|
||||
|
||||
|
||||
@patch(
|
||||
"apps.alerts.escalation_snapshot.escalation_snapshot_mixin.EscalationSnapshotMixin.escalation_chain_snapshot",
|
||||
new_callable=PropertyMock,
|
||||
)
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_chain_with_respect_to_escalation_snapshot(
|
||||
mock_escalation_chain_snapshot,
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
escalation_chain_snapshot = "asdfasdfadsfadsf"
|
||||
mock_escalation_chain_snapshot.return_value = escalation_chain_snapshot
|
||||
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.escalation_chain_with_respect_to_escalation_snapshot == escalation_chain_snapshot
|
||||
|
||||
|
||||
@patch(
|
||||
"apps.alerts.escalation_snapshot.escalation_snapshot_mixin.EscalationSnapshotMixin.escalation_chain_snapshot",
|
||||
new_callable=PropertyMock,
|
||||
)
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_chain_with_respect_to_escalation_snapshot_no_escalation_chain_snapshot(
|
||||
mock_escalation_chain_snapshot,
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_escalation_chain,
|
||||
make_alert_group,
|
||||
):
|
||||
mock_escalation_chain_snapshot.return_value = None
|
||||
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
escalation_chain = make_escalation_chain(organization=organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel, escalation_chain=escalation_chain)
|
||||
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.escalation_chain_with_respect_to_escalation_snapshot == escalation_chain
|
||||
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
|
||||
assert alert_group.channel_filter is None
|
||||
assert alert_group.escalation_chain_with_respect_to_escalation_snapshot is None
|
||||
|
||||
|
||||
@patch(
|
||||
"apps.alerts.escalation_snapshot.escalation_snapshot_mixin.EscalationSnapshotMixin.escalation_chain_snapshot",
|
||||
new_callable=PropertyMock,
|
||||
)
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_chain_with_respect_to_escalation_snapshot_no_escalation_chain_snapshot_and_no_channel_filter(
|
||||
mock_escalation_chain_snapshot,
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
mock_escalation_chain_snapshot.return_value = None
|
||||
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
|
||||
assert alert_group.escalation_chain_with_respect_to_escalation_snapshot is None
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_channel_filter_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_escalation_chain,
|
||||
make_escalation_policy,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
escalation_chain = make_escalation_chain(organization=organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel, escalation_chain=escalation_chain)
|
||||
make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_WAIT,
|
||||
wait_delay=EscalationPolicy.FIFTEEN_MINUTES,
|
||||
)
|
||||
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
assert alert_group.channel_filter_snapshot.id == channel_filter.id
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_channel_filter_snapshot_no_escalation_chain_exists(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
assert alert_group.raw_escalation_snapshot["channel_filter_snapshot"] is None
|
||||
assert alert_group.channel_filter_snapshot is None
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_channel_filter_snapshot_no_alert_group_raw_escalation_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.channel_filter_snapshot is None
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_chain_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_escalation_chain,
|
||||
make_escalation_policy,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
escalation_chain = make_escalation_chain(organization=organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel, escalation_chain=escalation_chain)
|
||||
make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_WAIT,
|
||||
wait_delay=EscalationPolicy.FIFTEEN_MINUTES,
|
||||
)
|
||||
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
assert alert_group.escalation_chain_snapshot.id == escalation_chain.id
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_chain_snapshot_no_escalation_chain_exists(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
assert alert_group.raw_escalation_snapshot["escalation_chain_snapshot"] is None
|
||||
assert alert_group.escalation_chain_snapshot is None
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_chain_snapshot_no_alert_group_raw_escalation_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.escalation_chain_snapshot is None
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
return_value = "asdfasdfasdf"
|
||||
with patch(
|
||||
"apps.alerts.escalation_snapshot.escalation_snapshot_mixin.EscalationSnapshotMixin._deserialize_escalation_snapshot",
|
||||
return_value=return_value,
|
||||
):
|
||||
assert alert_group.escalation_snapshot == return_value
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_snapshot_validation_error(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
with patch(
|
||||
"apps.alerts.escalation_snapshot.escalation_snapshot_mixin.EscalationSnapshotMixin._deserialize_escalation_snapshot",
|
||||
side_effect=ValidationError("asdfasdf"),
|
||||
):
|
||||
assert alert_group.escalation_snapshot is None
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_snapshot_no_alert_group_raw_escalation_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.escalation_snapshot is None
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_snapshot_empty_escalation_policies_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
assert alert_group.raw_escalation_snapshot is not None
|
||||
assert alert_group.has_escalation_policies_snapshots is False
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_snapshot_nonempty_escalation_policies_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_escalation_chain,
|
||||
make_escalation_policy,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
escalation_chain = make_escalation_chain(organization=organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel, escalation_chain=escalation_chain)
|
||||
make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_WAIT,
|
||||
wait_delay=EscalationPolicy.FIFTEEN_MINUTES,
|
||||
)
|
||||
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
assert alert_group.raw_escalation_snapshot is not None
|
||||
assert alert_group.has_escalation_policies_snapshots is True
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_has_escalation_policies_snapshots_no_alert_group_raw_escalation_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.raw_escalation_snapshot is None
|
||||
assert alert_group.has_escalation_policies_snapshots is False
|
||||
|
||||
|
||||
@patch("apps.alerts.models.alert_group.AlertGroup.slack_channel_id", new_callable=PropertyMock)
|
||||
@pytest.mark.django_db
|
||||
def test_deserialize_escalation_snapshot(
|
||||
mock_alert_group_slack_channel_id,
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_escalation_chain,
|
||||
make_escalation_policy,
|
||||
make_alert_group,
|
||||
):
|
||||
mock_alert_group_slack_channel_id.return_value = MOCK_SLACK_CHANNEL_ID
|
||||
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
escalation_chain = make_escalation_chain(organization=organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel, escalation_chain=escalation_chain)
|
||||
escalation_policy = make_escalation_policy(
|
||||
escalation_chain=channel_filter.escalation_chain,
|
||||
escalation_policy_step=EscalationPolicy.STEP_WAIT,
|
||||
wait_delay=EscalationPolicy.FIFTEEN_MINUTES,
|
||||
)
|
||||
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
deserialized_escalation_snapshot = alert_group._deserialize_escalation_snapshot(alert_group.raw_escalation_snapshot)
|
||||
|
||||
assert deserialized_escalation_snapshot.alert_group == alert_group
|
||||
assert deserialized_escalation_snapshot.channel_filter_snapshot.id == channel_filter.id
|
||||
assert deserialized_escalation_snapshot.escalation_chain_snapshot.id == escalation_chain.id
|
||||
assert deserialized_escalation_snapshot.last_active_escalation_policy_order is None
|
||||
assert len(deserialized_escalation_snapshot.escalation_policies_snapshots) == 1
|
||||
assert deserialized_escalation_snapshot.escalation_policies_snapshots[0].id == escalation_policy.id
|
||||
assert deserialized_escalation_snapshot.slack_channel_id == MOCK_SLACK_CHANNEL_ID
|
||||
assert deserialized_escalation_snapshot.pause_escalation is False
|
||||
assert deserialized_escalation_snapshot.next_step_eta is None
|
||||
assert deserialized_escalation_snapshot.stop_escalation is False
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_chain_exists(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_escalation_chain,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
escalation_chain = make_escalation_chain(organization=organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel, escalation_chain=escalation_chain)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.pause_escalation is False
|
||||
assert alert_group.escalation_chain_exists is True
|
||||
|
||||
|
||||
@patch(
|
||||
"apps.alerts.escalation_snapshot.escalation_snapshot_mixin.EscalationSnapshotMixin.pause_escalation",
|
||||
new_callable=PropertyMock,
|
||||
)
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_chain_exists_paused_escalation(
|
||||
mocked_pause_escalation,
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
mocked_pause_escalation.return_value = True
|
||||
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
|
||||
assert alert_group.pause_escalation is True
|
||||
assert alert_group.escalation_chain_exists is False
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_chain_exists_no_channel_filter(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
|
||||
assert alert_group.pause_escalation is False
|
||||
assert alert_group.channel_filter is None
|
||||
assert alert_group.escalation_chain_exists is False
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_escalation_chain_exists_no_channel_filter_escalation_chain(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_channel_filter,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
channel_filter = make_channel_filter(alert_receive_channel)
|
||||
alert_group = make_alert_group(alert_receive_channel, channel_filter=channel_filter)
|
||||
|
||||
assert alert_group.pause_escalation is False
|
||||
assert alert_group.channel_filter == channel_filter
|
||||
assert alert_group.channel_filter.escalation_chain is None
|
||||
assert alert_group.escalation_chain_exists is False
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_pause_escalation_no_raw_escalation_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
|
||||
assert alert_group.raw_escalation_snapshot is None
|
||||
assert alert_group.pause_escalation is False
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_pause_escalation_raw_escalation_snapshot_exists(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
assert alert_group.raw_escalation_snapshot is not None
|
||||
assert alert_group.raw_escalation_snapshot["pause_escalation"] is False
|
||||
|
||||
alert_group.raw_escalation_snapshot["pause_escalation"] = True
|
||||
|
||||
assert alert_group.pause_escalation is True
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_next_step_eta_no_raw_escalation_snapshot(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
|
||||
assert alert_group.raw_escalation_snapshot is None
|
||||
assert alert_group.next_step_eta is None
|
||||
|
||||
|
||||
@pytest.mark.django_db
|
||||
def test_next_step_eta_no_next_step_eta(
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
|
||||
assert alert_group.raw_escalation_snapshot is not None
|
||||
assert alert_group.raw_escalation_snapshot["next_step_eta"] is None
|
||||
assert alert_group.next_step_eta is None
|
||||
|
||||
|
||||
@patch("apps.alerts.escalation_snapshot.escalation_snapshot_mixin.parse")
|
||||
@pytest.mark.django_db
|
||||
def test_next_step_eta(
|
||||
mock_dateutil_parser,
|
||||
make_organization_and_user,
|
||||
make_alert_receive_channel,
|
||||
make_alert_group,
|
||||
):
|
||||
mocked_raw_date = "mcvnmcvmnvc"
|
||||
mocked_parsed_date = "asdfasdfaf"
|
||||
mock_dateutil_parser.return_value.replace.return_value = mocked_parsed_date
|
||||
|
||||
organization, _ = make_organization_and_user()
|
||||
alert_receive_channel = make_alert_receive_channel(organization)
|
||||
alert_group = make_alert_group(alert_receive_channel)
|
||||
alert_group.raw_escalation_snapshot = alert_group.build_raw_escalation_snapshot()
|
||||
alert_group.raw_escalation_snapshot["next_step_eta"] = mocked_raw_date
|
||||
|
||||
assert alert_group.raw_escalation_snapshot is not None
|
||||
assert alert_group.raw_escalation_snapshot["next_step_eta"] is mocked_raw_date
|
||||
assert alert_group.next_step_eta == mocked_parsed_date
|
||||
|
||||
mock_dateutil_parser.assert_called_once_with(mocked_raw_date)
|
||||
mock_dateutil_parser.return_value.replace.assert_called_once_with(tzinfo=pytz.UTC)
|
||||
|
|
@ -5,11 +5,11 @@ from django.conf import settings
|
|||
|
||||
def get_random_readonly_database_key_if_present_otherwise_default() -> str:
|
||||
"""
|
||||
This function returns a string, representing a key in the DATABASES django settings.
|
||||
If settings.READONLY_DATABASES is set, and non-empty, it randomly chooses one of the read-only databases,
|
||||
This function returns a string, representing a key in the `DATABASES` django settings.
|
||||
If `settings.READONLY_DATABASES` is set, and non-empty, it randomly chooses one of the read-only databases,
|
||||
otherwise it falls back to "default".
|
||||
|
||||
This is primarily intended to be used for django's QuerySet.using() function
|
||||
This is primarily intended to be used for django's `QuerySet.using()` function
|
||||
"""
|
||||
using_db = "default"
|
||||
if hasattr(settings, "READONLY_DATABASES") and len(settings.READONLY_DATABASES) > 0:
|
||||
|
|
|
|||
|
|
@ -49,9 +49,6 @@ class Command(BaseCommand):
|
|||
alert_group.unsilence_task_uuid = task_id
|
||||
|
||||
escalation_start_time = max(now, alert_group.silenced_until)
|
||||
alert_group.estimate_escalation_finish_time = alert_group.calculate_eta_for_finish_escalation(
|
||||
start_time=escalation_start_time,
|
||||
)
|
||||
alert_groups_to_update.append(alert_group)
|
||||
|
||||
tasks.append(
|
||||
|
|
@ -65,9 +62,6 @@ class Command(BaseCommand):
|
|||
# otherwise start escalate_alert_group task
|
||||
else:
|
||||
if alert_group.escalation_snapshot:
|
||||
alert_group.estimate_escalation_finish_time = alert_group.calculate_eta_for_finish_escalation(
|
||||
escalation_started=True,
|
||||
)
|
||||
alert_group.active_escalation_id = task_id
|
||||
alert_groups_to_update.append(alert_group)
|
||||
|
||||
|
|
@ -82,7 +76,7 @@ class Command(BaseCommand):
|
|||
|
||||
AlertGroup.all_objects.bulk_update(
|
||||
alert_groups_to_update,
|
||||
["estimate_escalation_finish_time", "active_escalation_id", "unsilence_task_uuid"],
|
||||
["active_escalation_id", "unsilence_task_uuid"],
|
||||
batch_size=5000,
|
||||
)
|
||||
|
||||
|
|
|
|||
|
|
@ -1,9 +1,12 @@
|
|||
import os
|
||||
import shlex
|
||||
import subprocess
|
||||
|
||||
from django.core.management.base import BaseCommand
|
||||
from django.utils import autoreload
|
||||
|
||||
from common.utils import getenv_boolean
|
||||
|
||||
WORKER_ID = 0
|
||||
|
||||
|
||||
|
|
@ -11,8 +14,18 @@ def restart_celery(*args, **kwargs):
|
|||
global WORKER_ID
|
||||
kill_worker_cmd = "celery -A engine control shutdown"
|
||||
subprocess.call(shlex.split(kill_worker_cmd))
|
||||
start_worker_cmd = "celery -A engine worker -l info --concurrency=3 -Q celery,retry -n {}".format(WORKER_ID)
|
||||
subprocess.call(shlex.split(start_worker_cmd))
|
||||
|
||||
queues = os.environ.get("CELERY_WORKER_QUEUE", "celery,retry")
|
||||
max_tasks_per_child = os.environ.get("CELERY_WORKER_MAX_TASKS_PER_CHILD", 100)
|
||||
concurrency = os.environ.get("CELERY_WORKER_CONCURRENCY", 3)
|
||||
log_level = "debug" if getenv_boolean("CELERY_WORKER_DEBUG_LOGS", False) else "info"
|
||||
|
||||
celery_args = f"-A engine worker -l {log_level} --concurrency={concurrency} -Q {queues} --max-tasks-per-child={max_tasks_per_child} -n {WORKER_ID}"
|
||||
|
||||
if getenv_boolean("CELERY_WORKER_BEAT_ENABLED", False):
|
||||
celery_args += " --beat"
|
||||
|
||||
subprocess.call(shlex.split(f"celery {celery_args}"))
|
||||
WORKER_ID = 1 + WORKER_ID
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -51,3 +51,4 @@ pyroscope-io==0.8.1
|
|||
django-dbconn-retry==0.1.7
|
||||
django-ipware==4.0.2
|
||||
django-anymail==8.6
|
||||
django-deprecate-fields==0.1.1
|
||||
|
|
|
|||
|
|
@ -395,6 +395,10 @@ CELERY_MAX_TASKS_PER_CHILD = 1
|
|||
CELERY_WORKER_SEND_TASK_EVENTS = True
|
||||
CELERY_TASK_SEND_SENT_EVENT = True
|
||||
|
||||
ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL = os.getenv(
|
||||
"ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_URL", None
|
||||
)
|
||||
|
||||
CELERY_BEAT_SCHEDULE = {
|
||||
"restore_heartbeat_tasks": {
|
||||
"task": "apps.heartbeat.tasks.restore_heartbeat_tasks",
|
||||
|
|
@ -403,7 +407,11 @@ CELERY_BEAT_SCHEDULE = {
|
|||
},
|
||||
"check_escalations": {
|
||||
"task": "apps.alerts.tasks.check_escalation_finished.check_escalation_finished_task",
|
||||
"schedule": 10 * 60,
|
||||
# the task should be executed a minute or two less than the integration's configured interval
|
||||
#
|
||||
# ex. if the integration is configured to expect a heartbeat every 15 minutes then this value should be set
|
||||
# to something like 13 * 60 (every 13 minutes)
|
||||
"schedule": getenv_integer("ALERT_GROUP_ESCALATION_AUDITOR_CELERY_TASK_HEARTBEAT_INTERVAL", 13 * 60),
|
||||
"args": (),
|
||||
},
|
||||
"start_refresh_ical_files": {
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue