Alerts Reference
Objective
This document provides reference information on various types of alerts supported by F5® Distributed Cloud Services. Use the information provided in this document to understand the details on the various alerts and actions required to be performed.
In Distributed Cloud Console, the Alerts page displays two tabs: Active Alerts and All Alerts.
Active Alerts
An alert is generated when the alert condition is evaluated to true. Alert rules are evaluated periodically, and the alert status remains active as long as the alert condition is active.
Note: There are two alert APIs. The Get Alerts API returns active alerts, and the state of alerts will be active. The Alerts History API returns a history of alert notifications for a selected time interval. The status can be firing (which is same as active) or resolved.
The following are some keys and their corresponding values for an active alert (can be viewed from the JSON view of an alert in Console):
- state: The value is active.
- startsAt: The time at which alert started firing.
- endsAt: The time at which the alert got resolved (if it is resolved). Ignore this field if the alert status is active or firing.
- generatorURL: Identifies the entity that generated this alert. This is an internal URL and hence it is always set to “”.
- silencedBy and inhibitedBy: This is always null and should be ignored.
- receivers: List of alert receivers this alert notification was sent to, based on the user configured alert policy. This is empty if no alert policy configured or this alert did not match any configured alert policy.
- fingerprint: This is a hash of the key-value pairs in the alert, and it uniquely identifies an alert.
All Alerts
The All Alerts tab shows the history of alerts triggered for the selected date and time interval. The following are some keys and their corresponding values (can be viewed from the JSON view of an alert in Console):
- status: An alert can have one of the following values:
- firing: This is the same as active state.
- resolved: This indicates that the alert is resolved.
- startsAt: The time at which alert started firing.
- endsAt: The time at which the alert got resolved (if it is resolved). Ignore this field if the alert status is active or firing.
Key Points
An alert is resolved in the following cases:
- Alert condition is no longer active.
- If the alert has valid endsAt time, and it is lapsed.
- If the alert has no valid endsAt time, and no updates are received for the resolve_timeout duration (15 minutes).
Note: In case of an active alert, you can ignore the endsAt time. The entity generating the alert may set this endsAt time and the alert manager resolves the alert after this time is lapsed.
There is no separate alert for health score. This is because health score is composed of multiple components. For example, health score of a site is computed based on the data-plane connection status to the Regional Edge (RE) sites, control-plane connection status, and K8s API server status in the Site. There are individual alerts defined for each of the above conditions, but no alert is available for the health score itself.
Note: You can obtain the health score of a Site in F5® Distributed Cloud Console (Console). You can also obtain it using the API
https://www.volterra.io/docs/api/graph-connectivity#operation/ves.io.schema.graph.connectivity.CustomAPI.NodeQuery
with"field_selector":{"healthscore":{"types":["HEALTHSCORE_OVERALL"]}}
.
The amount of time before alert generation is not the same for all alerts. This duration is determined based on the severity of the alerts. For example, alert is raised as soon as the tunnel connection to RE goes down, whereas health check alert for a service is raised only if the condition persists for 10 minutes. This is to keep the alert volume under manageable level and not to generate alerts on temporary or transient failure conditions.
You cannot change the threshold for alerts.
You cannot define new alerts using an API. However, in case the existing alerts do not satisfy your requirements, you can create a support request for a new alert in Console.
Alerts and Descriptions
The following table presents alerts and associated details, such as group, type, severity, and associated actions.
TSA Severity vs Anomaly Scores
Time-Series Anomaly (TSA) alerts are generated when the anomaly detection algorithm determines anomalies in any one of the following metrics:
- Request rate
- Error Rate
- Response Throughput
- Request Throughput
- Response Latency
Note: The metrics are evaluated in requests per second (rps), errors per second (erps), seconds (s), and Megabits per second (Mbps).
The alerts are classified into 3 groups (minor, major and critical) based on the severity. The minimum/absolute thresholds for the metrics to trigger these alerts are provided in the following table.
Metric | Severity | Score | Absolute Threshold | Alert |
---|---|---|---|---|
Request Rate | minor | 0.6 | 5 rps | RequestRateAnomaly |
Request Rate | major | 1.5 | 50 rps | RequestRateAnomaly |
Request Rate | critical | 3.0 | 100 rps | RequestRateAnomaly |
Request Throughput | minor | 0.6 | 0.25 Mbps | RequestThroughputAnomaly |
Request Throughput | major | 1.5 | 2.5 Mbps | RequestThroughputAnomaly |
Request Throughput | critical | 3.0 | 5 Mbps | RequestThroughputAnomaly |
Response Throughput | minor | 0.6 | 2.5 Mbps | ResponseThroughputAnomaly |
Response Throughput | major | 1.5 | 25 Mbps | ResponseThroughputAnomaly |
Response Throughput | critical | 3.0 | 50 Mbps | ResponseThroughputAnomaly |
Response Latency | minor | 0.6 | 5 s | ResponseLatencyAnomaly |
Response Latency | major | 1.5 | 50 s | ResponseLatencyAnomaly |
Response Latency | critical | 3.0 | 100 s | ResponseLatencyAnomaly |
Error Rate | minor | 0.6 | 2.5 erps | ErrorRateAnomaly |
Error Rate | major | 1.5 | 25 erps | ErrorRateAnomaly |
Error Rate | critical | 3.0 | 50 erps | ErrorRateAnomaly |
Note: For more information on the TSA, see Configure DDoS Detection guide.
In case of L7 DDoS event, the minimum thresholds are similar to the absolute thresholds defined for critical TSA alerts. That is, for an L7 DDoS event, the following are the minimum thresholds defined for metrics and are not configurable by end users:
- Request Rate: 100 rps
- Error Rate: 50 erps
- Request Throughput: 5 Mbps
- Response Throughout: 50 Mbps
- Response Latency: 100 s