DNS HealthCheck Failure Reference

Objective

This document provides information on the various errors due to which a DNS helthcheck failure occurs and associated remedial actions. DNS health checks periodically check the status of a pool of origins a DNS load balancer serves. For more information on DNS load balancers and health checks, see DNS Load Balancer

Use the information provided in this guide to determine why a healthcheck failure occured for your pool and which actions can resolve the issue.


Errors and Remediation

The following chapters provide details on various errors causing DNS healthcheck failures, reasons why the errors occur, and associated remedial actions:

Error: Received String did not Match Configured Value for Monitor

Description: The monitor was created with a specific string to match, and when performing the health check, another string was received.

Remediation: This needs investigation as this could either mean an application failing (e.g. connection to a database is failing), or a misconfigured health check.


Error: ICMP Ping Failure for Monitor from at least One Region

Description: ICMP health checks are happening from all of F5 Distributed Cloud regions. This error indicates that at least one region could not perform this ICMP health check against the origin server.

Remediation: If the error persists, check whether an ACL is not blocking the F5 Distributed Cloud IP addresses used to perform health check (see this page)


Error: Connection Aborted for Monitor from at least One Region

Description: This means that, while performing the health check, a connection with the origin server was established, but was closed by a reset (RST) packet issued by the origin server.

Remediation: Check whether an ACL or a security device is not blocking the F5 Distributed Cloud IP addresses used to perform health check (see this page)


Error: Connection reset for monitor from at least One Region

Description: while running the health check, a connection to the origin server could be established, but the server reset the connection before sending a response. This can happen when e.g.the remote server is configured to ignore invalid requests, or when the remote server crashes.

Remediation: Check the remote server configuration, or modify the health check settings to match the remote server configuration


Error: Connection Refused for Monitor from at least One Region

Description: This error means that health checks from at least one region were refused - usually a SYN packet was observed and a reset (RST) or ICMP "connection refused" message was received in response.

Remediation: Check whether an ACL or a security device is not blocking the F5 Distributed Cloud IP addresses used to perform health check (see this page)


Error: Connection Timed Out for Monitor from at least One Region

Description: the origin server could not be reached in a timely manner. This could be caused by an incorrect IP address, or the remote server being temporarily unavailable (e.g. rebooting).

Remediation: Check the health check configuration, or that the origin server is reachable.


Error: Network Unreachable for Monitor from at least One Region

Description: the network to whom the origin server belongs to is not reachable from the F5 Distributed Cloud backbone.

Remediation: Check the IP address configured for the origin server, or whether there is an ongoing network outage on the origin server network.


Error: Host Unreachable for Monitor from at least One Region

Description: same as network unreachable, but applies only to a specific host

Remediation: check that the host is reachable, and that there's no ACL/firewall rules blocking access from F5 Distributed Cloud.


Error: Timeout Waiting for Response for Monitor from at least One Region

Description: Connection to the endpoint succeeded. But the expected response as configured in the receive parameter for the health-check was not received within a timeout of 30s

Remediation: check that the host is sending a response to the health check request.


Error: Exceeded maxBytes (1MB) for Response Received for Monitor from at least One Region

Description: the maximum response size for a health check is 1 MB. This error means that the response size is too large.

Remediation: Reduce the response message to be less than 1 MB.


Error: Error Encountered in Sending a Monitor Request from at least One Region

Description: Connection to the endpoint succeeded. But there was an error in sending the send string to the endpoint.

Remediation: This should be most likely a transient error. Either the message was too large or the socket was disconnected from the endpoint. Customer should check if the health-check request is working locally.


Error: Error Encountered in Receiving a Monitor Response from at least One Region

Description: Connection to the endpoint was successful and we were able to send the send string to the endpoint. But there was an error encountered when receiving the message.

Remediation: This should be most likely a transient error. Customer should check if the health-check request is working locally.


Error: Encountered Internal Error for Monitor from at least One Region

Description: An internal error was encountered when attempting to fire a health check request.

Remediation: This is likely an issue with the F5XC monitoring service. Customer should file a support request if this persists.


Error: Unknown error

This is similar to internal error. See Encountered Internal Error for Monitor from at least One Region for more information.


Error: TLS Handshake Failure for Monitor Request from at least One Region

Description: While performing the initial TLS handshake (Authenticating the server / Determining the TLS version and cipher suite that will be used for the connection / Exchanging the symmetric session key that will be used for communication), an error occured.

Remediation: There can many reasons for a TLS Handshake to fail: incorrect system time, man-in-the-middle attack, Cipher suite used by client not supported by the origin server, ...