Security

In order to provide a secure environment for applications, data, and connectivity many aspects of security need to be chained together under assumptions rooted in a hard-to-crack system backed by cryptographically secure building blocks. Our approach to security is to define and implement these assumptions that have been selected after vetting each of them against industry best practice or academic peer-review.

We implement multi-layer defense using the following techniques as shown in the diagram below.

Security
Figure: Highlevel View of Multi-Layer Security
  1. Identity Management - This provides a common identity to every software component that runs in the system - both customer workload as well as our microservices. A common identity is critical to ensure interworking across different cloud platforms.
  2. Authentication and Authorization - This identity forms the basis for authentication used for secure communication between customer sites and our regional edge sites. This identity is also used to setup mTLS connection between applications.
  3. Secrets - There are many types of secrets (like TLS certificates, passwords, tokens, etc) that need to be stored securely by our software as well customer workloads across various sites. We provide a mechanism whereby the secrets can be stored in our centralized control plane without worrying about system compromise.
  4. Key Management System (KMS) - Data security is critical for distributed applications and data-stores and KMS can be used to manage keys, versioning, rotation for encryption and decryption of data at-rest and in-transit.
  5. Transport Security - Every customer and our infrastructure sites need to use their issued identity for authentication and based on that a secure IPSec/SSL channel is created. There is transport firewall that ensures that a compromised site cannot communicate across tenants.
  6. Network Security - Customer sites in edge or cloud require a combination of Network Policy to order to isolate across subnets/applications and Network Firewall to be configured to prevent volumetric DDoS attacks and anomaly detection.
  7. Application Security - Customer workloads that are protected by F5® Distributed Cloud Mesh (Mesh) or running within F5® Distributed Cloud App Stack’s application management platform can be secured from attacks using Application Firewall (WAFS, App DOS and anomaly detection) and Service Policy can be used to isolate APIs or Applications.

In order to secure F5® Distributed Cloud’s infrastructure, security starts with diligent protection of bootstrap secrets and credentials in a completely offline and secure storage that can be accessed and used only by a very limited number of individuals. These credentials are then used to bootstrap the SaaS and it is designed to boot up as a multi-tenant system with tenant and namespace isolation built in every component.


Tenant Isolation

Since F5 Distributed Cloud Services is a multi-tenant platform, isolation between tenants is designed from the ground up and every single data object that is stored in the system is attached to a tenant that allows for strict enforcement of role-based access policies and enables tenant isolation. Each tenant’s infrastructure is isolated at multiple layers - network, identity, object stores etc.

Great care is taken to make sure that even if one tenant’s edge infrastructure is compromised, it does not result in lower security for any other tenant. In fact, even within a single tenant, the impact of a compromise of a site is contained to that site.


Identity Management

Securely bootstrapping the identity is the most fundamental challenge and one of the earliest steps that should be taken in designing a secure infrastructure. Security Components in SaaS are the maintainers of cryptographic material that acts as root-of-trust for many security services built and offered as a part of our platform.

All other software services that make up SaaS get their cryptographic identity from F5® Distributed Cloud’s Identity Authority, one of the key security components in a secure way. Cryptographic identity is essential for secure communication and access control between components.

The second block in the security chain is the SaaS components that are responsible for registering and keeping track of customer sites and their software services. These components perform secure registration and mint appropriate credentials for each customer site such that the site and its local management components can facilitate bootstrapping of identities of all other applications that will ever run in these sites.

F5 Distributed Cloud treats its own software and customers workloads at the same security level - all workloads are also bootstrapped with cryptographic identity that can be used for secure communication and authorization. This is a very useful feature provided by F5 Distributed Cloud’s platform as it enables customer workloads with the ability to use the identity as-is (x.509 certificate) without any translation/transformation across different cloud providers.

There are two types of identity that needs to be handled by our system:

  1. Site Identity - Site identity is used for many different purposes, one of them is to establish a secure communication channel (using IPSec or SSL) with the regional edge sites (within F5 Distributed Cloud’s global infrastructure) or for secure site-to-site connectivity.

  2. Workload Identity - Customer workloads running in vk8s get their identities provisioned via a security sidecar (called Wingman) that is automatically injected at the launch time.

WLidentity
Figure: F5 Distributed Cloud Identity Management

Once each component, F5 Distributed Cloud’s infrastructure software or customer workloads are provisioned with a unique cryptographic identity, solutions for other security requirements like authentication, authorization, policy enforcement, key management, etc can be built on top of it.


Authentication and Authorization

All software services use their X.509 identities to communicate over authenticated and secure mTLS or IPSec tunnels. This provides end-to-end protection against all network-level spoofing, evesdropping, and man-in-the-middle attacks.

AAA
Figure: Authentication and Authorization in F5 Distributed Cloud Services

Each component adheres to a set of policies for authorization that can allow or deny individual API requests based on the clients identity and attributes. All the authorization decisions are closely monitored and appropriate alerts and notifications are raised in case of any unexpected request.


Secrets Management and Blindfold

Almost all software programs handle some sort of sensitive or secret material such as database passwords, TLS private keys, API keys, IAM credentials etc. F5 Distributed Cloud Services components also need such secrets. However, our strict security policy does not allow these secret materials to reside in source code or configuration. F5 Distributed Cloud has designed a state-of-the-art secrets management system - named Blindfold - that uses advanced cryptographic techniques to keep secrets hidden from all systems, including F5 Distributed Cloud, except for the owner of the secret.

Since the customer workloads are at the same security level as F5 Distributed Cloud Services components, all customer workloads can also use Blindfold. The most notable feature of the Blindfold system is that it allows customers to encrypt their secrets while being completely offline - that is, the clear secret is never given to F5 Distributed Cloud while provisioning or during its use. Only customers’ workloads see their secrets in clear even though F5 Distributed Cloud is involved in decrypting them.

We have created a simple tutorial to describe how-to use Blindfold to secure a TLS certificate when configuring a Virtual Host.

Secret Policy

Secret Policy is an ACL for secrets managed by Blindfold. This policy will dictate who is allowed to decrypt secrets encrypted under this policy. Secret Policy consists of one more Secret Policy Rules. Rules are evaluated depending on the configured rule-combining algorithm and each Secret Policy Rule is a set of predicates. A rule is said to be matched if all the predicates in that rule are true; and upon match the assigned action (allow or deny) is taken.

Details on all the parameters and how they can be configured for this object is covered in the API Specification.

External Secrets Management System

F5® Distributed Cloud Services provide Blindfold system for secrets management. However, sometimes customers may have their own systems for secret management, for example, Hashicorp Vault. Customers can choose to use their own secret management method. In this scenario, tenants need to provide credentials to the external secret management system via Secret Management Access configuration. Credentials to the external secrets management system can themselves be protected using Blindfold.

The external system can be either available through a public network (public IP) or it can be exposed as a virtual-host using our platform and advertised to precise set of sites. This allows customers to keep their secret management system in a protected environment, and yet be available for sites across a distributed infrastructure.

Secrets management access configuration object uses the concept of “where” to apply the configuration.


Key Management and Data Security

Another priority for F5 Distributed Cloud is to keep customer’s configuration and data safe from unauthorized modification and exposure. All F5 Distributed Cloud Services components that handle customer data use encryption of data at-rest and in-transit. The management of the encryption key is handled by a key management system (KMS) that enforces strict policies around access and use of each key. The KMS supports both symmetric and asymmetric encryption, digital signature, HMAC operations with strong cryptographic algorithms like AES-GCM-AEAD, RSA-OAEP, RSA-PSS, and HMAC-SHA256. It also supports versioning and rotation. All keys in the system are protected by a highly secure root-of-trust.

Just like Secrets Management, customer workloads can also use F5 Distributed Cloud’s key management system to protect their data at-rest and in-transit.

KMS Key

KMS key object holds handle and/or value of the cryptographic key. Keys can be for symmetric encryption, asymmetric encryption, digital signature, or HMAC operations. Each key has sensitivity level that is used to determine where the actual bytes of the key are stored and whether or not it is allowed to ever leave the KMS.

Details on all the parameters and how they can be configured for this object is covered in the API Specification.

KMS Policy

A KMS Policy is an ACL for cryptographic keys created in F5 Distributed Cloud KMS. It dictates who is allowed to do what operations using the keys attached to it. KMS Policy consists have one or more KMS Policy Rules. Each Rule will be evaluated based on the rule-combining algorithm configured in the policy.

Details on all the parameters and how they can be configured for this object is covered in the API Specification.

KMS Policy Rule

Each KMS Policy Rule is a set of predicates defining access to a key. A rule is said to be matched if all the predicates in that rule are true; and upon match the assigned action (allow or deny) is taken.

Details on all the parameters and how they can be configured for this object is covered in the API Specification.


Security Sidecar - Wingman

There is a security sidecar, named Wingman, that does almost all of the security-related work for the workload starting from identity bootstrapping, secrets management, external secrets management, key management, certificate rotation, etc.

Every customer workload is automatically injected with this sidecar and acts as an assistant to the customer workload. Since the sidecar executes within the same security boundary as the workload and is only accessible by the workload, no security risk is added; quite the contrary, it helps applications to not handle sensitive material themselves.

In addition, this sidecar is also part of every software component of F5 Distributed Cloud Services and used to secure configuration and manage keys for encryption and decryption. This sidecar is also used in conjunction with Blindfold to secure customer’s TLS certificates without these being available in the clear to any system within F5 Distributed Cloud Services.

Wingman support includes status, identity, and secret management APIs. For more information, see Wingman API Reference.


Security Policies

F5 Distributed Cloud Services offer a variety of security controls and users can choose any or all of them based on their use-case. A common concept of policies is used to enable and configure each security control. There are three main concepts that are common in any type of policy (eg network, service, etc):

  1. Policy set: It is a set of policies that the user wants to be active in a given use case.
  2. Policy: Policy is shareable object. For example, Policy for PCI compliant virtual host can be applied to all virtual-hosts that we want to be PCI compliant.
  3. Rules: Individual policies is made up of rules
SecPolicy
Figure: Security Policies of F5 Distributed Cloud Services

Network Policy

Traditionally, Network policy is attached to some networking constructs like interfaces, networks, or connectors. However, our approach is to decouple network configuration from network policy as it makes it easier to define the intent rather than worrying about network topologies or address assignments. Also, this makes it significantly easier to re-use policies.

Network Policy set consists of many Network Policies and each Network Policy is comprised of many Network Rules. These Policies have three main concepts:

  1. Local-endpoint and Remote-endpoint - Local here is from the point of view of the policy. “Local endpoint” is a “network” for which the policy is being written for. Remote endpoint is the other end of the connection. Local and remote “endpoint” can be a virtual network, a network interface, set of IP address, a specific IP address, a subnet prefix, or an arbitrary set.
  2. Egress Rules - “Egress” is from the point of view of the local endpoint. These rules are for connection originated from the local endpoint to a remote endpoint. For example, if the local-endpoint was an interface, then all connections from endpoints reachable through this interface when processed by the dataplane are classified as Egress connections. Rules match on remote-endpoint, protocol, and port. Actions are allow or deny connection.
  3. Ingress Rules - “Ingress” is from the point of view of the local endpoint. These rules are for incoming connections to the local-endpoint from a remote-endpoint. For example, if the local-endpoint was a network, then all connections to endpoints which are members of this network when processed by dataplane are classified as Ingress connections. Rules match on remote-endpoint, protocol and port. Actions are allow or deny connection.

Traditionally network and firewall rules were written from the point of view of the packet using five tuple in the packet {source IP, destination IP, protocol, source port, destination port}. This is very network centric and requires the operator to know how the traffic is flowing. This makes it very hard to define reusable policies and to define intent based policies. Endpoints, Egress Rules, and Ingress Rules are built in such a way to abstract away the knowledge of routing and traffic flow and focus on the intent.

Configuring Network Policy Set

Network policy set is an ordered list of network policies that is applied to all traffic in a given virtual-network. However, the policies themselves are designed to be independent of the virtual network so they can be easily reused.

There are two categories of networks that exist and each can have network policy set assigned to them:

  1. Infrastructure Networks on a Site - that exist on a particular site or exist globally and are defined in the system namespace. Configuration of these networks is valid on a given site by virtue of associating them to a site on a network interface and using fleet configuration. Network policy set for a site is configured as a network firewall object in the fleet object. This policy set is applied to all infrastructure objects on the sites that are members of the fleet

  2. Namespace Networks - there is an implicit isolated network in every application namespace, called the namespace virtual-network. Namespace-network is per-site network in which all workload Pods are instantiated and there is also a per-site service-network where all kubernetes services on the site are exposed. There is exactly one network policy set allowed to be configured in a particular namespace and is applicable only on namespace-network and service IPs accessed from that namespace. Since only virtual kubernetes pods exist in namespace-network, policy serves as network-level microsegmentation (even though it is popularly referred to as application microsegmentation in the industry) policy for the namespace.

Details on all the parameters and how they can be configured for this object is covered in the API Specification.

Configuring Network Policy

Network policy works at session level for TCP/UDP and network layer for the rest of the traffic types. As described earlier, Network Policy is designed from the point of view of an endpoint. Endpoint could be a single ip address, set of ip address that belong to subnet/prefix, all ip address that match a given label expression (for example, match to PCI-compliant).

Endpoint or set of endpoints for which the policy is written is called local endpoint from a policy point of view. This is defined in one of two ways:

  1. Prefix - is a subnet or a prefix written in the from of <ip address>/<prefix length>, for example 10.1.2.3/32 or 10.1.2.0/24. If the prefix length is 32 then it is single IP address otherwise it is all the IP address in the subnet.

  2. Prefix_selector - Prefix selector is label expression and If an IP address’s labels match the label expression then that IP address will be considered to become part of the local-endpoint.

It is important to note that the decision of matching an IP address to a label (for Prefix_selector) follow the following label inheritance rules:

  • IP address gets all labels of the virtual network on which it originated
  • IP address get all labels of an interface where this IP is reachable:
    • When an IP belongs to an interface or a subnet of an interface
    • When an IP is reachable by a route that points to an interface
    • When IP belongs to a POD in the vK8s, then it gets the labels of that POD and these labels are usually defined in pod spec template in the deployment spec
    • When IPs are from a remote site, then the F5 Distributed Cloud fabric will sync up labels from remote site

Once the local endpoint is specified, then policy rules are written for that endpoint. There are two sets of rules.

  • Ingress rules. From the point of view of local endpoint, these rules apply to all the sessions/traffic received by local endpoint from remote endpoint.
  • Egress rules. From the point of view of local endpoint, these rules apply to all the sessions/traffic sent by local endpoint to remote end point.

Details on all the parameters and how they can be configured for this object is covered in the API Specification.

Configuring Network Policy Rules

Each Network Policy Rule can be configured with the following fields:

  • Action - Currently, supported actions are allow and deny
  • Protocol - For example, TCP, UDP etc
  • Port - Destination port list for the session
  • Remote-endpoint
    • Prefix - as defined above (refer to earlier section on local-endpoint)
    • Prefix_selector - as defined above (refer to earlier section on local endpoint)
    • Prefix set - a list of prefixes, this usually used for white list or black list
  • Label matcher - Label matcher is list of label keys for which label values should be the same for local-endpoint and remote-endpoint. This is useful when writing reusable rules.

A good use-case example for label matcher is to write reusable rules where it takes into account a deployment type. Now let's assume we have two deployments “deployment=staging” and “deployment=production” and the rules says that local-endpoint “A” can talk to remote-endpoint “B” over port TCP 80. Since we don’t want an endpoint “A” in production to be allowed to setup a connection to endpoint “B” in staging, we can use Label macther to ensure this. The rule can be modified as - Local endpoint “A” can talk to remote endpoint “B” and match label for key “deployment”. This will ensure that the connection is allowed only if the value of deployment is the same for A and B.

Details on all the parameters and how they can be configured for this object is covered in the API Specification.


Service Policy

Traditionally application policies are attached to a proxy construct like a load balancer. However, our approach is to decouple policy configuration from connectivity configuration. As stated before in the network policy section, this allows the user to define intent rather than worry about packet flows and also makes it easier to reuse policies.

As described earlier, Service Policy set consists of many Services Policies and each Service Policy is comprised of many Rules. These Policies have two main concepts:

  1. Server and Client - Server here is from the point of view of the policy. Server is a set of domains for which the policy is being written for. Client is a set that is consuming this service and at the other end of the connection. For example, server can be a domain in virtual host. Client can be set of ip address, a specific IP address, or a subnet/prefix, service name, or an arbitrary set.

  2. Rules - Rules are applied to all requests to the Server from the Client. In the case of TCP Proxy with SNI, only server domain and Client can be matched. In case of HTTP and HTTPS proxies - domain, path, query parameters and HTTP headers can be matched. Requests can be allowed or denied based on this match.

An application may be comprised of many microservices or VMs. As a result, an application can be defined in terms of a set of virtual hosts with each application with its own service policy set.

Configuring Service Policy Set

Service policy set is an ordered list of service policies that are applied to all requests/traffic to all virtual hosts (proxy) in a given Namespace. However policies themselves are designed to be independent of the virtual host so they can be easily reused.

There are two categories of proxies that can exist in the system and each can have a service policy set associated with them:

  • Infrastructure proxies on a Site - that exists on a particular site when connecting two networks. These are defined in the system namespace. Configuration of these proxies is valid on a given site by virtue of associating them to a site on a network interface and using fleet configuration. Service policy set for a site is configured in network firewall object in the fleet object. This policy set is applied to all proxies on the sites that are members of the fleet.

  • Application Virtual Hosts in a Namespace - There is exactly one service policy set allowed in a namespace. Service policy set configured in a namespace is applicable to all virtual-hosts or services created in vk8s in that namespace. This policy can be used to perform application-level microsegmentation for the namespace.

Details on all the parameters and how they can be configured for this object is covered in the API Specification.

Configuring Service Policy

Service policy works at request level for HTTP/HTTPS proxy, SNI level for TCP proxy with SNI, and IP level for TCP/UDP proxy. Policy is designed from the point of view of the proxy. The policy can be designed for a single proxy or a set of proxies or all proxies matching a given label expression.

Proxy or set of proxies for which policy is written is called a server from a service policy point of view. Server is defined in one of three ways:

  1. Name - A fully qualified domain name of the server

  2. Server selector - server selector is a label expression and if a given endpoint matching the host header or SNI matches the label expression than it is considered as server

  3. Name matcher - It is any form of regular expression for fully qualified domain name of all possible servers

Once the server is specified, then policy rules are written for that endpoint.

Details on all the parameters and how they can be configured for this object is covered in the API Specification.

Configuring Service Policy Rules

Each Service Policy Rule can be configured with the following fields:

  • Action - Currently, supported actions are allow and deny
  • Client - that is accessing the server can have either of these configured:
    • Name - the fully qualified name of the client (eg. service name)
    • Label selector expression - for the client. For a client coming over the public Internet, implicit labels like geo-ip country or geo-ip region can be used
    • Roles - Role is currently used in F5 Distributed Cloud RBAC policies only
    • Name matcher - Regular expressions to match for client name
  • Label matcher - Label matcher is list of label keys for which label values should be the same for server and client
  • Client IP list - to implement white-list or black-list
  • Client ASN - for clients coming over the public Internet.

The following additional fields are valid for HTTP/HTTPs Proxy type:

  • Path - this is the URL path in the request
  • Header match - HTTP headers in the request
  • Query param match - Query parameters in the request
  • HTTP method - method of the request e.g GET, POST or PUT etc

Details on all the parameters and how they can be configured for this object is covered in the API Specification.


User Rate Limiting

Rate limiting is a method of protecting backed applications by controlling the rate of traffic coming into or out of an application. The rate is specified by how many times a Virtual Host Route can be called within a specific time interval (per second or minute). If the number of requests exceeds the defined limit, the incoming requests can overwhelm the capacity of the services resulting in poor performance, reduced functionality, and downtime. These can be the result of either intentional (DDoS) or unintentional events (misconfiguration of applications/clients).

As applications get deployed into production, the DevOps teams need to expose those applications' VIPs to clients. Those clients can be external public users (F5 Distributed Cloud ADN or Mesh CE with Public IPs) or in uncontrolled environments (like Enterprise Edge sites Private Networks). These applications will need to ensure reliable user experiences, SLAs, and security, with end to end visibility.

F5 Distributed Cloud's User Rate Limiting allows the SecOps or DevOps admin to define security policy in our shared or <app> namespaces. F5 Distributed Cloud's implementation of User Rate Limiting allows the administrator to limit the number of API requests per second, minute, or hour from each user. The combination of a User Identified + Rate Limiter is what enabled the User Rate Limiting feature in Mesh.

Each User Identification can be defined as ONE of the following:

  • Client IP Address: The client IP source address as the user identifier.
  • Cookie Name: An HTTP cookie value as the user identifier.
  • HTTP Header Name: User a specific HTTP header value as the user identifier.
  • Query Parameter Key: Use the query parameter value for the given key as the user identifier.

Each Rate Limiter is a tuple consisting of the following Rate Limit Values:

  • Number: The total number of allowed requests for 1 unit (Seconds/Minute) of the specified period.
  • Per Period: Unit for a period per which the rate limit is applied.

Network Firewall

Network Firewall object has all network security configuration for a site and serves as a single place to configure all infrastructure security. A fleet object also has a network firewall object that can be used to simplify configuration of network firewall across many sites in a fleet.

Configuring Network Firewall

Network Firewall has the following objects that can be configured:

  • Network Policy set - Network policies in a network policy set are applied to all physical networks on the sites within Fleet. These session based rules are applied to every new session on these sites.

  • Service Policy set - Service policies in a service policy set are applied to all forward proxies and dynamic reverse proxies in all the network connectors in the fleet.

  • Fast ACLs - Fast ACLs are list of rules that are not session-based and applied to all packets. These rules allow action of allow, deny and ratelimit. These rules are also used to mitigate DOS attacks. In case of regional edge sites within the F5 Distributed Cloud global infrastructure, they are also used to mitigate volumetric DDOS attacks (in conjunction with other techniques) and are automatically managed by our SaaS service. In case of customer sites, these rules can be configured by the customer.

Details on all the parameters and how they can be configured for this object is covered in the API Specification.


Application Firewall

Application Security is a complex topic and needs a multi-layer approach to solving the problem. There are many technologies needed to comprehensively secure an application and we provide a set of tools that are expected from an application infrastructure platform. These tools use a combination of traditional static signature-based techniques, statistical algorithms, and more dynamic machine learning approaches to solve the problem of application protection. These set of tools are represented by three capabilities in the system:

  1. WAF - The Web Application Firewall (WAF) combines multiple security solutions into a simplified single WAF, providing protection from a range of attacks on HTTP traffic with ability to identify and take action. The action can be either to log threats or both log and block threats. The WAF can be used in monitoring or blocking mode.

  2. Behavioral Analysis - System performs machine learning to understand the behavior of clients and server based on logs and metrics collected by monitoring system in a service mesh. It generates alarms when it detects anomalies in the behavior.

  3. Application DOS Protection - While our global infrastructure and backbone has built-in capability to protect against volumetric DDoS attacks, there is a growing need for more complex application level attacks and BOTs. This can be prevented using a combination of rules-based WAF, behavioral analysis, and Fast ACLs from the Network Firewall section.

WAF

The ability to selectively apply security while continually monitoring for threats on both requests and responses is the centerpiece of the WAF. The WAF integrates behavioral analysis and signatures to assess the threat associated with any given client session. The WAF detects signature-based attacks, distinguishes good bots from malicious bots, runs threat campaigns, detects various security violations, detects false positive events, and applies associative preventive actions. The WAF also provides ability to mask sensitive data in request logs, customize blocking response page, and define the list of allowed response codes. User simply needs to define a name for the WAF to activate WAF protection with default mode. The default mode enables WAF to monitor all attack types, high and medium accuracy signatures, automatic attack signature tuning, threat campaigns, and all violations.

The WAF provides following security functionalities that can be selectively applied to the application:

  • Attack Types - Attack types are the rules or patterns that identify attacks or classes of attacks on a web application and its components. These rules and patterns are known as attack signatures and each attack type can have one more more attack signatures. The WAF compares patterns in the attack signatures against the contents of requests and responses looking for potential attacks. Some of the signatures are designed to protect specific operating systems, web servers, databases, frameworks or applications.

  • Attack Signatures - Attack signatures are rules or patterns that identify attack sequences or classes of attacks on a web application and its components. Attack signatures can apply to both requests and responses.

  • Accuracy Levels - Indicates the ability of the attack signature to identify the attack including susceptibility to false-positive alarms. Users can choose accuracy levels for the signatures and the following types of accuracy levels are supported:

    • High Accuracy - This results in a low possibility of false positives.
    • High and Medium Accuracy - This results in some possibility of false positives. This is default setting.
    • High, Medium, and Low Accuracy - This results in a high possibility of false positives.
  • False Positive Suppression - The WAF uses advanced machine learning to suppress false positives and user can choose to enable or disable the false positive suppression. By default, this setting is enabled. Default setting for signature accuracy and false positive suppression result in a balance between false positive and false negative rate. This in turn results in best WAF accuracy. Disabling it increases the probability of false positive (legitimate users that are impropertly identified as an attack) rate while decreasing the false negative (real attacks that are not properly identified) rate. In the same way, turning on all signatures (high, medium and low) decreases false negatives while also increase the possibility of false positives.

  • Threat Campaigns - Attackers constantly look for ways to exploit the latest vulnerabilities and/or new ways to exploit old vulnerabilities. A threat campaign is an attack associated with a specific malicious actor, attack vector, technique, or intent. Threat Campaign protection is based on a variety of threat intel sourced from the real world campaigns to attack and/or take over resources. The Threat Campaign signatures are based on current “in-the-wild” attacks. These signatures contain contextual information about the nature and purpose of the attack.

The Threat Campaign signatures are different from the Automatic Attack Signatures. Threat Campaigns provide targeted signatures to protect organizations from pervasive attacks that are often coordinated by organized crime and nation states. Threat Campaigns provide critical intelligence to fingerprint and mitigate sophisticated attacks with nearly real-time updates resulting from continuous research.

As an example, a normal WAF signature might report that SQL injection was attempted. A Threat Campaign signature will report that a known threat actor used a specific exploit of the latest Apache Struts vulnerability (CVE -xxxx) in an attempt to deploy ransomware for cryptomining software.

The WAF provides option for user to enable or disable threat campaigns.

  • Violation Detection - Attacks may contain violations such as a missing mandatory header or a malformed request body. The WAF supports setting detection for specific violations from a list of known violations. All violations are enabled for detection by default.

  • Bot Protection - Bot detection and related protective actions are simplified for users. The WAF detects and classifies bots into Good, Suspicious, and Malicious bots based on user-agent signatures and user can set an associated protective action for each category. The bots are identified into the following 3 categories:

    • Good Bot - A client that exhibits known search engine behaviors/signatures.

    • Suspicious Bot - A client that exhibits non-malicious tools or bot behaviors/signatures. The following are suspicious bot examples:

      • Tools — Site crawlers, monitors, spiders, and web downloaders.
      • Bots — Search bots and social media agents.
    • Malicious Bot A client that the system detects using a combination of bot signatures, browser verification tests, and anomaly detection heuristics, including DoS tools, known exploit tools, and vulnerability scanners.

  • Allowed Response Status Codes - User can specify which HTTP response status codes are allowed.

  • Mask Sensitive Parameters in Logs – Users can mask sensitive data in request logs by specifying http header name, cookie name, or query parameter name. Only values are masked. By default, values of query parameters card, pass, pwd, and password are masked.

  • Custom Blocking Response Page – When a request or repsonse is blocked by WAF, users have the ability to customize the blocking response page served to the client.

In addition to configuring the detection settings, the decision needs to be made on whether to log and block the attacks or just log an attack and let the security or application experts decide what actions to take:

  1. Monitoring mode - This will configure the WAF to log requests and responses that match the configured detection settings. No attack is blocked in this mode.

  2. Blocking Mode - This will configure the WAF to both log and block the requests and responses that match the configured detection settings.

Configuring WAF

The user can configure the WAF per load balancer by attaching waf object to a load balancer. Individual waf-objects can be shared by multiple load balancers within a namespace. In addition, these objects can also be attached to load balancer or per route basis. Since waf objects can be shared with multiple load balancers and multiple wafs can be configured to given load balancers by way of routes, monitoring is not based on these WAF objects. All monitoring for a waf is done on a per load balancer basis to provide insights into application.

Details on all the parameters and how they can be configured for this object is covered in the API Specification under Waf object.

Behavioral Firewall

Behavior analysis will happen only if advanced machine learning is enabled for a namespace. Machine learning happens in the centralized control plane and it uses metrics and log data collected from the distributed proxies to create a model. This model is then distributed to each customer and regional edge-site where tenant services are enabled. The network data-plane will perform inference using the generated model.

There are three types of behavioral analysis that can be performed by the system:

  1. Per request anomaly detection - AI models is created to learn baseline behavior of different kinds of requests. This model is used for inference in the proxy’s request path to flag requests that deviate from the learnt models. Request behavior is characterized by metrics such as request size, response size, and request to response latency.
  2. Business logic markup - the system uses the request logs (from client to server) and metrics from the virtual-host to learn:
    1. API endpoints Markup - set of API Endpoints that make up an application. This includes identifying and tokenizing dynamic components in the URL.
    2. Probability Distribution Function - for each API Endpoint, various behavioral models are generated that allow per request anomaly detection.
  3. Time-series anomaly detection - Time series metrics for request rate, errors, latency and throughput are used to detect anomalies detection. This uses a statistical algorithms and not done using neural-networks.

Configuring Behavioral Analysis

A set of one or more services are considered an application and can be defined by creating an app_type object in the shared namespace. Reason that app_type is needed is because all machine learning models are created on an application type.

Definition of Application “Instance” - One instance of application is defined by all virtual-host objects with the same app_type label and within the same namespace. To simplify deployment and monitoring, instances are assumed not to span across namespaces and all monitoring is scoped by the namespace.

As more instances associated with an application are deployed, the system will use data from all of these to train the AI model. When a new instance is deployed training data from other instances can be used for inference.

Within the namespace many instances of an application can be present. For each instance of application advance monitoring and AI algorithms can be enabled or disabled. This is done using app_setting configuration object.

These two things can be configured using

  1. app_type object - Set of one or more services are considered to comprise an application. app_type name is used as label, ves.io/app_type=<app_type.name>. This label can be assigned to virtual hosts and services.
  2. app_setting object - There is exactly one app_setting object per namespace. app_setting object controls following
    1. List of app_types that are enabled for anomaly detection
    2. Types of anomalies to be detected:
      1. Per-site
      2. Per-service
      3. Per-site, per-service

Details on all the parameters and how they can be configured for this object is covered in the API Specification

Application Denial of Service Attacks

Denial of service attacks can be detected on the Applications and APIs using a combination of techniques - alerts from rules-based WAF as well as anomaly alerts from the Behavioral analysis - and these alerts can be used to generate service policies at the application level as well as network level. Since applications are always protected using our distributed proxies and our global network, any network level denial of service attacks affect only the data-plane and the data-plane is able to handle various resource exhaustion attacks (e.g. flow table using syn flood), fragmentation buffer, nat pool etc. In addition, the data-plane provides a way for users to configure fast ACLs and prevent application level attacks from clients or BOTs.

Details on all the parameters and how they can be configured for this object is covered in the API Specification