Cloud
Examine critical components of Cloud computing with Intel® software experts
109 Discussions

Accelerated Offload Connection Load Balancing in Envoy; Envoy's First Hardware Feature

looong
Employee
0 0 4,211

What is connection load balancing?

Connection load balancing, also known as connection balancing, is a core networking solution used to distribute traffic across multiple servers in a server farm. Load balancers improve application availability and responsiveness and prevent server overload. Each load balancer sits between client devices and backend servers, receiving and then distributing incoming requests to any available server capable of fulfilling them.
A typical web server usually has multiple workers (processors or threads). If too many clients connect to a single worker, that worker becomes busy and brings big tail latency while other workers run in a free state; this impacts the performance of the web server. Connection load balancing solves this problem.

What does Envoy do for connection load balancing?

Envoy provides a connection load balance implementation called Exact connection balance. As its name implies, a lock is held during balancing so that connection counts are nearly exactly balanced between workers. This is "nearly" exact in the sense that a connection might close in parallel, thus making the counts incorrect; this should be rectified on the next acceptance. This balancer sacrifices accept throughput for accuracy and should be used when there are a small number of connections that rarely cycle, e.g., service mesh gRPC egress.

loong-envoy-image-1.png

 

Figure 1: Default thread mode of Envoy

This section explains how workers in Envoy first get connections. As Figure 1 shows, the kernel works as a dispatcher by using hash computing, based on these four properties as a set (also known as a quadruple): source IP, source port, destination IP, and destination port.

loong-envoy-image-2.png

 Figure 2: Workflow of exact connection balance in Envoy

Figure 2 shows how Exact connection balance works. Assuming there are three threads, their respective number of connections in their own queue is 2, 5, 1 (called as T (2), T (5), T (1)). When a new connection comes, the Kernel dispatches it to T (5) 

  1. T (5) gets the connection, adds the lock, and compares the number of connections currently in the queue per thread. The smallest number means the idlest, so the result is T (1).
  2. Since the result is not T (5) itself, T (5) needs to post this connection to T (1). T (1) gets the connection, then handles it directly, and no longer compares it.

Obviously, it is not suitable for an ingress gateway since an ingress gateway accepts thousands of connections within a brief time. The resource cost of the lock causes a big drop in throughput.

How Intel® Dynamic Load Balancer accelerates connection load balance in Envoy

Intel® Dynamic Load Balancer (Intel® DLB) is a hardware managed system of queues and arbiters connecting producers and consumers. It is a PCI device envisaged to live in the server CPU uncore and can interact with software running on cores and potentially with other devices.

Intel DLB implements the following load balancing features:

  • Lock-free multi-producer/multi-consumer operation
  • Multiple priorities for varying traffic types
  • Various distribution schemes

There are three types of load balancing queues:

  • Unordered: spray the packets across multiple workers and to not preserve the order.
  • Ordered: like unordered, except that the system provides a means of restoring the original flow order. Synchronization mechanisms may still be required in the software.
  • Atomic: ensure that packets from a given flow can only be outstanding on a single worker at a given time. It dynamically pins flows to workers and migrates flows between workers to balance loads when required. This preserves flow order and allows the processing software to operate in a lock-free manner. As such, this type of distribution is highly desirable in modern packet processing equipment, such as NICs.

An ingress gateway is expected to process as much data as possible as quickly as possible, so the unordered queue is sufficient for it.

loong-envoy-image-3.png

 

Figure 3: workflow of Intel DLB connection balance in Envoy

Figure 3 shows how Intel DLB connection balance works. Assuming there are three threads (T (3), T (5), T (2)), when a new connection comes, the Kernel dispatches it to T (5):

  1.  T (5) gets the connection and sends the connection to Intel DLB.
  2.  Intel DLB does the balancing and uses eventfd to notify T (2) to get the connection. It then handles it directly, and no longer sends the connection.

In this way, we get a free-lock offload accelerated connection balance.

How to use accelerated offload connection load balance in Envoy

Now that Intel DLB connection balance support has been added to Envoy, see https://www.envoyproxy.io/docs/envoy/latest/configuration/other_features/dlb#config-connection-balance-dlb.
You can download the envoyproxy/envoy-contrib image directly from Docker hub to try.

About the Author
Intel cloud native engineer, Microsoft MVP, engaged in the cloud native industry for many years, have participated in the whole process of microservice splitting, development and governance, and have contact with the upstream and downstream needs of the microservice field. Deeply cultivating open source, made a lot of contributions to many cloud native projects, have a unique understanding of the contribution and maintenance governance of open source communities. Currently, I am the maintainer of Dapr, Thanos, and Golangci-lint. Now I mainly focus on the field of service mesh and explore a new paradigm of combining cloud-native software and hardware.