Load Balancing

What is load balancing?
Load balancing is the process of distributing the same information and resources across multiple servers. This lighten's the load for each server as user requests can be spread across a cluster of servers as opposed to being directed to a single one. Load balancing also helps improve reliability and redundancy as if one server fails, requests will simply be redirected towards another working server.
As web applications grow, so does the need for more resources. Load balancing is a straightforward method to help minimize those growing pains and facilitate application scaling.
How does it work?
The way a load balancer works is quite simple:
- The load balancer is in most cases a software program that is listening on the port where client requests are made.
- When a request comes in, the load balancer takes that requests and forwards it to a backend server which is under acceptable load.
- The backend server then fulfills the request and replies back to the load balancer.
- Finally, the load balancer passes on the reply from the backend server to the client.
This way the user is never aware of the division of functions between the load balancer and each backend server.
Types of traffic load balancers handle
Load balancers handle four main types of traffic, these include:
- HTTPS - Load balancers handle HTTPS traffic by setting the
X-Forwarded-For
,X-Forwarded-Proto
, andX-Forwarded-Port
headers to give the backends additional information about the original request. Learn more about how theX-Forward-For
header works. - HTTP - Load balancers handle HTTP traffic in the same way as HTTPS except with it being encrypted traffic.
- UDP - Certain load balancers support the balancing of protocols such as DNS and syslogd that use UDP.
- TCP - TCP traffic can also be spread across load balancers. Traffic to a database cluster would be a good example of this.
Load balancing techniques
Based on what you want to achieve, this will determine which load balancing technique you should use. The following describes three load balancing techniques.
Round-robin
This is the simplest method of load balancing. Using round-robin, the load balancer simply goes down the list of servers sequentially and passes a request to each one at a time. When the list of servers has reached its end, the load balancer simply restarts the process from the beginning.
This method is straightforward and easy to implement, however, it can pose problems since:
- Not all servers may have the same capacity
- Not all servers may have the same storage
- Not all servers may be up
This results in a less than optimal distribution of client requests since one server may get overloaded before the next, however, it will continue to receive requests regardless.
Two ways around this are to implement weighted round-robin or dynamic round robin. Weighted round-robin involves the site administrator assigning a weight to each server based on its capacity to handle requests. On the other hand, dynamic round robin allows a weight to be assigned dynamically based on real-time statistics of the server's load.
IP hash
Based on the vSphere Networking document, IP hash load balancing is described as:
Choose an uplink based on a hash of the source and destination IP addresses of each packet. For non-IP packets, whatever is at those offsets is used to compute the hash.
Put simply, IP hash uses the client's IP address in order to determine which server will receive the request. The largest downfall of this method is that more often than not, the load balancer will not distribute requests to servers equally.
Least connections
This method compares the number of connections for each server in the cluster and based on this will determine which server should process the next request. Each server's capacity limitations are also taken into account when determining which has the smallest amount of connections and which will receive the next request.
Least response time
Using this method, the server with the fewest active connections and the lowest average response times is selected. This ensures speedy delivery of content. To learn more about response times and how to improve them, check out our server response time guide.
Benefits of load balancing
There are a variety of benefits that can be realized from implementing a load balancing system. A few of these include:
- Less downtime and greater redundancy for website operators as if a single server goes down, the system will simply reroute traffic to another active server
- A better user experience for visitors as content will most likely load faster for them. The load balancing algorithms mentioned above are made to either distribute the load equally amongst server, which reduces stress and therefore allows servers to respond faster, or delivers content to users from the nearest available server.
- Less stress on a single server. This goes hand-in-hand with the benefit above. Since there is less stress on a single server, not only will visitors receive content faster but there is also less of a chance that the server will be overloaded with requests.
Of course, there can also be drawbacks to implementing a load balancer. If you're doing it yourself this will add additional complexity to your setup and will take time. Furthermore, whether you're using a software or hardware-based load balancer you will likely need to pay an additional fee for that unless it is already included as a feature in your existing stack.
Summary
For scalability purposes, load balancing can be an efficient and straightforward resource distribution method to implement. Various techniques are available to distribute client requests with the goal of mitigating the chance of overloading the server. Implementing a load balancer is an option to consider for scaling a fast growing web application.