What is Load Balancing?
Load balancing is the process of distributing the same information and resources across multiple servers. This lighten’s the load for each server as user requests can be spread across a cluster of servers as opposed to being directed to a single one. Load balancing also help improve reliability and redundancy as if one servers fail, requests will simply be redirected towards another working server.
As web applications grow, so does the need for more resources. Load balancing is a straightforward method to help minimize those growing pains and facilitate application scaling.
How Does it Work?
The way a load balancer works is quite simple:
- The load balancer is in most cases a software program that is listening on the port where client requests are made.
- When a request comes in, the load balancer takes that requests, and forwards it to a backend server which is under acceptable load.
- The backend server then fulfills the request and replies back to the load balancer
- Finally, the load balancer passes on the reply from the backend server to the client.
This way the user is never aware of the division of functions between the load balancer and each backend server.
Load Balancing Techniques
Based upon what you want to achieve, this will determine which load balancing technique you should use. The following describes 3 load balancing techniques.
This is the simplest method of load balancing. Using Round-Robin, the load balancer simply goes down the list of servers sequentially and passes a request to each one at a time. When the list of servers has reached it’s end, the load balancer simply restarts the process from the beginning.
This method is straightforward and easy to implement, however it can pose problems since:
- Not all servers may have the same capacity
- Not all servers may have the same storage
- Not all servers may be up
This results in a less than optimal distribution of client requests since one server may get overloaded before the next, however it will continue to receive requests regardless.
Two ways around this is to implement weighted round-robin or dynamic round robin. Weighted round robin involves the site administrator assigning a weight to each server based on it’s capacity to handle requests. On the other hand, dynamic round robin allows a weight to be assigned dynamically based on real time statistics of the server’s load.
Based on the vSphere Networking document, IP Hash load balancing is described as:
Choose an uplink based on a hash of the source and destination IP addresses of each packet. For non-IP packets, whatever is at those offsets is used to compute the hash.
Put simply, IP Hash uses the client’s IP address in order to determine which server will receive the request. The largest downfall of this method is that more often than not, the load balancer will not distribute requests to servers equally.
This method compares the amount of connections for each server in the cluster and based on this will determine which server should process the next request. Each server’s capacity limitations are also taken into account when determining which has the smallest amount of connections and which will receive the next request.
For scalability purposes, load balancing can be an efficient and straightforward resource distribution method to implement. Various techniques are available to distribute client requests with the goal of mitigating the chance of overloading the server. Implementing a load balancer is an option to consider for scaling a fast growing web application.