Okay, so you understand webfarms now. What's the magic that actually distributes the load, and how does it determine how the distribution is handled?
At ORCS Web we use the Foundry Server Iron products to perform our webfarm load-balancing. If one of them fails, the other instantly takes over (In our testing, it had sub-second fail-over!)
So what is this "Server Iron" thing? In simplest terms, it's a layer 4-7 switch. It has multiple network ports on it and can be used literally like other types of switches. But, it can also load-balancing and traffic distribution. A VIP (virtual IP) can be assigned to the SI (Server Iron) and it then handles all traffic sent to that address/VIP. Further configuration is done to tell the SI what to actually do with the traffic sent to the VIP address.
The traffic that hits the VIP on the Server Iron is of course redistributed to a number of server nodes so the client request can be satisfied - that's the whole point of a webfarm. If one or more server nodes are not responding, the switches are able to detect this and send all new requests to servers that are still online - making the failure of a server node almost transparent to the client.
The traffic can be distributed based on a couple different logic algorithms. The most common are:
- Round Robin: The switches send requests to each server in rotation, regardless of how many connections each server has or how fast it may reply.
- Fastest response: The switches select the server node with the fastest response time and sends new connection requests to that server.
- Least connections: The switches send traffic to whichever server node shows as having the fewest active connections.
- Active-passive: This is called Local/Remote on a Foundry switch, but is still basically active/passive. This allows one or more servers to be designated as "local" which marks them as primary for all traffic. This is combined with another method above to determine what order the "local" server nodes have requests sent to them. If a situation were to arise that all the "local" (active) server nodes were down, then traffic would be sent to the "remote" server nodes. Note that "remote" in this case doesn't really have to mean remote - the "remote" server could be sitting right next to the "local" servers but it is marked as remote in the configuration to let it operate as a hot-standby server. This setting can also be used in a true remote situation where there are servers in a different physical data center - perhaps for extreme disaster recovery situations.
What method is best? It really depends on your application and some other surrounding factors. Each method is good though and would probably satisfy requirements regardless of the configuration. Especially if you are closely monitoring each server node with an external tool (other than directly from the load-balancing switch). So, with the external monitoring you can confirm that all server nodes are operating without error and within reasonable speed thresholds that have been set.
Also, remember that, regardless which traffic algorithm is chosen, if a node goes down, traffic is sent to the other nodes. And when a node comes back online, it can automatically be placed back into the webfarm and start getting client requests again.
Clustered hosting does require some consideration of how state is managed within applications, which will be covered in a future article.
By Brad Kingsley, President and Founder of ORCS Web, Inc. - a company that provides managed hosting services for clients who develop and deploy their applications on Microsoft Windows platforms.