D Murli: November 2019

Load balancers

How do load balancers distribute the web traffic? There are several algorithms:

Round-robin: each request is assigned to the next server in the list, one server after the other. This is also called the poor man’s load balancer as this is not true load balancing. Web traffic is not distributed according to the actual load of each server.
Weight-based: each server is given a weight and requests are assigned to the servers according to their weight. Can be an option if your web servers are not of equal quality and you want to direct more traffic to the stronger ones.
Random: the server to handle the request is randomly selected
Sticky sessions: the load balancer keeps track of the sessions and ensures that return visits within the session always return to the same server
Least current request: route traffic to the server that currently has the least amount of requests
Response time: route traffic to the web server with the shortest response time
User or URL information: some load balancers offer the ability to distribute traffic based on the URL or the user information. Users from one geographic location region may be sent to the server in that location. Requests can be routed based on the URL, the query string, cookies etc.

Apart from algorithms we can group load balancers according to the technology they use:

Reverse Proxy: a reverse proxy takes an incoming request and makes another request on behalf of the user. We say that the Reverse Proxy server is a middle-man or a man-in-the-middle in between the web server and the client. The load balancer maintains two separate TCP connections: one with the user and one with the web server. This option requires only minimal changes to your network architecture. The load balancer has full access to the all the traffic on the way through allowing it to check for any attacks and to manipulate the URL or header information. The downside is that as the reverse proxy server maintains the connection with the client you may need to set a long time-out to prepare for long sessions, e.g. in case of a large file download. This opens the possibility for DoS attacks. Also, the web servers will see the load balancer server as the client. Thus any logic that is based on headers like REMOTE_ADDR or REMOTE_HOST will see the IP of the proxy server rather than the original client. There are software solutions out there that rewrite the server variables and fool the web servers into thinking that they had a direct line with the client.
Transparent Reverse Proxy: similar to Reverse Proxy except that the TCP connection between the load balancer and the web server is set with the client IP as the source IP so the web server will think that the request came directly from the client. In this scenario the web servers must use the load balancer as their default gateway.
Direct Server Return (DSR): this solution runs under different names such as nPath routing, 1 arm LB, Direct Routing, or SwitchBack. This method forwards the web request by setting the web server’s MAC address. The result is that the web server responds directly back to the client. This method is very fast which is also its main advantage. As the web response doesn’t go through the load balancer, even less capable load balancing solutions can handle a relatively large amount of web requests. However, this solution doesn’t offer some of the great options of other load balancers, such as SSL offloading – more on that later
NAT load balancing: NAT, which stands for Network Address Translation, works by changing the destination IP address of the packets
Microsoft Network Load Balancing: NLB manipulates the MAC address of the network adapters. The servers talk among themselves to decide which one of them will respond to the request. The next blog post is dedicated to NLB.

Let’s pick 3 types of load balancers and the features available to them:

Physical load balancers that sit in front of the web farm, also called Hardware
ARR: Application Request Routing which is an extension to IIS that can be placed in front of the web tier or directly on the web tier
NLB: Network Load Balancing which is built into Windows Server and performs some basic load balancing behaviour

No additional failure points:

This point means whether the loadbalancing solution introduces any additional failure points in the overall network.

Physical machines are placed in front of your web farm and they can of course fail. You can put a multiple of these to minimise the possibility of a failure but we still have this possible failure point.

With ARR you can put the load balancer in front of your web farm on a separate machine or a web farm of load balancers or on the same web tier as the web servers. If it’s on a separate tier then it has some additional load balancing features. Putting it on the same tier adds complexity to the configuration but eliminates additional failure points, hence the -X sign in the appropriate cell.

NLB runs on the web server itself so there are no additional failure points.

Health checks

This feature means whether the load balancer can check whether the web server is healthy. This usually means a check where we instruct the load balancer to periodically send a request to the web servers and expect some type of response: either a full HTML page or just a HTTP 200.

NLB is only solution that does not have this feature. NLB will route traffic to any web server and will be oblivious of the answer: can be a HTTP 500 or even no answer at all.

Caching

This feature means the caching of static – or at least relatively static – elements on your web pages, such as CSS or JS, or even entire HTML pages. The effect is that the load balancer does not have to contact the web servers for that type of content which decreases the response times.

NLB does not have this feature. If you put ARR on your web tier then this feature is not available really as it will be your web servers that perform caching.

SSL offload

SSL Offload means that the load balancer will take over the SSL encryption-decryption process from the web servers which also adds to the overall efficiency. SSL is fairly expensive from a CPU perspective so it’s nice to relieve the web machine of that responsibility and hand it over to the probably lot more powerful load balancer.

NLB doesn’t have this feature. Also, if you put ARR on your web tier then this feature is not available really as it will be your web servers that perform SSL encryption and decryption.

A benefit of this feature is that you only have to install the certificate on the load balancer. Otherwise you must make sure to replicate the SSL certificate(s) on every node of the web farm.

If you go down this path then make sure to go through the SSL issuing process on one of the web farm servers – create a Certificate Signing Request (CSR) and send it to a certificate authority (CA). The certificate that the CA generates will only work on the server where the CSR was generated.

Install the certificate on the web farm server where you initiated the process and then you can export it to the other servers. The CSR can only be used on one server but an exported certificate can be used on multiple servers.

There’s a new feature in IIS8 called Central Certificate Store which lets you synchronise your certificates across multiple servers.

Geo location

Physical loadbalancers and ARR provide some geolocation features. You can employ many load balancers throughout the world to be close to your customers or have your load balancer point to different geographically distributed data centers. In reality you’re better off looking at cloud based solutions or CDNs such as Akamai, Windows Azure or Amazon.

Low upfront cost

Hardware load balancers are very expensive. ARR and NLB are for free meaning that you don’t have to pay anything extra as they are built-in features of Windows Server and IIS. You probably want to put ARR on a separate machine so that will involve some extra cost but nowhere near what hardware loadbalancers will cost you.

Non-HTTP traffic

Hardware LBs and NLB can handle non-HTTP traffic whereas ARR is a completely HTTP based solution. So if you’re looking into possibilities to distribute other types of traffic such as for SMTP based mail servers then ARR is not an option.

Sticky sessions

This feature means that if a client returns for a second request then the load balancer will redirect that traffic to the same web server. It is also called client affinity. This can be important for web servers that store session state locally so that when the same visitor comes back then we don’t want the state relevant to that user to be unavailable because the request was routed to a different web server.

Hardware LBs and ARR provide a lot of options to introduce sticky sessions including cookie-based solutions. NLB can only perform IP-based sticky sessions, it doesn’t know about cookies and HTTP traffic.

Your target should be to avoid sticky sessions and solve your session management in a different way – more on state management in a future post. If you have sticky sessions then the load balancer is forced to direct traffic to a certain server irrespective of its actual load, thus beating the purpose of load distribution. Also, if the server that received the first request becomes unavailable then the user will lose all session data and may receive an exception or unexpected default values in place of the values saved in the session variables.

Other types of load balancers

Software

With software load balancers you can provide your own hardware while using the vendor-supported software for load balancing. The advantage is that you can provide your own hardware to meet your load balancing needs which can save you a lot of money.

Above valuable information I have copied from following link "https://dotnetcodr.com/2013/06/17/web-farms-in-net-and-iis-part-1-a-general-introduction/" for more details you can check this link

D Murli

Thursday, November 7, 2019

Load Balancer