Load balancers
How do load balancers distribute the web traffic? There are several algorithms:
- Round-robin: each request is assigned to the next server in the list, one server after the other. This is also called the poor man’s load balancer as this is not true load balancing. Web traffic is not distributed according to the actual load of each server.
- Weight-based: each server is given a weight and requests are assigned to the servers according to their weight. Can be an option if your web servers are not of equal quality and you want to direct more traffic to the stronger ones.
- Random: the server to handle the request is randomly selected
- Sticky sessions: the load balancer keeps track of the sessions and ensures that return visits within the session always return to the same server
- Least current request: route traffic to the server that currently has the least amount of requests
- Response time: route traffic to the web server with the shortest response time
- User or URL information: some load balancers offer the ability to distribute traffic based on the URL or the user information. Users from one geographic location region may be sent to the server in that location. Requests can be routed based on the URL, the query string, cookies etc.
Apart from algorithms we can group load balancers according to the technology they use:
- Reverse Proxy: a reverse proxy takes an incoming request and makes another request on behalf of the user. We say that the Reverse Proxy server is a middle-man or a man-in-the-middle in between the web server and the client. The load balancer maintains two separate TCP connections: one with the user and one with the web server. This option requires only minimal changes to your network architecture. The load balancer has full access to the all the traffic on the way through allowing it to check for any attacks and to manipulate the URL or header information. The downside is that as the reverse proxy server maintains the connection with the client you may need to set a long time-out to prepare for long sessions, e.g. in case of a large file download. This opens the possibility for DoS attacks. Also, the web servers will see the load balancer server as the client. Thus any logic that is based on headers like REMOTE_ADDR or REMOTE_HOST will see the IP of the proxy server rather than the original client. There are software solutions out there that rewrite the server variables and fool the web servers into thinking that they had a direct line with the client.
- Transparent Reverse Proxy: similar to Reverse Proxy except that the TCP connection between the load balancer and the web server is set with the client IP as the source IP so the web server will think that the request came directly from the client. In this scenario the web servers must use the load balancer as their default gateway.
- Direct Server Return (DSR): this solution runs under different names such as nPath routing, 1 arm LB, Direct Routing, or SwitchBack. This method forwards the web request by setting the web server’s MAC address. The result is that the web server responds directly back to the client. This method is very fast which is also its main advantage. As the web response doesn’t go through the load balancer, even less capable load balancing solutions can handle a relatively large amount of web requests. However, this solution doesn’t offer some of the great options of other load balancers, such as SSL offloading – more on that later
- NAT load balancing: NAT, which stands for Network Address Translation, works by changing the destination IP address of the packets
- Microsoft Network Load Balancing: NLB manipulates the MAC address of the network adapters. The servers talk among themselves to decide which one of them will respond to the request. The next blog post is dedicated to NLB.
Let’s pick 3 types of load balancers and the features available to them:
- Physical load balancers that sit in front of the web farm, also called Hardware
- ARR: Application Request Routing which is an extension to IIS that can be placed in front of the web tier or directly on the web tier
- NLB: Network Load Balancing which is built into Windows Server and performs some basic load balancing behaviour
No additional failure points:
This point means whether the loadbalancing solution introduces any additional failure points in the overall network.
Physical machines are placed in front of
your web farm and they can of course fail. You can put a multiple of
these to minimise the possibility of a failure but we still have this
possible failure point.
With ARR you can put the load balancer in
front of your web farm on a separate machine or a web farm of load
balancers or on the same web tier as the web servers. If it’s on a
separate tier then it has some additional load balancing features.
Putting it on the same tier adds complexity to the configuration but
eliminates additional failure points, hence the -X sign in the
appropriate cell.
NLB runs on the web server itself so there are no additional failure points.
Health checks
This feature means whether the load
balancer can check whether the web server is healthy. This usually means
a check where we instruct the load balancer to periodically send a
request to the web servers and expect some type of response: either a
full HTML page or just a HTTP 200.
NLB is only solution that does not have
this feature. NLB will route traffic to any web server and will be
oblivious of the answer: can be a HTTP 500 or even no answer at all.
Caching
This feature means the caching of static –
or at least relatively static – elements on your web pages, such as CSS
or JS, or even entire HTML pages. The effect is that the load balancer
does not have to contact the web servers for that type of content which
decreases the response times.
NLB does not have this feature. If you put
ARR on your web tier then this feature is not available really as it
will be your web servers that perform caching.
SSL offload
SSL Offload means that the load balancer
will take over the SSL encryption-decryption process from the web
servers which also adds to the overall efficiency. SSL is fairly
expensive from a CPU perspective so it’s nice to relieve the web machine
of that responsibility and hand it over to the probably lot more
powerful load balancer.
NLB doesn’t have this feature. Also, if
you put ARR on your web tier then this feature is not available really
as it will be your web servers that perform SSL encryption and
decryption.
A benefit of this feature is that you only
have to install the certificate on the load balancer. Otherwise you
must make sure to replicate the SSL certificate(s) on every node of the
web farm.
If you go down this path then make sure to
go through the SSL issuing process on one of the web farm servers –
create a Certificate Signing Request (CSR) and send it to a certificate
authority (CA). The certificate that the CA generates will only work on
the server where the CSR was generated.
Install the certificate on the
web farm server where you initiated the process and then you can export
it to the other servers. The CSR can only be used on one server but an
exported certificate can be used on multiple servers.
There’s a new feature in IIS8 called
Central Certificate Store which lets you synchronise your certificates
across multiple servers.
Geo location
Physical loadbalancers and ARR provide
some geolocation features. You can employ many load balancers throughout
the world to be close to your customers or have your load balancer
point to different geographically distributed data centers. In reality
you’re better off looking at cloud based solutions or CDNs such as
Akamai, Windows Azure or Amazon.
Low upfront cost
Hardware load balancers are very
expensive. ARR and NLB are for free meaning that you don’t have to pay
anything extra as they are built-in features of Windows Server and IIS.
You probably want to put ARR on a separate machine so that will involve
some extra cost but nowhere near what hardware loadbalancers will cost
you.
Non-HTTP traffic
Hardware LBs and NLB can handle non-HTTP
traffic whereas ARR is a completely HTTP based solution. So if you’re
looking into possibilities to distribute other types of traffic such as
for SMTP based mail servers then ARR is not an option.
Sticky sessions
This feature means that if a client
returns for a second request then the load balancer will redirect that
traffic to the same web server. It is also called client affinity.
This can be important for web servers that store session state locally
so that when the same visitor comes back then we don’t want the state
relevant to that user to be unavailable because the request was routed
to a different web server.
Hardware LBs and ARR provide a lot of
options to introduce sticky sessions including cookie-based solutions.
NLB can only perform IP-based sticky sessions, it doesn’t know about
cookies and HTTP traffic.
Your target should be to avoid sticky
sessions and solve your session management in a different way – more on
state management in a future post. If you have sticky sessions then the
load balancer is forced to direct traffic to a certain server
irrespective of its actual load, thus beating the purpose of load
distribution. Also, if the server that received the first request
becomes unavailable then the user will lose all session data and may
receive an exception or unexpected default values in place of the values
saved in the session variables.
Other types of load balancers
Software
With software load balancers you can
provide your own hardware while using the vendor-supported software for
load balancing. The advantage is that you can provide your own hardware
to meet your load balancing needs which can save you a lot of money.
Above valuable information I have copied from following link "https://dotnetcodr.com/2013/06/17/web-farms-in-net-and-iis-part-1-a-general-introduction/" for more details you can check this link
No comments:
Post a Comment