As your cloud environment is expanding, your application instances are scaling out, and their number is increasing on a daily basis. At some point, you’ll need to leverage some mechanism that will help you manage them all. One such solution is a load balancer, which not only balances network request load between virtual instances but also provides several more features.
What is load balancing?
Load balancing is a mechanism that automatically distributes traffic across multiple servers or virtual instances. Once in AWS, you can manage your own load balancers installed on EC2 instances, like F5 BIG-IP or open-source HAProxy, or you can use an AWS native service called Elastic Load Balancing (ELB).
In this article, I will cover the basics of Elastic Load Balancer. We will run through a comparison of different types of Elastic Load Balancers, then I will provide you with the essential glossary terms, and finally, we’ll investigate the pricing options. Let’s dive in!
Getting Started with AWS Elastic Load Balancer (ELB)
Elastic Load Balancer is an AWS managed service providing highly available load balancers that automatically scale in and out and according to your demands. In general, the service helps you achieve high availability by distributing traffic between healthy instances in multiple availability zones.
Although the ELB service is highly-available within a single region, it is always recommended to use multiple availability zones to provide HA to your application. Additionally, to make achieving high availability easier, Elastic Load Balancers integrate with the Auto Scaling service, which allows scaling out EC2 instances behind load balancers, and automatically including new instances to the load balancer target-group.
How to choose the right load balancer on Amazon
Elastic Load Balancing provides three types of load balancers:
- Classic Load Balancer
- Network Load Balancer
- Application Load Balancer
The below table includes a per-feature comparison of the three balancers:
|Feature||Application Load Balancer||Network Load Balancer||Classic Load Balancer|
|Protocols||HTTP, HTTPS||TCP||TCP, SSL, HTTP, HTTPS|
|Connection draining (deregistration delay)|
|Load Balancing to multiple ports on the same instance|
|IP addresses as targets|
|Load balancer deletion protection|
|Configurable idle connection timeout|
|Cross-zone load balancing|
|Server Name Indication (SNI)|
|Back-end server encryption|
|Elastic IP address|
|Preserve Source IP address|
How to choose the best one for your application? Let’s see what the three types of load balancing offer in more details.
Classic Load Balancer
This is the oldest type of load balancers provided by AWS. It may execute the following operations:
- passing through TCP traffic,
- handling HTTP/HTTPS requests,
- or terminating SSL and inserting HTTP headers.
Classic Load Balancers were initially designed for applications within the EC2-Classic network, and they are not recommended when using Virtual Private Clouds (VPCs). Let’s leave them out for now.
Network Load Balancer
The Network Load Balancer operates at Layer 4 (Transport Layer) of the OSI model. When NLB receives a connection request, it selects a target from the associated target-group and then attempts to open a TCP connection to the port selected in the listener configuration. A target can be either an EC2 instance, a container, or an IP address. However, it cannot be in a peered VPC or a network behind a VPN connection.
This type of balancing is capable of handling millions of requests per second while maintaining ultra-low latencies, which is essential for latency-sensitive applications. It is also optimized to handle sudden and volatile traffic patterns while using a single static IP address per Availability Zone.
Passing traffic on the Network Load Balancer
The main idea is that Network Load Balancer listens on a specific port and when traffic arrives, it is distributed between hosts within a target group. Additionally, with NLB, you can set a health check to monitor the condition of instances. If any of them is unable to run the traffic, it receives a faulty status, and no traffic goes through that instance until it recovers.
Hosts within a target group can be registered by IP or by Instance ID. However, when IP registers them, the source IP of the client cannot be preserved, and a proxy protocol has to be used to pass the client IP on to the target. The issue does not appear when Instance ID is used to register the target.
Application Load Balancer
Application Load Balancers work similar to Network Load Balancers, but they only use HTTP and HTTPS protocols. After an ALB receives a request, the listener rules are evaluated to determine which rule should be applied, and then a target host from an appropriate target group is selected. Requests can be routed to a different target group, based on the content of the application traffic. Just as with Network Load Balancers, health checks can be used so that the traffic is routed only to healthy instances. Instances within a target group can be registered with IP or InstanceID, similar to Network Load Balancer.
Internet-facing and internal load balancers
When setting load balancers, you can choose if they are to be Internet-facing or internal ones. An Internet-facing load balancer, as the name implies, receives requests from the Internet and passes them by to internal instance. It has a DNS name that should be used to send requests to the application. It is recommended to use the DNS name instead of IP, as IPs can change for instance when the load balancer scales out.
Internal load balancers cannot distribute traffic from the Internet. They can be used for instance to spread traffic between different layers of your infrastructure. We may say that Internet-facing load balancers distribute traffic between front-end servers, which then point to an internal load balancer distributing traffic between application servers.
Load balancers – a glossary of essential terms
Let’s go through the most important terms related to load balancing.
A process that waits for requests. It is similar to the SSH daemon on the Linux server waiting for incoming connections, but it offers more features. The SSH daemon only listens to a specific port and protocol, and the listener also has a port and protocol configured to send traffic to.
Application Load Balancers listen to HTTP or HTTPS ports, while Network Load Balancers can listen on any TCP port. Traffic is then related to the port configured in the target group.
When using HTTPS for Application Load Balancer, you can attach the SSL certificate to the load balancer and decrypt traffic before passing it to the target.
These are rules configured on the Application Load Balancer to determine how the requests are routed to the target in one or more target groups. Rules are evaluated in priority order from lowest to the highest value. When the rule condition is met, traffic is forwarded to the specific target group.
Targets refer to destinations selected within target groups where traffic is routed. These can either be instance IDs or IP addresses of instance interfaces.
If an instance ID is used, requests are routed to the primary private IP address specified in the primary network interface for the instance. If an IP is used, it can be any of the private IP addresses from any interface of the instance. For instance ID targets, a load balancer can leverage the Auto Scaling group. When a target group is attached to the Auto Scaling group, new instances will be created within the target group.
An object that allows you to group your targets and create health checks for them.
A process that ensures that a load balancer does not send requests to instances which are deregistering or unhealthy while keeping the existing connections open. Thanks to such mechanism, the load balancer may complete in-flight requests made to these instances.
This Application Load Balancer feature allows you to bind a session to a specific instance to ensure that all requests within the same session terminate at the same instance. Sessions for Network Load Balancer are inherently sticky due to the flow hashing algorithm used.
How does AWS load balancing pricing work?
The AWS ELB pricing depends on the balancing type.
Application Load Balancer pricing
In the Application Load Balancer, you pay per each hour (or partial hour) that the balancer is running and per Load Balancer Capacity Units (LCU) time consumed – also per hour/partial hour.
LCUs measure the dimensions on which the Application Load Balancer processes your traffic (averaged over an hour). These four dimensions are:
- New connections – the number of newly established connections per second
- Active connections – the number of active connections per minute
- Bandwidth – the amount of traffic processed by the load balancer in Mbps
- Rule evaluations – the product of the number of rules processed by your load balancer and the request rate. The first ten rules are free (Rule evaluations = Request rate * (Number of rules processed – 10 free rules).
You pay only for the dimension with the highest usage. An LCU contains:
- 25 new connections per second
- 3,000 active connections per minute
- 2.22 Mbps (which translates to 1 GB per hour)
- 1,000 rule evaluations per second
Network Load Balancer pricing
Calculations for the Network Load Balancer are similar, but only three dimensions are measured:
- New connections or flows
- Active connections or flows
An LCU for the Network Load Balancer (NLCU) contains:
- 800 new non-SSL connections or flows per second
- 100,000 active connections or flows (sampled per minute)
- 2.22 Mbps (which translates to 1GB per hour).
Which AWS Load Balancing should you choose?
It… depends. One thing is sure, if you are planning to implement a scalable environment in AWS, Auto Scaling won’t suffice.
ELB configuration is easy and straightforward. All you need to do is choose the right load balancer for your applications. If you need to handle millions of requests per second or you use protocols other than HTTP/HTTPS, go with the Network Load Balancer. To offload SSL, you need to choose the Application Load Balancer.
Analyze your needs, write down your requirements, and study the documentation before making the final decision. Once you select your load balancing mechanism, move on to the configuration. That is where the fun begins