Auto scaling seems to be the ultimate selling point for cloud computing. It is often considered as a handy emergency fuse kit, that’s activated when something goes wrong and brings your environment back to normal in no time.
However, if we want auto scaling to work without fail, we must adequately configure it, like any other feature.
In today’s post, we will take a more in-depth look at AWS Auto Scaling, and explore how to use it properly to secure your EC2 deployment. It is worth mentioning that it is also possible to auto scale other resources, like AWS ECS, DynamoDB, or Aurora, but today I will focus solely on EC2.
Let’s start with the basics and get all your questions about AWS EC2 auto scaling answered.
What Is Amazon EC2 Auto Scaling?
AWS Auto Scaling monitors your applications or instances and adjusts their capacity to maintain steady performance. The feature helps you secure your deployment in two ways.
First, auto scaling helps you keep the desired number of instances; so if you need two instances to maintain your application’s performance, auto scaling will make sure you will never go below this value. Alternatively, if you want to make sure that your application is always up, you can add its instance to a scaling group with one instance only. If the instance encounters a problem, another one will launch to replace it.
The second way in which auto scaling supports your performance is by ensuring that the number of instances your application uses is sufficient to maintain stable performance. New instances can be automatically added to the group when, for example, CPU utilization is high for a given amount of time. This mechanism can also work in the opposite direction. If your application uses, e.g., less than 10% of CPU of more than one instance, the auto scaling mechanism can reduce the total number of instances.
What Are AWS Auto Scaling Core Components?
Auto Scaling in AWS leverages components such as auto scaling groups, launch configurations and templates, and scaling plans.
- The auto scaling groups are collections of EC2 instances. You can specify the minimum and the maximum number of instances in a group, and auto scaling ensures that that value never goes outside of that range. You can also provide an exact value.
Launch configuration or launch templates are components used as templates for EC2 instances that are launched within auto scaling groups. When you create a launch configuration or a template, you can provide information such as AMI ID, instance type, key pair, security groups, user data, storage, etc.
Launch templates are more sophisticated and let you specify more details than launch configurations. They can also be updated and versioned, which is not possible for launch configurations.
- The last component of auto scaling is a scaling plan. It is a kind of a policy telling the auto scaling when and how to scale. Scaling can be dynamic, basing on the occurrence of specific conditions like CPU utilization, or schedule-driven.
The X Factor – Elastic Load Balancer
There is also one more crucial component, not mentioned before. It’s an Elastic Load Balancer that balances the traffic between EC2 instances within a group.
When a load balancer is registered with the auto scaling groups, either directly (Classic Load Balancer) or through target groups (Application and Network Load Balancer), any EC2 instance launched within a scaling group is automatically associated with that balancer or with the target group registered with that scaling group.
How Does Amazon EC2 Auto Scaling Work?
By now, you may have some idea already, but let me provide you with an overview of the process.
Instances are launched within a scaling group using the launch configuration or a template. Once they start, they automatically register with the Elastic Load Balancer (ELB), which starts to spread the network traffic between healthy instances. The auto scaling monitors the instances, and if any of them is down, another one is immediately launched to keep the total amount of instances at the desired level. CloudWatch monitors the instance parameters and sends them to the auto scaling, which reacts according to the scaling plans and either increases or decreases the total number of instances, keeping them between the desired minimum and maximum.
I’ve included some examples to illustrate how auto scaling can help you keep the total number of instances at the desired level and ensure that at least one healthy instance is running at any given time.
Here we have a single instance within an auto scaling group:
If that instance is terminated, …
… it will be detected as unhealthy, …
… and another instance will immediately launch to replace it.
Simple, isn’t it? However, for more advanced configurations with more than one instance, you need a more complex feature. That’s when auto scaling groups come in handy.
How to Create and Configure an Auto Scaling Group in AWS?
Let’s start with some configuration first. We will launch two instances within a scaling group. A Network Load Balancer will balance traffic between them, and we will also use a scaling policy to launch another instance when CPU utilization within the group exceeds 75%.
Create Network Load Balancer
First, we will create the Network Load Balancer.
It will be a simple balancer listening on the TCP80 port and having its targets in a single subnet, in a single AZ.
Next, we will create a new target group associated with the balancer.
As there are no targets yet to register, we are going to skip the next step.
In the Review section, verify your load balancer settings and complete the creation.
Create Launch Configuration
It is now time to prepare a launch configuration for the instances in the group.
Define AMI, Instance type, IAM role, name, and more detailed information like user data if needed. Add storage, assign security group. Then review your settings and create the configuration.
Notice that once you create a launch configuration, you won’t be able to edit it!
Later, I will show you how you can change the configuration for a new instance in the scaling group. This knowledge can become useful if you want to modify some parameters, e.g., AMI or instance type. However, now, when the load balancer and launch configuration are in place, you can create an auto scaling group.
Create Auto Scaling Group
In EC2, choose Auto Scaling Groups and then Create Auto Scaling Group.
We will use the existing launch configuration and select the configuration we have just created.
In the next step, choose a name for the group, the initial group size (we will start with two instances), network, subnet and target groups.
Remember that the network and subnet have to match the security group from the launch configuration and VPC and subnet of the load balancer registered with the target group chosen.
In the next step, we will create an auto scaling policy. We want to add an instance when CPU utilization exceeds 75% and remove one when it drops below 50%. To do so, we will need to configure new CloudWatch alarms by clicking Add new alarm.
Create two alarms. The first one to increase the group and the second to decrease it.
Configure them to add one instance for the first alarm, and to remove one for the second.
Moving further, you can add notifications and tags, and eventually review the configuration.
Continue to the Create Auto Scaling Group button.
The creation will take a few seconds, and new instances will be launched to meet the desired quantity. Another few minutes later you will see them up and running.
New instances have been added to the scaling group:
However, they also appeared in the target group. This means that they have been automatically registered with the load balancer.
Take a look at the status of the instances. They are healthy from the scaling group perspective but unhealthy in the context of the target group.
No listening occurs on the instances, so the health check on port 80 fails with the unhealthy status. You can change the health check type to ‘ELB’ in the auto scaling group settings to reflect its status in the group. See what happens now:
Auto scaling terminates unhealthy instances, launches new ones to keep the desired group size and then terminates them again as unhealthy.
Finally, let’s change the health check type to ‘EC2’ until some services run on port 80. As a result, the auto scaling will be trying to keep the number of healthy instances at the desired level.
Amazon EC2 Scaling Policies
A part of our scaling plan was a policy where we established that a new instance would automatically spin up when the CPU exceeds 75% and terminate when the CPU goes below 50%.
Let’s explore how the policy works.
There are two healthy instances in the group.
To increase the CPU load on the instances, we will use a small tool called stress. You can install it from the yum repository on Amazon Linux.
Let’s run it on the first instance and see what happens.
The average CPU load in the first instance is much above 70% now.
However, our scaling policy does not launch a new instance yet. Let’s run the same stress on the second instance and wait a few minutes.
Soon, you will be able to see that the new instance has launched.
When the stress tool stops in both instances, the third instance is going to terminate.
Scale in Protection Feature
It’s worth noting that the terminating instance does not have to be the one that launched last.
If you need to prevent a specific instance from terminating, enable the Scale In Protection feature in the Instances tab of the auto scaling group.
It Works, but What if I Need to Change Something in the Configuration?
I’ve mentioned before that you cannot edit launch configurations, but there is a way to make some modifications in new instances within the scaling group.
Although you are not able to modify a saved launch configuration, you can still create a new one. You may also attach a new launch configuration to an existing scaling group configuration.
Any new instance launched within a group will have the new configuration. After the first instance launches, you can manually terminate the old configs and wait for auto scaling to launch other instances up to the desired quantity.
This feature becomes useful when you need to upgrade instances to more robust ones, or you have to update instances with the newest AMIs or the latest version of your application.
Auto Scaling Is Powerful but Tricky
Creating a fully-automated auto scaling environment can be a grueling exercise. Auto Scaling is an extremely beneficial and powerful tool provided you use it with caution.
Keep in mind that you must run preliminary tests on any instance that is to launch automatically to verify that it will work as desired. The easiest way to do it is to keep the configuration in the repository and use tools like Puppet, Chef or Ansible to configure instances after they start. This process takes time and baking in Amazon Linux AMI helps save it. Deployment becomes much less laborious. On the other hand, you need to spin a new AMI on every application or configuration update.
Another aspect is the environment capacity and the decision when to scale. Maybe it would be better for you to keep more instances than needed, but avoid downtime? It takes time for the alarm and policy to launch new instances, and for the instance itself to start. That time can be crucial when other instances are not able to serve the requests waiting for the new one to kick off. Perhaps you prefer to add instances earlier when the load is not that high yet? You have to consider all of these aspects when deploying your auto scaling features. Decide on your own which approach is more suitable for your needs.
New Auto Scaling Features
Last but not least, the auto scaling features are constantly enhanced.
Earlier this year, Amazon has launched a new, unified AWS Auto Scaling for cloud applications. Check the official Amazon documentation to find out more: https://aws.amazon.com/blogs/aws/aws-auto-scaling-unified-scaling-for-your-cloud-applications/.
Questions? Comments? Share your thoughts!