Many organizations have already kicked off “Agile” and “DevOps” transformation plans on the people and process levels. However, they quickly discovered that it’s hard to reap the benefits of this change due to a lack of a supporting architecture that abides by Conway’s law.
It’s tough to create business agility if each alteration involves a handoff between layers upon layers of IT components and teams responsible for their management. One of the most common strategies to address this problem is to adopt the microservice architecture pattern. This approach allows small two-pizza teams to own a complete vertical functionality end-to-end while leveraging the principles and best practices of SOA.
Microservices and IaaS providers like AWS form a natural marriage. When your task is to design a microservice-based architecture on AWS, you can choose between two quick ways to approach this without high entry costs:
- Docker containers on ECS/ECR or
- Serverless lambda functions
*If you require a higher level of control and need to run your own service orchestration infrastructure (like Kubernetes) on “bare” EC2, you’re always free to do so on AWS. This, however, is a different ball game and beyond the scope of this article.
Run Docker containers
AWS offers a framework for container management in the form of the EC2 Container Service and the EC2 Container Registry.
EC2 Container Service (ECS)
This is a container orchestration framework similar to Docker Swarm or Marathon but fully managed by AWS. Its core building block is a task comprising a set of interconnected containers (something along the lines of Docker Compose).
ECS operates on top of a cluster of EC2 instances that auto-scale according to user-defined rules. By default, the EC2 images are based on dedicated Amazon ECS-optimized AMIs (preconfigured with the ECS agent and Docker). You can use your own AMIs if needed (e.g., CoreOS). Besides sizing the EC2 cluster, you can also manage individual tasks by dynamically adjusting the number of active container instances.
ECS integrates with ELB, which can function as an internal API gateway. This feature relieves you of dedicated HTTP clients. You also won’t have to set up routing daemons like linkerd, query service SRV records, or maintain Nginx templates.
From a pricing standpoint, ECS doesn’t incur any additional costs besides the EC2 resources that provide the necessary processing power.
EC2 Container Registry (ECR)
ECR is a highly available and scalable artifact repository responsible for storing Docker images. Think Docker Registry, but without the hassle of having to set it up from scratch or worrying about storage exhaustion during an important presentation of your new deployment pipeline (yes, this happened to one of my colleagues). With ECR you pay only for the storage that you leverage for your images.
A typical code delivery setup involving ECS and ECR would look like this:
AWS Lambda is a service that enables us to fire specific functions in response to events. An event can originate from a plethora of sources, e.g., AWS API Gateway, Kinesis Streams, CloudWatch Events and other.
All you have to do here is to prepare your code, use the native Lambda API to denote a handler function (sync or async) and upload it to AWS. Additionally, you define the max memory limit for function execution, with CPU and network bandwidth proportionally allocated.
As you can see, nowhere does this simple process even mention the notion of a “server.” Hence the term “serverless.” Containers which execute different lambda functions are created and scaled in the background, in response to incoming calls requests. This process is invisible to the AWS customer, who only has to deal with the underlying domain logic and can neglect infrastructure concerns altogether (well, almost…please see the comparison below).
To aid the governance of functions and their lifecycle across successive environments, Lambda provides features like environment variables, versioning, and aliases.
Effectively, AWS Lambda charges you per:
- number of function calls
- function memory allocation
- average function execution duration (rounded up to 100ms)
ECS Cluster vs. AWS Lambda – Feature-by-Feature Comparison
Let’s analyze in more detail how the two approaches stack up. Here’s a feature-by-feature comparison of ECS Cluster versus AWS Lambda based on 11 parameters. First round – simplicity:
In terms of simplicity, ECS cluster configuration and maintenance may pose a significant challenge and require a solid background in containerization technology. On the other hand, Lambda abstracts away infrastructure complexity in favor of focusing on business outcomes delivered through code.
Moreover, Lambda enables rapid code updates using per-environment function aliases (no need for separate testing environments!), provides monitoring via CloudWatch, and has security baked in via IAM Roles. Specifically, Lambda makes it EXTREMELY simple to build event processing pipelines and create backend services for external APIs (by using AWS API gateway integration). When we couple this with frameworks like AWS SAM and Serverless, creating new environments becomes a piece of cake.
Score: Lambda wins hands down
2. Usage patterns
Unpredictable and highly dense traffic scenarios are AWS Lambda’s sweet spot. In these circumstances, Lambda is the most-cost effective and convenient choice. Thanks to the built-in auto-scaling feature, it requires no upfront planning whatsoever.
In contrast, in low traffic use cases, you may encounter uneven performance with Lambda, due to the “cold start” effect that is a natural consequence of the technology. If you don’t use a function within a given grace period, you may have to build a container from scratch and load the code into the working memory.
With ECS it’s a different story. Here you can define the minimum number of active service tasks at any given moment (you will have to pay for the resources you keep on the back burner though).
Also for highly predictive traffic scenarios, ECS would be the way to go. For instance, for one of our customers, who needed to poll GPS metrics from thousands of data providers in 5-second intervals, using a cluster of reserved EC2 instances proved an order of magnitude cheaper than relying on Lambda.
In the case of recurring usage spikes (daily, weekly, seasonal), with ECS you have the power to control your clusters scaling behavior fully. Scryer by Netflix used historical data and the FFT algorithm (Fast Fourier transform used for MP3 encoding, among other things) to automate EC2 scaling in preparation for anticipated traffic. Not only did they significantly cut AWS server costs, but also improved customer experience through “just in time” provisioning of processing power.
Since spinning up a new EC2 instance can take minutes, this was often way too long for Netflix to handle sudden spikes in request volumes (you can, of course, tune your scaling rules to be more aggressive, but this yields additional costs and flaky cluster behavior). With Scryer, they managed to overcome this problem completely.
In contrast, Lambda does not give you these kinds of controls. Auto-scaling is hidden under the hood of AWS and resources cannot be implicitly pre-warmed.
Score: No winner. The choice depends on your business usage patterns.
Docker naturally enables support for polyglot microservices. Service technology for Lambda functions is ultimately constrained to what is currently supported by AWS.
Score: Clear win for ECS
ECS enables the deployment of both stateless and stateful services since EBS disks can be attached to the EC2 instances comprising the cluster.
On the contrary, Lambda is applicable only to stateless services (you can make use of ephemeral/tmp file storage, but this is not preserved across function calls). Then again… Everything aside from databases should be stateless anyway to enable linear service scalability.
Score: ECS is the winner in this race, but the prize money is quite insignificant
5. User interface
AWS Lambda functions belong strictly to the backend variety. If you need to build a frontend for your microservices based on Lambda, you can go for the Backend for Frontend pattern as successfully implemented at SoundCloud.
If you require more agility and are afraid that your frontend will become a new monolith/bottleneck, you should instead look at what Spotify did. The Spotify desktop app uses Chromium, with each page element being directly rendered by an independent microservice. You can only pull off this model with services that expose an “HTML interface” directly.
Score: AWS suffers defeat here
In Lambda each microservice effectively enforces the relation: one function = one API. While you may create functions that route to a particular domain service version based on an input parameter, you may quickly hit the wall of the code size limit that you can upload to AWS (and this includes all packaged dependencies).
As a result, the above factors pose a HUGE threat of running into the “nanoservice” anti-pattern and creating significant code redundancy. While the microservice style does assume some level of redundancy in the name of service independence, the amount we are potentially talking about here is ridiculous and may create a governance and security challenge (e.g., when you need to update a shared library that has a security hole identified).
Regarding managing specs and consumer contracts, Lambda falls behind HTTP-enabled microservices that publish Swagger or RAML definitions (unless you put the functions behind some internal API gateway).
Score: IMHO, a victory for Docker and ECS
As Docker is the de facto standard for containerization, you can easily port your services to other Docker-based orchestration platforms, if needed (a private cloud or public cloud provider offering a better value deal).
Lambda, on the other hand, creates tight coupling of your microservices to the AWS platform. Encapsulating all AWS API calls behind a reusable “cloud-agnostic” client library helps minimize the coupling. This is a solution which we always recommend to our customers; it usually incurs additional costs and maintenance effort but is worth your buck.
Score: Still, an easy win for ECS
8. Long running processes
A Lambda function call cannot take longer than 5 minutes – more complex tasks need to split into smaller piped lambda functions as per SEDA.
This is not necessarily a bad thing in terms of maintaining architecture simplicity, though. You can package all processing stages in one lambda function and do a self-call with different parameters. Alternatively, you can stage intermediate results in Kinesis for persistence.
Score: No winner
9. Resource heavy processes
You can allocate max 1.5 GB of RAM for a single Lambda function execution. RAM, CPU and network bandwidth in the case of ECS are constrained only by the underlying EC2 resources that offer a breadth of processing power.
Score: ECS steals this one
10. Communication style
Cross-lambda communication is strictly RPC. You cannot reap the benefits of REST, including caching or hypermedia controls in this case. To be perfectly honest, almost no one uses hypermedia in machine-to-machine interfaces. Most of the existing REST implementations are RPC based on JSON sent over an HTTP wire. Still, if you want to use REST by the book, you can do this with microservices deployed on ECS.
Score: +half a point for ECS
11. New vs. existing services
Settling on a microservice for each new functionality would play against the risk-driven “monolith first” approach coined by Martin Fowler. This model aims to minimize the possibility of creating an integration knot of nanoservices that is impossible to manage.
You can move to Lambda or Docker once you’ve learned how the new functionality behaves in production. As a result, you’ll be able to make an informed decision to split some of the code from the monolith – in tune with the popular “strangler pattern.”
Then again, for some legacy applications that you want to move to microservices, it may be way cheaper to “Dockerize” a selected system module than to re-architect it into a Lambda function.
Score: A tie. You should always take a risk-oriented stance before creating a new microservice (independent of the technology you will employ).
And the winner is…
Not so fast. As you may have noticed, there is no hard-and-fast rule for choosing a microservice implementation strategy on AWS. The path you select is dependent on:
- Your org’s appetite for control
- The available ops resources
- Your commitment to AWS
- The given service’s business usage patterns
- The given service’s lifecycle stage: is the service new (green field) or legacy (brown-field)
- Your front-end architecture
- …and many others
To summarize, I believe that often the most sensible route today is to employ a “best of both worlds” approach – a mix of Lambda and Docker on ECS/ECR – while regularly monitoring short-term/long-term risks vs. substantial costs savings.