Auto Scaling Production Services on Titus

Netflix Technology Blog
6 min readintermediate
--
View Original

Overview

The article discusses Netflix's implementation of auto scaling for its container management platform, Titus, leveraging AWS Auto Scaling features. It details the collaborative efforts with AWS to enhance the scalability of microservices, allowing for dynamic resource allocation based on real-time demand.

What You'll Learn

1

How to configure auto scaling for services running on Titus

2

Why leveraging AWS Auto Scaling enhances microservices scalability

3

When to use CloudWatch metrics for auto scaling decisions

Prerequisites & Requirements

  • Understanding of container management and microservices architecture
  • Familiarity with AWS services, particularly EC2 and CloudWatch(optional)

Key Questions Answered

How does Titus implement auto scaling for microservices?
Titus implements auto scaling by integrating with the AWS Auto Scaling engine, which computes desired capacity based on metrics and adjusts container counts accordingly. This allows services to dynamically scale in response to varying workloads, utilizing familiar scaling policies from AWS.
What are the benefits of using AWS Auto Scaling with Titus?
Using AWS Auto Scaling with Titus provides several benefits, including leveraging a proven scaling engine, utilizing familiar Target Tracking and Step Scaling policies, and enabling applications to scale based on both custom and AWS-specific metrics.
What metrics can be used for auto scaling in Titus?
Metrics for auto scaling in Titus include application-specific metrics like requests per second and container CPU utilization, as well as AWS-specific metrics such as SQS queue depth. These metrics are published to CloudWatch for monitoring and scaling decisions.
How does the integration with AWS API Gateway facilitate auto scaling?
The integration with AWS API Gateway allows the AWS Auto Scaling engine to communicate with the Titus control plane securely. It acts as an accessible API front door that enables scaling actions to be executed based on the configured policies.

Technologies & Tools

Container Management
Titus
Used for scheduling and managing application containers across AWS EC2 instances.
Cloud Service
AWS Auto Scaling
Provides the auto scaling engine that powers dynamic scaling for services running on Titus.
Monitoring
Cloudwatch
Used for monitoring application metrics and triggering scaling actions based on defined thresholds.
Cloud Service
API Gateway
Facilitates secure communication between the AWS Auto Scaling engine and the Titus control plane.

Key Actionable Insights

1
Implement auto scaling in your microservices using Titus to enhance resource efficiency.
By adopting auto scaling, services can dynamically adjust to traffic patterns, ensuring optimal performance without manual intervention. This is particularly useful during peak usage times.
2
Utilize CloudWatch metrics to monitor application performance and trigger scaling actions.
By effectively using CloudWatch, you can ensure that your applications respond to real-time demand, improving user experience and resource utilization.
3
Leverage existing AWS technologies to simplify the adoption of container management.
Using familiar AWS tools and policies can reduce the learning curve for teams transitioning to containerized environments, leading to faster implementation and better integration.

Common Pitfalls

1
Failing to properly configure CloudWatch alarms can lead to insufficient scaling actions.
If alarm thresholds are not set correctly, the auto scaling process may not respond adequately to traffic spikes, resulting in degraded service performance.
2
Over-reliance on static resource allocation can hinder the benefits of containerization.
Static sizing does not leverage the full potential of cloud resources, which can lead to inefficiencies and increased costs. Dynamic scaling is essential for optimizing resource use.

Related Concepts

Container Orchestration
Microservices Architecture
AWS Services Integration
Cloud Monitoring And Metrics