Auto Scaling on AWS — EC2, ECS, and DynamoDB Scaling Strategies

July 19, 2025 · 8 min read

DevOps & Cloud Learning Hub

Your load balancer is distributing traffic perfectly across three servers. Then a marketing campaign goes live and traffic triples in ten minutes. Two of your servers hit 100% CPU, response times spike to 8 seconds, and users start dropping off. You needed six servers, not three — but only for the next four hours. Auto Scaling adds and removes capacity automatically so you stop paying for servers you don't need and stop losing customers when you don't have enough.

EC2 Auto Scaling Groups

An Auto Scaling Group (ASG) manages a fleet of EC2 instances. You define the minimum, maximum, and desired capacity. The ASG launches or terminates instances to maintain the desired count and replaces unhealthy ones automatically.

Launch Templates

Every ASG needs a launch template that defines what kind of instance to spin up:

# Create a launch template
aws ec2 create-launch-template \
  --launch-template-name web-server-template \
  --version-description "v1" \
  --launch-template-data '{
    "ImageId": "ami-0abcdef1234567890",
    "InstanceType": "t3.medium",
    "KeyName": "prod-key",
    "SecurityGroupIds": ["sg-0abc123def456"],
    "UserData": "'$(base64 -w0 <<< '#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd')'",
    "TagSpecifications": [{
      "ResourceType": "instance",
      "Tags": [{"Key": "Environment", "Value": "production"}]
    }],
    "Monitoring": {"Enabled": true}
  }'

Creating the ASG

# Create an Auto Scaling Group
aws autoscaling create-auto-scaling-group \
  --auto-scaling-group-name web-asg \
  --launch-template LaunchTemplateName=web-server-template,Version='$Latest' \
  --min-size 2 \
  --max-size 10 \
  --desired-capacity 3 \
  --vpc-zone-identifier "subnet-abc123,subnet-def456,subnet-ghi789" \
  --target-group-arns "arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/web-targets/abc123" \
  --health-check-type ELB \
  --health-check-grace-period 300 \
  --tags '[{
    "Key": "Name",
    "Value": "web-asg-instance",
    "PropagateAtLaunch": true
  }]'

The health-check-grace-period gives new instances 300 seconds to boot and pass health checks before the ASG marks them unhealthy and replaces them. Set this too low and you'll get a termination loop.

Scaling Policies Compared

Policy Type	How It Works	Best For	Reaction Time
Target Tracking	Maintains a metric at a target value (like a thermostat)	Steady-state workloads	~1-3 min
Step Scaling	Adds/removes different amounts based on alarm thresholds	Variable spikes, fine-grained control	~1-3 min
Simple Scaling	Adds/removes a fixed amount per alarm, waits for cooldown	Basic workloads (legacy)	Cooldown-dependent
Scheduled	Changes capacity at a specific time	Predictable traffic patterns	Exact schedule
Predictive	ML-based forecasting from historical patterns	Recurring daily/weekly cycles	Proactive (ahead of time)

Target Tracking Scaling

Target tracking is the simplest and most effective policy. You set a target value for a metric, and AWS handles the math to keep it there:

# Scale to maintain 50% average CPU utilization
aws autoscaling put-scaling-policy \
  --auto-scaling-group-name web-asg \
  --policy-name cpu-target-tracking \
  --policy-type TargetTrackingScaling \
  --target-tracking-configuration '{
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ASGAverageCPUUtilization"
    },
    "TargetValue": 50.0,
    "ScaleInCooldown": 300,
    "ScaleOutCooldown": 60,
    "DisableScaleIn": false
  }'

# Scale based on requests per target (ALB)
aws autoscaling put-scaling-policy \
  --auto-scaling-group-name web-asg \
  --policy-name request-count-tracking \
  --policy-type TargetTrackingScaling \
  --target-tracking-configuration '{
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ALBRequestCountPerTarget",
      "ResourceLabel": "app/web-alb/abc123/targetgroup/web-targets/def456"
    },
    "TargetValue": 1000.0
  }'

Notice the asymmetric cooldowns: scale-out cooldown is 60 seconds (react fast to spikes) while scale-in cooldown is 300 seconds (don't remove instances too quickly during fluctuating traffic).

Step Scaling

Step scaling gives you fine-grained control. Different alarm thresholds trigger different scaling amounts:

# Create a CloudWatch alarm for high CPU
aws cloudwatch put-metric-alarm \
  --alarm-name web-asg-high-cpu \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 60 \
  --evaluation-periods 2 \
  --threshold 60 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=AutoScalingGroupName,Value=web-asg \
  --alarm-actions "arn:aws:autoscaling:us-east-1:123456789012:scalingPolicy:abc-123:autoScalingGroupName/web-asg:policyName/step-scale-out"

# Define step adjustments: add more instances as CPU climbs higher
aws autoscaling put-scaling-policy \
  --auto-scaling-group-name web-asg \
  --policy-name step-scale-out \
  --policy-type StepScaling \
  --adjustment-type ChangeInCapacity \
  --step-adjustments '[
    {"MetricIntervalLowerBound": 0, "MetricIntervalUpperBound": 20, "ScalingAdjustment": 1},
    {"MetricIntervalLowerBound": 20, "MetricIntervalUpperBound": 40, "ScalingAdjustment": 3},
    {"MetricIntervalLowerBound": 40, "ScalingAdjustment": 5}
  ]' \
  --estimated-instance-warmup 180

This means: when CPU exceeds 60% by 0-20 points (60-80%), add 1 instance. By 20-40 points (80-100%), add 3. Over 40 points above threshold, add 5. The estimated-instance-warmup tells the ASG to exclude new instances from the metric calculation for 180 seconds while they boot.

Scheduled Scaling

If your traffic follows a predictable pattern, schedule capacity changes ahead of time:

# Scale up before business hours (8 AM UTC weekdays)
aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name web-asg \
  --scheduled-action-name scale-up-morning \
  --recurrence "0 8 * * MON-FRI" \
  --min-size 5 \
  --max-size 15 \
  --desired-capacity 8

# Scale down after hours (10 PM UTC weekdays)
aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name web-asg \
  --scheduled-action-name scale-down-evening \
  --recurrence "0 22 * * MON-FRI" \
  --min-size 2 \
  --max-size 10 \
  --desired-capacity 2

# Weekend minimal capacity
aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name web-asg \
  --scheduled-action-name weekend-minimum \
  --recurrence "0 0 * * SAT" \
  --min-size 1 \
  --max-size 4 \
  --desired-capacity 1

Predictive Scaling

Predictive scaling uses machine learning to analyze 14 days of historical data and forecast future traffic. It proactively provisions instances before the spike arrives:

# Enable predictive scaling
aws autoscaling put-scaling-policy \
  --auto-scaling-group-name web-asg \
  --policy-name predictive-scaling \
  --policy-type PredictiveScaling \
  --predictive-scaling-configuration '{
    "MetricSpecifications": [{
      "TargetValue": 50.0,
      "PredefinedMetricPairSpecification": {
        "PredefinedMetricType": "ASGCPUUtilization"
      }
    }],
    "Mode": "ForecastAndScale",
    "SchedulingBufferTime": 300
  }'

The SchedulingBufferTime launches instances 300 seconds before the predicted spike, giving them time to boot and warm up. Set Mode to ForecastOnly first to evaluate predictions without actually scaling.

Lifecycle Hooks

Lifecycle hooks let you pause an instance during launch or termination to run custom actions — install software, drain connections, or deregister from a service registry:

# Add a launch lifecycle hook
aws autoscaling put-lifecycle-hook \
  --auto-scaling-group-name web-asg \
  --lifecycle-hook-name pre-launch-setup \
  --lifecycle-transition autoscaling:EC2_INSTANCE_LAUNCHING \
  --heartbeat-timeout 600 \
  --default-result CONTINUE

# Add a termination lifecycle hook (drain connections)
aws autoscaling put-lifecycle-hook \
  --auto-scaling-group-name web-asg \
  --lifecycle-hook-name pre-terminate-drain \
  --lifecycle-transition autoscaling:EC2_INSTANCE_TERMINATING \
  --heartbeat-timeout 300 \
  --default-result ABANDON

# Complete the lifecycle action from your script
aws autoscaling complete-lifecycle-action \
  --auto-scaling-group-name web-asg \
  --lifecycle-hook-name pre-launch-setup \
  --instance-id i-0abc123def456 \
  --lifecycle-action-result CONTINUE

ECS Service Auto Scaling

ECS services scale the number of tasks (containers), not instances. It uses Application Auto Scaling, which provides the same policy types:

# Register the ECS service as a scalable target
aws application-autoscaling register-scalable-target \
  --service-namespace ecs \
  --resource-id service/production-cluster/api-service \
  --scalable-dimension ecs:service:DesiredCount \
  --min-capacity 2 \
  --max-capacity 20

# Target tracking on CPU
aws application-autoscaling put-scaling-policy \
  --service-namespace ecs \
  --resource-id service/production-cluster/api-service \
  --scalable-dimension ecs:service:DesiredCount \
  --policy-name ecs-cpu-tracking \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration '{
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
    },
    "TargetValue": 60.0,
    "ScaleInCooldown": 300,
    "ScaleOutCooldown": 60
  }'

If you're running ECS on EC2 (not Fargate), you need two layers of scaling: ECS Service Auto Scaling for tasks, and EC2 Auto Scaling or ECS Capacity Providers for the underlying instances.

DynamoDB Auto Scaling

DynamoDB auto scaling adjusts provisioned read/write capacity units based on actual usage:

# Register DynamoDB table for write capacity scaling
aws application-autoscaling register-scalable-target \
  --service-namespace dynamodb \
  --resource-id "table/orders" \
  --scalable-dimension "dynamodb:table:WriteCapacityUnits" \
  --min-capacity 5 \
  --max-capacity 1000

# Target tracking to maintain 70% utilization
aws application-autoscaling put-scaling-policy \
  --service-namespace dynamodb \
  --resource-id "table/orders" \
  --scalable-dimension "dynamodb:table:WriteCapacityUnits" \
  --policy-name dynamo-write-scaling \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration '{
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "DynamoDBWriteCapacityUtilization"
    },
    "TargetValue": 70.0
  }'

Pro tip: DynamoDB scales up quickly but scales down only 4 times per day per table. If you have wildly variable traffic, consider on-demand capacity mode instead — it costs more per request but has no scaling delays.

Monitoring and Troubleshooting

# Check current ASG status
aws autoscaling describe-auto-scaling-groups \
  --auto-scaling-group-names web-asg \
  --query 'AutoScalingGroups[0].{Min:MinSize,Max:MaxSize,Desired:DesiredCapacity,Instances:Instances[].{Id:InstanceId,State:LifecycleState,Health:HealthStatus}}' \
  --output table

# View scaling activities (why did it scale?)
aws autoscaling describe-scaling-activities \
  --auto-scaling-group-name web-asg \
  --max-items 5 \
  --query 'Activities[].{Time:StartTime,Status:StatusCode,Cause:Cause}' \
  --output table

What's Next?

Your infrastructure now scales automatically to handle any traffic pattern. But scaling up means more resources, and more resources means a bigger bill if you're not careful. In an upcoming post, we'll explore S3 Security — locking down bucket policies, encryption at rest, access points, and preventing the data breaches that make headlines.

This is Part 15 of our AWS series. Target tracking is the right default for 80% of workloads. Start there, and only switch to step scaling if you need finer control over the response curve.

EC2 Auto Scaling Groups​

Launch Templates​

Creating the ASG​

Scaling Policies Compared​

Target Tracking Scaling​

Step Scaling​

Scheduled Scaling​

Predictive Scaling​

Lifecycle Hooks​

ECS Service Auto Scaling​

DynamoDB Auto Scaling​

Monitoring and Troubleshooting​

What's Next?​

Stay Updated