Scale Down EC2 Container Instances in ECS

4 min readDec 3, 2017

AWS ECS in Brief

This article would explain a sample script which is used for scale down a ECS cluster in a cost efficient way using AWS SDK for Python.

Amazon ECS (EC2 Container Service) is a container management system which runs Docker containers while providing high scalability and high performance.Default container orchestration comes with AWS ECS, takes care of everything related to container management and avoid the user having to worry about container management. You can define static number of instances to be occupied in the cluster.

But it would not be very cost friendly because resources are paid even when they are not in used. How can we overcome this. Well..AWS provides auto scaling group feature so that we can attach it to the cluster. We can define how to scale out and scale in the cluster.

Problem with Autoscaling group scaling down

Autoscaling group can be attached to the ECS cluster and we can control the conditions to be auto scaled. But it has its own limitations. The cluster can be auto scaled under following limitations. They are applicable for the whole cluster. Not for a single node in the cluster.

CPU reservation
Memory reservation
CPU utilization
Memory utilization

How do you use above conditions in auto scaling. As an example, you can define cluster to be scaled out by 1 instance if there is a CPU utilization above 95% average for consecutive 5 minutes period of time.

With scaling out, there was no problem. But when we scale in, we came across with a few difficulties. For example, shutting down instances with running containers inside. The other major issue was even though an instance is shutdown, cluster might get scaled out with again in a few mintues. If this keeps happening in a loop, you would be paying unnecessarily for AWS resources. ATM we were testing this, AWS billing happens for hour basis.

How to overcome this…

Therefore, we wanted something more to combine with these conditions, for a better, yet economical scalability. For example, shutdown an instance if only it’s closing to next billing hour and no containers are running inside the EC2 instance in the cluster.

What we did was writing a custom script to run as a cron job so that it would run periodically do the scaling in if necessary. Wait- how can you write a custom script to handle AWS resources. Well..There is an AWS SDK for Python, called boto3 and it helped to solve the problem.

I am not going to describe here how to write boto3 script using Python. I will cover it in another article for sure. But I will describe a boto3 script which I wrote for scaling down.

Well..the script is shown below. You can go through it as I have commented in the code to make it clear.

#---------------Auto Scaling Down Instances When There Are No Running Containers---------------------
import boto3
import datetime 
import math
import time#---This method will provide the unix time once the timestamp is provided
def unix_time(time_stamp):
    return time.mktime(time_stamp.timetuple())#---This method will scale down the cluster
#---containerInstance and container instances count is needed
def scale_down(containerInstance, container_instances_count):
    #Here "containerInstance" is an EC2 instance in the cluster
    print "\nContainer Instance ID: " + containerInstance['ec2InstanceId']
    #Check if the instance state has running task or pending tasks and number of instances in the cluster is greater than 1
    #Running task = running containers, pending taks means the containers which are in queue to run in the considering instance
    if (containerInstance['runningTasksCount'] == 0 and containerInstance['pendingTasksCount']== 0 and container_instances_count > 1):
        #registration time of the container instance
        #This means starting time on the EC2 instance
        time_reg = containerInstance['registeredAt']
     #current time of the instance
        current_time = datetime.datetime.now()#next billing hour of the instance
        #current_time - registered time give you time different. When you divide it by 60*60 you get the different in hourse
        #Eg: hours_different could be 0.5 , 1.3, 2.5 etc.
        hours_difference = (unix_time(current_time) - unix_time(time_reg))/(60*60)
        #next billing hour can be takes by registered time + round up value of the hours difference
        next_billing_hour = time_reg + datetime.timedelta(hours=math.ceil(hours_difference))print ("Next billing hour begins: %s" % next_billing_hour )
        #check if the current time greater than 45 minutes of the current billing hour
        # You can edit this by changine following 15 value by any value you like.
        #This value actually depends on the cron job
        threshold_time = next_billing_hour - datetime.timedelta(minutes=15)
        print ("Threshold time to kill: %s" % threshold_time )
        print ("Current time: %s" % current_time )
        #check if the current time is less than the time to be killed the instance
        if unix_time(threshold_time) < unix_time(current_time):
            #Terminate the instance and number of available container instances would be decreased by 1
            print "Terminating instance " + containerInstance['ec2InstanceId'] 
            #auto scale group API called to terminate the instance by providing the instance ID
            #Desired capacity should be decrement after terminating the instance, hence ShouldDecrementDesiredCapacity=true
            asgClient.terminate_instance_in_auto_scaling_group(InstanceId=containerInstance['ec2InstanceId'], ShouldDecrementDesiredCapacity=True)
     container_instances_count -= 1
     print ("Size of the cluster after termination %s\n" %container_instances_count)
    #if there are running/pending containers inside the instances
    else :
        print ("Running Containers {} \nPending tasks {} \nCluster size {} \n ".
            format(containerInstance['runningTasksCount'],containerInstance['pendingTasksCount'],container_instances_count))#choose the aws user which to access the resources
#this account will be taken from the aws cli you have configured in your machine
session = boto3.Session(profile_name='myAccount')#ecs client and auto scaling group resource generation
ecsClient = session.client(service_name='ecs')
asgClient = session.client(service_name='autoscaling')#list container instances of the cluster
#you will have to provide the cluster name here. eg: ECS-Cluster
clusterListResp = ecsClient.list_container_instances(cluster='ECS-Cluster')#details of EC2 container instances
containerDetails = ecsClient.describe_container_instances(cluster='ECS-Cluster', containerInstances=clusterListResp['containerInstanceArns'])#Get the instances count in the cluster
container_instances_count = len(containerDetails['containerInstances'])#loop through every instances to check if it should be terminated
for containerInstance in containerDetails['containerInstances']:
    scale_down(containerInstance, container_instances_count)

You can get the code by following this GitHub location as well.

dinusha92/AWS

Contribute to AWS development by creating an account on GitHub.

github.com

You can simply run this by

python scale_down_script.py

To make it more efficient and to get the full use it, you can set this script to run periodically as a cron job.

I hope this article helped you in some way to write a cost saving scale down policy with Python boto3. Hope you enjoyed it.

Scale Down EC2 Container Instances in ECS

AWS ECS in Brief

Problem with Autoscaling group scaling down

How to overcome this…

dinusha92/AWS

Contribute to AWS development by creating an account on GitHub.

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Dinusha Dissanayake

Responses (3)

More from Dinusha Dissanayake

Using Kaniko in a K8S Cluster

Kaniko is a tool developed to build and push container images into a container image registry inside a container.

Building a resilient deployment on Kubernetes-part 2: Resources and capacity planning

In my previous article on Building a resilient deployment on Kubernetes-part 1: Minimum application downtime, I discussed how to keep…

Deployment Best Practices for WSO2 API Manager

WSO2 API Manager is a fully open-source full lifecycle API Management solution that can be run anywhere.

Kubernetes Pod Placement Strategies

Kubernetes is a platform which enables developers to automate the management of their applications in a containerized deployments.

Recommended from Medium

AWS Fargate.

AWS Fargate is a serverless compute engine for containers that allows users to run and manage containers without having to manage the…

Autoscaling in AWS EC2: Best Practices for Scaling Your Applications

One of the most powerful features of cloud computing is the ability to scale resources dynamically based on demand. Autoscaling is an…

Lists

Visual Storytellers Playlist

Natural Language Processing

Managing different environments for AWS CDK (Dev, Test, Prod)

How I manage development, testing, and production environments using AWS CDK, including common setup issues and solutions

Implementing Blue/Green Deployments with Amazon ECS and AWS CodeDeploy

Blue/Green deployment strategies allow you to minimize downtime and reduce risk during application updates. In this article, I will walk…

Different Types of Deployment Strategies for Production Env: Explained with Security Considerations

Deployment strategies are crucial for delivering application updates while ensuring minimal downtime and preserving the security and…

AWS Terraform [2024]: ECS Cluster on Autoscaling EC2 with RDS DB - a fully CloudNative approach

A client of mine had a requirement to setup a Ruby on Rails application on a ECS cluster using EC2 autoscaling group, that talks to RDS…