- Published on
Reduce Costs with Scheduled EC2 Instance Shutdowns on AWS
- Authors
- Name
- Eric McCowan
- @ericrmc
It is often still necessary to run certain applications on an EC2 instance, particularly if it is not easily containerised, and you do not want to pay extra for a managed service. In a non-production environment, you will be able to reduce costs by shutting down these instances when they are not needed by your developers or users.
Off-the-shelf scheduling with an AWS template
The team at AWS has already put together a CloudFormation template you can deploy. The documentation and deployment guide is available here: https://docs.aws.amazon.com/solutions/latest/instance-scheduler/overview.html
It uses a new command line interface to initialise and configure a schedule. This metadata is stored in DynamoDB, and a Lambda function will be run every few minutes to ensure the rules get applied to your instances at the correct times. It is scalable but can seem a bit heavy if you don't need all the features.
A custom but lightweight alternative
The previous template may be too heavy for some. As a simple alternative, you can make something similar with your own Lambda function, albeit with reduced functionality. This approach is suited to non-production environments where developers have permission to start instances manually when they need them, and have those instances stop automatically at the end of the day.
This approach is just a serverless application deployed with AWS Serverless Application Model (SAM).
The two key resources are a Lambda function, and a Cloudwatch event rule.
With SAM you can specify one template.yaml and just have the one resource, a Type: AWS::Serverless::Function
.
This resource will neatly wrap up your function's code, event triggers, and IAM policies and make them ready for deployment.
Here is the entire template used for this shutdown app:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
ec2-autoshutdown
SAM Template for ec2-autoshutdown
Resources:
LambdaFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: ec2autoshutdown/
Handler: app.lambda_handler
Runtime: python3.8
Events:
Schedule1:
Type: Schedule
Properties:
Description: Schedule for shutdown (7:30/UTC = 5:30pm/+10)
Enabled: True
Schedule: cron(30 7 * * ? *)
Policies:
- !Ref EC2AccessPolicy
Environment:
Variables:
SHUTDOWN_TAG_NAME: 'AutoShutdown'
WAIT_TIMEOUT_SECS: 10
MemorySize: 256
Timeout: 900
EC2AccessPolicy:
Type: AWS::IAM::ManagedPolicy
Properties:
Description: Allows access and state management of EC2 instances
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- 'ec2:Get*'
- 'ec2:Describe*'
- 'ec2:StopInstances'
Resource: '*'
Now onto the application itself. It gets initiated by a CloudWatch Events Rule, with a daily time configured in the template above. It will request a list of all running EC2 instances, then check each one for the tag key AutoShutdown. If its associated value is True, its ID gets saved along with any others that also match the criteria. Once all other instances have been checked, it will send a single request to stop all those instances.
The application code is written in Python and uses the standard boto3
library for interacting with AWS services.
The environment variables above are given to the function below on load, so it knows which tag to look for, and how long to wait.
import json
import boto3
from time import sleep
import os
REGION = os.environ['AWS_REGION']
WAIT_TIMEOUT = os.environ['WAIT_TIMEOUT_SECS']
SHUTDOWN_TAG_NAME = os.environ['SHUTDOWN_TAG_NAME']
def lambda_handler(event, context):
ec2 = boto3.client('ec2', region_name=REGION)
response_describe = ec2.describe_instances(Filters=[{
'Name': 'instance-state-name',
'Values': ['running']
}])['Reservations']
shutdown_list = list()
for res in response_describe:
for instance in res.get('Instances', []):
ec2_id = instance.get('InstanceId')
taglist = instance.get('Tags', [])
ec2_has_true_shutdown_tag = False
ec2_name = None
for tag in taglist:
if tag.get('Key') == SHUTDOWN_TAG_NAME:
if tag.get('Value') == 'True':
ec2_has_true_shutdown_tag = True
if tag.get('Key').startswith('Name'):
ec2_name = tag.get('Value')
if not ec2_name:
ec2_name = 'Unnamed'
if ec2_has_true_shutdown_tag:
shutdown_list.append(ec2_id)
print('Affected instance: {} ({})'.format(ec2_name, ec2_id))
if shutdown_list:
print("\nShutting down the following instances in {} seconds: {}".format(WAIT_TIMEOUT, ', '.join(shutdown_list)))
sleep(int(WAIT_TIMEOUT))
response_shutdown = ec2.stop_instances(InstanceIds=shutdown_list)
print(response_shutdown)
return {
"statusCode": 200,
"body": json.dumps(response_shutdown, default=str)
}
else:
print("No instances to shut down.")
To deploy using SAM, save the code blocks above to template.yaml
and ec2shutdown/app.py
respectively then run the following two commands:
sam build --use-container
sam deploy --guided
This will create the resources in AWS for you, and you can tinker with the YAML parameters without having to change the Python app code.
When it runs, we get a simple event in the logs like this, followed by the actual API response from the stop_instances
request:
Affected instance: ENV-TEST-APP-EC2 (i-0bc99...)
Affected instance: ENV-TEST-DATABASE-EC2 (i-0a239...)
Shutting down the following instances in 10 seconds: i-0bc99..., i-0a239...
This simple output will help developers check that the right instances are getting shut down.
But the next day when you start up the instance again, what if your application doesn't start?
Ensuring your application runs on instance boot
We also need to reduce the time spent starting instances that have been shut down, especially if the application doesn't automatically start for us.
Use systemd
If you are running a Linux application, you may be able to configure the built-in service manager systemd to run your application on boot as a system service. All you really need to do is change some sample values (Description, User/Group, ExecStart) in this example in a text editor, then save the file as /etc/systemd/system/sampleapp.service
[Unit]
Description=sampleapp
Requires=network-online.target
After=network-online.target
[Service]
User=ubuntu
Group=ubuntu
Restart=on-failure
ExecStart=/path/to/sampleapp -listen ":8080"
ExecReload=/bin/kill -HUP $MAINPID
KillSignal=SIGINT
TimeoutStopSec=30s
Restart=on-failure
[Install]
WantedBy=multi-user.target
Original source code thanks to Tristan.
To start and enable the service to run on boot, run:
sudo systemctl start sampleapp
sudo systemctl enable sampleapp
Other related cost reductions
AWS Glue Development Endpoints are another good candidate to shut down using this method because of their higher hourly cost.
Cleaning these up can also be done using the simple Lambda above, where you just list Dev Endpoints (with client.list_dev_endpoints
) and delete them.
Consider using Cloud9 for development instances as it has an automatic shutdown based on a configurable timeout (default 30 minutes).
For a routine clean-up of your environment that removes unused resources, Marat has developed a clean-up tool, you can read more about it here.