Saturday, July 9, 2016

AWS Auto Scaling Lifecycle Hook with Lambda and CloudFormation



There are a lot of advantages to place instances in AWS Auto Scaling Groups, scaling is the obvious one. Even for a single instance appliance, Auto Scaling provides resiliency, health monitoring and auto recovery. In many cases, ASG High Availability model is superior to running active/standby appliances in terms of seamless automation and cost effectiveness.

However, Auto Scaling has limitations, not all instance actions and properties can be defined with an ASG. For example, instance launched in an ASG can have only one interface. Auto Scaling currently does not support attaching multiple interfaces. AWS Lambda, on the other hand, is great for defining custom actions executed efficiently and on demand. Putting the two together, AWS Auto Scaling lifecycle hook allows Lambda defined custom actions to be inserted during ASG instance launch or termination, which is powerful and flexible.

Reference links below for more details about Auto Scaling lifecycle hooks, as well as an excellent example and implementation steps using AWS console written by Vyom Nagrani  

To automate ASG and lifecycle hook actions, Cloudformation is used to define ASG and lifecycle hook. In the following example, a lifecycle hook is defined to send notification via SNS when instance launches. A Lambda function will be triggered via subscription to the SNS topic.
"GatewayAutoscalingGroupHook" : {
                "Type" : "AWS::AutoScaling::LifecycleHook",
                "Properties" : {
                                "AutoScalingGroupName" : { "Ref": "GatewayAutoscalingGroup" },
                                "HeartbeatTimeout" : 300,
                                "LifecycleTransition" : "autoscaling:EC2_INSTANCE_LAUNCHING",
                                "NotificationMetadata" : { "Fn::Join" : ["", [
                                                "{",
                                                "\"ENI1\"",
                                                ":",
                                                "\"",
                                                { "Ref" : "GatewayInstanceENI1" },
                                                "\"",
                                                ",",
                                                "\"ENI2\"",
                                                ":",
                                                "\"",
                                                { "Ref" : "GatewayInstanceENI2" },
                                                "\"",
                                                "}"
                                ]]},
                                "NotificationTargetARN" : "arn:aws:sns:us-east-1:697686697680:gateway-asg-lifecycle-hook",
                                "RoleARN" : "arn:aws:iam::697686697680:role/gateway-sns-hook-role"
                }
},

There is an odd behavior with Cloudformation when it is used to define ASG lifecycle hook. According to AWS, Lifecycle hook is defined AFTER the first instance in ASG is created. As a result, the first instance launches without the expected lifecycle hook action. Only when the first instance is deleted, the next instance kicks off lifecycle action, and triggers Lambda function as expected. AWS suggests several workarounds, including launching ASG with 0 instance and increasing to 1 later, or use custom resources.

Use Lambda monitoring features to see if/when the function is triggered by Lifecycle hooks. It is helpful to log the receiving message. AWS sends out a TEST notification when lifecycle hook is initially created. The TEST notification won’t have the complete notification content but it still will trigger Lambda. Since it currently can’t be turned off, Lambda function need to have some error handling for it.