Saturday, July 25, 2015

AWS automation – lessons with Cron set up

I spent quite a bit of time on what appeared to be a rather simple problem to solve, I figured posting it may save others time and frustration. This may appear trivial to a Linux admin, but these days we have people from various backgrounds, wrestling with scripts, Python, DevOps, and infrastructure as code in the “cloud” – it’s nice to share some common lessons along the way. 

I have some “monkey” jobs, which are Python scripts that runs from servers in AWS. To perform regular monitoring, generate custom CloudWatch metric, and raise alarms via SNS, Cron seems like a natural method.

But this is where I got the unexpected glitch. Python scripts runs like a charm from the command line, but Cron does not. The crontab job runs at specified interval (as seen in /var/log/cron), but no metric was received by CloudWatch. Unfortunately, this is where my Google search mislead to wrong directions (due to various posts about crontab misbehavior).  I’ve tried a number of things with no success, including running Cron as root, even rebuilding server.

Finally, I got back on track by focusing on getting more output from the job. Use the following to direct output to log file, note “2>&1” indicates that the standard error (2>) is redirected to the same file 
Crontab –
* * * * * /home/ec2-user/monkey.sh > /home/ec2-user/cron.log 2>&1

Among the output received, the message “socket.timeout: timed out” was clear indication of some sort of network problem. At that point, it is pretty obvious that it is an internet access issue - Cron does not pick up proxy setting.

There was another twist as I tried to set proxy for crontab. I tried setting environment variables in /etc/crontab, which is not supported for proxy. I then went on a detour to set proxy inside Python, way too complicated and unnecessary. It then occurred to me, all that is needed is to set proxy as the first step in crontab job. The last piece of the puzzle is to execute everything in one line (so it is one job run sequentially), like this
Crontab –
* * * * * source /etc/profile.d/proxy.sh & /home/ec2-user/monkey.sh > /dev/null 2>&1

The above job runs every minute, it sets up proxy first, and then does some monkey business.