Way back in the Before Time, I did some work to automate a couple of our recurring manual Harmonia tasks. One of these was the task to back up our shared Google Drive to Amazon S3. Prior to this we'd been running the
rclone sync command manually on one of our local machines. One significant downside of this was that we each needed to have a local copy of all the files (~17GB), so I was keen to come up with an automated solution running in the cloud.
Unlike with our Trello backup, it wasn't obvious to me how we could split the work up into tasks short enough to run as AWS Lambda functions. Also, although it was interesting from an educational point-of-view, I felt as if the orchestration/coordination complexities introduced by splitting up the Trello backup tasks had been overly cumbersome. So I decided to explore the idea of spinning up some compute to execute a script in one go much more like how a cron job would run on a traditional server.
I was (and still am) enjoying using the AWS CDK and so after a bit of research, I decided to use the
ScheduledFargateTask construct which is one of the higher-level patterns made available in the CDK. This construct meant that it was relatively straightfoward to spin up a container on Amazon Elastic Container Service (ECS) at regular intervals and execute a shell script on that container.
Scheduled Fargate Task
The task needed access to the internet, but there was no need for it to be accessible from the internet. I could've run it on a private subnet, but this would've meant I'd need either a NAT Gateway (expensive) or to run a NAT Instance on Amazon EC2 (maintenance/complexity overhead). Since the tasks only run for a few minutes every week I was willing to sacrifice the extra security provided by a private subnet in favour of a simpler/cheaper system where the tasks run on a public subnet.
However, at that point
ScheduledFargateTask only ran if its VPC had a private subnet - if there was no private subnet available, an error was reported. So I decided to take the opportunity to contribute to the AWS CDK project and opened a pull request to allow ECS tasks to run on a public subnet which was released in v1.29.0. I really enjoy contributing to open-source projects like this - it's a really good way to get a deeper understanding of how it all works.
Having incorporated that change, I used
ecs.ContainerImage.fromAsset to define the container image using a local
Dockerfile. This installs
rclone on an Ubuntu base image and copies a backup script and associated rclone configuration files into the home directory. This means you need the Docker CLI available locally when you run
cdk deploy so it can build the container image and push it up to Amazon Elastic Container Registry ready for use by ECS.
It turned out that using the
rclone sync command on a Google Drive folder containing so much data needs quite a bit of CPU and memory, but it was easy to increase this from the default of ¼vCPU & ½GB to 4vCPU & 16GB so that the command ran very quickly. Even though this is pretty beefy, given that it only runs for a few minutes once a week, the cost is negligible.
The task is scheduled using the
CronOptions interface. Configuration is supplied to the container via environment variables using dotenv. Credentials for Google Drive are supplied via Secrets Manager. Those for the S3 bucket are made available via the IAM role assigned to the ECS Task and used by
rclone with the
env_auth option set to
The task is monitored with the excellent Healthchecks service which we were already using for the Trello backup. This is effectively a dead man's switch which alerts us if the script doesn't complete successfully at a given frequency and within a defined grace period.
Two years on, I'm really happy how this turned out. Once I'd got the backup running successfully, we've only had one failure which was due to a recent change to the Google Drive API requiring a newer version of
rclone. This meant I had to dive back into the code again to fix it, but I found it pretty easy to find my way around again partly because there's not actually very much code!
The source code for the whole CDK project is available on GitHub.