3
minute read
Conrad Irwin

Autoscaling with Capistrano, NFS, and Runit

Bugsnag is architected to deal with huge variations in load — some days we get over 10x more API hits than others. In order to maintain a reasonable processing speed for a reasonable cost, we use AWS autoscaling to add and remove workers as needed.

This is surprisingly easy to do: first set up a launch configuration with an AMI and a shell script, then add an autoscaling group that boots more machines when the worker queue is building up and terminates them when they’re not using much CPU.

Once we set up AWS autoscaling, we needed to ensure new instances are running the latest version of our code. To solve this we use a combination of NFS, capistrano and runit.

Deployment Architecture

Sharing App Code with NFS

NFS lets you share a directory between multiple servers (think of it as Dropbox for servers).

Our first approach was to [CODE]scp[/CODE] our app’s code when the autoscaler booted an instance. This was fragile, and meant that servers could get out of sync with each other, so we needed a solution that let us maintain just one copy of the app in a central location. NFS seemed perfect.

NFS was built in the 1980s so it has a very mature interface. It literally just works. All we had to do was install install the [CODE]nfs-kernel-server[/CODE] package on our build server and [CODE]nfs-common[/CODE] on the worker machines.

Configuring NFS is done by editing [CODE]/etc/exports[/CODE] on the build server. We want to export our [CODE]/apps[/CODE] directory read-only to all machines, which requires the following line of config:

-- CODE language-bash --
/apps *(ro,no_subtree_check)

Now as part of our autoscaling shell script, we mount the apps onto the actual app servers. The app is immediately ready to boot!

-- CODE language-bash --
mkdir -p /apps/worker
mount -t nfs nfs.internal.bugsnag.com:/apps/worker /apps/worker

Deploying with Capistrano

We use Capistrano 3 for deployment, as it neatly wraps running shell scripts on multiple servers. In fact, the only change we needed to make to move from having a list of hard-coded servers to autoscaling was to configure the list of machine names dynamically.

First we add the build machine as a server with the [CODE]deploy[/CODE] role. This role is not configured to run any code, but exists just to deploy to. Then we find all the autoscaled workers using the AWS API and tag them. The [CODE]no_release[/CODE] attribute prevents Capistrano uploading code to that server (it can just read the code over NFS),and the [CODE]worker[/CODE] role tells it to run the bugsnag-worker.

-- CODE language-bash --
# config/deploy/production.rb

# deploy to build server
server 'nfs.internal.bugsnag.com', user: 'deploy', roles: %w{deploy}

# find all running EC2 servers tagged with Role=worker
instances = AWS::EC2::Client.new.describe_instances(filters: [
             {name: 'tag:Role', values: ['worker']},
             {name: 'instance-state-name', values: ['running']}
           ])

instances.instance_index.values.each do |host|
 server host[:dns_name], user: 'deploy', roles: %w{worker}, no_release: true
end

Booting with runit

There are a plethora of process managers out there, but we’ve used runit for a while due to its simplicity.

To do this we use cap-runit to define runit tasks during deployment:

-- CODE language-bash --
# config/deploy.rb

set :runit_service_directory, "/apps/service"

runit_service :'bugsnag-worker' do
 roles :worker

 run File.read("./config/service/bugsnag-worker/run")
 log File.read("./config/service/bugsnag-worker/log/run")
end

This shell script gets deployed to all worker machines when we run [CODE]cap production deploy[/CODE]. Runit then picks up on the files we’ve added, and starts running the service.

To make this work on autoscaled machines we add another command to the autoscaling script that copies the runit scripts into place. The relevant part of the script then looks like this:

-- CODE language-bash --
# AWS autoscaling custom data

mkdir -p /apps/worker
mount -t nfs nfs.internal.bugsnag.com:/apps/worker /apps/worker
cp -R /apps/worker/current/config/service /apps/
chown -R deploy:dev /apps/service

———

That’s all you need to get basic auto-scaling working on AWS! Let us know any feedback or tell us how you autoscale your servers via email or twitter, we’d love to hear from you!

Bugsnag helps you prioritize and fix software bugs while improving your application stability
Request a demo