As you might imagine (depending on just how nerdy and imaginative you are), Booko is a poster child for the concept of long running background tasks. Grabbing prices from 40 online stores isn’t a fast process and you certainly would not want your front end webservers making your users wait as long as the slowest of the 40 stores before responding to a user request.
Over the years, I’ve tried various approaches to running user level daemons. My first attempt was ok – I rolled my own and slowly improved it. It could handle HUP signals, write PID files, die gracefully and it knew if it hadn’t died properly and attempted to kill zombie versions of itself. It had stop / start / restart commands. But it wasn’t all sweetness and light. What happens when it dies? This is probably the trickiest part of running daemons (Well, having to fork twice and make sure you have detached from the terminal is probably tricker, but still).
So, how do you make sure your daemon is running? Cron immediately springs to mind. So, part two of writing your own daemons is writing something to keep them going. You may have found yourself in this position and felt a little tickle in the back of your mind when you setup a cron job to solve this problem. My cron job looked at the daemon’s log file’s modified time and if it was more than 5 minutes old, looked for the PID file and sent that process a KILL signal.
It’s an easy, stable solution to the problem at hand – albeit with a 5 minute lag to detect crashed daemons. It’s ok because I run multiple daemons which can take the load if one dies. But, what happens if you only have a single daemon? Increase the frequency of checking? Cron’s smallest resolution is 1 minute – that’s not really ok (depending on what your daemon does, it may be fine). But now you have to make sure that your daemon’s writing to the log at least every minute. Ugh.
This solution is starting to smell. So, what does everyone else do? Well, I checked out God – but it just doesn’t feel like an elegant solution to this problem. It may solve the problem nicely, but there must be a better way? Hard core nerds would probably move on to daemontools but it’s too much work for me.
That tickle you may have had in the back of your mind earlier was your subconscious telling you the problem is already solved and you already use it for your webserver, mail server, DNS server, ssh server and more. Your operating system can provide this exact service for you. Since I’m using Ubuntu that service is provided by Upstart.
Running your service with Upstart has two very nice consequences. Firstly – you can remove all the code used to manage daemonising. You can now write your code to hang around in the foreground. Leaving your code in the foreground while you’re in development mode is good anyway – you can watch it more closely. If you really want to daemonise in our dev environment, bang up a tiny ruby script with the Ruby Daemon gem which calls your actual script and manages PIDs, signals and a stop/start interface for you.
Setting up a service to run with Upstart requires just a config file – here’s one I prepared earlier:
description "Price Fetcher Upstart script" author "Dan Milne" start on startup stop on shutdown console output respawn instance $FID script env RAILS_ENV=production export RAILS_ENV exec sudo -u booko RAILS_ENV=production /opt/ruby-enterprise/bin/ruby /var/www/booko.com.au/booko/bin/fetcher.rb $FID end script
That file gets named “fetcher.conf” and goes in the /etc/init/ directory. This has some nice features; the first of which is that once it’s started, it will keep running. If it dies, it’ll respawn (you can see the option right there in the script). The fact that it died goes in /var/log/daemons – but what’s even awesomer, you can run multiple instances of the same script, by passing in FID=0 or FID=1 etc when you’re starting it. Finally, it gets the standard init features. You can start it with ‘service fetcher start FID=0’ for example.
The only missing feature that I can see, is that because I need to pass in FID=0 to the script, it doesn’t start at bootup. There appears to be no way of stating “Startup 2 of these at boot time”.
In summary, if you use your OS init services, you get to write simpler code, get respawning at an OS level and you get all the normal daemon control features.
Would something like
instance ${FID:-0}
let you start at least instance 0 on bootup?
From http://upstart.ubuntu.com/wiki/ExpandEventVariables?highlight=((CategorySpec))
Hey Paul, turns out, that doesn’t work. It looks like it should, but doesn’t seem to. Something to do with the expansion rules I suspect.
Have you tried the @reboot in Cron?
https://help.ubuntu.com/community/CronHowto
You probably just need to add:
@reboot /usr/bin/service fetcher start FID=0
@reboot /usr/bin/service fetcher start FID=1
@adam, I asked around on IRC – the recommendation was to create a master job which starts the other services.
script
env RAILS_ENV=production
export RAILS_ENV
start fetcher FID=0
start fetcher FID=1
start fetcher FID=2
start fetcher FID=3
end script
daemontools is way easy. Like 30 seconds from install to running daemons.
Like all daemon things though, who watches the watcher?
That’s where you need other servers to keep an eye on their peers. Puppet or Nagios.
Timo – After trying this setup out for a while, Upstart (and probably init) is a pretty good place to spawn from. Processes are respawned immediately after they’ve died. It’s also very easy to setup with Upstart – the config is pretty simple. Also, there’s little chance of either Upstart or init dying. Probably the biggest issue will be processes which are hung, rather than terminated.
Hey cool, I came here looking for cheap books and found something elseI could use! My approach so far has been to make my download script more reliable, but there’s always something new to crash it.