Configuring Monit to watch over your processes

Tags

I use a couple of useful gems (actually more than just a couple) within my Rails App, namely Delayed_job and Thinking_Sphinx and both these two gem runs background jobs as daemon. One of the issues with running background job daemon is having to make sure that they are always up and running. I use Thinking_Sphinx to enhance the search functionality of the data on the database. And if for whatever reason(s) the process got killed without my knowledge, it will have some adverse effect on the results that comes back from the search. Actually it will throw an error in the Rails App if Thinking Sphinx (searchd) is not running when the user is trying to do a search.

So, as useful as they can be, these background workers needed to be watched frequently to make sure they are still alive and kicking. You certainly do not want to be woken up at 3am to be told that the web server is down or certain email jobs are not going out and lose out on your precious sleep. This is where Monit comes to the rescue. Monit is a program that also runs in the background and does a good job watching over all the various processes and services to make sure that they are up and running. And if it detects certain process is down (not running as it should), it will make attempts to revive the process or service. If it fails for whatever reason, then it will shoot an email or message to inform you. There are many ways you may configure Monit to work for you.

As useful as it may be, configuring Monit for your environment is a different ball game. Troubleshoot problem on Monit is frustrating, as I found out for myself. It took me hours (almost a couple of days) googling for answers to finally get the configuration working right for my environment.

Here are a few gotchas that you should be aware of if you gathered enough courage to venture into this territory.

You need to watch your $path.

By default Monit starts up with a plain “spartan” path:

/bin:/usr/bin:/sbin:/usr/sbin

Monit does not define a $HOME environment variable. What that means is if you are using bundler, it will not be able to locate the gems or your ruby path. And soon you will be pulling your hairs out trying to figure out why a perfectly written script isn’t working. And Monit is not vocal about the cause either. So make sure to include your $HOME environment variable in the path. The standard script example I followed religiously goes like this:

check process delayed_job with pidfile /var/www/my_app/shared/pids/delayed_job.pid
start program = “/usr/bin/env RAILS_ENV=production /var/www/my_app//current/script/delayed_job star”
stop program = “/usr/bin/env RAILS_ENV=production /var/www/my_app/current/script/delayed_job stop”

Unfortunately it took me hours to figure out the need to include your environment PATH within the script to make this work. The final script that works looks like this:

check process delayed_job with pidfile /var/www/my_app/shared/pids/delayed_job.pid
start program = “/usr/bin/env HOME=/home/deployer PATH=/usr/local/bin:/usr/bin:/bin:/home/deployer/.rbenv/shims:$PATH RAILS_ENV=production /var/www/my_app//current/script/delayed_job star”
stop program = “/usr/bin/env HOME=/home/deployer PATH=/usr/local/bin:/usr/bin:/bin:/home/deployer/.rbenv/shims:$PATH RAILS_ENV=production /var/www/my_app/current/script/delayed_job stop”

The path /home/deployer/.rbenv/shims is where my ruby program resides and definitely need to be specified since I am running a ruby script here.

So if you are struggling with getting your monitrc script to work, make sure your PATH is included.

2. Which user account should I use to run Monit as?

I struggled to get this right. And not getting it right means you keep getting permission errors whether you are trying to start a monit job or during deployment using Capistrano.

The Monit daemon can be started by a regular user or a privilege user such as root. The decision of which user depends on your app setup, in my case whether I will be dealing with starting and stopping monit during my deployment. Since my deployment runs as a regular user called deploy, and I need to be able to start/stop monit as a deploy user, I chose to start up Monit as user deploy to begin with. To do that I just need to change the ownership of the file monitrc to the deploy account.

Why don’t I just start/stop monit during deployment using sudo while keeping monit started by root account? That sounds reasonable but as I found out later, that there will still be permission issues involved. After hours of trials and testing, I came to the conclusion that the real issue is not whether to run Monit as a user deploy or root because with either case, I still have permission issues. How so, you may ask.

I have three processes that my Monit will be monitoring – Delayed_job, Sphinx and Nginx. Both delayed_job and Sphinx is started as user deploy because I need to be able to do that during deployment. Nginx is started as a root user by default, I think. And I just kept it that way. So I have 2 sets of processes that needed to be started with different user permissions. (There probably is a better way to do this but this is the best I can come up with for the moment).

So if I start Monit with a sudo, when the time come to restart my Delayed_job in my deployment, I have have permission issue because somewhere along the way, Monit may have restarted Delayed_job (in the event that the process stopped for whatever reason). And during deployment when I try to stop the Delayed_job process, it gives a permission error because that process is now owned by a privileged user (root), and it just stops my deployment in its track.

What if I start Monit as a regular user deploy? Then Monit will have problem restarting Nginx in the event it goes down, because Nginx needs to be started as a privileged user. And Monit is now owned by a regular user. I tried using the as uid root gid root but that command could only be used if Monit is started by privileged user like root.

So either way, I get permission issues. So the solution I came up with after hours of laborious trying is to start the processes within Monit by the respective users. Easier said than done.

The final solution I found was using a wrapper script, to start Nginx:

First create a start script file /usr/local/bin/startNginxServer.sh containing the following:
#!/bin/sh
/etc/init.d/nginx start
Similarly create a stop script file /usr/local/bin/stopNginxServer.sh containing:
#!/bin/sh
/etc/init.d/nginx stop
Change execution permission:
chmod a+x usr/local/bin/sttrtNginxServer.sh
chmod a+x usr/local/bin/stopNginxServer.sh
In /etc/sudoers add the following line:
deployer ALL=NOPASSWD: /usr/local/bin/startNginxServer.sh, /usr/local/bin/stopNginxServer.sh
In the monitrc file, the script to start Nginx will look like this (adapt to your own path environment. Essentially you are starting/stopping Nginx with the startNginxServer.sh / stopNginxServer.sh script):
check process nginx with pidfile /opt/nginx/logs/nginx.pid
start program = “/usr/bin/sudo /usr/local/bin/startNginxServer.sh”
stop program = “/usr/bin/sudo /usr/local/bin/stopNginxServer.sh”

What this essentially do is to allow Monit run the Nginx script with root permission while the rest are run as Deployer user.

Hopefully you will find this information useful, especially if you are struggling to get Monit up and running.

i S i g n I n

~ Just A Few Random Thoughts

Configuring Monit to watch over your processes