It's all connected

Somehow

Goodbye Zenoss

5 months ago, I switched jobs. One of the things I needed to implement at my new workplace was a systems monitoring and management solution. I decided to useZenoss.

The reasons were many, but the main ones were:

  • Auto discovery of hosts and services
  • Nice, modern python feel (not very rational, I know)
  • Nice and usable UI
  • Good graphing
  • User management
  • Agent-less monitoring
  • Support forNagiosplugins

I am now in the process of throwing out Zenoss and switching to Nagios. Why? Because Zenoss may fit very well when you need to monitor simple devices such as routers and switches, but I needed to monitor various processes that were not as easy to fit into the normal Zenoss way of thinking.

Take Virtual hosts for example. I wanted to use the Nagios check_http to check every virtual host so that I could catch deployment errors early. This proved to be hard to do with Zenoss for two reasons. First, I had to create a template for each virtual host, something that takes a lot of clicks, second, Zenoss barfs if one IP is connected to multiple devices. Thus, I cannot represent the different virtual hosts as separate devices.

Another problem is that I feel Zenoss hides a bit of how it works. I was getting the feeling that I didn’t completely understand how I could change different values. This became a problem when I needed to adjust a set of values to reflect that different hosts behave differently.

Graphing

A problem with the Zenoss graphs is that some of them are in the prefs tab and the others you’ll need to click into each file system to find. The result is that you get Repetitive Strain Injury (RSI) before you know it. Nagios doesn’t have a graphing solution so Zenoss beats it there.

To solve this I have ended up with a combined approach. I’ve installedMuninon most of the hosts in question - this was something I did before I started using Zenoss, and Munin gives a lot of nice graphs out of the box. I find Munin a bit of a PITA to configure at times though. I wish someone wrote a quick test script to check that a munin.conf file works correctly.

The final graphing solution I am planning to use is Munin combined withNagios-PNP. Nagios-PNP works, is easy to set up and has the great feature that it will graph any performance data you provide with your pluginWITHOUT ANY EXTRA CONFIG! I just love it.

Auto discovery

My conclusion is that auto discovery is very nice when you want get started, but that you will quickly tire of it. Instead I suggest a combination of nmap with xml output and a custom script to create services. I might write one some day. For now, I ended up just using a simple list of my hosts and a couple of Perl scripts to generate per host configurations where they were needed.

Further needs

After this rant, I hope to come back later with a couple of posts that are more detailed with regard to configuring Nagios NRPE and also some simple autodiscovery tools.

Quick tip at the end: Use Nagios 3

It is realy worth compiling Nagios form source to get the latest version - this eases configuring Nagios-PNP a lot!

 

Comments