SNMP monitoring with Shinken


Nagios did many things right..

By externalizing checks to plugins. This same flexibility has led to literally hundreds of plugins based on SNMP. Some plugins offer wide coverage, others a single device or a single OID, others are very generic like check_snmp. But, they all have something in common.

They all suffer from design limitations that render them difficult to maintain and impossible to use when  scaling for very large environments.

What if..

  • there was an integrated SNMP acquisition method for Shinken
And..
  • What if it were distributed  and scalable to thousands of devices
  • What if it were efficient and performant in operation
  • What if it were compatible with network weathermap lines for Nagvis
  • What if it did automatic instance mapping
  • What if it had a discovery module to create the Shinken configuration for you
  • What if you could send all that nifty data to Graphite seamlessly for outstanding graphs (or PNP for that matter)
  • What if you could have simple to define triggers that go beyond basic thresholds
  • What if it was written in Python like the rest of Shinken
  • What if it had been designed and written by professionals, yet still AGPL

That would be interesting, eh.

SNMP monitoring is the bread and butter of monitoring network devices. Which is why Nagios and Shinken have always been relegated to monitoring small to medium networks and more focused on systems and applications. Monitoring hundreds of switches and routers that can change everyday is not for the faint of heart in Nagios, in fact, it plain sucks. Even with new frameworks that cobble together series of independent Nagios systems or really great check scripts, there is room for improvement.

  • What if this monitoring system just needed simple configurations for services and host definitions
  • What if it was really friendly to add more device definitions
  • What if … Oh what if I told you it was available today in a secret lab in the French Alps.
Ah ha.. Gotcha, I was only pulling your leg.
And yet, there is this piece of software running on my laptop along with Shinken. Ahh, here it is.

SnmpBooster

 

The SnmpBooster module was born from a need to have a reliable and performant SNMP Polling mechanism that does not suck to configure. You read that right, who the heck wants to pore over SNMP mibs and screw around with Nagios type configuration files to add each and every device. Shinken configuration needs to be programmatic and handle any SNMP collection needs.

You know what I like:

  1. Define what you want to be monitoring (List of IPs, identify dependencies and a few options)
  2. Launch a device discovery script that will generate the configuration for you
  3. Restart your Shinken arbiter to start monitoring your new devices
I gloss over administrative  items (notification groups, building custom dashboards, graphite graph templates, etc.) for readability.

The design specification was established based on experience with many different tools and with the unique characteristics that make up Shinken. Months of development, refactoring and testing were recently completed. Thanks to the Shinken design, a SINGLE line of code is all it took to integrate the new poller module. Shinken 1.2 is ready for SnmpBooster TODAY.

Learn how to install and setup SnmpBooster Note: A bug had slipped in a last minute debug commit last week. Now fixed 2012-10-26.

Testing has shown that the module is very capable and provides healthy feedback. We consider it ready for public beta testing in preparation to making SnmpBooster the core SNMP collection method for Shinken. The installation process has more steps than we would like, but this will improve over time.

 

Background about the project

The design was influenced by the Cricket project, the Shinken architecture, the NrpeBooster module, the trigger mechanism of Zabbix, the genDevConfig configuration profile generator and by the technical experience of the  SnmpBooster development team.

Development team

Savoir Faire Linux : Thibault Cohen (Lead Developer, Design and Testing)

Acktomic Net Architects : Francois Mikus (Design, Testing, Documentation)