Shinken 1.2 is out
After a long wait, we are proud to announce Shinken version 1.2! As previously mentioned, this is a huge update, with numerous commits and contributions from many new developers! Have a look at the Changelog.
Let’s look at the WebUI, which has evolved since the last version. Some pages are re-styled, like the detail page:
You can now easily zoom in to a specific timeperiod in graphs, like the PNP Graphing visualization module :
The WebUI now has a dashboard! You can add/remove widgets to control what information is displayed on a per user basis.
The filtering capabilities are much improved, and you can easily filter data to see only what you want. You can also save such queries as “bookmarks” :
The mobile web interface was also redesigned. Thanks to the Lyon Sciences-U guys!
Setup for lazy admins
Ok, you are an efficient system administrator and you don’t want to download and untar a tarball? Here is something for you. To launch the Shinken installation, just paste the following command in a root shell :
curl -L http://install.shinken-monitoring.org | /bin/bash
And you are done!
This will install the Shinken core using the install.sh automatically. To install addons, manage dependencies, upgrade, remove and more, the install.sh is also the correct method, just follow the Shinken 10 minute installation guide.
Passive check scaling : direct routing
In the previous versions, it was very easy to scale active checks with just adding pollers. But it was harder for passive checks, because even with numerous receiver daemons, all the checks were routed by the arbiter daemon. Things have changed with this new version! Now you can enable the direct routing mode for the receiver, and they will directly send checks to the schedulers without talking with the arbiter!
So now, it doesn’t matter if you are doing active or passive checks, scaling is always possible!
One big part of this 1.2 release was the configuration part. We know how hard it can be to manage a huge configuration, so we did our best in this part. And results are quite good!
A major component to making configuration management easy is the discovery tool. Now you can define numerous “levels” in your discovery rules! Here’s an example : you want to checks all your windows shares. Of course you can write a specific runner to scan your network. But it can be more efficient to only apply these rules to Windows hosts!
Now you can! The classic Nmap scan will discover your windows hosts, and for each windows server, and only those, it will launch a new runner that will scan and configure the shares!
Shinken packs and community.shinken-monitoring.org
You know how useful default templates can be when you are a beginner in monitoring. Now you can share your experience with the whole community with Packs. They are .zip files with templates, service definitions, commands configuration, discovery rules, icons for WebUI and graph visualization templates for PNP or Graphite. You can use the new shinken-packs command to create such a pack file. Then you can just share it with the whole community using the very intuitive community.shinken-monitoring.org website!
The last configuration feature is a massive one : the new Skonf configuration UI! It’s still considered a Beta, but it’s already quite good and useful. It makes it easy for everyone to add new hosts by making use of the discovery library. Administrators can define customs objects, download and setup new configuration packs directly from the community web site! SkonfUI has automatic integration of the available packs in the community web site.
Take a close look at the SkonfUI, it will save you lot of time in the future!
When using SkonfUI, you start by a discovery or rescan of your hosts. Tags will be automatically added from your discovery rules!
Then if you need some special parameters (like SNMP community or Oracle database password), you can modify your hosts :
Then, if you need new packs, just search for them in the community website from SkonfUI and launch the one-click install !
A great monitoring tool is Zabbix. One of its nice features is the trigger definition. It’s a way to define error criteria with an expression. Whereas in Shinken, like in Nagios, all the smart part of the error checking was in the plugins, Zabbix lets the plugin only focus on getting the data, and the centralized triggers would apply rules (triggers) to determine possible errors or warnings.
It’s a powerful but is more complex, which is one of the reasons Nagios became the number one open-source monitoring tool in the past 10 years. But such triggers can be useful for other things that just looking at a CPU value.
The Shinken team added a way to define “user code” in the scheduler executed after each check. Such code has full access to the data and modules. They are the Shinken Triggers. Basically, they are .trig files with Python code embedded. They can access objects states and values (their own or others), so you can define correlation rules that are not possible with the classic Shinken business rules that use OR/AND/Xof: logic.
You can also use triggers to compute KPI, like the average value of the response time of some webservers. Such triggers will fill a new service perfdata, so the value will be available in your performance data visualization frontends like PNP or Graphite! You can also raise alerts based on the trigger output.
A new acquisition method for Shinken – Collectd – Getting only performance data
The Shinken team also added a new module in Shinken : Collectd. This module will listen to collectd data, and will fill services with the values (so they will be graphed automatically). With triggers, you can easily write “checks” around theses values, like regular plugin rules for warning if the memory is too high or more complex calculations. All you need in the distant server is the collectd client
- Step 1 : Install a collectd agent
- Step 2 : Configure it to gather date and forward it to a remote receiver (Shinken collectd module)
- Step 3 : Enable the Shinken collectd module
- Step 4 : Configure some triggers (or not if you just want to graph the data in PNP or Graphite)
Triggers are very powerful, they will be even more in the future, with the help of scheduler modules that will allow to add new functions available for triggers. (Possibly an SNMP TRAP OID parser module?).
In the 1000+ commits for 1.2, many other features made it in
- For very large configurations (50K+ services, 100K+ metrics)
- Faster overall startup time for Shinken
- Increased scalability and performance of the core
- Livestatus API 10x performance increase with the use of new indexing and core optimizations (Great for Nagvis, Thruk, MK Multisite)
- Graphite module is now very efficient, no problem with 500K metrics at high rate (Fix required, needs Shinken 1.2.1)
- Graphite module has many new options for manipulating the name of metrics sent to Graphite
- Reorganization of the wiki documentation to make it easier to get started with Shinken
- Increased use of MongoDB for resiliency of logs, configurations and retention data.
- Consult the changelog for more information
The next version will be dedicated to completing SkonfUI , getting the most out of the new integrated SNMP polling module, with some work on triggers too (like new functions or giving modules away to add functions too), and a new feature proposed in Shinken ideas, to change the check command from timeperiod like we did for notifications!