2011 restrospective and 2012 roadmap
Hi all,
First of all, Happy new year
Thanks to you 2011 was great!
Last year was a great one for the Shinken project with some great features like escalations based on time, business rules, business impacts and of course the new WebUI.
But code is just a small part of a project, and I would like to thank a lot every one that sent us bug reports, took some time to code or helped correct my horrible English in the web site and the wiki.
And what about 2012?
You asked for what is coming for 2012! There are two main objectives :
- simplification for beginners, very easy setup, ready to use templates, tutorial oriented doc and an easy configuration interface
- more correlation features for the experts (trigger like)
Setup and (huge) configuration simplification
Configuration IS complex
If I go back a few years, in 2005, and remember how long it took to setup and configure my first Nagios box, the need for simplification becomes obvious : it took me a week to setup (ok the compilation/install phase was quite quick) but the configuration was just a nightmare to reach a “good” configuration that is effective and still in use years later.
Is the situation better now, even in Shinken? There are configuration UI of course, but these are just overlays over a vi. You still need to think about how to design your configuration. There is no “easy best practice” ready to go in a few minutes way. This is a major problem.
Lots of beginners are going to the Zabbix solution, because they have in a few minutes a ready to run setup. And guess what? If I was a beginner today, I would want this too.
The complexity problem
Why is it so complex? It’s because it follows Nagios way of thinking, which can be summed up in a few points :
- Setup each part one by one
- No pre-built ready to go templates
- The focus is on the services, not the host
- Admins that are managing the monitoring configuration are not the same that are (trying) to add them
- Focus on “declaration”, not discovery
Solutions
The first point is in progress, we are already working on the new setup script thanks to David Guenault. It helps setup the core but also plugins. It will help a lot new comers to setup a working environment.
The second point was quite easy to resolve, ready to run templates have been added in the Shinken git version. There are templates for typical devices :
- Windows
- Linux
- Network services : HTTP(S), FTP, IMAP/POP/SMTP, Ldap(s), DHCP
- Databases : Mysql, MSsql and Oracle
- Active directory
- Exchange
- Printers
- VMware ESX and virtual machines
- Network equipment is in progress
If you have ideas or templates to propose for the sample templates, just send them to us. :)
The third and 4th are strongly linked. In the Nagios world, the host was just “something” to which link services. Until the last versions, they were not event “checked” unless a service raised a problem!
Most admins do not think like that. They see the host as a sum of properties. For them, their main ERP database server is a Linux+Prod+ERP+Oracle server. Looks like a list of templates? Yes they are. In fact they can be seen as template “tags”. We were already able to setup how to check the “host” object, like check period or contacts. In addition to that, we should be able to automatically link services to them.
Of course there are the hostgroups. But why create groups when we already have the template tags? Why ask the admin to create another object where he already has all that he needs? That’s why Shinken introduced the service template over host template setup. For each template tag, the host will automatically inherit the defined services. It will no longer be required to define useless groups for that.
We also see more and more devops application too. We saw too many places where the monitoring admin should create templates, groups and services definitions and then have to add the new host in the configuration himself (or herself) . I think there should be two roles : one that create services packs and templates, and another for others admins that add new hosts and simply “tag” them based on their requirements. That’s aligned with the devops spirit : the monitoring tool expert creates “easy” tools for all others.
The last point is quite obvious in fact : if you can have an easy tool to automatically “tag” your hosts when you add them, why not use it?
What is in progress
The sample templates are following this spirit : the user just needs to add his host with tags and properties (like a specific user connexion if it’s not a default one), that’s all. Services are in a “pack directory” with all they need in it :
- the host template definition
- specific macros if needed
- check commands definitions
- specific contact groups if needed
- services definition, with a few arguments, most parameters should be in the host template already
The goal is that the new user should only touch hosts objects, that’s all they need to see. It’s the monitoring admins’ role to adapt templates to the IT environment, not the others admins that don’t have time to read the whole documentation
We are thinking about opening a place to exchange such “services packs”
To tie this all together the configuration UI needs to mature (we already have the discovery classes and objects). It’s our major project for the next months.
The advanced correlation
One other objective for 2012 is to allow users to define even more advanced correlation. Of course they can use business rules, but there are things that even they can’t do. If you want to setup a rule that raises an alert if the average latency of your web servers is too high, it’s difficult to setup currently.
We are thinking about Zabbix trigger like objects. Don’t worry, you won’t have to drop your plugins, it’s not for this use case
You can see examples of such triggers in this wiki page. The goal is to allow complex expressions to be computed. For now it’s a draft, but the goal is to allow SEC like definitions but in a simpler format
This should be done after the configuration UI I think.
Start of Shinken Enterprise
We are starting to get huge projects in qualification phases, and it’s a very good thing. But this is also means that they need professional support. That’s why I’m starting Shinken Enterprise. But don’t worry, there will be no Shinken XI or other open core things
I started this on the side from my current day to day admin job for now, to answer professional support requests and on-demand development. I hope it will allow me to reduce my admin job and take more time for Shinken development and documentation. This won’t hurt my community presence of course, I’ll still be available by mail or in the forum. It can even boost my presence by giving me more free time.
We made 2011 a great year, our core is a great one, the visualization UI is growing, and if we still have as much fun as we did last year, 2012 will be an even better one with a lot of Shinken installations everywhere.
Jean
This is why I believe Shinken has what it takes to be the best and most flexible open source monitoring core.
It has the vision and the design to make it happen.