Scale in a few minutes
With Shinken, you only need to manage one configuration, and look at data from a unique point! It will cut your configuration into parts and dispatch it to all the available resources! Need more resources? Just add some new ones, you don’t even need to restart, it supports hot scaling capabilities!
It’s the cloud computing applied to the IT monitoring world!
The load balancing is done automatically : the administrator doesn’t need to remember which server is responsible for launching a specific check, Shinken does it for him.
If you want to use virtual machines as resources, you can.
The global architecture of Shinken: the Unix way
Shinken’s architecture has been designed according to the Unix Way : one tool, one task. Right now the Nagios daemon does nearly everything: loads configuration, schedule and launch checks, and raise notifications. Shinken has an architecture where each part is isolated and connected to the others via a communication channel which makes building distributed monitoring architecture quite easy.
The major innovation of Shinken over nagios is to split the different roles into separated daemons. They are :
Arbiter : it reads the configuration, cuts it into parts (N schedulers = N parts), and then send them to all others elements. It manages the high availability part : if an element dies, it re-routes the configuration managed by this falling element to a spare one. Its other role is to receive input from users (like external commands of nagios.cmd) and send them to other elements. There can be only one active arbiter in the architecture.
Schedulers : they are in charge of the scheduling checks, the analysis of results and follow up actions (like if a service is down, ask for a host check). They do not launch checks or notifications. They keep a queue of pending checks and notifications for other elements of the architecture (like pollers or reactionners). There can be many schedulers.
Pollers : They are in charge of launching plugins as requested by schedulers. When the check is finished they return the result to the schedulers. There can be many pollers.
Reactionners : They are in charge of notifications and launching event_handlers. They are not managed by pollers because it is more easy to have only one place to send notifications (and another one for spare) for example to have less SMTP authorizations or RSS feeds to read (only one for all hosts/services). There can be numerous reactionners if the administrator desires so.
Broker : Its role is to get data from schedulers (like status) and manage them (like storing them in database). The management is done by modules. Different modules exists : export into ndo database (MySQL and Oracle backend), export to Graphite for time-series graphing and many others.
This architecture is fully flexible and scalable: the only daemons that ask for performances are pollers and schedulers and the administrator can add as much as he wants. A picture is worth a thousand words:
An animation is worth a thousand pictures :
If you want to know how it works, you can see this in the architecture area.
Looking for other features?
Smart and automatic load balancing is just one feature of Shinken. See here for the whole list.