Shinken’s architecture has been designed according to the Unix Way : one tool, one task. Shinken has an architecture where each part is isolated and connects to the others via standard interfaces. Shinken is based on the Python Pyro remote objects library. Which makes building a highly available or distributed monitoring architecture quite easy. In contrast, the Nagios daemon does nearly everything: it loads the configuration, schedules and launches checks, and raises notifications.
One of the major innovations of Shinken over Nagios is to split the different roles into separate daemons.
Another innovation is to permit the use of modules to extend and enrich the various Shinken daemons.
Arbiter : The arbiter daemon reads the configuration, divides it into parts (N schedulers = N parts), and distributes them to the appropriate Shinken daemons. Additionally, it manages the high availability features: if a particular daemon dies, it re-routes the configuration managed by this failed daemon to the configured spare. Finally, it receives input from users (such as external commands from nagios.cmd) or passive check results and routes them to the appropriate daemon. Passive check results are forwarded to the Scheduler responsible for the check. There can only be one active arbiter with other arbiters acting as hot standby spares in the architecture.
Modules for data collection : NSCA
Modules for configuration data storage : MongoDB,
Modules for status retention : PickleRententionArbiter
Modules for configuration manipulation : IP_Tag, MySQLImport, GLPI, vmware autolinking
Scheduler : The scheduler daemon manages the dispatching of checks and actions to the poller and reactionner daemons respectively. The scheduler daemon is also responsible for processing the check result queue, analyzing the results, doing correlation and following up actions accordingly (if a service is down, ask for a host check). It does not launch checks or notifications. It just keeps a queue of pending checks and notifications for other daemons of the architecture (like pollers or reactionners). This permits distributing load equally across many pollers. There can be many schedulers for load-balancing or hot standby roles. Status persistence is achieved using a retention module.
Poller : The poller daemon launches check plugins as requested by schedulers. When the check is finished it returns the result to the schedulers. Pollers can be tagged for specialized checks (ex. Windows versus Unix, customer A versus customer B, DMZ) There can be many pollers for load-balancing or hot standby spare roles.
Reactionner : The reactionner daemon issues notifications and launches event_handlers. This centralizes communication channels with external systems in order to simplify
SMTP authorisations or
RSS feed sources (only one for all hosts/services). There can be many reactionners for load-balancing and spare roles
Broker : The broker daemon exports and manages data from schedulers. The management can is done exclusively with modules. Multiple
Broker modules can be enabled simultaneously.
Modules for data retention: Pickle , Simple-log (flat file), ToNdodb_Mysql, ToNdodb_Oracle, ToMerlindb_Mysql, ToMerlindb_Sqlite, couchdb.
Modules for exporting data : Graphite-Perfdata, NPCDMOD(PNP4Nagios) and Syslog.
Modules for the Livestatus
API - status retention and history : SQLite (default), MongoDB.
Module for the Shinken WebUI, GRAPHITE_UI, PNP_UI.
Modules for compatibility : Service-Perfdata, Host-Perfdata and Status-Dat
The various daemons can be run on a single server for small deployments and split on different hardware for larger deployments as performance or availability requirements dictate. For larger deployments, running multiple Schedulers is recommended, even if they are on the same server. Consult
planning a large scale Shinken deployment for more information.
If you wish to plan a large scale installation of Shinken, you can consult the Scaling Shinken reference.
This is essential to avoid making time consuming mistakes and aggravation.