About
From Mon Wiki
Introduction
"mon" is a tool for monitoring the availability of services, and sending alerts on prescribed events. Services are defined as anything tested by a "monitor" program, which can be something as simple as pinging a system, or as complex as analyzing the results of an application-level transaction. Alerts are actions such as sending emails, making submissions to ticketing systems, or triggering resource fail-over in a high-availability cluster.
The tool is extremely useful for system administrators, but not limited to use by them. It was designed to be a general-purpose problem alerting system, separating the tasks of testing services for availability and sending alerts when things fail. To achieve this, "mon" is implemented as a scheduler which runs the programs which do the testing, and triggering alert programs when these scripts detect failure. Alerts can be controlled by a variety of "squelch" knobs, and complex dependencies can be configured to help suppress excessive alerts.
None of the actual service testing or reporting is actually handled directly by the mon server itself. These functions are handled by auxillary programs. This model was chosen because it is very extensible, and does not require changing the code of the scheduler to add new tests or alert types. For example, an alphanumeric paging alert can be added simply by writing a new alert script, and referencing the alert script in the configuration file. Monitoring the temperature in a room can be done by adding a script that gathers data from a thermistor via a serial port. Often these monitoring scripts can just be wrappers for pre-existing software, such as "ping" or "ftp".
The mon scheduler also can service network clients, allowing manipulation of run-time parameters, disabling and enabling of alerts and tests, listing failure and alert history, and reporting of current states of all monitors.
