System Monitoring

It pays to know what your computer machines are doing. Gathering them into a machine room is the traditional method, allowing specialised personnel to maintain high operational standards. With the VLSI revolution, units are now found everywhere. No longer is the luxury of a specialised environment available for these units. This makes remote sensing of primary concern for system reliability and the extended lifetimes we have come to expect from these versatile units.

I'm using the lm_sensors package to monitor the sensors built into the more modern of the units in my server farm. A cron job runs every five minutes on each unit I monitor. This job runs the lm78 tools and grabs data out of the motherboard hardware, then sends it to the web server.

Currently I'm using RRD to plot the temperature values. RRD is better suited to general purpose plotting than MRTG, as RRD does not have the positive values only, integer values only limitations imposed by MRTG.

I did hack the lm_sensors code so that it does not use the "176" character ("degree" symbol) when printing values.

Patch for prog/sensors/chips.c.

Perusing awk docs led me to modify the following scripts slightly. Saves a couple pipes and an awk and sed invocation.

A remote host has a script run out of cron that grabs "sensors" data and copies it with ssh to the remote web host. The web host first creates an RRD data file with this script and then runs a cron script to update the images for the web page.