A Nagios plugin for Sun hardware failure detection
Par Le Seb, samedi 3 mars 2007 à 23:17 :: Linusque
As I needed a script to parse Solaris ' prtdiag output, and couldn't find any, I just wrote one.
Analyzing prtdiag output made easy ...
The script is logicaly called check_prtdiag.
It's 100% config-file based, massively using Perl regular expressions and logic, as prtdiag output is very different from one system to another.
Without any parameter, check_prtdiag will launch a prtdiag -vcommand and parse its output.
For testing purpose, you can use the -v flag to get more verbose output (on STDERR), or use the f <file>option to use the content of the specified file as input.
See check_prtdiag.txt for details about configuration file format.
Edited 2008-12-05 - release 1.10 :
You can now specify an alternate config file location using the -c <file> option.
The prtdiag command is no more mandatory when using the -f <file> option for testing purpose.
The Unrecognized escape \s passed through warnings should be gone.
Added tests for SunFire V210 in the sample configuration file.
Edited 2009-01-08 - release 1.11 :
Corrected exit codes on CRITICAL and WARNING statuses.
Thanks to Jonathon Weiss for finding this bug.
Edited 2009-01-08 - release 1.12 :
Release 1.11 was crappy.
Thanks to Eric Pearce for feedback.
The provided sample configuration file checks these :
[Enterprise 150]
- IO Cards : checks for "No failures found" / "No System Faults" messages presence
[Enterprise 250]
- IO Cards : checks for "No failures found" / "No System Faults" messages presence
- Memory : looks for memory modules not in "OK" state
- System leds : looks for lit 'ERROR' leds
- Disks : looks for disks not in 'OK' or 'EMPTY' states
- Fans : looks for fans not in 'OK' state
- Power Supplies : looks for PSU not in 'OK' state
[Enterprise 450]
- IO Cards : checks for "No failures found" / "No System Faults" messages presence
- Memory : looks for memory modules not in "OK" state
- System leds : looks for lit 'ERROR' leds
- Disks : looks for disks not in 'OK' or 'EMPTY' states
- Fans : looks for fans not in 'OK' state
- Power Supplies : looks for PSU not in 'OK' state
[Enterprise 3000]
- System leds : looks for lit failure system led
- Fans : looks for fans not in 'OK' state
- Temperatures : looks for temperature sensors not in 'stable' trend
- Power Supplies : looks for PSU not in 'OK' state
- IO Cards : checks for "No failures found" / "No System Faults" messages presence
[SunFire 280R]
- System leds : looks for lit 'FAULT' leds
- Fans : looks for fans not in 'NO_FAULT' state
- Disks : looks for disks not in 'NO_FAULT' state
- Power Supplies : looks for PSUs not in 'OK' state
[SunFire V120]
- IO Cards : checks for "No failures found" / "No System Faults" messages presence
[SunFire V210]
- CPU : check for CPUs not in 'on-line' state
- Fans : checks for fans not in 'okay' state
- System leds : looks for lit 'SERVICE' leds
- Temperatures : looks for temperature sensors not in 'okay' state
- Voltages : looks for voltage sensors not in 'okay' state
- Current : looks for current sensors not in 'okay' state
- Field Replaceable Units : looks for FRUs not in 'okay' (PSUs) or 'present' (disks) states
[SunFire V240]
- Fans : checks for fans not in 'okay' state
- System leds : looks for lit 'SERVICE' leds
- Temperatures : looks for temperature sensors not in 'okay' state
- Voltages : looks for voltage sensors not in 'okay' state
- Current : looks for current sensors not in 'okay' state
- Field Replaceable Units : looks for FRUs not in 'okay' (PSUs) or 'present' (disks) states
[SunFire V440]
- Fans : checks for fans not in 'okay' state
- System leds : looks for lit 'SERVICE' leds
- Temperatures : looks for temperature sensors not in 'okay' state
- Voltages : looks for voltage sensors not in 'okay' state
- Current : looks for current sensors not in 'okay' state
- Field Replaceable Units : looks for FRUs not in 'okay' (PSUs) or 'present' (disks) states
[SunFire V490]
- Temperatures : looks for temperature sensors not in 'OK' state
- System leds : looks for lit 'FAULT' leds
- Disks : looks for disks not in 'NO_FAULT' state
- Fans : looks for fans not in 'NO_FAULT' state
- Power Supplies : looks for PSUs not in 'NO_FAULT' state
[SunFire 880]
- Temperatures : looks for temperature sensors not in 'OK' state
- System leds : looks for lit 'FAULT' leds
- Disks : looks for disks with lit 'FAULT' led
- Fans : looks for fans not in 'OK' state
- Power Supplies : looks for PSUs not in 'GOOD' state
- IO Cards : checks for "No failures found" / "No System Faults" messages presence
[Ultra 10]
- IO Cards : checks for "No failures found" / "No System Faults" messages presence

Commentaires
1. Le mardi 27 novembre 2007 à 21:05, par Plec
2. Le mercredi 14 mai 2008 à 05:33, par amd
3. Le vendredi 5 décembre 2008 à 21:40, par jen
Réponse de Le Seb le vendredi 5 décembre 2008 à 22:25
4. Le jeudi 13 mai 2010 à 09:02, par essay writer
Ajouter un commentaire
Les commentaires pour ce billet sont fermés.