Ticket #330 (new enhancement)

Opened 2 weeks ago

Last modified 2 weeks ago

System monitoring at NSO

Reported by: alisdair Owned by: niles
Priority: high Milestone:
Component: NSO Servers Version: 1.4
Severity: major Keywords: NSO, system monitoring
Cc:

Description

It appears that VSO03 has been down for a few days with people being unaware. We need to work on system monitoring at NSO and in general. This is partly covered by ticket 239.

Change History

comment:1 Changed 2 weeks ago by jacob

My system monitoring scripts are a step in this direction, I have vso1 SCPing json containing status information over to vso, it would not be hard to set up the same code @ Boulder. Though, since this is more of a server thing than a database thing it might make more sense to install Nagios Core on vso03 (I had to tear down my Nagios installation @SDAC because of security and whatnot).

Note: See TracTickets for help on using tickets.