wiki:drmsSumsGeneral

Version 1 (modified by niles, 11 years ago) (diff)

--

SUMS - General maintenance notes

The SUMS family of services are a series of daemons that support SUMS. They must run as a specific user. The choice as to where the software resides is made at installation time - there is no specific place that it must be installed to. A typical location might be /opt/drms, and this location will be assumed for the purposes of this discussion. The most simple check to see if the daemons are running is to use the ps command, which should show the daemons running :

prompt> ps aux | grep sum_svc | grep -v grep
sumUser       4940  0.0  0.0  61160  3192 ?        S    Jan27  10:39 sum_svc nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4942  0.0  0.0  61152  2860 ?        S    Jan27   4:19 Sdelser nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4944  0.0  0.0  61104  3024 ?        S    Jan27   4:53 Sinfo nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4946  0.0  0.0  61104  3104 ?        S    Jan27   4:53 Sinfo1 nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4949  0.0  0.0  61164  3032 ?        S    Jan27   4:42 Sinfo2 nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4951  0.0  0.0  61200  3128 ?        S    Jan27  14:19 Sput nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4954  0.0  0.0  61164  3096 ?        S    Jan27  14:06 Sput1 nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4957  0.0  0.0  61164  3104 ?        S    Jan27  13:59 Sput2 nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4959  0.0  0.0  61200  3192 ?        S    Jan27   7:25 Sget nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4962  0.0  0.0  61180  3224 ?        S    Jan27   7:14 Sget1 nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4964  0.0  0.0  61164  3188 ?        S    Jan27   7:02 Sget2 nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4966  0.1  0.0  61104  3088 ?        S    Jan27  38:22 Salloc nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4968  0.0  0.0  61152  2860 ?        S    Jan27   5:17 Salloc1 nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4970  0.0  0.0  61152  2860 ?        S    Jan27   5:13 Salloc2 nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4972  0.0  0.0  61136  3128 ?        S    Jan27  17:37 Sopen nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4974  0.0  0.0  61160  3108 ?        S    Jan27  17:31 Sopen1 nso_drms_sums sum_svc_2014.01.27.125018.log
sumUser       4976  0.0  0.0  61160  3116 ?        S    Jan27  17:32 Sopen2 nso_drms_sums sum_svc_2014.01.27.125018.log

These daemons are started with the start script /opt/drms/scripts/sum_start.NetDRMS which may print messages that are not entirely clear on start up. The stop script is /opt/drms/scripts/sum_stop.NetDRMS which can also be somewhat less than clear in with respect to the messages it prints. Users may well want to check that running sum_stop.NetDRMS has indeed caused al daemons to exit, if not the "kill" command may have to be used on individual daemons that remain. Many sites have a cron job that restarts the daemons on a regular basis.

Although the "ps" command above does check that the daemons are running, it does not check that they are in a functional state. One way to do this is to allocate some space to test :

/opt/drms/bin/linux_x86_64/vso_sum_alloc sunum=516847681 size=56246 seriesname=hmi.rdvfitsc_fd05
sunum:516847681;size:56246;sudir:/SUM01/D516847681

If the allocation is successful, that's a good sign.

There is a directory associated with the sum_svc daemons contains the log files (and, somewhat idiosyncratically, the configuration file for the sum_rm daemon). It is typically something like /usr/local/logs/SUM. "ls -ltr" can be used to see which logs are actively being written to in that directory.