Changes between Initial Version and Version 1 of drmsHandyChecks


Ignore:
Timestamp:
02/10/14 12:57:09 (10 years ago)
Author:
niles
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • drmsHandyChecks

    v1 v1  
     1 
     2= NetDRMS Useful Debugging Checks = 
     3 
     4This is a collection of short checks that can be done on the SUMS or DRMS databases to debug problems. It should be fairly 
     5clear from the context if the query should be run on the DRMD database or on the SUMS one. If in doubt, try one, and fail 
     6over to the other. 
     7 
     8=== Check sunum_queue size === 
     9 
     10This checks the size of the sunum_queue - the sunums waiting to be processed. This should ideally be 0 unless a lot 
     11of sunums have come in at once. 
     12 
     13{{{ 
     14select count(*) from sunum_queue; 
     15}}} 
     16 
     17=== Check sunum_queue entries older than 1 day === 
     18 
     19This checks the number of entries in sunum_queue that are older than a day. This should be 0. 
     20 
     21{{{ 
     22select count(*) from sunum_queue where timestamp < now() - interval '1 days'; 
     23}}} 
     24 
     25=== See what partitions SUMS has available === 
     26 
     27This shows what partitions SUMS has available. The last entry in the table - pds_set_num - should be 0. If it 
     28is not, then perhaps the disk is unmounted, or SUMS sees it as having filled up (note that SUMS sees a disk as full 
     29slightly before the disk is at 100% use). You will have to work with sum_rm to clear up some space and then set 
     30pds_set_num to 0 again. 
     31 
     32{{{ 
     33select * from sum_partn_avail; 
     34}}} 
     35 
     36 
     37=== Temporal coverage === 
     38 
     39When data are written to disk, they have an "effective date" - a date after which they can be deleted by sum_rm. 
     40This returns the latest effective date that is still available. 
     41 
     42{{{ 
     43select min(effective_date) from sum_partn_alloc; 
     44}}} 
     45 
     46=== slony updates === 
     47 
     48This shows the time of the last slony update and the time it was last applied. It should be very recent, at least on the current day. 
     49 
     50{{{ 
     51select * from _jsoc.sl_archive_tracking; 
     52}}} 
     53 
     54 
     55=== Show data on disk === 
     56 
     57This shows data that are on disk. Note that you can be subscribed to a dataset and yet not have data for it on disk (no trigger to get the data). 
     58 
     59{{{ 
     60select owning_series, sum(bytes), count(*) from sum_main  group by owning_series order by sum(bytes); 
     61}}} 
     62