Changes between Initial Version and Version 1 of jmdBulkLoad


Ignore:
Timestamp:
02/18/15 13:56:57 (9 years ago)
Author:
joe
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • jmdBulkLoad

    v1 v1  
     1= Bulk Loading SUNUMS into the JMD = 
     2 
     3There are times when scientists want to retrieve a lot of data.  If we can insure that we have all of the data already local, their downloading will go much faster. 
     4 
     5But to do this, we need to get the list of SUNUMs to the JMD. 
     6 
     7There are at least three ways to do this: 
     8 
     9== Using the JMD admin tools == 
     10 
     11In your JMD install directory, there should be a `bin` directory with the script `jmd_admin.pl`.  It allows you to do many things, including request specific sunums be retrieved: 
     12 
     13{{{ 
     14   ## To make a data request 
     15     ./jmd_admin.pl --request -- --series=<series_name> --sunums=<comma separated sunum list> [--priority=<a number from 0 to 1000>] 
     16       e.g. 
     17         ./jmd_admin.pl  --request -- --series=hmi_test.v_45s --sunums=16983416,16977656 --priority=20 
     18}}} 
     19 
     20Typically, you won't know what the list of sunums are that need to be retrieved, so you'll need to use one of the other methods. 
     21 
     22== Using show_info == 
     23 
     24If you call the DRMS command `show_info` with the argument `-p`, it will attempt to retrieve the records identified by the query.  This is useful for contiguous time ranges, but I'm not certain if the logic for sampling is the same between DRMS and the VSO.  If the user is going to be using the DRMS for their retrieval, just use their query.  if they're going to be using the VSO with the IDL `sample` keyword, use the next method. 
     25 
     26 
     27== Using the tables in DRMS == 
     28 
     29If we have a list of VSO fileids, we can write queries that will put the appropriate values into the table `public.sunum_queue`, which is used to track new observations that we'd like the JMD to retrieve. 
     30 
     31From IDL, we can get a list of VSO fileids: 
     32 
     33{{{ 
     34 
     35IDL> a=vso_search('2012/12/29','2015/03/01', inst='aia', sample=3600*6., wave=304, prov='jsoc') 
     36Records Returned : JSOC : 3112/3112 
     37IDL> print, a.fileid 
     38aia__lev1:304:1167588044 aia__lev1:304:1136808044 aia__lev1:304:1138406444 aia__lev1:304:1173830444 aia__lev1:304:1150243244 
     39aia__lev1:304:1192989644 aia__lev1:304:1180418444 aia__lev1:304:1159056044 aia__lev1:304:1136527244 aia__lev1:304:1153893644 
     40aia__lev1:304:1187503244 aia__lev1:304:1200312044 aia__lev1:304:1200376844 aia__lev1:304:1160870444 aia__lev1:304:1177221644 
     41aia__lev1:304:1142316044 aia__lev1:304:1195495244 aia__lev1:304:1153008044 aia__lev1:304:1146528044 aia__lev1:304:1185300044 
     42... 
     43}}} 
     44 
     45AIA fileids have three parts separated by colons:  the series identifier, the wavelength, and the time (`T_REC_INDEX`).   HMI fileids have two parts : the series identifier, and the time. 
     46 
     47Use your preferred tools to extract the list of times and turn it into a comma seperated list, then issue an INSERT similar to: 
     48 
     49{{{ 
     50INSERT INTO sunum_queue (  
     51SELECT NEXTVAL('sunum_queue_key'::regclass), lev1.sunum,  'aia.lev1' AS series_name, NOW() AS TIMESTAMP, lev1.recnum 
     52FROM aia.lev1 LEFT OUTER JOIN sunum_queue ON lev1.sunum = sunum_queue.sunum 
     53WHERE sunum_queue.sunum IS NULL and WAVELNTH=304 and T_REC_INDEX IN ( 
     541135814444,1135836044,1135857644,1135879244,1135900844,1135922444,1135944044 
     55... 
     56)); 
     57}}} 
     58 
     59Note that this should only be used after the other options.  Due to the way that the JMD pulls records out of the queue table, more recent observations will take precidence.