wiki:InstallingNetDRMS
Last modified 9 years ago Last modified on 07/15/15 18:37:19

Originally edited at http://jsoc.stanford.edu/jsocwiki/DRMSSetup on 15 July 2015

NetDRMS - a shared data management system

Introduction

In order to process, archive, and distribute the substantial quantity of data flowing from the Atmospheric Imaging Assembly (AIA) and Helioseismic and Magnetic Imager (HMI) instruments on the Solar Dynamics Observatory (SDO), the Joint Science Operations Center (JSOC) has developed its own data management system. This system, the Data Record Management System (DRMS), consists of data series, each of which is a collection of related data. For example, there exists a data series named hmi.M_45s, which contains the HMI 45-second cadence magnetograms. Each data series consists of several DRMS objects: records, keywords, segments, and links. A DRMS record is the smallest unit of data-series data. Typically, it represents data for a single observation in time (hence the term series in data series), but there is no restriction on how a user organizes their data. A data series may contain one or more DRMS keywords, each of which represents a named bit of metadata. For example, many data series contain a DRMS keyword named CRPIX1. A DRMS segment is a collection of data that contains storage/retrieval information needed by DRMS to locate auxiliary data files. These data files contain large sets of data like image arrays. Generally, they are image files, but what they contain is arbitrary and user-defined. A data series optionally contains one or more DRMS links, each of which is a collection of data that links the data series to other DRMS data series. Each DRMS record contains record-specific values for the DRMS keywords, segments, and links. In this way, one record may have one set of keyword, segment, and link values, and another record may have a different set of these values.

The Storage Unit Management System (SUMS) is the file-management system that contains the data files that DRMS records refer to. Each DRMS segment value is used by DRMS code to derive the SUMS file-system path to a single data file. Because each DRMS series may contain multiple DRMS segments, each DRMS record may point to more than one data file.

To manage all these data, DRMS comprises several components, one of which is a database instance in a relational-database management system (PostgreSQL). The DRMS Library code uses a database instance and several tables to implement the DRMS objects. For each data-series record, there exists a database table that contains one row per each DRMS record. The columns of each of these records contain the DRMS keyword, segment, and link values - bits of data that are all small enough to efficiently fit in a database record. The data-file data are too large to fit into a database record, so those data reside in data files in SUMS. The DRMS-segment values point to the data files, using a unique identifier called a SUNUM. SUMS itself comprises several components, one of which is another database instance that contains several database tables. When DRMS needs a data file, it requests the file from SUMS by providing SUMS with a SUNUM, and then SUMS consults its database tables to derive the path to the data file. SUMS shuttles files between hard disk (aka the disk cache) and tape, so data files have no permanent file path. Therefore, when DRMS requests the path to a file, SUMS must obtain the current path by consulting a database table.

Installing NetDRMS

Before Installing NetDRMS for the First Time

The initial installation of NetDRMS requires installing database software, adding one or more new users, allocating a fair bit of additional disk space for file storage, and installing, configuring and compiling the custom NetDRMS code.

The entire NetDRMS system involves, from base to top:

  1. A couple instances of a database called Postgres, users, procedures and data tables within that database
  2. NetDRMS software written mainly in C, with some embedded Postgres calls and some Python v2.7 or higher. There are two pieces to this software: DRMS and SUMS. Each are compiled/made separately. It requires several third party libraries as well, such as cfitsio. math libraries, and mpi.
  3. If you want to receive replicated data from JSOC, you'll need to install some scripts, and work with your ssh keys and a software called hpn-ssh.
  4. If you want to be a distributor of data, you'll need to install a 'JMD' java/derby database system, third party libraries for tar and curl, and possibly Slony replication software.
  5. If you are a VSO installation, you'll need to run a web server and install further Perl code.

When installing NetDRMS, it is best to do it in a nested order, as listed above, and test each phase for success as you go. Don't move on to the next piece of the installation until reasonably assured that the software installed in the prior step works as planned.

The zeroth step is really to download the NetDRMS Distribution and familiarize yourself with its contents. This is a gzipped tarfile. Unpack it into a target root directory of your choice, e.g. /usr/local/drms or $HOME/drms or /opt/netdrms. The size of the source distribution is currently about 230 MB. A built system (including SUMS) is typically about 630 MB, but this does not include any data in your databases, which can be considerably more depending on how much you want to store locally. You may wish to create a sim link to the NetDRMS directory. E.g. your code is really in /opt/netdrms87/, but you have a link for /opt/netdrms/ that points to whatever your most current NetDRMS code directory is. This will facilitate updates without changing environment variables. Once you've decided where to put the code, unpack it and have a look at it. In particular, read the config.local.template file. You will need to copy and rename and then adjust this file to create a file called config.local. Config.local will drive most of your localized NetDRMS settings, so read the template carefully in preparation for adjusting to your own site.

When you do create your config.local file, it is a good idea to save a copy in a directory outside your $DRMS directory; the SUMS_LOG_BASEDIR would be a good place to keep it if you are the SUMS_MANAGER.

Installation

First, you will need to create a few linux users and groups, giving them the needed permissions (see step 1 below). Second, you will need to install the PostgreSQL Relational Database Management System (PG) and create two databases (see step 2 below). Third, you will need to establish disk storage for SUMS (see "Setting up a SUMS" below). Fourth, you will need to install third-party libraries needed by DRMS and SUMS (see X below). Fifth, you will need to build and install NetDRMS and SUMS (see X below).

To install NetDRMS and SUMS, please follow these directions in order. All accounts/paths/ports/etc. referenced can be modified, but we recommend not doing this unless you are certain they must be different. Debugging issues from Stanford becomes difficult if every site does things differently. The accounts/paths/ports/etc. listed below are the ones used on Stanford's test NetDRMS (on the machine "shoom").

Bear in mind that you may have to change the ownership and permissions on the $DRMS directory as you go through the install process and determine the user that will run the code.

Users and Environment

  1. Set up your existing linux environment to accept NetDRMS (to be done by a superuser or someone with sudo privileges)
    1. Create a production linux user (named production by default). The name of this user is the value of the SUMS_MANAGER parameter in the config.local file. If necessary, modify the sudoers file to include the name of the production user so that this user has the privileges necessary to run a setuid program, sum_chmown, that is part of the SUMS-installation package:<<BR>><<BR>><production user> <host>=NOPASSWD:<path to sum_chmown><<BR>><<BR>>This will allow sum_chmown to be run without a password prompt being presented. Other sites have configured their production user to have highly specific ownership permissions as an alternative to giving the user sudo privileges, and nullified the sum_chmown script since all their data is written only by one user. Clearly, there are security concerns that must be addressed in configuring a production user. We recommend you think through what is best for your local circumstances.
    2. Create a linux group to which the production user belongs, e.g. sumsadmin. All users who will be using the NetDRMS system to access or create SUMS data files must also belong to this group.
    3. Create a linux user named postgres. This is the user that will own all of the Postgres data files. It is also the user that will run the server daemon process (postgres). It will also be the superuser inside the database environment, so think again about security for those who might su to become this user.
    4. Each user of DRMS, including the production user, must set two environment variables in their environment:<<BR>><<BR>>setenv JSOCROOT <DRMS source tree root><<BR>>setenv JSOC_MACHINE <OS and CPU><<BR>><<BR>>where <DRMS source tree root> is the root of the DRMS source tree installed by the production linux user, and <OS and CPU> is "linux_x86_64", if DRMS was installed on a machine with a Linux OS and a 64-bit processor, or "linux_avx", if DRMS was installed on a machine with a Linux OS and a 64-bit processor that supports Advanced Vector Extensions (which supports an extended instruction set). You may wish to have the NetDRMS software installed and compiled before you put the $JSOC_MACHINE variable into play.
    5. Create the SUMS log directory on the SUMS server machine, if it does not already exist. The name/path for this directory is defined in config.local in the SUMS_LOG_BASEDIR parameter. The actual directory must match the value of this parameter, which defaults to /usr/local/logs/SUM. You are free to change this path in SUMS_LOG_BASEDIR, to, say, /var/log/sums or whatever is consistent with your system logs. This directory must be writeable by the linux production user.

Install Postgres

  1. Set up the Postgres database.
    1. Install server version 8.4 (this is the only version supported by Stanford) on a dedicated machine. Obtain the latest 8.4 rpm binaries from ftp://ftp.postgresql.org/pub/binary/ or via yum. You can install later versions of Postgres, up to v.9.3 have been proven at other data sites, if you are not going to become a provider or generator of slony data. Slony is the database replication software that is used by the JSOC at Stanford to distribute records, and its version is tied to Postgres 8.4.x presently.
    2. Install the client software, version 8.4 or the version corresponding to your chosen server version, on all machines that will be used to either access the database server or build DRMS software. All DRMS software must connect to the DRMS and SUMS databases. To do so, it must be linked against static and/or dynamic libraries that allow database access. These libraries are a component of the Postgres client software, so it must be installed on machines used to build DRMS software. Some dynamic libraries are involved, so the host on which this software is run must also have the Postgres client software installed.
    3. As user postgres, create a database cluster for the DRMS data. A database cluster is a storage area on disk that contains the data for one or more databases. The storage area is implemented as a directory (the data directory) and it is managed by a single instance of a Postgres server process. To create this cluster (data directory), first log-in as the linux user postgres, and then run the initdb command:<<BR>><<BR>>initdb --locale=C -D /var/lib/pgsql/data<<BR>><<BR>>This will create the data directory /var/lib/pgsql/data on the database server host. If you want to place the data in a different directory, go right ahead and change the -D parameter value. The "--locale" argument will set cluster locale to "C". Locale support refers to an application respecting cultural preferences regarding alphabets, sorting, number formatting, etc. PostgreSQL uses the standard ISO C and POSIX locale facilities provided by the server operating system. We recommend "C" and make no guarantees what will happen to your formatting if you deviate.
    4. Also as user postgres, create a database cluster for the SUMS data. This cluster is distinct from the cluster for the DRMS data, and it is maintained by a separated server instance:<<BR>><<BR>>initdb --locale=C -D /var/lib/pgsql/data_sums<<BR>><<BR>>This will create the data directory /var/lib/pgsql/data_sums on the database server host (or wherever you've decided to put the cluster with the -D parameter).
    5. Edit the Postgres configuration files - you will have these in two different places, one for each cluster you created with initdb. The configuration files are cluster-specific, and they reside in the data directory created by the initdb command. These are the key parameters which will determine your database efficiency and security. A complete list of all modifiable parameters can be found in the Postgres online documentation, but a few are worth mentioning now.
      1. listen_addresses (in postgresql.conf) is a list of IP addresses from which connections can be made. By default the value of the parameter is "localhost", which disallows IP connections from all machines, except the machine hosting the database server process. This is not what you want. The single-quoted string '*' will allow connections from all machines. If you want to be more restrictive, you can simply provide a comma-separated list of hostnames or IP addresses.
      2. port (in postgresql.conf) is the port on which the server listens for connections. If you create more than one cluster on the host server machine (e.g., if you create both the DRMS and SUMS clusters on a single host), then you'll need to change the port number for at least one cluster (you cannot have two server processes listening for connections on the same port). We suggest using port 5432 for the DRMS cluster (port = 5432 - no quotes), and port 5434 for the SUMS cluster. Note that port 5432 is the default port for Postgres.
      3. logging_collector (in postgresql.conf). Set this to 'on' so that the output of the Postgres server process will be captured into log files and rotated once per day.
      4. log_rotation_size (in postgresql.conf). Set this to 0. This will cause PG to emit one log every day (as opposed to starting a new log after the previous log is a certain size).
      5. log_min_duration_statement (in postgresql.conf). Set this to 1000 so that only queries that are greater than 1000 ms in run time will be logged. Otherwise, the log files will quickly get out of hand.
      6. shared_buffers. Set this to how much memory you want to devote to running the database. The default is 128 MB, so you should increase it. You may also wish to adjust the values for work_mem, maintenance_work_mem, and max_stack_depth, but consult the Postgres manual for a better understanding.
      7. Adjust and learn about the pg_hba.conf file. This file contains lines of the form<<BR>><<BR>><connection type> <databases> <user> <IP address> <IP mask> <authentication method><<BR>><<BR>>if you wish to use an IP-address mask to specify a range of IP addresses, or<<BR>><<BR>><connection type> <databases> <user> <CIDR-address> <authentication method><<BR>><<BR>>if you wish to use a CIDR-address to specify the range. To get yourself up and running, you'll need to add a line or two to this file. To allow access by one host, we suggest<<BR>><<BR>>host all all XXX.XXX.XXX.XXX 255.255.255.255 md5<<BR>><<BR>>or<<BR>><<BR>>host all all XXX.XXX.XXX.XXX/32 md5<<BR>><<BR>>For multiple-host access, we suggest<<BR>><<BR>>host all all XXX.XXX.XXX.0 255.255.255.0 md5<<BR>><<BR>>or<<BR>><<BR>>host all all XXX.XXX.XXX.0/24 md5 The md5 encryption is what will trigger the use of user .pgpass files. You may also wish to comment out the line <<BR>>local local trust" - this line allows anyone on the local machine to log in with no password, and isn't good for long term security. Once you've commented out the <<BR>>local local trust line, you will no longer be able to log in without a .pgpass file correctly made. Please note that whenever you make changes to pg_hba.conf, you will need restart the database server to have changes take effect. You can test your changes once you've started the server.

Start Postgres and Install Data Structures

  1. The remainder of the instructions require that the Postgres servers (there is one for the DRMS cluster, and one for the SUMS cluster) be running. To start-up the server instances run:<<BR>><<BR>>su postgres<<BR>>pg_ctl start -D /var/lib/pgsql/data # start the DRMS-database cluster server<<BR>>pg_ctl start -D /var/lib/pgsql/data_sums -p 5434 # start the SUMS-database cluster server.<<BR>><<BR>> The server logs will be placed in the pg_log subdirectory for each cluster.
  2. Test pg_hba.conf.
    1. Make .pgpass files and ensure that they work. You'll know you've done it right when the production user can connect to the database via "psql" without being prompted for a password. To do this, create a .pgpass file in the production user's home directory. Please click here for information on the .pgpass file, or read the Postgres documentation web site for more information. It is important that the permissions for the .pgpass file are set to 600, readable only to the individual user. You will need to adjust your pg_hba.conf settings in Postgres in order for the .pgpass file to correctly work, and if you need to change pg_hba.conf later, you'll need to recycle the database to get it to see the new settings. It is important that you fully test your .pgpass access with at least one user before proceeding; much depends on its working. If you cannot get it to work and need to step backward with less security, add the <<BR>>local local trust line back into pg_hba.conf and restart the database using pg_ctl restart.
  1. Create the DRMS database in the DRMS cluster, and create the SUMS database in the SUMS cluster:<<BR>><<BR>>su postgres<<BR>>createdb --locale C -E LATIN1 -T template0 data # create the DRMS database in the DRMS-database cluster<<BR>>createdb --locale C -E LATIN1 -T template0 -p 5434 data_sums # create the SUMS database in the SUMS-database cluster. NOTE: The -E flag sets the character encoding of the characters stored in the database. LATIN1 is not a great choice (it would have been better to have used SQL_ASCII or UTF8), but that is what was chosen at Stanford so we're stuck with it, which means remote sites that have become series subscribers are stuck with it too.
  2. Install the required DB-server languages:<<BR>><<BR>>createlang -h <db server host> -p 5432 -U postgres plpgsql data # Add the plpgsql language to the DRMS database<<BR>>createlang -h <db server host> -p 5432 -U postgres plperl data # Add the plperl language to the DRMS database<<BR>>createlang -h <db server host> -p 5432 -U postgres plperlu data # Add the plperlu 'unstrusted' language to the DRMS database<<BR>><<BR>>At this time, there are no auxiliary languages needed for the SUMS database.
  3. Create various tables and DRMS database functions needed by the DRMS library. You will need the NetDRMS source code for this:<<BR>><<BR>>psql -h <db server host> -p 5432 -U postgres data -f $JSOCROOT/base/drms/scripts/NetDRMS.sql # Create the 'admin' schema and tables within this schema; create the 'drms' schema<<BR>># Create the SUMSADMIN database user<<BR>>su postgres<<BR>>cd $JSOCROOT/base/drms/scripts<<BR>>./createpgfuncs.pl data # Create functions in the DRMS database
  4. Create database accounts for DRMS users. To use DRMS software/modules, a user of this software must have an account on the DRMS database (a DRMS series is implemented as several database objects). The software, when run, will log into a user account on the DRMS database - by default, the name of the user account is the name of the linux user account that the DRMS software runs under.
    1. Run the newdrmsuser.pl script - you will be prompted for the postgres dbuser password:<<BR>><<BR>>$JSOCROOT/base/drms/scripts/newdrmsuser.pl data <db server host> 5432 <db user> <initial password> <db user namespace> user 1<<BR>><<BR>>where <db user> is the name of the user whose account is to be created and <db user namespace> is the namespace DRMS should use when running as the db user and reading or writing database tables. The namespace is a logical container of database objects, like database tables, sequences, functions, etc. The names of all objects are qualified by the namespace. For example, to unambiguously refer to the table "mytable", you prepend the name with the namespace. So, for example, if this table is in the su_production namespace (container), then you refer to the table as "su_production.mytable". In this way, there can be other tables with the same name, but that reside in a different namespace (e.g., su_arta.mytable is a different table that just happens to have the same name). Please see the NOTE in this page for assistance with choosing a namespace. <initial password> is the initial password for this account. This is another useful place for you to test your .pgpass files if you have access to a home account for testing purposes, such as your own user account. You may have a mis-configuration in your pg_hba.conf file that would make it appear that .pgpass was not working.
    2. Have the user that owns the account change the password:<<BR>><<BR>>psql -h <db server host> -p 5432 data<<BR>>data=> ALTER USER <db user> WITH PASSWORD '<new password>';<<BR>><<BR>>where <new password> is the replacement for the original password. It must be enclosed in single quotes.
    3. Have the user put their password in their .pgpass file. Please click here for information on the .pgpass file. This file allows the user to login to their database account without having to provide a password at a prompt.
    4. Create a db account for the linux production user (the name is the value of the SUMS_MANAGER parameter in config.local). The name of the database user for this linux user is the same as the name of the linux user (typically 'production'). Follow the previous steps to create this database account.
    5. Create a password for the sumsadmin DRMS database user, following the "ALTER USER" directions above. The user was created by the newdrmsuser.pl script above.
    6. OPTIONALLY, create a table to be used for DRMS version control:<<BR>>psql -h <db server host> -p 5432 -U <postgres administrator> data<<BR>>CREATE TABLE drms.minvers(minversion text default '1.0' not null);<<BR>>GRANT SELECT ON drms.minvers TO public;<<BR>>INSERT INTO drms.minvers(minversion) VALUES(<version>);<<BR>>where <version> is the minimum DRMS version that a DRMS module must have before it can connect to the DRMS database.

Set Up the SUMS database

  1. Although the SUMS data cluster and SUMS database have been already created, you must create certain tables and users in this newly created database.
    1. Create the production user in the SUMS database:<<BR>><<BR>>$JSOCROOT/base/drms/scripts/newdrmsuser.pl data_sums <db server host> 5434 <db production user> <password> <db production user namespace> sys 1<<BR>><<BR>>where <db production user namespace> is the namespace. Please see the NOTE in this link for assistance with choosing a namespace for the production user.
    2. Put the production db user into the sumsadmin group:<<BR>><<BR>>psql -h <db server host> -p 5432 data -U postgres<<BR>>postgres=> GRANT sumsadmin TO <db production user>;<<BR>><<BR>>
    3. Put the production user's password into the .pgpass file. Please click here for information on the .pgpass file.
    4. Create the SUMS database tables:<<BR>><<BR>>psql -h <db server host> -p 5434 -U production -f scripts/create_sums_tables.sql data_sums<<BR>>ALTER SEQUENCE sum_ds_index_seq START <min val> RESTART <min val> MINVALUE <min val> MAXVALUE <max val><<BR>><<BR>>where <min val> is <drms site code> << 48, and and <max val> is <min val> + 281474976710655 (2<drms site code> - 1), and <drms site code> is the value of the DRMS_SITE_CODE parameter in config.local.
    5. Grant elevated privileges to these tables to the db production user (the scripts should be modified to do this):<<BR>><<BR>>psql -h <db server host> -p 5434 -U postgres data_sums<<BR>>data_sums=> GRANT ALL ON sum_tape TO production;<<BR>>data_sums=> GRANT ALL ON sum_ds_index_seq,sum_seq TO production;<<BR>>data_sums=> GRANT ALL ON sum_file,sum_group,sum_main,sum_open TO production;<<BR>>data_sums=> GRANT ALL ON sum_partn_alloc,sum_partn_avail TO production;<<BR>><<BR>>
    6. SUMS data files are organized into "partitions" which are implemented as directories. Each partition must be named /SUM[0-9]* (e.g., /SUM, /SUM0, /SUM101). Each directory must be owned by the production linux user (e.g., "production). The file-system group to which the directories belong, the SUMS user group (e.g., SOI) must also contain all DRMS users. So, if linux user art will be using DRMS and running DRMS modules, then art must be a member of the SUMS user group. You are free to create as few or many of these partitions as you desire. Create these directories now.<<BR>><<BR>>NOTE: Please avoid using file systems that limit the number of directories and/or files. For example, the EXT3 file system limits the number of directories to 64K. That number is far too small for SUMS usage.
    7. Initialize the sum_partn_avail table with the names of these partitions. For each SUMS partition run the following:<<BR>><<BR>>psql -h <db server host> -p 5434 -U postgres data_sums<<BR>>data_sums=> INSERT INTO sum_partn_avail (partn_name, total_bytes, avail_bytes, pds_set_num, pds_set_prime) VALUES ('<SUMS partition path>', <avail bytes>, <avail bytes>, 0, 0);<<BR>><<BR>>where <SUMS partition path> is the full path of the partition (the path must be enclosed in single quotes) and <avail bytes> is some number less than the number of bytes in the directory (multiply the number of blocks in the directory by the number of bytes per block). The number does not matter, as long as it is not bigger than the total number of bytes available. SUMS will adjust this number as needed.

Test your Postgres database installations

  1. Make sure you as production and at least one other user name can log in to both the sums and drms database instances without a password prompt using psql and your .pgpass file.
  2. Do a \dt in both databases and check that you can see tables listed.
  3. Select * from sum_partn_avail table and make sure that your sums partitions are accurately entered.

Third Party Software for NetDRMS

You will need the following third party packages and main package libraries installed before compiling or the compilation will not work. Please note that these are examples from some successful installations, but your own machine may already be configured correctly or it may need an entirely different bunch of stuff installed to get to the same place. It's possible that even with the following installed, during your make you may see that you need further packages or libraries.

-- Development and standard package for postgres. Choose your version - this example shows packages for 9.3: <<BR>>postgresql93.x86_64 <<BR>>postgresql93-devel.x86_64 <<BR>>postgresql93-libs.x86_64 <<BR>>postgresql93-plperl.x86_64 <<BR>>postgresql93-plpython.x86_64 <<BR>>postgresql93-pltcl.x86_64 <<BR>>postgresql93-server.x86_64

--Perl for scripts: V. 5.10 minimum; you may want development libraries installed. (Note that your OS may be relying on an old version of Perl and installing a new one directly on top of it may cause you strange and unexpected problems; parallel installation may be necessary.)

--Python, version 2.7 or higher (Note that some CentOS versions expect a lower version of Python for their own purposes, and installing directly on top of the existing Python may cause unexpected problems): <<BR>>python33-python.x86_64 <<BR>>python33-python-devel.x86_64 <<BR>>python33-python-libs.x86_64 --Cfitsio development and standard packages: <<BR>>cfitsio.x86_64 <<BR>>cfitsio-devel.x86_64

--OpenSSL, LibSSH2

--A compiler, choose either icc or gcc - you don't have to install these specific packages, these are only guides: <<BR>>gcc.x86_64 <<BR>>libgcc.x86_64

--Development package and headers for C (gcc examples given here): <<BR>>glibc-devel.x86_64 <<BR>>glibc-headers.x86_64

--Some compression stuff: <<BR>>zlib.x86_64 <<BR>>zlib-devel.x86_64

--If you're going to be communicating regularly with the JSOC for replicated data, you may also need: <<BR>>openssh.x86_64 <<BR>>openssh-clients.x86_64 <<BR>>openssh-server.x86_64 <<BR>>openssl.x86_64 <<BR>>openssl-devel.x86_64

--To build hpn-ssh for regular file exchange with JSOC: <<BR>>See instructions on http://www.psc.edu/index.php/hpn-ssh , which will first instruct you to get the OpenSSH source code from OpenSSH.org. You will also need to install the "patch" package if it's not on your machine already, to put your hpn-ssh code together. --If you're installing the JMD, you'll need Java installed along with its development library and the tools in tar.x86_64

Configure NetDRMS and SUMS

The configuration and compilation of NetDRMS described here can proceed largely independently of the site and/or user setup, which only needs to be done once. It is recommended that the site setup be done first, as the NetDRMS build requires the definition of certain site-dependent names, such as those of the database and server; however, if these names are already known, the libraries can be built without the database and SUMS storage in place. Any code that requires access to the database will not of course function until the DRMS and SUMS services have been set up.

These instructions assume that there is already a Postgres database server and associated SUMS server that you can connect to. If that is not the case, then you or someone else at your site will first have to do a Site Installation (above). You must also have the PostgreSQL Core (Server) installed at least as a client library on any machine on which you intend to build the package. You should have psql and pg_ctl in your path.

If you have not already done so, download the NetDRMS Distribution. This is a gzipped tarfile. Unpack it into a target root directory of your choice, e.g. /usr/local/drms, /opt/netdrms/ or $HOME/drms. In the target root directory (hereinafter referred to as $DRMS), now is the time you must supply a config.local file describing your site configuration using the config.local.template to start. If V 2.7 or higher has been installed by your site administrator, you should simply copy or link to their version of the file.

If you had not previously installed a V 2.7 release or higher, you should create the config.local file fresh. You can do so either by copying one from the file config.local.template and editing it to supply the appropriate values, or by running the perl script netdrms_setup.pl which will walk you through the fields. (That script has not been widely tested, and might require some tweaking. In particular it tries to execute some additional scripts at the end that are not yet in the release.)

Most of the entries in the file should be self-explanatory. It is essential that the first variable, LOCAL_CONFIG_SET be changed from NO or commented out. Other variables that are almost certain to require changes are DBSERVER_HOST, DRMS_DATABASE, SUMS_SERVER_HOST, and DRMS_SITE_CODE. If you intend to export as well as import data, your DRMS_SITE_CODE must be registered. See the site code page for a list of currently assigned codes.

However you create your config.local file, as previously stated, it is a good idea to save a copy in a directory outside your $DRMS directory; the SUMS_LOG_BASEDIR would be a good place to keep it if you are the SUMS_MANAGER. Other users' config.local files should match that of the SUMS_MANAGER in any case.

Compile NetDRMS

In the target root directory $DRMS, run

./configure

This simply builds a set of links for include files, man pages, scripts, and jsd (JSOC Series Descriptor) files in common subdirectories below the $DRMS root. Note that it is a csh script. If you do not have csh or tcsh installed on your system, you will have to make those links yourself. (Chances are that you will have to perform the whole site configuration by hand.) The NetDRMS distribution is currently supported for three target architectures under Linux, named (by default): linux_ia32 (uname -s = Linux, uname -m = ia32 | i686 | i386), linux_x86_64 (uname -s = Linux, uname -m = x86_64), and linux_avx. The distribution has been built on both Enterprise Linux versions 4 and 5. Enterprise 5, has a system bug that needs to be fixed in order to build the SUMS server (it does not affect the DRMS client.) See platform notes for instructions on how to fix this bug.

If you are making on any other architecture, the target name will be custom. Binaries and libraries will be placed in appropriate subdirectories based on these names. If you will be making on multiple architectures, or if you wish to change the target architecture name, you should either add the following line near the beginning of the file $DRMS/make_basic.mk

. JSOC_MACHINE = name

or set your environment variable JSOC_MACHINE to name before running the make. The latter is recommended for future use, so that you can set appropriate paths in your login or shell initialization scripts. If necessary, edit the file $DRMS/make_basic.mk to set your compiler options. The default compilers for Linux are the Intel compiler icc and ifort if available; otherwise gcc and gfortran. If you prefer to use different compilers, change the following two lines in the file accordingly:

. COMPILER = icc FCOMPILER= ifort

Note that the DRMS Fortran API requires a Fortran 90 compiler. The Fortran compiler is only required if you wish to build Fortran modules that will link against the DRMS library; nothing in the DRMS and SUMS internals and applications uses Fortran. Besides ifort, the gfortran43 compiler should work; there may be a problem with f95. Note that you can only build on a system on which the Postgres SQL Client Applications libraries exist (e.g. libecpg.a). You will also require the OpenSSL secure sockets toolkit; You should have a /usr/include/openssl directory or equivalent on your system where the compiler can locate it by default. N.B. If you are using the icc compiler, it is recommended to use version 11 . There are some very nasty bugs using version 10.*. In the root directory $DRMS, type make. If all goes well, the directory $DRMS/bin/arch_name will be created and filled, likewise the library directory $DRMS/lib/arch_name. If you are building on multiple architectures, repeat this step on each one, being careful to observe the rules in the previous three steps. These instructions should suffice for all users except the manager who needs to initialize the database and/or start the SUMS server. If you do not need to start a SUMS server, you are done. The SUMS manager (production user) should continue with the next step.

There are two parts to setting up NetDRMS. First, the necessary services must be set up at the institution or group that will be hosting the NetDRMS service. The basic preparation and installation only needs to be done once, although the actual software distribution may be updated from time to time without affecting the setup. Second, individual users may wish to set up the NetDRMS software distribution for use or development in their own environment. Again, there are a few administrative tasks that need to be performed once when a user is registered, but the software may be updated or rebuilt at any time. Once the site preparation and setup is complete, user setup is a simple task, so there are two sets of instructions. Most users only need to concern themselves with the second, Installing / Upgrading NetDRMS.

Test your NetDRMS Installation

  1. Put the binaries made in $DRMS/bin/arch_type into your $PATH variable. Test this is successful with which show_info. You should get back the path to the binary that was just created.
  2. Make sure Postgres is still up with a grep for it in your active process list.
  3. Try to run a show_info command. See more on this at: http://jsoc.stanford.edu/doxygen_html/group__show__info.html If show_info successfully runs for the production user and another, lower level user, you've been successful.
  4. If you cannot make NetDRMS work, stop and get help. Do not proceed until you've gotten NetDRMS basically working.

Compiling SUMS

To make the SUMS server available, follow steps below, or the SUMS manager (only) needs to run make sums in the $DRMS root directory. This only needs to be done once for the system; individual users do not need to do it. At this point, if you are the SUMS manager, you are ready to proceed with the configuration, build and start of SUMS services. Proceed to the SUMS setup instructions. Please note that you will likely see many, many warning messages as NetDRMS and SUMS compile. Pages and pages of warnings will likely appear. Unless you have an Error, ignore them and proceed.

  1. Build the SUMS binaries:<<BR>><<BR>>su - <production user>; cd $JSOCROOT; ./configure; make sums<<BR>><<BR>>
  2. Copy the sum_chmown program to <path to sum_chmown> (chosen in step 1a. above), make the production user the owner, and give it setuid privileges:<<BR>><<BR>>su - root<<BR>>cp $JSOCROOT/drms/_linux_x86_64/base/sums/apps/sum_chmown <path to sum_chmown><<BR>>chown root:root <path to sum_chmown><<BR>>chmod u+s <path to sum_chmown><<BR>><<BR>> Note: some sites have made this program into a program that does nothing when called. These sites have only one user that writes files to sums, however, and need not be concerned about different users with different permissions writing files to sums. If you have multiple users writing files to sums, however, you'll need sum_chmown.

Starting, Stopping and Testing SUMS

  1. Start SUMS: <<BR>><<BR>>$JSOCROOT/base/sums/scripts/sum_start.NetDRMS<<BR>><<BR>>The script does not return a prompt after echoing "sum_svc now available". Just hit RETURN.
  2. To stop SUMS for any reason, run this script:<<BR>><<BR>>$JSOCROOT/base/sums/scripts/sum_stop.NetDRMS<<BR>><<BR>>
  3. If both of these commands work and you find many sum_ processes in your list of active processes, you've been successful.

Deciding what's next

You may wish to run a JMD or use Remote SUMS. The decision should be discussed with JSOC personnel. Once you've made this decision and installed the appropriate software (see below for Remote SUMS), you'll need to populate your DRMS database with data. For this, you'll need to be a recipient of Slony subscription data. We recommend contacting the JSOC directly to become a subscriber.

Remote SUMS

A local NetDRMS may contain data produced by other, non-local NetDRMSs. Via a variety of means, the local NetDRMS can obtain and ingest the database information for these data series produced non-locally. In order to use the associated data files (typically image files), the local NetDRMS must download the storage units (SUs) associated with these data series too. There are currently two methods to facilitate these SU downloads. The Java Mirroring Daemon (JMD) is a tool that can be installed and configured to download SUs automatically as the series data records are ingested into the local NetDRMS. It fetches these SUs before they are actually used. It can obtain the SUs from any other NetDRMS that has the SUs, not just the NetDRMS that originally produced them. Remote SUMS is a built-in tool that comes with NetDRMS. It downloads SUs as needed - i.e., if a module or program requests the path to the SU or attempts to read it, and it is not present in the local SUMS yet, Remote SUMS will download the SUs. While the SUs are being downloaded, the initiating module or program will poll waiting for the download to complete.

Several components compose Remote SUMS. On the client side, the local NetDRMS, is a daemon that must be running (rsumsd.py). There also must exist some database tables, as well as some binaries used by the daemon. On the server side, all NetDRMS sites that wish to act as a source of SUs for the client, is a CGI (rs.sh). This CGI returns file-server information (hostname, port, user, SU paths, etc.) for the SUs the server has available in response to requests that contain a list of SUNUMs. When the client encounters requests for remote SUs that are not contained in the local SUMS, it requests the daemon to download those SUs. The client code then polls waiting for the request to be serviced. The daemon in turn sends requests to all rs.sh CGIs at all the relevant providing sites. The owning sites return the file-server information to the daemon, and then the daemon downloads the SUs the client has requested, via scp, and notifies the client module once the SUs are available for use. The client module will then exit from its polling code and continue to use the freshly downloaded SUs.

To use Remote SUMS, the config.local configuration file must first be configured properly, and NetDRMS must be re-built. Here are the relevant config.local parameters:

  • JMD_IS_INSTALLED - This must be set to 0 for Remote SUMS use. Currently, either the JMD or the Remote SUMS features can be used, but not both at the same time.
  • RS_REQUEST_TABLE - This is the database table used by the local module and the rsumsd.py daemon running at the local site for communicating SU-download requests. Upon encountering a non-native SUNUM, DRMS will insert a new record into this table to intiate a request for the SUNUM from the owning NetDRMS. The Remote SUMS daemon will service the request and update this record with results.
  • RS_SU_TABLE - This is the database table used by the Remote SUMS daemon to track SUs downloaded from the providing sites.
  • RS_DBHOST - This is the local database-server host that contains the database that contain the requests and SU tables.
  • RS_DBNAME - This is the database on the host that contains the requests and SU tables.
  • RS_DBPORT - This is the port on the local on which the database-server host accepts connections.
  • RS_DBUSER - This is the database user account that the Remote SUMS daemon uses to manage the Remote SUMS requests.
  • RS_LOCKFILE - This is the path to a file that ensures that only one Remote SUMS daemon instance runs.
  • RS_LOGDIR - This is the directory into which the Remote SUMS daemon logs are written.
  • RS_REQTIMEOUT - This is the timeout, in minutes, for a new SU request to be accepted for processing by the daemon. If the daemon encounters a request older than this value, it will reject the new request.
  • RS_DLTIMEOUT - This is the timeout, in minutes, for an SU to download. If the time the download takes exceeds this value, then all requests waiting for the SU to download will fail.
  • RS_MAXTHREADS - The maximum number of download threads that the Remote SUMS daemon is permitted to run simultaneously. One thread is one scp call.
  • RS_BINPATH - The NetDRMS-binary-path that contains the external programs needed by the Remote SUMS daemon (jsoc_fetch, vso_sum_alloc, vso_sum_put).

After setting-up config.local, you must build or re-build NetDRMS:

> cd $JSOCROOT
> configure
> make

It is important to ensure that three binaries needed by the Remote SUMS daemon have been built: jsoc_fetch, vso_sum_alloc, vso_sum_put.

Ensure that Python >= 2.7 is installed. You will need to install some package if they are not already installed: psycopg2, ...

An output log named rslog_YYYYMMDD.txt will be written to the directory identified by the RS_LOGDIR config.local parameter, so make sure that directory exists.

Provide all providing NetDRMS sites your public SSH key. They will need to put that key in their authorized_keys file.

Create the client-side Remote SUMS database tables. Run:

> $JSOCROOT/base/drms/scripts/rscreatetabs.py op=create tabs=req,su

Start the rsumsd.py daemon as the user specified by the RS_DBUSER config.local parameter. As this user, start an ssh-agent process and add the public key to it:

> ssh-agent -c > $HOME/.ssh-agent_rs
> source $HOME/.ssh-agent_rs
> ssh-add $HOME/.ssh/id_rsa

This will allow you to create a public-private key that has a passphrase while obviating the need to manually enter that passphrase when the Remote SUMS daemon runs scp.

Start SUMS:

> $JSOCROOT/base/sums/scripts/sum_start.NetDRMS >& <log dir>/sumsStart.log

Substitute your favorite log directory for <log dir>. There is another daemon, sums_procck.py, that keeps SUMS up and running once it is started. Redirecting to a log will preserve important information that this daemon prints. To stop SUMS, use $JSOCROOT/base/sums/scripts/sum_stop.NetDRMS.

Start the Remote SUMS daemon:

> $JSOCROOT/base/drms/scripts/rsumsd.py