|   | 1 | = Minimum Information for Solar Observations =  | 
                  
                          |   | 2 |   | 
                  
                          |   | 3 | == draft v3.1 ==  | 
                  
                          |   | 4 |   | 
                  
                          |   | 5 | After months of trying to come up with various recommendations for 'best practices', this document it an attempt at a 'Minimum Information' standard in the spirit of the bioinformatics community's [http://mibbi.sourceforge.net/about.shtml MIBBI project].  | 
                  
                          |   | 6 |   | 
                  
                          |   | 7 | If you have any comments, please contact [People/JoeHourcle Joe Hourclé].  | 
                  
                          |   | 8 |   | 
                  
                          |   | 9 | ----  | 
                  
                          |   | 10 |   | 
                  
                          |   | 11 | = Documenting Solar Data =  | 
                  
                          |   | 12 |   | 
                  
                          |   | 13 | Goal:  | 
                  
                          |   | 14 |   | 
                  
                          |   | 15 | * Ensure that current and future researchers can use your data.  | 
                  
                          |   | 16 | * Ensure that researchers will properly acknowledge your data.  | 
                  
                          |   | 17 | * Reduce the amount of time needed to support researchers.  | 
                  
                          |   | 18 | * Reduce the likelihood of data or metadata being misunderstood.  | 
                  
                          |   | 19 | * Reduce the chance of improper use of the data.  | 
                  
                          |   | 20 | * Reduce the amount of effort needed to use the data.  | 
                  
                          |   | 21 |   | 
                  
                          |   | 22 | Well thought out documentation, organization, file naming and metadata (FITS headers) will make a difference.  | 
                  
                          |   | 23 |   | 
                  
                          |   | 24 | Both solar physicists and non-discipline scientists should be able to easily understand what is in a data file from an instrument that they have never dealt with before, and quickly determine if it is useful for their purposes.  | 
                  
                          |   | 25 |   | 
                  
                          |   | 26 | The following questions should be answered by the documentation or the data files themselves.  Where possible, individual files should provide a link of where to find additional documentation.  Active missions should review this information on an annual basis, or at times of significant updates.  | 
                  
                          |   | 27 |   | 
                  
                          |   | 28 | NOTE : Because URLs to documentation may change over time, the SDAC is looking into providing stable URLs (under !http://data.virtualsolar.org/... or !http://solardata.org/...) that would collect up relevant information and links that may change.  We're also looking into registering DOIs for these documents.  See [http://vso1.nascom.nasa.gov/spd2012/2012_SPD_citation.pdf Recommendations for Data & Software Citation in Solar Physics] (2012 SPD poster) and  [/wiki/Citation Guidelines / Recommendations for Citing Data] for more information.  | 
                  
                          |   | 29 |   | 
                  
                          |   | 30 | -----  | 
                  
                          |   | 31 |   | 
                  
                          |   | 32 | == The Overall Collection (High Level) ==  | 
                  
                          |   | 33 |   | 
                  
                          |   | 34 | * What is the name of experiment?  | 
                  
                          |   | 35 | * Who ran the experiment?  | 
                  
                          |   | 36 |         * (organization/institution, PIs)  | 
                  
                          |   | 37 | * If a researcher has questions:  | 
                  
                          |   | 38 |         * Where can they get documentation?  | 
                  
                          |   | 39 |                 * (website; published papers)  | 
                  
                          |   | 40 |         * How can they get help or report possible problems?  | 
                  
                          |   | 41 |                 * (website w/ contact info or a generic email like 'instrument@...' )  | 
                  
                          |   | 42 | * How should the experiment be acknowledged in published research?  | 
                  
                          |   | 43 | * What was the goal of the experiment?  | 
                  
                          |   | 44 | * What instruments were used to perform the experiment?  | 
                  
                          |   | 45 |         * (names, acronyms/abbreviations)  | 
                  
                          |   | 46 |         * Where were they?  | 
                  
                          |   | 47 |                 * (spacecraft or observatory name, general location (eg, 'near L1', 'near earth', 'Tenerife, Canary Islands')  | 
                  
                          |   | 48 | * When did the experiment run?  | 
                  
                          |   | 49 | * What type of observations were collected?  | 
                  
                          |   | 50 | * What type of derived products are available?  | 
                  
                          |   | 51 |         * (eg, 'white light coronograms', 'EUV images', 'x-ray spectroscopy', 'daily plots', 'carrington maps')  | 
                  
                          |   | 52 | * Are there caveats or other warnings for potential users of the data?  | 
                  
                          |   | 53 |         * (eg, issues with the collection process that might make the data unsuitable for specific uses; known environmental conditions that introduce error during certain periods?  Known biases introduced in the calibration or other processing?  Known misleading metadata (eg, clock drift)?  Any other potential sources of error?)  | 
                  
                          |   | 54 | * Is there software in SolarSoft to use the data?  | 
                  
                          |   | 55 |         * If so, where can we get documentation on using it?  | 
                  
                          |   | 56 | * Is there any other recommended software to use the data?  | 
                  
                          |   | 57 |   | 
                  
                          |   | 58 | -----  | 
                  
                          |   | 59 |   | 
                  
                          |   | 60 | == Dataset Details (Mid Level) ==  | 
                  
                          |   | 61 |   | 
                  
                          |   | 62 | * What different datasets are in the overall collection?  | 
                  
                          |   | 63 |         * ... different sensors / detectors / cameras  | 
                  
                          |   | 64 |         * ... different observing modes (filters, polarization, cadences, exposure times)  | 
                  
                          |   | 65 |         * ... different processed forms  | 
                  
                          |   | 66 |   | 
                  
                          |   | 67 |         (for background, see [http://dx.doi.org/10.2218/ijdc.v6i1.183 Wynholds, "Linking to Scientific Data: Identity Problems of Unruly and Poorly Bounded Digital Objects"])  | 
                  
                          |   | 68 |   | 
                  
                          |   | 69 |         For each specific dataset:  | 
                  
                          |   | 70 |                 * Is there a name or title to distinguish it from the other available datasets?  | 
                  
                          |   | 71 |                 * What type of data is it?  | 
                  
                          |   | 72 |                         * (eg, intensity, magnetic field, temperature)  | 
                  
                          |   | 73 |                 * What are the defining characteristics of the dataset? (eg, level of processing, detector used, calibration version, observing mode (filters, exposure time, cadence, etc.))  | 
                  
                          |   | 74 |                 * What is the purpose / intended use of this specific dataset?  | 
                  
                          |   | 75 |                         * (why was the dataset created?)  | 
                  
                          |   | 76 |                 * Is there a contact for this dataset different from the larger collection?  | 
                  
                          |   | 77 |                 * Should it be cited or acknowledged differently from the rest of the collection?  | 
                  
                          |   | 78 |                 * Are there specific or additional caveats?  | 
                  
                          |   | 79 |                         * (eg, DATE_OBS is a coordinated time, not the spacecraft time)   | 
                  
                          |   | 80 |                 * Is this final data, or is there a chance it will be revised?  | 
                  
                          |   | 81 |                 * Is this quicklook data or otherwise unsuited for science use?  | 
                  
                          |   | 82 |                 * How has the dataset been processed?  | 
                  
                          |   | 83 |                         * (flat fielded, limb darkened, reduced, correction for point-spread, compressed, etc.)  | 
                  
                          |   | 84 |                 * Is the calibration reversable?  | 
                  
                          |   | 85 |                 * What dataset is this derived from (or is it the lowest level available?)  | 
                  
                          |   | 86 |                 * What time reference are you using?  | 
                  
                          |   | 87 |                         * (UTC, GPS, UNIX, TAI, spacecraft clock)  | 
                  
                          |   | 88 |                 * What is the volume of the dataset?:  | 
                  
                          |   | 89 |                         * Total number of images or data records?  | 
                  
                          |   | 90 |                         * Overall volume on disk (in GB or TB)?  | 
                  
                          |   | 91 |                         * (if still in planning stages, how quickly is it expected to grow?)  | 
                  
                          |   | 92 |   | 
                  
                          |   | 93 | * Which datasets are considered to be 'level0'?  | 
                  
                          |   | 94 |         * (or the lowest level available on the ground)  | 
                  
                          |   | 95 | * How are the different datasets related?  | 
                  
                          |   | 96 |         * (eg, calibrated version of ..., reduced form of ..., repackaging of ..., etc.)  | 
                  
                          |   | 97 | * How is the data organized?  | 
                  
                          |   | 98 |         * What does each file represent?  | 
                  
                          |   | 99 |         * How are the filenames constructed?  | 
                  
                          |   | 100 |         * Are the filenames unique, or is directory location significant?  | 
                  
                          |   | 101 |         * How are the directories structured?  | 
                  
                          |   | 102 |                 * (eg, 'year/month/day/instrument' vs. 'instrument/year/...')  | 
                  
                          |   | 103 |   | 
                  
                          |   | 104 | -----  | 
                  
                          |   | 105 |   | 
                  
                          |   | 106 | == File & Observation Specific Details (Low Level) ==  | 
                  
                          |   | 107 |   | 
                  
                          |   | 108 | NOTE : FITS allows for all of this metadata to be included in each individual file, and where possible, this is the recommended practice.   If reprocessing the files would be a burden or for other file formats, you might have tarballs or [http://tools.ietf.org/html/draft-kunze-bagit-09 BagIt archives] which include README and checksum files.  There may also be a catalog of the data in FITS tables, CVS or some other format, that includes this information.  | 
                  
                          |   | 109 |   | 
                  
                          |   | 110 | * Is the file in a self-describing scientific format?  | 
                  
                          |   | 111 | * Does it mention how to uniquely reference this file or observation to report problems or check for an updated calibration?  | 
                  
                          |   | 112 | * Does it mention which specific dataset it's a member of?  | 
                  
                          |   | 113 | * Does it provide a URL or other reference to the documentation?  | 
                  
                          |   | 114 |   | 
                  
                          |   | 115 | * Have you provided:  | 
                  
                          |   | 116 |         * The time of the observation?  | 
                  
                          |   | 117 |         * The duration (exposure) of the observation?  | 
                  
                          |   | 118 |         * The location of the detector?  | 
                  
                          |   | 119 |         * The pointing of the detector (if appropriate)?  | 
                  
                          |   | 120 |         * Any other details of the observing mode that may vary between observations?  | 
                  
                          |   | 121 |           | 
                  
                          |   | 122 |         * A checksum to verify file integrity?  | 
                  
                          |   | 123 |   | 
                  
                          |   | 124 | -----  | 
                  
                          |   | 125 |   | 
                  
                          |   | 126 | == FITS Specific issues ==  | 
                  
                          |   | 127 |   | 
                  
                          |   | 128 | * Does the file clearly state that it's a FITS file?  | 
                  
                          |   | 129 |         * Is there a reference to the FITS standard, or a link to the FITS website?  | 
                  
                          |   | 130 |   | 
                  
                          |   | 131 | * Do the headers include the assigned filename (FILENAME)?  | 
                  
                          |   | 132 | * Do the headers include units in the comments?  | 
                  
                          |   | 133 | * Do the headers spell out abbreviations or other coded values?  | 
                  
                          |   | 134 |   | 
                  
                          |   | 135 | ----  | 
                  
                          |   | 136 |   | 
                  
                          |   | 137 | == References, Recommended Reading & Other Related Stuff: ==  | 
                  
                          |   | 138 |   | 
                  
                          |   | 139 | * [http://dx.doi.org/10.2218/ijdc.v6i1.183 Wynholds, "Linking to Scientific Data: Identity Problems of Unruly and Poorly Bounded Digital Objects"]  | 
                  
                          |   | 140 | * [/wiki/Citation Guidelines / Recommendations for Citing Data]  | 
                  
                          |   | 141 | * [/wiki/Checklists Checklists for documenting solar physics data & catalogs]  | 
                  
                          |   | 142 | * [http://vso1.nascom.nasa.gov/spd2012/2012_SPD_FITS_headers.pdf Recommendation for FITS Headers], poster from 2012 SPD meeting.  | 
                  
                          |   | 143 |   | 
                  
                          |   | 144 |   |