ORNL DAAC Collaborates with Missions, Projects, and Data Producers
NASA is working to assign Distributed Active Archive Centers (DAACs) to missions and projects early in the project lifecycle. This allows data producers and ORNL DAAC to collaborate on data management and submission timelines. Collaboration can greatly reduce the effort to publish data, and it allows projects and ORNL DAAC to better plan for the others' needs.
If you are working on a NASA-funded project that has not been assigned to a DAAC (or other NASA-approved data repository), and your data is relevant to the NASA Terrestrial Ecology program or other programs within the NASA Carbon Cycle and Ecosystems Focus area, please email us to tell us about your project and potential data publication needs. We can then work with NASA to minimize the effort to get your data published in compliance with NASA directives, such as those described in the Science Mission Directorate Policy 41a (SPD-41a).
If you are working on a project that is not at least partially funded by NASA, and you are interested in publishing your data with ORNL DAAC, it is best if we know about the data as far ahead of submission as possible. We must follow an explicit approval process for publication of data from outside NASA funding by demonstrating why it is in NASA's interests for us to accept the responsibility of publishing and managing that data. Please email us to discuss the potential of your data being published through ORNL DAAC. Also, see the section on Data Acceptance for a list of other data repositories that might be relevant to your work.
The types of services we provide as part of publishing your data will depend on the nature of your data and the project that funded its collection. See the Earth Science Data Systems Level of Service Model for further information.
1. Request the publication of your data
To request that your data be published with the ORNL DAAC, complete the Submit Data Form. Inquiries will be reviewed within a few working days. If your data is from a mission or project already assigned to ORNL DAAC, we will open your submission and ask that you upload your data and other materials to our servers. Otherwise, please see the preceding section.
Data Versioning
If your data is an update to an existing dataset published at ORNL DAAC, please be particularly clear about how the new data differ from what was previously published. If you are extending the spatial or temporal extent of the data, adding observations, adding additional variables (measurements), or other changes which only involve adding files to the dataset, we will generally append the data to the existing dataset and update the version number.
However, if you have reprocessed or changed the data preparation methods so that previously published data values are changed, we will assign a new DOI for the new version. Historically, we have only changed the DOI when the underlying data preparation method or algorithm changed, but we generally did not change it when the data was reprocessed. However, in the years since the ORNL DAAC started using DOIs for data, the consensus around best practice has evolved. We made this change to bring the ORNL DAAC into alignment with that best practice consensus. This change is also part of our response to an ORNL DAAC User Working Group (UWG) suggestion to improve the visibility of dataset revisions, particularly in the citations, so that it is clear to readers of a scientific paper which specific revision of a dataset was used in the work.
2. Ensure Your Data Follows Our Standards and Best Practices
Please ensure that your data follows data management best practices as much as possible and use the directions below to help ensure that specific standards and data requirements have been followed. Ideally, we would have been working with your project well in advance of the publication request, minimizing the effort for this step.
3. Submit Your Data and Documentation
Once your submission has been opened, you will receive an email from our system with instructions on how to perform a data upload. You will also be asked to answer Data Provider Questions. We will work with you on methods to get large datasets to us. If your data is from a mission or project already assigned to ORNL DAAC, there might be an established data delivery mechanism described in a data management plan.
Files expected in a submission include:
- Data Files
- Data Documentation
- Supplemental Files
- Code and Scripts
- Research Papers, Preprints, and Manuscripts
Data Files
Include all the files representing a complete, and reproducible, body of work. When possible, field or input data should be included alongside higher-level, derived products. Uncertainty estimates (such as standard deviation or confidence estimates) should be included with the data, if available. Datasets consisting of summary statistics only will not be accepted. Note that all data files must be uniquely named.
We accept data in the well-documented file formats noted below. Because of the importance of historic Earth science data, we must publish data in formats that will be readable decades from now.
- Spatial data, including GeoTIFF, NetCDF 4, HDF5, shapefile, GeoPackage, and KML/KMZ
- Tabular data, including comma-separated values (CSV) format, while Excel spreadsheets are not accepted
- Common, open instrument and data formats, including LAS, ICARTT, and ENVI
During the quality assurance process, staff might recommend that you convert your data into another format to take advantage of NASA Earthdata Tools. Where appropriate, we will restructure your data into cloud-optimized formats, such as Cloud Optimized GeoTIFF (COG).
Data Documentation
Documentation that describe your data should be uploaded with the data files. Important information includes:
- Name of the dataset and its associated project
- Authors of the data (following the Data Authorship Guidance below)
- Who funded the investigation, including award numbers
- Names of the data files, file formats, and an explanation of the file-naming convention
- Descriptions for all data parameters, variables, units, coded cell values, etc.
- What, when, where, and how data were collected and the scientific purpose
- Processing of data and sources for derived products
- Quality control and estimates of uncertainty
- References to published research papers, including DOIs
Please use our Data Provider Documentation Template to ensure you provide all the required information.
Supplemental Files
Include any additional files that are associated with your dataset, such as photos, reports, Algorithm Theoretical Basis Document (ATBD), or metadata files. Supplemental information that addresses the uncertainties in your data is particularly important.
Code and Scripts
Code used to generate or analyze data is an important part of the scientific record and can be extremely useful in enabling reuse of your data. If you have already published this code elsewhere (such as Zenodo), simply provide us the DOI for that code. If the code has not been published elsewhere, please provide a compressed archive or a link to a tag in a public repository.
NASA processes regarding software are evolving. There are multiple methods available for archiving your code independent of your data, and we are happy to discuss options with you. See the NASA Open-Source Science Initiative and NASA Open Source Development pages for more information.
Research Papers, Preprints, and Manuscripts
Publication of a research paper is not required for data archival at the ORNL DAAC. However, if the collection of use of your data is described in a paper, include it in your submission (or provide the DOI for published materials) to help our staff to understand your data. We will not share or include them in the data package. We will use the papers and preprints you provide us for developing the user guide, which you will be asked to review before publication.
4. Dataset Preprint Release (Issued by Request)
To support NASA Open Source Science data policies and to support the publication of scientific manuscripts, ORNL DAAC can provisionally publish a dataset as a Preprint Dataset. Preprint datasets are one way to make data available to journal article reviewers. For us to publish a Preprint Dataset, we need documentation from you that is as close to our user guide format as possible. We will then publish the data, as received, with an advisory banner on the dataset landing page stating that it is provisional. Preprint datasets are not indexed in NASA systems and generally require that users know the DOI to download the data. Preprint datasets are extra work for you and for us, so please request this service only if you have a specific need for it.
5. ORNL DAAC Reviews Your Submission and Writes a User Guide
Our staff will review the submitted data using the Data Quality Review Checklist. Ensuring that your data can be understood and manipulated is an important part of data reusability.
Given the documentation you submit, we will prepare structured metadata and write a comprehensive guide for potential users of your data. A citation will be generated and a DOI registered for the dataset so that others can credit your work (as described in Earthdata's Data Use and Citation Guidelines). If any questions arise during the processing of your dataset, we will contact you to resolve those questions. Your prompt response is important as we work to get your data published.
6. Review the Data Package
When the data and user guide are ready, you will be asked to review and approve the final data package before it is released online for the public.
7. Publication
ORNL DAAC will publish the data package, providing a landing page with links to data and services for your data. We will distribute metadata to the NASA EOSDIS Common Metadata Repository (CMR), which dispatches to other relevant catalogues and is the source for Earthdata Search. The data package will also be advertised online through email, social media, and ORNL DAAC news.
8. Long-term data stewardship
After the data is published, ORNL DAAC will:
- Continue to provide and maintain tools to discover, explore, access, and extract data
- Provide long-term, secure archiving including back-up and recovery
- Address user questions, and serve as a buffer between users and data producers
- Maintain usage, download, and data citation statistics, which are provided on dataset landing pages
See the How to Complete your Data Submission section below for details about in-progress submissions.