Open Science

Image

NASA’s groundbreaking open data policy provides unrestricted access to more than 100 petabytes of Earth science data in NASA’s Earth Observing System Data and Information System (EOSDIS) collection. NASA's Earth Science Data Systems (ESDS) Program ensures that these data are fully available to any user for any purpose, and promotes and facilitates the open sharing of all metadata, documentation, models, images, and research results along with the source code used to generate, manipulate, and analyze these data. The agency's underlying objectives are that openness of data is fundamental, security of these data is essential, and freedom and integrity for using these data are crucial.

This page provides a deeper look at how ESDS defines open science and the evolving paradigm of open-source science, facilitates the unrestricted use of NASA Earth science data, and supports agency-wide open science initiatives.

Defining Open Science and Open-Source Science
Image
NASA open-source science practices place the agency closer to a fully open system (right side of image). New technologies and practices will enable NASA to continue to become more fully open-source. Credit: NASA ESDS.

NASA’s Earth Science Data Systems (ESDS) Program defines open science as a collaborative culture enabled by technology that empowers the open sharing of data, information, and knowledge within the scientific community and the wider public to accelerate scientific research and understanding. A system based on open science aims to make the scientific process as transparent (or open) as possible by making all elements of a claimed discovery readily accessible, which enables results to be repeated and validated.

Out of this open science concept, an evolving paradigm called open-source science is emerging. Open-source science accelerates discovery by conducting science openly from project initiation through implementation. The result is the inclusion of a wider, more diverse community in the scientific process as close to the start of research activities as possible. This increased level of commitment to conducting the full research process openly and without restriction enhances transparency and reproducibility, which engenders trust in the scientific process. It also represents a cultural shift that encourages collaboration and participation among practitioners of diverse backgrounds, including scientific discipline, gender, ethnicity, and expertise.

Since 1994, NASA Earth science data have been available without restriction to all users for any purpose, and since 2015, ESDS has ensured that all data systems software developed through NASA research and technology awards have been made available as open-source software.

Open Data
Image
The red area starting to the right of the Fiscal Year 2023 (FY23) dot indicates the enormous volume of data expected from upcoming high-data-volume missions that are projected to grow NASA's Earth science data collection to almost 600 petabytes (PB) by 2030, based on current launch schedules. Credit: NASA EMS.

The unrestricted availability of NASA Earth science data is the foundation of all ESDS activities, and the program has remained at the forefront of technological advances to ensure the efficient delivery and use of these data. As the volume of EOSDIS data continue to grow, ESDS is undertaking a groundbreaking effort to move this Big Data collection into the Earthdata Cloud. Having this data collection in the cloud will provide more efficient use of this vast archive, including the ability to conduct analyses in the cloud and merely download the results—a tremendous savings in computing time and processing requirements. New missions that are part of NASA's Earth System Observatory will generate higher volumes of data than any previous missions, all of which will be openly available through the Earthdata Cloud as early in the scientific process as possible.

The ESDS commitment to open-source data is also predicated on the collaborative use of these data. Through the Earthdata Cloud as well as efforts such as the cloud-based Multi-Mission Algorithm and Analysis Platform (MAAP), ESDS is enabling a broader base of users to interact with these data early in the scientific process. Using a standard internet connection, users in Arizona and Abu Dhabi, for example, can work together to analyze an EOSDIS dataset in real-time without having to download these data.

Open Tools
Image
NASA Worldview enables the interactive exploration of more than 1,000 imagery layers. Users can compare more than 20 years of MODIS full Earth imagery, create animations, and easily track natural events using data curated by the NASA Earth Observatory Natural Event Tracker (EONET). Credit: NASA Worldview.

Along with open data, NASA's ESDS also provides a wide range of tools and applications for working with these data along with the code behind these tools and applications. The Earthdata Data Tools page provides descriptions and links to resources created by EOSDIS Distributed Active Archive Centers (DAACs) for functions such as searching for and subsetting data. In addition, specialized tools such as NASA Worldview and Giovanni enable users to interactively explore hundreds of visualized data layers, overlay multiple layers, do comparisons, create animations, and much more.

Additional resources for using and working with NASA data include:

Algorithm Publication Tool
The Algorithm Publication Tool (APT) developed by NASA's Interagency Implementation and Advanced Concepts Team (IMPACT) enables open, reproducible science by helping scientists write standardized, high-quality algorithm documentation collaboratively. Algorithm Theoretical Basis Documents, or ATBDs, help ensure reproducible science by documenting key scientific assumptions made when writing algorithms and by promoting better understanding of Earth observation data.

Earthdata Code Collaborative (ECC)
The ECC is a platform for developing, testing, and discovering EOSDIS applications and services. The ECC provides a ready-to-use framework for running tests every time a codebase changes and makes recommendations about what testing frameworks and approaches to use.

Pangeo
The Pangeo project is helping the Earth science community analyze data in the cloud so they can spend less time downloading and managing data. The project is partially funded by NASA's Advancing Collaborative Connections for Earth System Science (ACCESS) Program, which develops technologies to effectively manage, discover, and utilize NASA’s archive of Earth observations for scientific research and applications. Pangeo’s collaborative tools allow researchers to access, process, and analyze NASA data in the commercial cloud without having to download the data. Their ecosystem of interconnected open-source tools use software from Project Jupyter. Project Jupyter software allows users to create and share collaborative workflows in open-source notebooks that contain code, equations, and visualizations.

Multi-Mission Algorithm and Analysis Tool (MAAP)
MAAP is a joint effort of NASA and ESA (European Space Agency), and brings together data, algorithms, and computing capabilities in a common cloud environment to facilitate the sharing and processing of data from field, airborne, and satellite measurements. Key features of MAAP are full and open access to mission data through the MAAP Dashboard, the use of open-source code, and unrestricted access to data and ancillary information.

Geographic Information Systems (GIS)
GIS is a collection of computer-based tools for organizing information from a variety of data sources to map and examine changes on Earth. The ESDS vision is to identify and deliver high value Earth science data in formats compliant and compatible with GIS standards; to ensure data are interactive, interoperable, accessible, and GIS-enabled through primary GIS platforms; and to provide the maximum impact to research, education, and public user communities requiring visualization and spatial analysis. The ESDS GIS Team (EGIST) was created to provide sustained program-wide support to enable the appropriate use and adoption of GIS technology in support of Earth science research and applied science for EOSDIS data. More information is available on the Earthdata GIS Data Pathfinder and GIS at NASA pages.

Providing Open Data for NASA’s Earth System Observatory (ESO) 

NASA’s Earth System Observatory (ESO) is a coordinated series of complementary missions designed to obtain measurements of multiple Earth processes to help address and mitigate climate change. In keeping with NASA’s open data policies, ESO data will be available as early in the mission process as feasible.

Developing a Mission Data Processing System (MDPS) that will ensure the most transparent processing and delivery of ESO mission data is vital. The MDPS is the set of algorithms, software, compute infrastructure, operational procedures, documentation, and teams that process raw instrument data into science quality data products. The MDPS also includes the software tools that support the development of processing algorithms and the validation and analysis of processed data.

Image
Basic NASA data processing flow. After raw satellite data are downloaded (left box), MDPS elements (center box) facilitate the transformation and processing of instrument data into science data products (right box).

NASA Chief Science Data Officer Kevin Murphy set a challenge to the mission processing community to identify and assess potential architectures that can meet the ESO mission science processing objectives, enable data system efficiencies, promote open science principles, and seek opportunities that support Earth system science.

Addressing this challenge is being accomplished through the ESO Mission Data Processing Workshop Study. The study is composed of several phases:

  • Phase 1: MDPS Architecture Recommendation (October 2021 to March 2023)
  • Phase 2: Detailed Analysis of Recommended MDPS (March 2023 through September 2024)
  • Phase 3: Baseline Technical Solution and Implementation Plan
  • Phase 4: Implementation

Phase 1: ESO Data Processing Architecture Recommendation

The Open Source Science for ESO Mission Data Processing Architecture Study was conducted between October 2021 and March 2022, and comprised two public workshops. Workshop 1 (October 19-20, 2021) focused on collecting NASA stakeholder objectives and ESO mission requirements. Workshop 2 (March 1-4, 2022) studied practices across NASA and other agencies for developing science data processing systems. The information from these workshops was used to inform and identify potential architectures that could meet the study objectives. A technical trade study was performed along with a programmatic trade study of different architectures. The results from these studies were then combined to establish a final recommendation.

The recommendation of the study team is for each ESO mission to develop their own MDPS using a common architecture and services that are provided and managed by an overarching Multi-Mission Organization (MMO). The MMO will establish standards across ESO missions and develop and deliver infrastructure, data catalog, analysis, and (potentially) processing services. This MDPS is designated Type 2 (Managed Services), Variant 4 (encompassing infrastructure, data, catalog, and analysis and processing services) and written as T2V4.

For detailed information about the Phase 1 study, please see the Final Report of the Open Source Science for Earth System Observatory Mission Data Processing Architecture Study (doi:10.48577/jpl.AXZSUY).

Phase 2: Detailed Analysis of Recommended MDPS

A detailed analysis of the recommended T2V4 MDPS began in March 2023 and a final report is scheduled for release in September 2024. This follow-on investigation will increase the fidelity of the architecture by establishing the use cases, system requirements, system definition, and potential impact to the ESO missions. Phase 2 will conclude with a final report and a data processing prototype with associated documentation.

Image
Phase 2 is a detailed analysis of the recommended Type 2, Variant 4 (T2V4) MDPS. It will start with a systems engineering review followed by an analysis of extending the MDPS for multi-mission use leveraging the Unity Code developed by the NASA Sounder Science Investigator-led Processing System (SIPS). Phase 2 will end with Prototype Validation and Verification (V&V).

Phase 2 Formal Review

The Systems Engineering stage of Phase 2 concludes in October 2023 with a Formal Review. This open event will be conducted October 24-25, 2023, at NASA’s Langley Research Center in Hampton, Virginia. The goals of the review are to:

  • Review systems engineering artifacts to validate architectural compliance and maturation of the Phase 1 T2V4 MDPS recommendation
  • Demonstrate ability to support ESO mission use cases as provided by ESO project representatives
  • Deliver prioritized development activities to enable an MMO-managed MDPS architecture to support ESO mission processing needs
  • Assess the maturity of the project and the progress made in defining the mission data processing system requirements to support ESO missions

Learn More

Agency-Wide Open Science Initiatives and Resources

NASA's Open-Source Science Initiative (OSSI) is a comprehensive program of agency activities to enable and support moving science towards openness, including enhancing existing data policies, supporting open-source software, and enabling cyberinfrastructure. OSSI aims to implement NASA’s Strategy for Data Management and Computing for Groundbreaking Science 2019-2024, which was developed through community input.

NASA's Transform to Open Science (TOPS) mission, which is part of OSSI, is working to accelerate the engagement of the scientific community in open science practices through open science events and activities aimed at:

  • Lowering barriers to entry for historically excluded communities
  • Developing a better understanding of how people use NASA data and code to take advantage of the agency's Big Data collections
  • Increasing opportunities for collaboration while promoting scientific innovation, transparency, and reproducibility

More information about TOPS activities is available on the TOPS GitHub page.

Other Resources

NASA Open Data, Services, and Software Policies

NASA Open Innovation sites provide access to agency-wide data, application programming interfaces (APIs), and code, and are under the Office of the Chief Information Officer:

  • NASA's Open Data Portal is the agency's central public open data site for the public
  • NASA's API Portal is a clearinghouse site for information about the agency's APIs and serves as a passthrough site to NASA APIs located elsewhere
  • NASA's Code Portal contains information on links to all open-sourced NASA code projects
Get Involved
Image

NASA and ESDS have many ways you can benefit from and contribute to open science efforts. Since all code used in ESDS applications and tools is open source, you can create your own instance of Worldview to pull imagery from Global Imagery Browse Services (GIBS) or download code from any of the NASA GitHub sites. In addition, ESDS competitive programs such as the Advancing Collaborative Connections for Earth System Science (ACCESS) Program, Citizen Science for Earth Systems Program (CSESP), and Making Earth System Data Records for Use in Research Environments (MEaSUREs) Program provide opportunities for you to contribute data and observations to ongoing scientific investigations or compete for funding opportunities to help advance open science initiatives. More information about ESDS collaborations and data processes is available on the Earthdata Engage page.

You Might Also Be Interested In

Filter By

Content type