Empowering Open Science with the Science Discovery Engine (SDE)

The SDE facilitates discovery and access to scientific data and resources across NASA’s Science Mission Directorate.

In 2018, NASA’s Science Mission Directorate (SMD) declared a long-term commitment to championing open science through their Strategy for Data Management and Computing, 2019 - 2024. The Open Source Science Initiative (OSSI) emerged from this strategic plan. One major recommendation from the scientific community was for the SMD to develop a capability to "support discovery and access to complex scientific data across [SMD] Divisions" that enables open science.

The SDE provides quick access to data and resources throughout NASA's Science Mission Directorate (SMD). Credit: NASA IMPACT.

A team of researchers and developers began formulating a strategy in early 2020 to meet this ambitious SMD objective. Two years and close to 1,000,000 documents, datasets, and tools later, the Science Discovery Engine (SDE) search capability was ready for launch.

The SDE provides an infrastructure for vast quantities of NASA science information to be available and searchable in a single location, making it easier for science community members to collaborate and accelerate their work. Constructing the SDE is a key step in NASA's process of establishing and encouraging open science practices; data and information from across the SMD's five divisions (Heliophysics, Earth Science, Planetary Science, Astrophysics, and Biological and Physical Sciences) can be searched, filtered, and accessed. The beta version of the SDE went live on the SMD website in December 2022, and the SDE was presented to attendees at the American Geophysical Union (AGU) 2022 Fall Meeting.

"To me, the most exciting thing about the SDE is how it makes the rich wealth of NASA’s open science data and information more accessible to an ever-growing community of users," said Kaylin Bugbee, leader of SDE team operations and a NASA research scientist and member of the OSSI team. "This increased accessibility will open new pathways to scientific discovery and encourage more people to make use of the open science data and information NASA provides."

The primary SDE development group operates within NASA's Interagency Implementation and Advanced Concepts Team (IMPACT), which is located at NASA’s Marshall Space Flight Center in Huntsville, Alabama, and is a component of NASA's Earth Science Data Systems (ESDS) Program. SDE team members collaborated with several external partners to construct and refine features of the tool. 

The Enterprise Data Platform (EDP) and Mission Cloud Platform (MCP) teams within NASA's Office of the Chief Information Officer (OCIO) assisted in deploying the powerful search capabilities of the SDE. To ensure broad representation of NASA science efforts, the SDE team coordinated with a working group comprised of members from all SMD divisions. The working group continues to help identify content for potential inclusion in the SDE and provides guidance on future project development. The SDE team also works with Sinequa, the developer of the intelligent search platform, and Left Right Mind, a digital design consulting firm, to craft user-centered web interfaces.

Compiling and organizing information included in the SDE presents many challenges. First, the SDE team works to identify relevant data and information from a vast network of resources across NASA's SMD. The team then considers how to develop useful categories for encompassing such a wide range of topics. Depending on the specificity of a query, thousands of search results may be generated that include links to datasets, models, images, videos, software, or data analysis tools. 

To refine the search process, the SDE team developed an SMD vocabulary extraction workflow that leveraged more than 50 glossaries, thesauri, and keywords across the SMD to generate term lists such as platforms, instruments, and missions. These lists are then used to create SMD-relevant filtering options to allow for guided exploration in the SDE.

Nearly a million scientific products are searchable within the SDE. Credit: NASA IMPACT.

Bugbee notes that consolidating NASA's science content in the SDE will assist researchers. "Before the SDE, information about science at NASA was spread out over 128 unique sources," she explains. "These sources included websites, data repositories, code repos, and document archives. For data specifically, over 84,000 science data products were found at [more than] 30 different repositories, making it a challenge for new scientists to find data they may not be familiar with. The SDE will make the scientific process more efficient by decreasing the amount of time required to search for data and information."

Now that the SDE is available to the broader scientific community, Bugbee and the SDE team hope that it will quickly become a go-to source for reliable, accessible science information. They anticipate that the application will foster significant collaboration and innovation within and across science disciplines. 

"This is only the beginning for the SDE," Bugbee said. "While we have brought in over 128 science information sources into the SDE, we plan to bring in more data and content in the coming months. We also plan to add enhanced features to the user interface and to further develop the SDE application programming interface [API]."

See for yourself, and explore the power of the SDE.

Last Updated