Kevin Murphy is used to working with Big Data collections. In his current role as the NASA Earth Science Data Systems (ESDS) Program manager, he oversees the 48 petabytes (PB) of data in NASA’s Earth Observing System Data and Information System (EOSDIS) collection – one of the largest repositories of Earth observing data on the planet. This experience will serve him well as the new chief science data officer for NASA’s Science Mission Directorate (SMD).
Murphy’s role in this newly-created position, which he will hold in addition to his work as program manager, is to see where synergies and collaborations can be created across the agency's science divisions as well as to advance the state of the art in cloud computing, machine learning, and other data management and analysis activities.
“The chief science data officer position is a recognition of the importance of the scientific data that are collected by NASA missions, generated through NASA research grants, and generally made available to the public,” says Murphy.
“One of our key science goals at NASA is to advance open access to data, which accelerates scientific discovery and increases participation in the scientific process,” says Dr. Thomas Zurbuchen, NASA associate administrator for science. “The position of chief science data officer is to take a strategic look at how we can make Earth science data available as broadly as possible.”
NASA engages the nation’s science community, sponsors scientific research, and develops and deploys satellites and probes in collaboration with global partners to answer fundamental questions requiring the view from and into space. NASA's six science divisions (Earth science, heliophysics, planetary science, astrophysics, biological and physical sciences, and the joint agency satellite division) seek to address three core contexts spanning the breadth of the agency's activities:
- Discover the secrets of the universe
- Search for life elsewhere
- Protect and improve life on Earth
As of 2019, NASA's science divisions maintained a collective data volume of over 100 PB. High-data-volume missions scheduled for the next five years are expected to add as much as 100 PB each year to this vast archive. For Murphy, this presents not only challenges, but tremendous opportunities.
“We need ways to effectively manage these data while simultaneously encouraging an inclusive and equitable way for people to access this information,” he says. “We want to enable access to mission data, knowledge, software, and documentation as early as possible to build trust in the process through which we undertook the measurements and provide broad participation in mission science activities.”
Murphy adds that another challenge is sharing information and data across the science divisions. One of his recent achievements has been the co-development of the agency's science strategy for data management and computing. This effort showed how open data, open source software, and open science principles can be used to accelerate scientific discovery and increase participation in the scientific process.
As Murphy points out, the volume of science data will only increase as instruments become more sophisticated, resolutions become higher, and ways to store and work collaboratively with large data collections become more efficient. He also knows that making sure these data are available is vital to furthering scientific discovery.
“We need to expand our data into new communities and push our data out to the folks who need them and who can help us understand them better,” he says. “In my role as chief science data officer, I want to ensure we have all the eyes on NASA data we can get to really leverage the capabilities of these data throughout the SMD.”