To say that NASA has a lot of data is like saying a beach has a lot of sand. In October 2023, the total volume of data in NASA's Earth Observing System Data and Information System (EOSDIS) surpassed 100 petabytes (PB). To put this into perspective, 1 PB is equivalent to approximately 500 billion pages of standard printed text. And this massive archive is yours to use as part of the agency's open data policies and practices.
Finding the data you need in this vast archive is enabled by resources created and managed by NASA's Earth Science Data Systems (ESDS) Program such as Earthdata Search, which provides sub-second search and discovery, and the Earthdata Forum, where you can pose data questions directly to NASA experts. Data discovery also is enabled through tools provided by NASA's discipline-specific Distributed Active Archive Centers (DAACs). With a free Earthdata Login credential, you can access (and download) the data you need.
NASA also provides the code and application programming interfaces (APIs) that enable users to search, transform, and access data. Developers comfortable working with code, for example, can create their own instance of NASA Worldview. Working with data discovery systems in the Python programming language, however, required more extensive domain knowledge. Until now.
A new Python API library called earthaccess streamlines access to NASA's Earth science data. earthaccess is a community-driven library whose development is being led through NASA Openscapes, including contributions by NASA's National Snow and Ice Data Center DAAC (NSIDC DAAC), Alaska Satellite Facility DAAC (ASF DAAC), Ocean Biology DAAC (OB.DAAC), and Atmospheric Science Data Center (ASDC). NASA Openscapes supports researchers using data distributed by NASA DAACs as these researchers migrate workflows to the cloud. Development of the earthaccess library also features leadership and contributions from private industry and the broader user community.
"earthaccess revolutionizes data access by drastically reducing the complexity and code required," says Luis López, an NSIDC software developer and earthaccess creator. "Since open science is a collaborative effort involving people from different technical backgrounds, our team took the approach that data analysis can and should be made more inclusive and accessible by reducing the complexities of underlying systems."
earthaccess resulted from workshops organized by NASA Openscapes where attendees noted the need to reduce the time spent on the technicalities of data search and discovery tools. Using a few lines of Python code (and a system running Python 3.8 or higher), users of earthaccess can easily search, download, or stream NASA Earth science data. Authentication is handled through Earthdata Login. Search and discovery are accomplished using NASA's Common Metadata Repository (CMR, which is the management system of NASA Earth science data and the foundation of Earthdata Search). Data access is facilitated through the Python-based File System Spec (fsspec).
earthaccess is accessible through the NSIDC GitHub page and can be downloaded at Zenodo. Several open-source projects are utilizing earthaccess, including an open data repository called opengeos. The development team welcomes issue submissions, pull requests, discussions, and other community contributions via the earthaccess Contributing GitHub page.