Common Metadata Repository (CMR)

CMR Contact Information

NASA Point of Contact: Steve Berrick
stephen.w.berrick@nasa.gov

Contact Earthdata Support for questions, comments, or technical issues.

What is CMR?

NASA's Common Metadata Repository (CMR) is a high-performance, high-quality, continuously evolving metadata system that catalogs all data and service metadata records for NASA's Earth Observing System Data and Information System (EOSDIS) system and will be the authoritative management system for all EOSDIS metadata. These metadata records are registered, modified, discovered, and accessed through programmatic interfaces leveraging standard protocols and APIs.

CMR is the keystone that makes NASA Earth Observation data discoverable. As a metadata repository, CMR contains Unified Metadata Model (UMM) schema records that describe individual Earth Data files (UMM-Granules), collections of files (UMM-Collections), scientific details about the data files (UMM-Variables), related tools and services that act on the data files (UMM-Tools and -Services), and pertinent relationships between these concepts. Using the UMM allows CMR to host its metadata records in several supported native formats, with translation services available between formats.

As of 2024, CMR contains metadata describing over a billion Earth Observation data files stemming from approximately 10,000 data collections, but that just comprises NASA’s Earth Science Holdings. Because NASA is a member of the Committee on Earth Observation Satellites (CEOS), CMR also houses collection records for the dozens of other CEOS member agencies which contribute another 45,000 collection metadata records to CMR. Additionally, if our CEOS partners provide OpenSearch Description Documents (OSDDs) for their collection metadata records, the data files they house within those collections are also made discoverable through CMR, enabling interagency and international federated discovery. OpenSearch is an Open Geospatial Consortium (OGC) international standard for federated search and discovery.

Diagram showing the Earthdata components which use CMR and how they access it.

CMR also has application programming interfaces (APIs) for both creating new metadata holdings (CMR Ingest) and querying the repository’s contents (CMR Search). Distributed Active Archive Centers (DAACs) primarily utilize CMR Ingest, curating and maintaining the metadata pertaining to collections that fall under their scientific purview. Anyone—from private industries to scientists to students to policymakers to anyone across the globe—can and do search CMR extensively. CMR nominally sees an average search load of several thousand queries per minute, with typical peaks over 10,000 queries per minute.

CMR tools and services include:

CMR Application
Metadata Management Tool (MMT)
draft Metadata Management Tool (dMMT)
Metadata Curation Dashboard

CMR is the backend behind the following data search and discovery services:

Earthdata Search
International Data Network (IDN)

All our metadata applications and records (metadata contents) heavily rely upon the GCMD Keywords to maintain a controlled vocabulary, which is instrumental to being able to effectively search through a multitude of metadata records. These are accessible through a Keyword Management System (KMS) API or through a GUI GCMD Keyword Viewer application, with changes being community-driven via the Earthdata Forum.

CMR Search can be performed through the native CMR API or through CMR STAC, CMR Cloud STAC, and CMR OpenSearch API wrappers. CMR Graph Query Language (GraphQL) can also be used to find concepts and relationships. Metadata concept associations can be made directly through linkages within metadata fields, or through CMR Graph Database (GraphDB) relationships.

Goals of CMR

The primary goal of CMR is to be a highly performant tool for both search and ingest: to deliver search results (for both collection and granule records) in under a second and be able to handle the ingest, from receipt to visibility, in under an hour (though CMR does this in well under a minute) without dropping a single file or change.

Search and ingest loads have and will continue to increase over time and with the addition of new missions. Current loads typically range within thousands of concurrent search and ingest requests occurring per minute every day (with surges well into the upper tens of thousands per minute for each, when a provider reprocesses and re-ingests a large collection or when a user makes a massive request). Having an easily scalable architecture is key to remaining both performant and cost effective.

How does CMR do that?

CMR is designed to handle metadata at the concept level. Collections and granules are common metadata concepts, but this can be extended out to visualizations, variables, documentation, services, and more. CMR provides a flexible ingest system with metadata libraries that can handle multiple metadata record formats, multiple metadata record concepts, and relationships and validations between them; native metadata remains unchanged. As new formats are introduced, new translations can be written for CMR’s metadata libraries to provide ingest, validation, and search support. The libraries also provide format conversions for backward compatibility.

diagram with colored wedges showing how the UMM operates.

Why CMR?

NASA's Earth Science Data Information System (ESDIS) Project is responsible for providing access to Earth Observation data and services to an ever-growing user community including—but not limited to—Earth scientists, educators, other government agencies, decision makers, and the general public. As data archives grow and more data becomes accessible online—cataloging, searching, and extracting relevant data from these archives becomes a critical part of Earth science research.

Formerly, NASA metadata providers were required to contend with multiple, disparate systems; each requiring different formats and different mechanisms for submitting and updating data entries. As an end user or application developer, this inconsistency reduced the value of the metadata and complicated finding and using earth science data.

In response to these challenges, NASA set out to design a system that would reduce the burden on metadata providers and improve the metadata quality, consistency, and usability for end users by:

Handling metadata at the concept level; including collections, granules, visualizations, variables, documentation, services, and more.
Managing hundreds of millions of metadata records; making them available through performing standards-based temporal, spatial, and faceted search.
Incorporating both human and machine metadata assessment features that work to ensure the highest quality metadata possible.
Supporting multiple metadata standards using an ever-evolving Unified Metadata Model (UMM).

Benefits of CMR

Unified Repository

CMR provides a unified repository for NASA's Earth science metadata. This benefits providers by only needing to ingest their metadata into one system. It also benefits users by being able search one system and retrieve both EOSDIS and IDN data.

Enhanced Performance

Modern Earth science applications strive to provide end users with nearly immediate access and interactivity across massive stores of Earth science data. That data is discovered, navigated, and often interrogated through science metadata. As the range of applications grow and more and more information moves from the underlying science data to metadata, the challenges of navigating even just the metadata increases. CMR is designed to handle hundreds of millions of metadata records; striving to make them available in under a second. CMR is able to do this by a using sophisticated data indexing and caching strategies built on top of a performance infrastructure.

Quality Assurance

High performance access to metadata is only part of the problem. To be useful to a broad range of Earth science applications, the metadata must be of high quality, complete, and consistent. CMR incorporates both human and machine metadata assessment features that work to ensure the highest quality metadata possible. During ingest, automated metadata scoring rubrics are applied giving data providers insight into how to make their data more discoverable or usable by end users. Science coordinators and review teams can review metadata that fails verification or lacks required information to help providers make their metadata more consistent and complete.

Consistent Metadata Representation

CMR's ingest framework supports services which validate distinct metadata standards such as ECHO10, IDN DIF, and ISO19115 against a common set of core metadata elements described in UMM.

UMM describes the metadata related to key EOSDIS concepts (such as collection or granule) using UMM metadata “profiles” (such as UMM-C for collection and UMM-G for granule).

CMR provides services for mapping metadata records to and from any of the CMR-supported metadata standards, through UMM. Thus, CMR is able take metadata records of various source formats, and convert non-ISO 19115 metadata into robust and standards compliant ISO 19115 representation for interested clients. This can be seen in the diagram below:

For example, for collection metadata records, a UMM-C record may be mapped to any other supported collection metadata standard such as ISO 19115-1, ISO 19115-2, ECHO 10, DIF 9, or DIF 10. Records created using any of those supported standards may be mapped to any other of those supported standards by first mapping to UMM-C.

As additional metadata concepts are introduced to CMR, new ingest services will provide verification and search indexing capabilities across diverse metadata such as visualization and variable information.

Getting Started

There is a lot of excellent information available that can help get you off and running.

The CMR Partner Guide is a wealth of information for users of any level

The CMR Wiki houses the CMR and UMM administration documents as well as links to CMR related projects.

Developers and interested parties should visit the Earthdata Developer Resource to view:
- General API information
- API documentation
- API reference information
- A Metadata Ingest API Overview
- OpenSearch documentation
- Collection CSW documentation
- UMM-C and UMM-S schema
- Metadata Management Tool User Guide
- Client-Partner User Guide
- Access the CMR Client Developer Forum

Additional Information

Data Use and Citation Policies
Visit this page to get details with examples on how to on how to use, cite or reference NASA's discipline-specific data and services.

EOSDIS Tools and Services
- EOSDIS Tools and Service Portfolio was created to:
  - Evaluate and update the existing inventory DAAC and EOSDIS tools and services to understand the current portfolio
  - Identify tools and services with apparent overlapping functionality and evaluate to determine what, if any, efficiencies may be gained moving forward
  - Create a framework for a governance process to apply to future tool and service development efforts
- EOSDIS Tools Information
  NASA's Distributed Active Archive Centers (DAACs) provide an extensive variety of tools that make it easy for users to discover, access, manipulate and use EOSDIS data products. Visit this page to learn more about the tools NASA has to offer.

Last Updated

Feb 27, 2024

Data

Topics

Atmosphere

Biosphere

Cryosphere

Human Dimensions

Land Surface

Ocean

Solid Earth

Sun-Earth Interactions

Terrestrial Hydrosphere

Learn

Engage

About