Prithvi-weather-climate: Advancing Our Understanding of the Atmosphere

An in-depth look at the development, attributes, and (many) benefits of this foundation model for applying artificial intelligence (AI) to weather and climate.

Prithvi-weather-climate (Prithvi-WxC) is a weather and climate foundation model (FM) pre-trained on Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) data from NASA's Global Modeling and Assimilation Office to replicate atmospheric dynamics while being capable of dealing with missing information. Named for the Sanskrit word for Earth, the FM was developed as part of an ongoing collaboration involving NASA and IBM Research. This application of artificial intelligence (AI) to NASA weather and climate data has the potential to not only improve public safety, but open new doors into expanding our understanding of meteorological processes. 

Both meteorology and climate science have long been the domain of numerical simulation. That is, researchers carefully encoded the laws of atmospheric physics and then used supercomputers to simulate how weather and climate would evolve. During the last 24 months, the research community has realized that it is possible to use a fully data-driven process in this context.

Prithvi-WxC offers precision, swift fine-tuning, and data efficiency. The FM can be fine-tuned to a wide range of different use cases in different locations around the world, using different datasets and covering different prediction timescales. It is the first FM for weather and climate to scale to both global and regional areas without tiling or compromising on resolution. The model has been fine-tuned to increase the resolution of long-term climate models by a factor of 12 through a process called downscaling. This AI-based approach significantly reduces the costs associated with traditional high-performance computing methods. Additionally, the model enhances the representation of small-scale physical processes in numerical weather and climate models. By inserting tokens at wind turbine locations, Prithvi-WxC can generate hyper-localized forecasts, improving the accuracy of short to medium-range predictions.

The video below shows a track and intensity (Mean Sea Level Pressure [MSLP] and 2-meter sustained wind speed) prediction of Hurricane Ian in September 2022 generated by the Prithvi foundation model. Hurricane Ian is the small blue dot moving past the west coast of Cuba and making landfall in southwest Florida. The blue color indicates a concentrated area of low pressure. Click on the play button in the lower left corner by the video timer to view the video. Credit: Prithvi-WxC team.

Video file

Prithvi-WxC features a flexible architecture, a scalable, hierarchical attention mechanism, and task-independent pretraining. It has 320 million (M) parameters, 220 M as encoder and 100 M as decoder. The FM implements a hierarchical two-dimensional vision transformer architecture that scales to large token counts. Thanks to its flexible architecture, Prithvi-WxC can capture the complex dynamics of atmospheric physics even with incomplete data, scaling effectively to both global and regional areas without compromising resolution. It accelerates the generation of regional climate projections by three to four orders of magnitude.

The application of AI to weather and climate data also enhances public safety. The research team is developing uses for Prithvi-WxC such as more accurate hurricane track and intensity forecasts and better seasonal precipitation forecasting. The Prithvi-WxC FM team created a versatile AI model that can be applied in multiple contexts, thereby leveraging the efficiency of transitioning from simulation to AI. This model not only promises economic benefits but also has the potential to unlock new use cases. Prithvi-WxC enables researchers to support a wide range of climate applications within the scientific community. These include detecting and predicting severe weather and natural disasters, creating targeted forecasts from localized observations, enhancing global climate simulations down to regional levels, and improving the representation of physical processes in weather and climate models.

A Glimpse into the Development Process

Before initiating the development of Prithvi-WxC, it was crucial to thoroughly evaluate the project's feasibility. This evaluation encompassed assessing the model's potential value and impact within the scientific community. A diverse team was assembled with the wide range of expertise necessary for effective model development and evaluation. A workshop at NASA's Marshall Space Flight Center in Huntsville, AL was held in September 2023 for FM subject matter experts, AI experts, and high performance computing (HPC) experts. Sessions covered the concept of AI foundation models, their potential value in scientific endeavors, available infrastructure resources, and an initial list of science use cases. The team aimed at making the FM flexible for grid or no-grid scenarios and developed cases to test if the model was able to understand physics-based processes.

The team began by thinking about scales. Given the broad range of scales, from turbulence—which plays out on short spatial and temporal scales—to long-term climate effects that play out over decades, they organized the potential use cases according to their spatial and temporal scales and made a cut. Their ambition was roughly to cover everything from 3 km to 60 km. While ambitious, this design constraint still imposed clear limitations on the model.

Potential AI model architectures and the processes for designing, building, testing, and deploying Prithvi-WxC were explored. The team tried out-of-the-box architectures and specialized architectures like spherical operators with a recurrent neural network (RNN) as a backbone. A foundation model should remain adaptable unless the plan is to develop numerous models for various datasets and spatial coverages. These considerations stem from the discussions with subject matter experts and imposed constraints on the AI architecture.

The team selected some architecture candidates and went through a process of rapid experimentation and iteration. Time and resource limits on experimentations were implemented. For example, one experiment was constrained to a maximum of 16 graphics processing units (GPUs) for 24 hours. Using this method, they were able to design short experiments with limited datasets they could scale later. They validated the experiments from physics-based viewpoints and got feedback from subject matter experts. The team also removed the constraints of limiting grid size and amplified the overall goal for the model training.

It Takes a Team

The development effort behind Prithvi-WxC incorporates the efforts of a diverse team of individuals. Two key members of the team are Dr. Sujit Roy with NASA's Interagency Implementation and Advanced Concepts Team (IMPACT)  and Dr. Johannes Schmude from IBM Research. Describing the team as a whole, Roy commented, "In this project, we've had a highly productive and skilled team, fostering a healthy, open, and critical discourse. This dynamic has allowed us to progress rapidly, making the project truly enjoyable."

Roy earned his PhD in computer science from Ulster University in collaboration with the Indian Institute of Technology Kanpur, where his research focused on Magnetoencephalography/Electroencephalogram (MEG/EEG) signal processing and classification and the real-time control of a hand exoskeleton using EEG/MEG. Following his PhD, he worked as a machine learning engineer at the University of Manchester on an explainable AI project. He also won a place in the United Kingdom (UK) Lean Launch Programme with his idea for Turtle.AI, a customizable cloud-based AI toolkit with advanced machine learning models that produces explainable and unbiased results, enhancing trust in decision-making for sectors like healthcare and finance by providing justification for decisions. Roy joined NASA IMPACT in July 2022 as a computer scientist and currently serves as the team's lead AI researcher, leading all FM and advanced AI efforts.

While a student, Schmude was primarily interested in thinking about concise mathematical descriptions of physical theories. This led him to academia and the more formal end of theoretical physics. Since leaving academia and joining IBM, he realized that it is just as much fun to work on the opposite end of the spectrum in a data-driven field governed by experimentation. "Whether you run AI experiments or do long pen and paper calculations, you need to be able to evaluate approaches and results critically," Schmude said. "There is a saying in theoretical physics that you should never make a calculation before you know the answer. You could say similar things in AI. Don't kick off a complex experiment without having a good rationale for doing so."

Dr. Sujit Roy of NASA IMPACT (left image) and Dr. Johannas Schmude of IBM Research (right image) played key roles in developing Prithvi-WxC. Images courtesy of Roy and Schmude.

An Open Future

The development team has trained an initial version of the Prithvi-WxC FM and has validated its application across several use cases. Currently, the model has been employed for gravity wave parameterization, climate model downscaling, hurricane track identification, and localized renewable energy forecasting. A thorough evaluation of its performance in each of these tasks continues. After all of these careful assessments, the model's architecture and performance will be enhanced.

The development of foundation models like Prithvi-WxC, with its open-source contributions and collaborative efforts, signifies a significant step forward in leveraging AI for atmospheric science. The model is part of a collection of models openly available on Hugging Face. This initiative not only enhances research, but also fosters open science through cross-disciplinary collaborations that drive innovation and progress in addressing critical global challenges.

Last Updated