End-to-End Machine Learning with High Performance and Cloud Computing Tutorial at IGARSS 2022

IMPACT members led a workshop on best practices for training machine learning models on high performance computing workstations and using cloud technologies to infer from the trained models.
author-share

Recently, IMPACT team members Muthukumarann Ramasubramanian (Kumar), Iksha Gurung, and Dr. Manil Maskey, alongside Dr. Gabriele Cavallaro and Rocco Sedona, conducted a tutorial on End-to-End Machine Learning with High Performance and Cloud Computing at IGARSS (International Geoscience and Remote Sensing Symposium) 2022. The goal of the workshop was to educate scientists on best practices for training their ML (machine learning) model on HPC (high performance computing) workstations and using cloud technologies to infer from the trained models. The IMPACT team used the pixel level smoke detection from satellite images detection as a use case for the tutorial.

This tutorial has relevance for the IGARSS community because Earth science researchers generally end their machine learning modeling life cycle at the testing phase, in which they report model performance metrics in publications. This workshop highlights how they can move on to the next phase–large scale and on-demand inference of the machine learning models–using cloud computing. Additionally, the tutorial presents a blueprint for researchers on the use of HPC clusters provided by universities and introduces networking and cloud computing concepts required to deploy a machine learning model at scale, making it readily available for inference; that is, making predictions based on new data.

Image
Screenshot of IMPACT members running the online tutorial
Iksha and Kumar running the interactive tutorial on end-to-end machine learning using high performance and cloud computing

Researchers in other fields could benefit from this tutorial as well. Wildfires happen every year, resulting in massive smoke events. Results from a trained model could be used to identify regions affected by smoke, thereby aiding recovery operations.

For Iksha, his work in the type of pixel detection covered by this tutorial is driven by his interest in machine learning, cloud computing, and Earth science. Asked what sparked his interest in the topic, Kumar observed that

"Cloud computing is quickly emerging as an integral part of every data workflow– from storage to computing predominantly because of ease of scalability and faster innovation."

The tutorial is based on GOES-16 (Geostationary Operational Environmental Satellite) data using bands 1–6 and includes subject matter expert curated labels that are available as shapefiles. Other topics covered in the tutorial include the use of HPC clusters for training including simplified processes to submit and monitor ML training and validation and easy scaling of training to multiple GPUs (graphic processing units). Also covered is the cloud deployment pipeline for the trained model for real time inference with APIs (application program interfaces).

As part of the tutorial, Dr. Cavallaro provided an overview of the HPC and Dr. Maskey provided an overview of cloud computing.

The IMPACT team is grateful to IEEE Geoscience and Remote Sensing Society Technical Committee on Earth Science Informatics and Julich Supercomputing Center (JSC) for technically sponsoring the tutorial and providing HOC resources respectively.

Learn more about this tutorial.

Notebooks and shapefiles for this tutorial are available on GitHub.

Access tutorial slides.

More information about IMPACT can be found at NASA Earthdata and the IMPACT project website.

Last Updated