There's a well-known parable from India about a group of blind men who come across an elephant. They have never seen an elephant and each of them feels different parts of the animal: one feels the tusk, one the ear, one the tail, and one the leg. When they describe what they are feeling to each other, they have drastically different descriptions of the creature in front of them.
“Studying the atmosphere is like the story of the blind men and the elephant,” said Dr. Robert Levy, a Research Physical Scientist in the Climate and Radiation Laboratory at NASA's Goddard Space Flight Center. “A lot of researchers spend their time analyzing data from just one satellite instrument, but different instruments have their own strengths and weaknesses. It is only when we work together that we can reconstruct the entire elephant.”
Recently, researchers have started combining the data gathered by different satellite instruments in order to improve the accuracy of their data and gain insights that would be difficult to achieve with one instrument alone.
Terra Fusion, a new dataset and toolkit funded by NASA, takes the original radiance measurements from five instruments on the Terra satellite and brings them together into a common format and structure in such a way that scientists can use them as one temporally- and spatially-consistent dataset. Led by Professor Larry Di Girolamo at the University of Illinois, the Terra Fusion team developed a toolkit of open-source code that researchers can use to resample and reproject radiance data from one Terra instrument onto others, or any user specified grid and map projection, so as to produce a customized fusion of different instruments' data.
The Trouble With Fusion
The motivation behind developing Terra Fusion has been there since the beginning of the Terra mission. This includes a research project by one of Di Girolamo's students, Dongwei Fu, who fused data from two instruments onboard Terra, the Moderate Imaging Spectroradiometer (MODIS) and the Multi-angle Imaging SpectroRadiometer (MISR), to better derive cloud properties.
Clouds are one of the largest sources of uncertainty in climate modeling, and Dongwei Fu was interested in how fusing eight years of MODIS and MISR data could inform an improved estimate of cloud droplet size, which is an essential variable in climate science, according to the World Meteorological Organization, for determining how much solar radiation clouds reflect back to space. Smaller cloud droplets have more surface area compared to volume so they reflect more light (they have a higher albedo) for the same amount of total water in a cloud.
Satellite-based estimates of cloud droplet size are often much larger than observations made by aircraft, suggesting that the interpretation of satellite data may need to be improved. By fusing data from MODIS with MISR data, Dongwei Fu found that due to the assumptions often made in processing MODIS data, there are regional biases which resulted in reported cloud drop sizes 15–60% larger than those found with the fusion. The combined results are more in line with aircraft observations. Improved estimates of cloud droplet size can help inform more accurate cloud behavior in weather and climate modeling.
But this research was computationally- and time-intensive. “Just downloading the satellite data we needed for that project took six months,” said Di Girolamo. “And MODIS and MISR are at slightly different resolutions and on different grids.” It took the team another couple of months of processing and reprojecting the data to fuse these datasets.
Terra Fusion's advanced toolkit was developed so that researchers can quickly create their own fusions and not have to take the time and energy to perform a fusion on their own. “If there are a thousand people wishing to fuse multiple datasets together from Terra, that's a thousand people wishing to solve the same problem. What we did is develop a toolkit, the software, that does that for them. If investigators want to fuse the instrument data in a different way, they can use our open source toolkit to contribute to the code and share with others,” said Di Girolamo.
Data fusion may also facilitate new research into air pollution, smoke from wildfires, clouds and aerosols, ocean biology, agriculture and land use, vegetation dynamics, hydrology, the Earth's radiation budget, and other Earth science fields that have traditionally used Terra data.
Fusion in the Cloud
Terra Fusion was developed under NASA's Earth Science Data System ACCESS Program, which supports and implements technologies that help people use NASA's archive of Earth observations for scientific research and other applications.
Terra's five instruments collect data that are processed into almost 2 terabytes of data products per day, and over 20 years have provided over 2 petabytes of data on the Earth's systems. Many researchers may be computationally constrained by how much data they can download and process, which is why they are turning to the cloud to do their analyses. Di Girolamo and team have developed customizable code so that researchers can do their analyses of Terra Fusion data in the cloud.
“The exciting thing for me is that Terra Fusion is more in line with how the science community operates today, using large amounts of data,” said Kurt Thome, Terra Project Scientist at Goddard, who wrote a letter of support for the project. “It's developed with the next generation of research in mind.”
As part of the Space Act Agreement, a new public-private partnership between NASA and Amazon Web Services allows several large datasets to be publicly available in the cloud. Through this agreement, Amazon Web Services is hosting a subset of Terra Fusion data, from 2000 to 2015, for researchers to use.
Further documentation on the Terra Data Fusion project can be found in the ACCESS to Terra Data Fusion Products page. The Advanced Fusion toolkit and other code related to the project can be found on the Terra Fusion Github site.