Extreme changes in climate, such as heatwaves can have devastating effects on the environment, ecosystems, and humans. Imagine if we knew when or why a heatwave is approaching? This is not possible today but it’s a focus in climate research, where observing the past can provide predictions on future occurrences. This post introduces our collaboration with NASA and 3 students from the Harvard Extension School’s master’s degree in Data Science program and their capstone project to discover, model and project patterns of extreme heat. In their Data Science Capstone course, they applied their acquired knowledge to use climate models from NASA Goddard Institute for Space Studies and created a data pipeline and a deep learning model that identifies patterns and predicts of extreme heat in an area.
Building on the Project 15 Open Platform
This project was built for the Project 15 platform, an open platform designed for conservation and ecological sustainability solutions. Project 15’s goal is to bring the latest Microsoft cloud and Internet of Things (IoT) technologies to accelerate scientific teams building solutions and boost innovation. Building a reusable machine learning model to identify trends in heatwaves on the Project 15 platform allows scientists to focus on studying the climate’s impact instead of building a model from scratch, reducing the time to start building crucial insights.
Data Processing with Planetary Computer
Climate research often includes climate models that build simulations and projection of climate and attempt to make them accurate as possible in each iteration or version of the model. These models, which are large sets of mathematical equations are first validated through existing observations then used to forecast climate into the future. NASA’s Goddard Institute for Space Studies builds and maintains their climate model named ModelE, which produces climate simulation projections and generates data based on the projections. Each model iteration creates a new dataset and is stored in a database named the Coupled Model Intercomparison Project or CMIP. NASA provided the most recent CMIP model outputs through Microsoft Planetary Computer, a platform for leveraging the cloud in environmental sustainability and Earth science. The platform contains petabytes of geospatial and environmental data and a Hub to provide convenient computing on the data for the capstone team.
Identifying Extreme Temperatures with Azure Machine Learning
One of the challenges to climate research that the Harvard team initially faced is there isn’t a standardized definition of a heatwave. In fact, the World Metrological Organization, American Metrological Society, and the National Oceanic and Atmospheric Administration have their own definitions. The team decided to take all three of these definitions into consideration and built an automates data processing pipeline that identifies areas of extreme heat, based on the user’s input. As an example, the team's report states a researcher could request a dataset that has “… occurrences of extreme heat events in the output of model Ssp45, for the region of the Pacific Northwest U.S., according to the heatwave definition: temperature is 5 K above the 30-year average for 5 consecutive days or more.” With this output, the team was also able to integrate visualizations of heat events in the pipeline and train a Convolutional Neural Network (CNN) model in Azure Machine Learning to visually identify heatwaves. Next, they used the model output to build a visualization that draws a shape around the regional areas of extreme heat events. Furthermore, the team was able to implement functionality for the pipeline to allow future teams to explore extreme temperatures, either hot or cold.
Results
With Azure and Planetary Computer, the Harvard team was able to use NASA’s climate data and achieve their goal of identifying and predicting extreme climate occurrences and package it into a robust and reusable pipeline through the Project 15 platform. NASA plans to build on the team’s work to further support climate research.
Learn More
Visit the team’s GitHub repository to see their work, and if you’re interested in contributing your own contributions to Earth and wildlife science be sure to visit Project 15 and AI for Earth, where funding may be available to support projects through their grant program. To explore the tools and services used, visit Planetary Computer, and explore the data catalog. Learn more about Azure Machine Learning and try it yourself to train a CNN in a notebook or creating a classification model with no code in these guided Learn Modules.