Skip to main content

Back in 2016, Chancellor’s Professor of Computer Science Padhraic Smyth was principal investigator of a multidisciplinary team awarded a five-year, $3 million grant from the National Science Foundation (NSF) through its National Research Traineeship (NRT) program. The funds were used to develop UCI’s Machine Learning and Physical Sciences (MAPS) program, which trains and supports graduate students from both the Donald Bren School of Information and Computer Sciences (ICS) and the School of Physical Sciences. The goal is to prepare students working at the intersection of machine learning and the physical sciences to realize the potential of today’s massive scientific data sets while tackling real-world problems.

Three years in, the MAPS program has already supported 25 students through a variety of fellowships, including computer science Ph.D. student Casey Graff and Earth Systems Science (ESS) Ph.D. student Shane Coffield. “By providing opportunities for computer science graduate students to learn about research opportunities in the physical sciences… interdisciplinary research projects naturally emerge,” says Graff, who is working with Coffield to build models that can forecast the movement of wildfires.

The project is a collaboration with Smyth from ICS, Chancellor’s Professor Jim Randerson and Assistant Researcher Yang Chen from ESS, and Distinguished Professor of Civil and Environmental Engineering Efi Foufoula-Georgiou from the Samueli School of Engineering. “It’s an interesting project,” says Smyth. “We’re trying to predict how large a fire will get given the initial conditions.”

Leveraging Satellite Data
Using a NASA data set of global satellite images captured daily over the past 20 years, the team is applying machine learning techniques to better understand the heat signatures and identify predictors of a fire’s growth and movement. This involves combining the heat signatures with other types of data, such as the temperature, humidity and moisture levels, and vegetation.

The project involves combining heat signatures with other types of data, including topography, land cover and temperature data.

“A lot of the work involves putting together the satellite data with weather data and land cover data, and getting it all registered,” says Smyth. “The students had to get everything lined up in space and time to then see if the machine learning model could predict how large the fire would be in three days, five days, and so on.”

The team is currently looking at Alaska, but they also hope to apply their models to areas such as the Amazon, Africa and Indonesia, where large fires often burn uncontained. Additionally, they’re focusing on downstream smoke, so populated areas can be warned of potentially poor air quality. “Right now, the models that they use for smoke prediction are very simple,” says Smyth. “We’re trying to replace that with something more sophisticated.”

He adds that these are the kinds of projects that really interest students. “They get to work with some of the world’s experts and with some really cool data sets, and they get to make predictions that are actually very useful to broad segments of society.”

Indeed, Graff notes how improving awareness of and response to wildfires can “lead to more effective uses of state fire resources, improve quality of life for people affected by fire emissions, and help responders better combat active fires.”

Advancing Machine Learning and Data Science
More broadly speaking, the findings from this work could help with a variety of other projects, particularly given the realities of global warming. “Climate change has widespread effects on our environment, many of which we still do not fully appreciate or understand,” says Graff. “Employing the use of experts from different fields and specialties is essential to properly understand the problem, and as the scale of the data and the scope of the problems grow, the role of algorithms to perform processing and develop insights will grow too.”

In other words, in building models for wildfire prediction, this work is also advancing machine learning more generally. According to Smyth, it has brought up “very interesting machine learning challenges for students to work on in depth, generating new research directions that are more abstract and that have potentially broader applications beyond these specific problems.”

MAPS students are thus making discoveries that could have a broad impact in the world of data science. “Our ability to capture data has grown tremendously in the last two decades, and it is common for scientific problems to involve Gigabyte, Terabyte and even Petabyte sized data sets,” explains Graff. “As the size of the data grow, powerful machine learning techniques show great benefits and are often required to properly understand the observed phenomena.”

Smyth makes the point that “there’s a lot of data out there — we need more people bringing it together and using it.”

Shani Murray