Machine Learning and Statistics Collaboration Leads to Outstanding Student Paper at AIStats
Around 1,800 papers were submitted to the 27th International Conference on AI and Statistics (AIStats), an interdisciplinary gathering of researchers at the intersection of computer science, artificial intelligence, machine learning, statistics and related areas. Of those submissions, only seven were highlighted as Outstanding Student Papers — one of which was “Functional Flow Matching,” by UC Irvine students Gavin Kerrigan and Giosuè Migliorini, and Distinguished Professor of Computer Science and Statistics Padhraic Smyth.
“The AI and Statistics conference is one of the leading international conferences in AI and machine learning,” says Smyth, “with a significant emphasis on research at the intersection of machine learning and statistics.”
Kerrigan is a computer science Ph.D. student and Migliorini is a statistics Ph.D. student, both working in Smyth’s DataLab research group in the Donald Bren School of Information and Computer Sciences (ICS). Kerrigan attended the AIStats conference, held May 2–4, 2024 in Valencia, Spain, to accept the award and present their research.
The paper describes the mathematical and algorithmic foundations for a new technique called “functional flow matching” for AI generative modeling of a variety of data types, including time-series data and images.
“To store data on a computer, some kind of finite, discrete representation is needed. For example, an image is saved as a collection of pixels,” explains Kerrigan, noting that the real world is not made of pixels. Rather, it consists of continuous underlying signals. “This motivated us to look at methods for modeling data which are invariant to the discrete representation used, in the sense that changing the representation of the data would not cause our model to err,” he says. “In practice, this allows us to work with data at an arbitrary resolution.”
The team is leveraging methods from statistics and computer science in developing their new technique. “The goal of the methodology is to bridge the field of functional data analysis with recent advances in generative deep learning,” says Migliorini. “It finds a natural application in time series data, where different observations may occur at different points in time, and in generating solutions to partial differential equations.” This technique thus has the potential for applications involving simulation and prediction in fields ranging from weather and climate forecasting to biomedical sciences and healthcare.
— Shani Murray