End-to-end Learnable Particle Filters and Smoothers
Ali Younis
UCI PhD Student
Abstract: Estimating the temporal state of a system from image sequences is an important task for many vision and robotics applications. A number of classical frameworks for state estimation have been proposed, but often these methods require human experts to specify the system dynamics and measurement model, requiring simplifying assumptions that hurt performance. With the increasing abundance of real-world training data, there is enormous potential to boost accuracy by using deep learning to learn state estimation algorithms, but there are also substantial technical challenges in properly accounting for uncertainty. In this presentation, I will develop end-to-end learnable particle filters and particle smoothers, and show how to bring classic state estimation methods into the age of deep learning. We first create an end-to-end learnable particle filter that uses flexible neural networks to propagate multimodal, particle-based representations of state uncertainty. Our gradient estimators are unbiased and have substantially lower variance than existing, differentiable (but biased) particle filters. We apply our end-to-end learnable particle filter to the difficult task of visual localization in unknown environments, and show large improvements over prior work. We then expand on our particle filtering method to create the first end-to-end learnable particle smoother, which incorporates information from future as well as past observations, and apply this particle smoother to the real-world task of city-scale geo-localization using camera and planimetric map data. We compare to state-of-the-art baselines for visual geo-localization, and again show superior performance.
Bio: Ali Younis is a final-year PhD student in Computer Science at UCI, advised by Prof. Erik Sudderth. He previously completed his bachelor’s and master’s degrees at UCI and briefly worked on spacecraft systems before returning for a PhD. He is broadly interested in particle based belief propagation systems for time varying systems with applications in computer vision.