ICS 269, Spring 2022: Theory Seminar
Bren Hall 1423, 1:00 – 1:50


27 May 2022: Ryuto Kitagawa

Straggler identification in round-trip data streams via Newton’s identities and invertible Bloom filters

We study the straggler identification problem, in which an algorithm must determine the identities of the remaining members of a set after it has had a large number of insertion and deletion operations performed on it, and now has relatively few remaining members. The goal is to do this in o(n) space, where n is the total number of identities. Straggler identification has applications, for example, in determining the unacknowledged packets in a high-bandwidth multicast data stream. We provide a deterministic solution to the straggler identification problem that uses only O(dlogn) bits, based on a novel application of Newton’s identities for symmetric polynomials. This solution can identify any subset of d stragglers from a set of n O(logn)-bit identifiers, assuming that there are no false deletions of identities not already in the set. Indeed, we give a lower bound argument that shows that any small-space deterministic solution to the straggler identification problem cannot be guaranteed to handle false deletions. Nevertheless, we provide a simple randomized solution using O(dlognlog(1/ε)) bits that can maintain a multiset and solve the straggler identification problem, tolerating false deletions, where ε>0 is a user-defined parameter bounding the probability of an incorrect response. This randomized solution is based on a new type of Bloom filter, which we call the invertible Bloom filter.

(Based on a paper from WADS 2007 and IEEE TKDE 2011 by David Eppstein and Mike Goodrich, arXiv:0704.3313)