27 May 2022: Ryuto Kitagawa
Straggler identification in round-trip data streams via Newton’s identities and invertible Bloom filters
We study the straggler identification problem, in which an algorithm
must determine the identities of the remaining members of a set after it
has had a large number of insertion and deletion operations performed on
it, and now has relatively few remaining members. The goal is to do this
in space, where is the total number of
identities. Straggler identification has applications, for example, in
determining the unacknowledged packets in a high-bandwidth multicast
data stream. We provide a deterministic solution to the straggler
identification problem that uses only bits, based on a
novel application of Newton’s identities for symmetric polynomials. This
solution can identify any subset of stragglers from a set of
-bit identifiers, assuming that there are no false
deletions of identities not already in the set. Indeed, we give a lower
bound argument that shows that any small-space deterministic solution to
the straggler identification problem cannot be guaranteed to handle
false deletions. Nevertheless, we provide a simple randomized solution
using bits that can maintain
a multiset and solve the straggler identification problem, tolerating
false deletions, where is a user-defined parameter
bounding the probability of an incorrect response. This randomized
solution is based on a new type of Bloom filter, which we call the
invertible Bloom filter.
(Based on a paper from WADS 2007 and IEEE TKDE 2011 by
David Eppstein and Mike
Goodrich, arXiv:0704.3313)