Coresets remembered and items forgotten: submodular maximization with deletions

Guangyi Zhang, Nikolaj Tatti, Aristides Gionis

In recent years we have witnessed an increase on the development of methods for submodular optimization, which have been motivated by the wide applicability of submodular functions in real-world data-science problems. In this paper, we contribute to this line of work by considering the problem of robust submodular maximization against unexpected deletions, which may occur due to privacy issues or user preferences. Specifically, we consider the minimum number of items an algorithm has to remember, in order to achieve a non-trivial approximation guarantee against adversarial deletion of up to $d$ items. We refer to the set of items that an algorithm has to keep before adversarial deletions as a deletion-robust coreset. Our theoretical contributions are two-fold. First, we propose a single-pass streaming algorithm that yields a $(1-2\epsilon)/(4p)$-approximation for maximizing a non-decreasing submodular function under a general p-matroid constraint and requires a coreset of size $k + d/\epsilon$, where $k$ is the maximum size of a feasible solution. To the best of our knowledge, this is the first work to achieve an (asymptotically) optimal coreset, as no constant-factor approximation is possible with a coreset of size sublinear in $d$. Second, we devise an effective offline algorithm that guarantees stronger approximation ratios with a coreset of size $O(d \log(k)/\epsilon)$. We also demonstrate the superior empirical performance of the proposed algorithms in real-life applications.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment