Core Concepts
The authors develop the first constant approximation algorithm for the Fair Max-Min Diversification (FairDiv) problem that runs in near-linear time using only linear space. Their approach employs a novel combination of the Multiplicative Weight Update method and advanced geometric data structures to implicitly and approximately solve a linear program.
Abstract
The paper focuses on the problem of fair and diverse subset selection from high-dimensional data, known as the Fair Max-Min Diversification (FairDiv) problem. The key contributions are:
MFD Algorithm: The authors present the MFD algorithm, the first constant approximation algorithm for FairDiv that runs in near-linear time using only linear space. Previous constant approximation algorithms had super-linear running time and space.
MFD Algorithm with High Probability Fairness: The authors extend the MFD algorithm to satisfy the fairness constraints with high probability, by solving a modified linear feasibility problem.
Coreset Construction: The authors show that any algorithm for the k-center clustering problem can be used to derive a (1+ε)-coreset for the FairDiv problem efficiently. Using this coreset, they propose the first efficient streaming algorithm for FairDiv.
Range Query Algorithm: The authors design a data structure that can efficiently return a fair and diverse subset of points within a given query region.
The algorithms use a novel combination of the Multiplicative Weight Update method and advanced geometric data structures, such as BBD-trees, to implicitly and approximately solve the linear programs underlying the FairDiv problem.