Centrala begrepp
Designing multi-armed bandit algorithms that preserve privacy through zero Concentrated Differential Privacy (zCDP) while maintaining near-optimal regret.
Sammanfattning
The paper investigates the problem of preserving privacy in multi-armed bandit (MAB) problems through the framework of Differential Privacy (DP). It focuses on the relaxation of pure DP, known as zero Concentrated Differential Privacy (zCDP), and its implications on the regret of MAB algorithms.
The key contributions are:
Formalizing and comparing different adaptations of DP to the bandit setting, including Table DP and View DP, and highlighting the differences between them, especially for relaxations of pure DP.
Proposing three private MAB algorithms, AdaC-UCB, AdaC-GOPE, and AdaC-OFUL, for finite-armed bandits, linear bandits, and linear contextual bandits, respectively. These algorithms share a common blueprint of adding Gaussian noise and running in adaptive episodes to ensure zCDP.
Analyzing the regret of the proposed algorithms and showing that the price of zCDP is asymptotically negligible compared to the non-private regret. Specifically, the additional regret due to zCDP is ˜O(ρ^(-1/2) log(T)), where ρ is the zCDP parameter and T is the horizon.
Proving the first minimax lower bounds on the regret of bandits with zCDP, which quantify the hardness of preserving privacy in these settings. The lower bounds show that the proposed algorithms are optimal, up to poly-logarithmic factors.
Experimentally validating the theoretical insights on the performance of the proposed private algorithms in different bandit settings.
Statistik
The paper does not contain any explicit numerical data or statistics. The key results are presented in the form of regret upper and lower bounds.
Citat
"Bandits serve as the theoretical foundation of sequential learning and an algorithmic foundation of modern recommender systems. However, recommender systems often rely on user-sensitive data, making privacy a critical concern."
"The goal of the policy is to reveal the sequence of actions while protecting the privacy of the users and achieving minimal regret."
"Our analysis shows that in all of these settings, the prices of imposing zCDP are (asymptotically) negligible in comparison with the regrets incurred oblivious to privacy."