Abstract
We study private and robust multi-armed bandits (MABs), where the agent receives Huber's contaminated heavy-tailed rewards and meanwhile needs to ensure differential privacy. We consider both the finite k-th raw moment and the finite k-th central moment settings for heavy-tailed rewards distributions with k ≥ 2. We first present its minimax lower bound, characterizing the information-theoretic limit of regret with respect to privacy budget, contamination level, and heavy-tailedness. Then, we propose a meta-algorithm that builds on a private and robust mean estimation sub-routine PRM that essentially relies on reward truncation and the Laplace mechanism. For the above two different heavy-tailed settings, we give corresponding schemes of PRM, which enable us to achieve nearly-optimal regrets. Moreover, our two proposed truncation-based or histogram-based PRM schemes achieve the optimal trade-off between estimation accuracy, privacy and robustness. Finally, we support our theoretical results and show the effectiveness of our algorithms with experimental studies.
Original language | English (US) |
---|---|
State | Published - 2023 |
Event | 37th Conference on Neural Information Processing Systems, NeurIPS 2023 - New Orleans, United States Duration: Dec 10 2023 → Dec 16 2023 |
Conference
Conference | 37th Conference on Neural Information Processing Systems, NeurIPS 2023 |
---|---|
Country/Territory | United States |
City | New Orleans |
Period | 12/10/23 → 12/16/23 |
Bibliographical note
Publisher Copyright:© 2023 Neural information processing systems foundation. All rights reserved.
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Signal Processing