Abstract
In this paper, we consider the problem of de-signing Differentially Private (DP) algorithms for Stochastic Convex Optimization (SCO) on heavy-Tailed data. The irregularity of such data violates some key assumptions used in almost all existing DP-SCO and DP-ERM methods, re-sulting in failure to provide the DP guarantees. To better understand this type of challenges, we provide in this paper a comprehensive study of DP-SCO under various settings. First, we con-sider the case where the loss function is strongly convex and smooth. For this case, we propose a method based on the sample-And-Aggregate framework, which has an excess population risk of O( d3 n4 ) (after omitting other factors), where n is the sample size and d is the dimensional-ity of the data. Then, we show that with some additional assumptions on the loss functions, it is possible to reduce the expected excess popula-tion risk to O ( d2 n2 ). To lift these additional condi-tions, we also provide a gradient smoothing and trimming based scheme to achieve excess popula-tion risks of O ( d2 n2 ) and O( d 2 3 (n2) 13 ) for strongly convex and general convex loss functions, respec-tively, with high probability. Experiments sug-gest that our algorithms can effectively deal with the challenges caused by data irregularity.
Original language | English (US) |
---|---|
Title of host publication | 37th International Conference on Machine Learning, ICML 2020 |
Publisher | International Machine Learning Society (IMLS) |
Pages | 10023-10033 |
Number of pages | 11 |
ISBN (Print) | 9781713821120 |
State | Published - Jan 1 2020 |
Bibliographical note
KAUST Repository Item: Exported on 2021-05-25Acknowledgements: Di Wang and Jinhui Xu were supported in part by the National Science Foundation (NSF) under Grant No. CCF1716400 and IIS-1919492.