Abstract
We study the performance of a family of randomized parallel coordinate descent methods for minimizing a nonsmooth nonseparable convex function. The problem class includes as a special case L1-regularized L1 regression and the minimization of the exponential loss (“AdaBoost problem”). We assume that the input data defining the loss function is contained in a sparse $$m\times n$$ matrix A with at most $$\omega $$ nonzeros in each row and that the objective function has a “max structure”, allowing us to smooth it. Our main contribution consists in identifying parameters with a closed-form expression that guarantees a parallelization speedup that depends on basic quantities of the problem (like its size and the number of processors). The theory relies on a fine study of the Lipschitz constant of the smoothed objective restricted to low dimensional subspaces and shows an increased acceleration for sparser problems.
Original language | English (US) |
---|---|
Title of host publication | Springer Proceedings in Mathematics and Statistics |
Publisher | Springer New York [email protected] |
Pages | 57-96 |
Number of pages | 40 |
ISBN (Print) | 9783030121181 |
DOIs | |
State | Published - Jan 1 2019 |
Externally published | Yes |