Abstract
In this work we show that randomized (block) coordinate descent methods can be accelerated by parallelization when applied to the problem of minimizing the sum of a partially separable smooth convex function and a simple separable convex function. The theoretical speedup, as compared to the serial method, and referring to the number of iterations needed to approximately solve the problem with high probability, is a simple expression depending on the number of parallel processors and a natural and easily computable measure of separability of the smooth component of the objective function. In the worst case, when no degree of separability is present, there may be no speedup; in the best case, when the problem is separable, the speedup is equal to the number of processors. Our analysis also works in the mode when the number of blocks being updated at each iteration is random, which allows for modeling situations with busy or unreliable processors. We show that our algorithm is able to solve a LASSO problem involving a matrix with 20 billion nonzeros in 2 h on a large memory node with 24 cores.
Original language | English (US) |
---|---|
Pages (from-to) | 433-484 |
Number of pages | 52 |
Journal | Mathematical Programming |
Volume | 156 |
Issue number | 1-2 |
DOIs | |
State | Published - Mar 1 2016 |
Bibliographical note
Funding Information:This paper was awarded the 16th IMA Leslie Fox Prize in Numerical Analysis (2nd Prize; for M.T.) in June 2013. The work of the first author was supported by EPSRC grants EP/J020567/1 (Algorithms for Data Simplicity) and EP/I017127/1 (Mathematics for Vast Digital Resources). The second author was supported by the Centre for Numerical Algorithms and Intelligent Software (funded by EPSRC grant EP/G036136/1 and the Scottish Funding Council). An open source code with an efficient implementation of the algorithm(s) developed in this paper is published here: http://code.google.com/p/ac-dc/ .
Publisher Copyright:
© 2015, The Author(s).
Keywords
- Big data optimization
- Composite objective
- Convex optimization
- Expected separable over-approximation
- Huge-scale optimization
- Iteration complexity
- LASSO
- Parallel coordinate descent
- Partial separability
ASJC Scopus subject areas
- Software
- Mathematics(all)