TY - GEN
T1 - A BETTER ALTERNATIVE TO ERROR FEEDBACK FOR COMMUNICATION-EFFICIENT DISTRIBUTED LEARNING
AU - Horvath, Samuel
AU - Richtarik, Peter
N1 - KAUST Repository Item: Exported on 2023-03-28
PY - 2021
Y1 - 2021
N2 - Modern large-scale machine learning applications require stochastic optimization algorithms to be implemented on distributed compute systems. A key bottleneck of such systems is the communication overhead for exchanging information (e.g., stochastic gradients) across the workers. Among the many techniques proposed to remedy this issue, one of the most successful is the framework of compressed communication with error feedback (EF). EF remains the only known technique that can deal with the error induced by contractive compressors which are not unbiased, such as Top-K or PowerSGD. In this paper, we propose a new and theoretically and practically better alternative to EF for dealing with contractive compressors. In particular, we propose a construction which can transform any contractive compressor into an induced unbiased compressor. Following this transformation, existing methods able to work with unbiased compressors can be applied. We show that our approach leads to vast improvements over EF, including reduced memory requirements, better communication complexity guarantees and fewer assumptions. We further extend our results to federated learning with partial participation following an arbitrary distribution over the nodes, and demonstrate the benefits thereof. We perform several numerical experiments which validate our theoretical findings.
AB - Modern large-scale machine learning applications require stochastic optimization algorithms to be implemented on distributed compute systems. A key bottleneck of such systems is the communication overhead for exchanging information (e.g., stochastic gradients) across the workers. Among the many techniques proposed to remedy this issue, one of the most successful is the framework of compressed communication with error feedback (EF). EF remains the only known technique that can deal with the error induced by contractive compressors which are not unbiased, such as Top-K or PowerSGD. In this paper, we propose a new and theoretically and practically better alternative to EF for dealing with contractive compressors. In particular, we propose a construction which can transform any contractive compressor into an induced unbiased compressor. Following this transformation, existing methods able to work with unbiased compressors can be applied. We show that our approach leads to vast improvements over EF, including reduced memory requirements, better communication complexity guarantees and fewer assumptions. We further extend our results to federated learning with partial participation following an arbitrary distribution over the nodes, and demonstrate the benefits thereof. We perform several numerical experiments which validate our theoretical findings.
UR - http://hdl.handle.net/10754/663911
UR - https://openreview.net/forum?id=vYVI1CHPaQg
UR - http://www.scopus.com/inward/record.url?scp=85150277248&partnerID=8YFLogxK
M3 - Conference contribution
BT - 9th International Conference on Learning Representations, ICLR 2021
PB - International Conference on Learning Representations, ICLR
ER -