CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Scopus citations

Abstract

Due to the high communication cost in distributed and federated learning, methods relying on compressed communication are becoming increasingly popular. Besides, the best theoretically and practically performing gradient-type methods invariably rely on some form of acceleration/momentum to reduce the number of communications (faster convergence), e.g., Nesterov’s accelerated gradient descent [31, 32] and Adam [14]. In order to combine the benefits of communication compression and convergence acceleration, we propose a compressed and accelerated gradient method based on ANITA [20] for distributed optimization, which we call CANITA. Our CANITA achieves the first accelerated rate O (r(1 + n3)Lǫ + ω(1ǫ)31) , which improves upon the state-of-the-art non-accelerated rate O ((1 + ωn)Lǫ + ωω2++nω 1ǫ ) of DIANA [12] for distributed general convex problems, where ǫ is the target error, L is the smooth parameter of the objective, n is the number of machines/devices, and ω is the compression parameter (larger ω means more compression can be applied, and no compression implies ω = 0). Our results show that as long as the number of devices n is large (often true in distributed/federated learning), or the compression ω is not very high, CANITA achieves the faster convergence rate O(qLǫ ) , i.e., the number of communication rounds is O (qLǫ ) (vs. O(Lǫ ) achieved by previous works). As a result, CANITA enjoys the advantages of both compression (compressed communication in each round) and acceleration (much fewer communication rounds).

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
EditorsMarc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan
PublisherNeural information processing systems foundation
Pages13770-13781
Number of pages12
ISBN (Electronic)9781713845393
StatePublished - 2021
Event35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online
Duration: Dec 6 2021Dec 14 2021

Publication series

NameAdvances in Neural Information Processing Systems
Volume17
ISSN (Print)1049-5258

Conference

Conference35th Conference on Neural Information Processing Systems, NeurIPS 2021
CityVirtual, Online
Period12/6/2112/14/21

Bibliographical note

Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression'. Together they form a unique fingerprint.

Cite this