Abstract
Byzantine robustness is an essential feature of algorithms for certain distributed optimization problems, typically encountered in collaborative/federated learning. These problems are usually huge-scale, implying that communication compression is also imperative for their resolution. These factors have spurred recent algorithmic and theoretical developments in the literature of Byzantine-robust learning with compression. In this paper, we contribute to this research area in two main directions. First, we propose a new Byzantine-robust method with compression – Byz-DASHA-PAGE – and prove that the new method has better convergence rate (for non-convex and Polyak-Łojasiewicz smooth optimization problems), smaller neighborhood size in the heterogeneous case, and tolerates more Byzantine workers under over-parametrization than the previous method with SOTA theoretical convergence guarantees (Byz-VR-MARINA). Secondly, we develop the first Byzantine-robust method with communication compression and error feedback – Byz-EF21 – along with its bidirectional compression version – ByzEF21-BC – and derive the convergence rates for these methods for non-convex and Polyak-Łojasiewicz smooth case. We test the proposed methods and illustrate our theoretical findings in the numerical experiments.
Original language | English (US) |
---|---|
Pages | 1207-1215 |
Number of pages | 9 |
State | Published - 2024 |
Event | 27th International Conference on Artificial Intelligence and Statistics, AISTATS 2024 - Valencia, Spain Duration: May 2 2024 → May 4 2024 |
Conference
Conference | 27th International Conference on Artificial Intelligence and Statistics, AISTATS 2024 |
---|---|
Country/Territory | Spain |
City | Valencia |
Period | 05/2/24 → 05/4/24 |
Bibliographical note
Publisher Copyright:Copyright 2024 by the author(s).
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability