Abstract
The computational and storage complexity of kernel machines presents the primary barrier to their scaling to large, modern, datasets. A common way to tackle the scalability issue is to use the conjugate gradient algorithm, which relieves the constraints on both storage (the kernel matrix need not be stored) and computation (both stochastic gradients and parallelization can be used). Even so, conjugate gradient is not without its own issues: The conditioning of kernel matrices is often such that conjugate gradients will have poor convergence in practice. Preconditioning is a common approach to alleviating this issue. Here we propose preconditioned conjugate gradients for kernel machines, and develop a broad range of preconditioned particularly useful for kernel matrices. We describe a scalable approach to both solving kernel machines and learning their hyperparameters. We show this approach is exact in the limit of iterations and outperforms state-of-the-art approximations for a given computational budget.
Original language | English (US) |
---|---|
Title of host publication | 33rd International Conference on Machine Learning, ICML 2016 |
Editors | Kilian Q. Weinberger, Maria Florina Balcan |
Publisher | International Machine Learning Society (IMLS) |
Pages | 3747-3760 |
Number of pages | 14 |
ISBN (Electronic) | 9781510829008 |
State | Published - 2016 |
Event | 33rd International Conference on Machine Learning, ICML 2016 - New York City, United States Duration: Jun 19 2016 → Jun 24 2016 |
Publication series
Name | 33rd International Conference on Machine Learning, ICML 2016 |
---|---|
Volume | 6 |
Other
Other | 33rd International Conference on Machine Learning, ICML 2016 |
---|---|
Country/Territory | United States |
City | New York City |
Period | 06/19/16 → 06/24/16 |
Bibliographical note
Publisher Copyright:© 2016 by the author(s).
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Computer Networks and Communications