Walsh-Hadamard variational inference for Bayesian deep learning

Simone Rossi*, Sébastien Marmin*, Maurizio Filippone

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

5 Scopus citations

Abstract

Over-parameterized models, such as DeepNets and ConvNets, form a class of models that are routinely adopted in a wide variety of applications, and for which Bayesian inference is desirable but extremely challenging. Variational inference offers the tools to tackle this challenge in a scalable way and with some degree of flexibility on the approximation, but for over-parameterized models this is challenging due to the over-regularization property of the variational objective. Inspired by the literature on kernel methods, and in particular on structured approximations of distributions of random matrices, this paper proposes Walsh-Hadamard Variational Inference (WHVI), which uses Walsh-Hadamard-based factorization strategies to reduce the parameterization and accelerate computations, thus avoiding over-regularization issues with the variational objective. Extensive theoretical and empirical analyses demonstrate that WHVI yields considerable speedups and model reductions compared to other techniques to carry out approximate inference for over-parameterized models, and ultimately show how advances in kernel methods can be translated into advances in approximate Bayesian inference for Deep Learning.

Original languageEnglish (US)
StatePublished - 2020
Event34th Conference on Neural Information Processing Systems, NeurIPS 2020 - Virtual, Online
Duration: Dec 6 2020Dec 12 2020

Conference

Conference34th Conference on Neural Information Processing Systems, NeurIPS 2020
CityVirtual, Online
Period12/6/2012/12/20

Bibliographical note

Publisher Copyright:
© 2020 Neural information processing systems foundation. All rights reserved.

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Walsh-Hadamard variational inference for Bayesian deep learning'. Together they form a unique fingerprint.

Cite this