Learning structured weight uncertainty in Bayesian neural networks

Shengyang Sun, Changyou Chen, Lawrence Carin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

69 Scopus citations

Abstract

Deep neural networks (DNNs) are increasingly popular in modern machine learning. Bayesian learning affords the opportunity to quantify posterior uncertainty on DNN model parameters. Most existing work adopts independent Gaussian priors on the model weights, ignoring possible structural information. In this paper, we consider the matrix variate Gaussian (MVG) distribution to model structured correlations within the weights of a DNN. To make posterior inference feasible, a reparametrization is proposed for the MVG prior, simplifying the complex MVG-based model to an equivalent yet simpler model with independent Gaussian priors on the transformed weights. Consequently, we develop a scalable Bayesian online inference algorithm by adopting the recently proposed probabilistic backpropagation framework. Experiments on several synthetic and real datasets indicate the superiority of our model, achieving competitive performance in terms of model likelihood and predictive root mean square error. Importantly, it also yields faster convergence speed compared to related Bayesian DNN models.
Original languageEnglish (US)
Title of host publicationProceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017
PublisherPMLR
StatePublished - Jan 1 2017
Externally publishedYes

Bibliographical note

Generated from Scopus record by KAUST IRTS on 2021-02-09

Fingerprint

Dive into the research topics of 'Learning structured weight uncertainty in Bayesian neural networks'. Together they form a unique fingerprint.

Cite this