Article,

How Good is the Bayes Posterior in Deep Neural Networks Really?

F. Wenzel, K. Roth, B. Veeling, J. Świątkowski, L. Tran, S. Mandt, J. Snoek, T. Salimans, R. Jenatton, and S. Nowozin.
(2020)cite arxiv:2002.02405.

Abstract

During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neural networks in industrial practice. In this work we cast doubt on the current understanding of Bayes posteriors in popular deep neural networks: we demonstrate through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD. Furthermore, we demonstrate that predictive performance is improved significantly through the use of a "cold posterior" that overcounts evidence. Such cold posteriors sharply deviate from the Bayesian paradigm but are commonly used as heuristic in Bayesian deep learning papers. We put forward several hypotheses that could explain cold posteriors and evaluate the hypotheses through experiments. Our work questions the goal of accurate posterior approximations in Bayesian deep learning: If the true Bayes posterior is poor, what is the use of more accurate approximations? Instead, we argue that it is timely to focus on understanding the origin of the improved performance of cold posteriors.

BibTeX key: wenzel2020bayes
entry type: article
year: 2020
url: http://arxiv.org/abs/2002.02405
note: cite arxiv:2002.02405

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{wenzel2020bayes, abstract = {During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neural networks in industrial practice. In this work we cast doubt on the current understanding of Bayes posteriors in popular deep neural networks: we demonstrate through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD. Furthermore, we demonstrate that predictive performance is improved significantly through the use of a "cold posterior" that overcounts evidence. Such cold posteriors sharply deviate from the Bayesian paradigm but are commonly used as heuristic in Bayesian deep learning papers. We put forward several hypotheses that could explain cold posteriors and evaluate the hypotheses through experiments. Our work questions the goal of accurate posterior approximations in Bayesian deep learning: If the true Bayes posterior is poor, what is the use of more accurate approximations? Instead, we argue that it is timely to focus on understanding the origin of the improved performance of cold posteriors.}, added-at = {2020-02-07T17:22:32.000+0100}, author = {Wenzel, Florian and Roth, Kevin and Veeling, Bastiaan S. and Świątkowski, Jakub and Tran, Linh and Mandt, Stephan and Snoek, Jasper and Salimans, Tim and Jenatton, Rodolphe and Nowozin, Sebastian}, biburl = {https://www.bibsonomy.org/bibtex/2d01ecf6846fa384c08239fed6781ae79/kirk86}, description = {[2002.02405] How Good is the Bayes Posterior in Deep Neural Networks Really?}, interhash = {c937e3d3a876619d5a68a08822a56cfa}, intrahash = {d01ecf6846fa384c08239fed6781ae79}, keywords = {bayesian readings uncertainty}, note = {cite arxiv:2002.02405}, timestamp = {2020-02-07T17:22:32.000+0100}, title = {How Good is the Bayes Posterior in Deep Neural Networks Really?}, url = {http://arxiv.org/abs/2002.02405}, year = 2020 }

BibSonomy

How Good is the Bayes Posterior in Deep Neural Networks Really?

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on