Variation Due to Regularization Tractably Recovers Bayesian Deep Learning
Abstract: Uncertainty quantification in deep learning is crucial for safe and reliable decision-making in downstream tasks. Existing methods quantify uncertainty at the last layer or other approximations of the network which may miss some sources of uncertainty in the model. To address this gap, we propose an uncertainty quantification method for large networks based on variation due to regularization. Essentially, predictions that are more (less) sensitive to the regularization of network parameters are less (more, respectively) certain. This principle can be implemented by deterministically tweaking the training loss during the fine-tuning phase and reflects confidence in the output as a function of all layers of the network. We show that regularization variation (RegVar) provides rigorous uncertainty estimates that, in the infinitesimal limit, exactly recover the Laplace approximation in Bayesian deep learning. We demonstrate its success in several deep learning architectures, showing it can scale tractably with the network size while maintaining or improving uncertainty quantification quality. Our experiments across multiple datasets show that RegVar not only identifies uncertain predictions effectively but also provides insights into the stability of learned representations.
- Weight uncertainty in neural networks. In International Conference on Machine Learning, pp. 1613–1622, 2015.
- Laplace redux-effortless Bayesian deep learning. Advances in Neural Information Processing Systems, 34:20089–20103, 2021.
- ’In-between’ uncertainty in Bayesian neural networks. arXiv preprint arXiv:1906.11537, 2019.
- Gauss-Newton approximation to Bayesian learning. In Proceedings of international conference on neural networks (ICNN’97), volume 3, pp. 1930–1935. IEEE, 1997.
- Improving predictions of Bayesian neural nets via local linearization. In International Conference on Artificial Intelligence and Statistics, pp. 703–711. PMLR, 2021.
- The implicit delta method. Advances in Neural Information Processing Systems, 35:37471–37483, 2022.
- What uncertainties do we need in Bayesian deep learning for computer vision? Advances in Neural Information Processing Systems, 30, 2017.
- Approximate inference turns deep networks into Gaussian processes. Advances in neural information processing systems, 32, 2019.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2017.
- Lawrence, N. D. Variational inference in probabilistic models. PhD thesis, 2001.
- Leveraging uncertainty information from deep neural networks for disease detection. Scientific Reports, 7(1):17816, 2017.
- Optimizing millions of hyperparameters by implicit differentiation. In International Conference on Artificial Intelligence and Statistics, pp. 1540–1552. PMLR, 2020.
- MacKay, D. J. Bayesian interpolation. Neural Computation, 4(3):415–447, 1992.
- Mackay, D. J. Bayesian methods for adaptive models. Doctoral Thesis, California Institute of Technology, 1992.
- Optimizing neural networks with Kronecker-factored approximate curvature. In International conference on machine learning, pp. 2408–2417. PMLR, 2015.
- Learning recurrent neural networks with Hessian-free optimization. In Proceedings of the 28th international conference on machine learning (ICML-11), pp. 1033–1040, 2011.
- Pearlmutter, B. A. Fast exact multiplication by the Hessian. Neural computation, 6(1):147–160, 1994.
- A scalable Laplace approximation for neural networks. In 6th International Conference on Learning Representations, ICLR 2018-Conference Track Proceedings, volume 6. International Conference on Representation Learning, 2018.
- Wasserman, L. All of nonparametric statistics. Springer, 2006.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.