Calibrate and Prune: Improving Reliability of Lottery Tickets Through Prediction Calibration

Published 10 Feb 2020 in stat.ML and cs.LG | (2002.03875v3)

Abstract: The hypothesis that sub-network initializations (lottery) exist within the initializations of over-parameterized networks, which when trained in isolation produce highly generalizable models, has led to crucial insights into network initialization and has enabled efficient inferencing. Supervised models with uncalibrated confidences tend to be overconfident even when making wrong prediction. In this paper, for the first time, we study how explicit confidence calibration in the over-parameterized network impacts the quality of the resulting lottery tickets. More specifically, we incorporate a suite of calibration strategies, ranging from mixup regularization, variance-weighted confidence calibration to the newly proposed likelihood-based calibration and normalized bin assignment strategies. Furthermore, we explore different combinations of architectures and datasets, and make a number of key findings about the role of confidence calibration. Our empirical studies reveal that including calibration mechanisms consistently lead to more effective lottery tickets, in terms of accuracy as well as empirical calibration metrics, even when retrained using data with challenging distribution shifts with respect to the source dataset.