Sparsity information and regularization in the horseshoe and other shrinkage priors

Published 6 Jul 2017 in stat.ME | (1707.01694v1)

Abstract: The horseshoe prior has proven to be a noteworthy alternative for sparse Bayesian estimation, but has previously suffered from two problems. First, there has been no systematic way of specifying a prior for the global shrinkage hyperparameter based on the prior information about the degree of sparsity in the parameter vector. Second, the horseshoe prior has the undesired property that there is no possibility of specifying separately information about sparsity and the amount of regularization for the largest coefficients, which can be problematic with weakly identified parameters, such as the logistic regression coefficients in the case of data separation. This paper proposes solutions to both of these problems. We introduce a concept of effective number of nonzero parameters, show an intuitive way of formulating the prior for the global hyperparameter based on the sparsity assumptions, and argue that the previous default choices are dubious based on their tendency to favor solutions with more unshrunk parameters than we typically expect a priori. Moreover, we introduce a generalization to the horseshoe prior, called the regularized horseshoe, that allows us to specify a minimum level of regularization to the largest values. We show that the new prior can be considered as the continuous counterpart of the spike-and-slab prior with a finite slab width, whereas the original horseshoe resembles the spike-and-slab with an infinitely wide slab. Numerical experiments on synthetic and real world data illustrate the benefit of both of these theoretical advances.

Abstract PDF Upgrade to Chat

Citations (339)

View on Semantic Scholar

Summary

The paper introduces the effective number of nonzero parameters to systematically tie the global shrinkage parameter to sparsity beliefs.
The paper proposes the regularized horseshoe, which enforces minimum regularization on large coefficients for improved model stability.
The paper validates its approach with extensive experiments showing enhanced prediction accuracy and computational efficiency over traditional methods.

Insights on the Horseshoe Prior: Sparsity and Regularization

The paper by Piironen and Vehtari provides an in-depth exploration of the horseshoe prior, a Bayesian approach utilized for sparse estimation in high-dimensional settings. This discussion serves to address two core issues historically identified with the horseshoe prior and offers robust theoretical insights and practical solutions.

Core Contributions

The paper identifies two main limitations with the traditional horseshoe prior: (1) the absence of systematic methodology to define the global shrinkage parameter based on prior information regarding parameter sparsity, and (2) the problematic inability to separately specify information on sparsity and regularization of large coefficients. The authors propose a new parameterization solution—introducing the effective number of nonzero parameters and formulating a generalized version of the horseshoe prior, termed the regularized horseshoe.

Formulation and Implications

Effective Number of Nonzero Parameters: A key insight presented is the concept of the effective number of nonzero coefficients ( $m_\text{eff}$ ). This measure allows researchers to directly tie the global shrinkage parameter to their prior belief of sparsity within the parameter space. The authors provide a mathematical relationship between $m_\text{eff}$ and the global shrinkage parameter $\tau$ , suggesting that traditional default choices for $\tau$ might unconsciously favor solutions with more unshrunk parameters than historically expected.
Regularized Horseshoe Prior: The regularized horseshoe extends the traditional model by allowing one to impose a minimum regularization level to the largest coefficients, addressing the original model's limitation, especially when dealing with weakly identified parameters. This new formulation is both theoretically elegant and practically feasible as it mimics the continuous counterpart of the spike-and-slab prior.

Experimental Validation

The paper underpins the theoretical claims with extensive numerical experiments across synthetic and real-world datasets, demonstrating superior prediction accuracy and computational efficiency of the regularized horseshoe over its traditional counterpart. The work suggests that appropriate hyperprior choices for $\tau$ can substantially improve the model's inference capabilities, validating effectiveness across multiple example spaces, specifically in logistic regression with separable data.

Practical and Theoretical Implications

Practical Impact: These developments are particularly salient in contexts such as genomic data analysis or any high-dimensional classification tasks where sparsity is paramount. The regularized horseshoe provides a seamless transition to manage large coefficients better, ensuring no parameter escapes regularization unduly, thus leading to more stable and interpretable models.
Future Directions: While the paper thoroughly addresses sparsity control through regularization, future work could explore further the potential for multimodality in Bayesian posteriors and its mitigations, especially considering correlated predictors affecting MCMC sampling effectiveness.

In summary, the solutions proposed by Piironen and Vehtari significantly refine the horseshoe prior’s application toolkit, providing researchers with methodological benefits for more tailored shrinkage priors particularly suited to sparse, high-dimensional Bayesian analysis. These advances further enhance Bayesian inference capabilities in an era increasingly defined by large-scale data challenges.