Group COMBSS: Group Selection via Continuous Optimization
Abstract: We present a new optimization method for the group selection problem in linear regression. In this problem, predictors are assumed to have a natural group structure and the goal is to select a small set of groups that best fits the response. The incorporation of group structure in a predictor matrix is a key factor in obtaining better estimators and identifying associations between response and predictors. Such a discrete constrained problem is well-known to be hard, particularly in high-dimensional settings where the number of predictors is much larger than the number of observations. We propose to tackle this problem by framing the underlying discrete binary constrained problem into an unconstrained continuous optimization problem. The performance of our proposed approach is compared to state-of-the-art variable selection strategies on simulated data sets. We illustrate the effectiveness of our approach on a genetic dataset to identify grouping of markers across chromosomes.
- Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences, 102(43):15545–15550, 2005.
- Ming Yuan and Yi Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68(1):49–67, 2006.
- The group lasso for logistic regression. Journal of the Royal Statistical Society Series B: Statistical Methodology, 70(1):53–71, 2008.
- A multidimensional shrinkage-thresholding operator. In 2009 IEEE/SP 15th Workshop on Statistical Signal Processing, pages 113–116. IEEE, 2009.
- Standardization and the group lasso penalty. Statistica Sinica, 22(3):983, 2012.
- A sparse-group lasso. Journal of Computational and Graphical Statistics, 22(2):231–245, 2013.
- Grouped variable selection with discrete optimization: Computational and statistical perspectives. The Annals of Statistics, 51(1):1–32, 2023.
- Combss: best subset selection via continuous optimization. Statistics and Computing, 34(2):75, 2024.
- Balas Kausik Natarajan. Sparse approximate solutions to linear systems. SIAM Journal on Computing, 24(2):227–234, 1995.
- Subset selection with shrinkage: Sparse linear modeling when the snr is low. Operations Research, 71(1):129–147, 2023.
- Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Statistics and Computing, 25:173–187, 2015.
- A trans-acting locus regulates an anti-viral expression network and type 1 diabetes risk. Nature, 467(7314):460–464, 2010.
- New insights into the genetic control of gene expression using a bayesian multi-tissue approach. PLOS Computational Biology, 6(4):1–13, 04 2010.
- R2guess: a graphics processing unit-based r package for bayesian variable selection regression of multivariate responses. Journal of Statistical Software, 69(2), 2016.
- Bayesian variable selection regression of multivariate responses for group data. Bayesian Analysis, 12(4):1039–1067, 2017.
- B. Liquet and M Chadeau-Hyam. R2GUESS: Wrapper Functions for GUESS., 2014. R package version 1.4.
- Best subset solution path for linear dimension reduction models using continuous optimization. arXiv preprint arXiv:2403.20007, 2024.
- A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736, 2010.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.