Papers
Topics
Authors
Recent
Search
2000 character limit reached

Automated Model Selection for Generalized Linear Models

Published 25 Apr 2024 in stat.ML, cs.LG, and math.OC | (2404.16560v1)

Abstract: In this paper, we show how mixed-integer conic optimization can be used to combine feature subset selection with holistic generalized linear models to fully automate the model selection process. Concretely, we directly optimize for the Akaike and Bayesian information criteria while imposing constraints designed to deal with multicollinearity in the feature selection task. Specifically, we propose a novel pairwise correlation constraint that combines the sign coherence constraint with ideas from classical statistical models like Ridge regression and the OSCAR model.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. B. K. Natarajan, Sparse approximate solutions to linear systems, SIAM journal on computing 24 (1995) 227–234.
  2. Best subset selection via a modern optimization lens, The annals of statistics 44 (2016) 813–852.
  3. D. Bertsimas, A. King, Logistic Regression: From Art to Science, Statistical Science 32 (2017) 367–384. URL: https://projecteuclid.org/journals/statistical-science/volume-32/issue-3/Logistic-Regression-From-Art-to-Science/10.1214/16-STS602.full. doi:10.1214/16-STS602.
  4. H. Hazimeh, R. Mazumder, Fast best subset selection: Coordinate descent and local combinatorial optimization algorithms, Operations Research 68 (2020) 1517–1537.
  5. I. Guyon, A. Elisseeff, An introduction to variable and feature selection, Journal of machine learning research 3 (2003) 1157–1182.
  6. R. Miyashiro, Y. Takano, Subset selection by mallows’ cp: A mixed integer programming approach, Expert Systems with Applications 42 (2015) 325–331.
  7. lmsubsets: Exact variable-subset selection in linear regression for r, Journal of Statistical Software 93 (2020) 1–21. URL: https://www.jstatsoft.org/index.php/jss/article/view/v093i03. doi:10.18637/jss.v093.i03.
  8. Feature subset selection for logistic regression via mixed integer optimization, Computational Optimization and Applications 64 (2016) 865–880. URL: https://doi.org/10.1007/s10589-016-9832-2. doi:10.1007/s10589-016-9832-2.
  9. Sparse poisson regression via mixed-integer optimization, PLOS ONE 16 (2021). doi:10.1371/journal.pone.0249916.
  10. J. C. Debuse, V. J. Rayward-Smith, Feature subset selection within a simulated annealing data mining algorithm, Journal of Intelligent Information Systems 9 (1997) 57–81.
  11. J. Yang, V. Honavar, Feature subset selection using a genetic algorithm, IEEE Intelligent Systems and their Applications 13 (1998) 44–49.
  12. V. Calcagno, C. de Mazancourt, glmulti: An r package for easy automated model selection with (generalized) linear models, Journal of Statistical Software 34 (2010) 1–29. URL: https://www.jstatsoft.org/index.php/jss/article/view/v034i12. doi:10.18637/jss.v034.i12.
  13. A rough set approach to feature selection based on ant colony optimization, Pattern Recognition Letters 31 (2010) 226–233.
  14. Feature selection based on rough sets and particle swarm optimization, Pattern recognition letters 28 (2007) 459–471.
  15. Holistic generalized linear models, arXiv (2022). URL: https://arxiv.org/abs/2205.15447. doi:10.48550/ARXIV.2205.15447.
  16. A comparison of optimization solvers for log-binomial regression including conic programming, Computational Statistics 36 (2021) 1721–1754. doi:10.1007/s00180-021-01084-5.
  17. Integer constraints for enhancing interpretability in linear regression, SORT-Statistics and Operations Research Transactions (2020) 67–98.
  18. A. E. Hoerl, R. W. Kennard, Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12 (1970) 55–67.
  19. H. D. Bondell, B. J. Reich, Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with oscar, Biometrics 64 (2008) 115–123.
  20. J. A. Nelder, R. W. Wedderburn, Generalized linear models, Journal of the Royal Statistical Society: Series A (General) 135 (1972) 370–384.
  21. H. Zou, T. Hastie, Regularization and variable selection via the elastic net, Journal of the royal statistical society: series B (statistical methodology) 67 (2005) 301–320.
  22. R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological) 58 (1996) 267–288.
  23. Best subset, forward stepwise or lasso? analysis and recommendations based on extensive comparisons, Statistical Science 35 (2020) 579–592.
  24. Estimation of l0 norm penalized models: A statistical treatment, Computational Statistics & Data Analysis 192 (2024) 107902.
  25. Cutting big m down to size, Interfaces 20 (1990) 61–66. URL: https://doi.org/10.1287/inte.20.5.61. doi:10.1287/inte.20.5.61.
  26. H. Akaike, A new look at the statistical model identification, IEEE transactions on automatic control 19 (1974) 716–723.
  27. G. Schwarz, Estimating the dimension of a model, The annals of statistics (1978) 461–464.
  28. C. Mallows, Choosing variables in a linear regression: A graphical aid, in: Central Regional Meeting of the Institute of Mathematical Statistics, Manhattan, KS, 1964, 1964.
  29. Akaike’s information criterion, cpsubscript𝑐𝑝c_{p}italic_c start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT and estimators of loss for elliptically symmetric distributions, International Statistical Review 82 (2014) 422–439. URL: http://www.jstor.org/stable/43299006. doi:https://doi.org/10.1111/insr.12052.
  30. A. Albert, J. A. Anderson, On the existence of maximum likelihood estimates in logistic regression models, Biometrika 71 (1984) 1–10.
  31. K. Konis, K. Fokianos, Safe density ratio modeling, Statistics & Probability Letters 79 (2009) 1915–1920. doi:10.1016/j.spl.2009.05.020.
  32. D. Bertsimas, A. King, An algorithmic approach to linear regression, Operations Research 64 (2015) 2–16. doi:10.1287/opre.2015.1436.
  33. ECOS: An SOCP solver for embedded systems, in: European Control Conference (ECC), 2013, pp. 3071–3076. doi:10.23919/ECC.2013.6669541.
  34. abess: a fast best-subset selection library in python and r, The Journal of Machine Learning Research 23 (2022) 9206–9212.
  35. G. C. McDonald, D. I. Galarneau, A monte carlo evaluation of some ridge-type estimators, Journal of the American Statistical Association 70 (1975) 407–416. URL: http://www.jstor.org/stable/2285832.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 21 likes about this paper.