MINTY: Rule-based Models that Minimize the Need for Imputing Features with Missing Values
Abstract: Rule models are often preferred in prediction tasks with tabular inputs as they can be easily interpreted using natural language and provide predictive performance on par with more complex models. However, most rule models' predictions are undefined or ambiguous when some inputs are missing, forcing users to rely on statistical imputation models or heuristics like zero imputation, undermining the interpretability of the models. In this work, we propose fitting concise yet precise rule models that learn to avoid relying on features with missing values and, therefore, limit their reliance on imputation at test time. We develop MINTY, a method that learns rules in the form of disjunctions between variables that act as replacements for each other when one or more is missing. This results in a sparse linear rule model, regularized to have small dependence on features with missing values, that allows a trade-off between goodness of fit, interpretability, and robustness to missing values at test time. We demonstrate the value of MINTY in experiments using synthetic and real-world data sets and find its predictive performance comparable or favorable to baselines, with smaller reliance on features with missing values.
- The influence of missing components of the acute physiology score of apache iii on the measurement of icu performance. Intensive care medicine, 31:1537–1543, 2005.
- Recurrent neural networks for missing or asynchronous data. Advances in neural information processing systems, 8, 1995.
- API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pages 108–122, 2013.
- Multiple imputation and its application. John Wiley & Sons, 2012.
- Recurrent neural networks for multivariate time series with missing values. Scientific reports, 8(1):1–12, 2018.
- Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.
- Package ‘xgboost’. R version, 90, 2019.
- Missing values and imputation in healthcare data: Can interpretable machine learning help? In Conference on Health, Inference, and Learning, pages 86–99. PMLR, 2023.
- Dean De Cock. Ames, iowa: Alternative to the boston housing data as an end of semester regression project. Journal of Statistics Education, 19(3), 2011.
- Foundations of rule learning. Springer Science & Business Media, 2012.
- Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual, 2023. URL https://www.gurobi.com.
- Performance of critical care prognostic scoring systems in low and middle-income countries: a systematic review. Critical care, 22:1–22, 2018.
- Missing-data methods for generalized linear models: A comparative review. Journal of the American Statistical Association, 100(469):332–346, 2005.
- Michael P. Jones. Indicator and stratification methods for missing explanatory variables in multiple linear regression. Journal of the American Statistical Association, 91(433):222–230, 1996. ISSN 01621459. URL http://www.jstor.org/stable/2291399.
- On the consistency of supervised learning with missing values. arXiv preprint arXiv:1902.06931, 2019.
- Neumiss networks: differentiable programming for supervised learning with missing values. Advances in Neural Information Processing Systems, 33:5980–5990, 2020a.
- NeuMiss networks: differentiable programming for supervised learning with missing values. arXiv:2007.01627, 2020b.
- Linear predictor on linearly-generated data with missing values: non consistency and solutions. In Silvia Chiappa and Roberto Calandra, editors, Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, volume 108 of Proceedings of Machine Learning Research, pages 3165–3174. PMLR, 26–28 Aug 2020c.
- What’s a good imputation to predict with missing values? Advances in Neural Information Processing Systems, 34:11530–11540, 2021.
- Statistical analysis with missing data, volume 793. John Wiley & Sons, 2019.
- A new method to compare the interpretability of rule-based algorithms. AI, 2(4):621–635, 2021.
- R-miss-tastic: a unified platform for missing values methods and workflows, 2019.
- Cognitive and MRI trajectories for prediction of Alzheimer’s disease. Scientific Reports, 11(1):1–10, 2021.
- Handling incomplete heterogeneous data using vaes. Pattern Recognition, 107:107501, 2020.
- Characterization of overlap in observational studies. In International Conference on Artificial Intelligence and Statistics, pages 788–798. PMLR, 2020.
- World Health Organization et al. Ghe: Life expectancy and healthy life expectancy. The Global Health Observatory [Internet].[cited 26 Aug 2022]. https://www. who. int/data/gho/data/themes/mortality-andglobal-health-estimates/ghe-life-expectancy-and-healthy-life-expectancy, 2021.
- Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- Life expectancy. Our World in Data, 2013. https://ourworldindata.org/life-expectancy.
- Donald B Rubin. Inference and missing data. Biometrika, 63(3):581–592, 1976.
- Donald B Rubin. An overview of multiple imputation. In Proceedings of the survey research methods section of the American statistical association, volume 79, page 84. Citeseer, 1988.
- A researcher’s guide to regression, discretization, and median splits of continuous variables. Journal of Consumer Psychology, 25(4):666–678, 2015.
- What is meant by ”missing at random”? Statistical Science, 28(2):257–268, 2013.
- Sharing pattern submodels for prediction with missing values. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37–8, pages 9882–9890, 2023.
- Good methods for coping with missing data in decision trees. Pattern Recognition Letters, 29(7):950–956, 2008.
- Learning optimized risk scores. Journal of Machine Learning Research (JMLR), 20(150):1–75, 2019.
- Stef Van Buuren. Flexible Imputation of Missing Data (2nd ed.). Chapman and Hall/CRC, Boca Raton, FL, 2018.
- Naïve bayes. Encyclopedia of machine learning, 15(1):713–714, 2010.
- Generalized linear rule models. In International Conference on Machine Learning, pages 6687–6696. PMLR, 2019.
- The alzheimer’s disease neuroimaging initiative: Progress report and future plans. Alzheimer’s & Dementia, 6(3):202–211.e7, 2010.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.