ProtoGate: Prototype-based Neural Networks with Global-to-local Feature Selection for Tabular Biomedical Data
Abstract: Tabular biomedical data poses challenges in machine learning because it is often high-dimensional and typically low-sample-size (HDLSS). Previous research has attempted to address these challenges via local feature selection, but existing approaches often fail to achieve optimal performance due to their limitation in identifying globally important features and their susceptibility to the co-adaptation problem. In this paper, we propose ProtoGate, a prototype-based neural model for feature selection on HDLSS data. ProtoGate first selects instance-wise features via adaptively balancing global and local feature selection. Furthermore, ProtoGate employs a non-parametric prototype-based prediction mechanism to tackle the co-adaptation problem, ensuring the feature selection results and predictions are consistent with underlying data clusters. We conduct comprehensive experiments to evaluate the performance and interpretability of ProtoGate on synthetic and real-world datasets. The results show that ProtoGate generally outperforms state-of-the-art methods in prediction accuracy by a clear margin while providing high-fidelity feature selection and explainable predictions. Code is available at https://github.com/SilenceX12138/ProtoGate.
- Bioinformatics. John Wiley & Sons, 2020.
- Arthur Lesk. Introduction to bioinformatics. Oxford university press, 2019.
- Bioinformatics. Springer Science & Business Media, 2007.
- An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data. Bioinformatics, 19(16):2131–2140, 2003.
- Locally sparse neural networks for tabular biomedical data. In International Conference on Machine Learning, pages 25123–25153. PMLR, 2022.
- Survival analysis with high-dimensional omics data using a threshold gradient descent regularization-based neural network approach. Genes, 13(9):1674, 2022.
- Transfer learning with deep tabular models. arXiv preprint arXiv:2206.15306, 2022.
- Invase: Instance-wise variable selection using neural networks. In International Conference on Learning Representations, 2018.
- Deep neural networks for high dimension, low sample size data. In IJCAI, pages 2287–2293, 2017.
- Tabular data: Deep learning is not all you need. Information Fusion, 81:84–90, 2022.
- Applications of deep learning in biomedicine. Molecular pharmaceutics, 13(5):1445–1454, 2016.
- A survey of high dimension low sample size asymptotics. Australian & New Zealand journal of statistics, 60(1):4–19, 2018.
- Weight predictor network with feature selection for small sample tabular biomedical data. AAAI Conference on Artificial Intelligence, 2023.
- A review of feature selection methods in medical applications. Computers in biology and medicine, 112:103375, 2019.
- Tabnet: Attentive interpretable tabular learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 6679–6687, 2021.
- Learning to explain: An information-theoretic perspective on model interpretation. In International Conference on Machine Learning, pages 883–892. PMLR, 2018.
- Have we learned to explain?: How interpretability methods can learn to encode predictions in their interpretations. In International Conference on Artificial Intelligence and Statistics, pages 1459–1467. PMLR, 2021.
- Sanity checks for saliency maps. Advances in neural information processing systems, 31, 2018.
- A benchmark for interpretability methods in deep neural networks. Advances in neural information processing systems, 32, 2019.
- Causability and explainability of artificial intelligence in medicine. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(4):e1312, 2019.
- Alex John London. Artificial intelligence and black-box medical decisions: accuracy versus explainability. Hastings Center Report, 49(1):15–21, 2019.
- Explainability for artificial intelligence in healthcare: a multidisciplinary perspective. BMC medical informatics and decision making, 20:1–9, 2020.
- Sandeep Reddy. Explainability and artificial intelligence in medicine. The Lancet Digital Health, 4(4):e214–e215, 2022.
- A survey on explainable artificial intelligence (xai): Toward medical xai. IEEE transactions on neural networks and learning systems, 32(11):4793–4813, 2020.
- Semi-supervised learning. 2006. Cambridge, Massachusettes: The MIT Press View Article, 2, 2006.
- Janet L Kolodner. An introduction to case-based reasoning. Artificial intelligence review, 6(1):3–34, 1992.
- Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- Case-based reasoning in the health sciences: What’s next? Artificial intelligence in medicine, 36(2):127–135, 2006.
- Isabelle Bichindaritz. Case-based reasoning in the health sciences: Why it matters for the health sciences and for cbr. In Advances in Case-Based Reasoning: 9th European Conference, ECCBR 2008, Trier, Germany, September 1-4, 2008. Proceedings 9, pages 1–17. Springer, 2008.
- Ace: Explaining cluster from an adversarial perspective. In International Conference on Machine Learning, pages 7156–7167. PMLR, 2021.
- Stochastic optimization of sorting networks via continuous relaxations. arXiv preprint arXiv:1903.08850, 2019.
- Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288, 1996.
- Sparse-input neural networks for high-dimensional nonparametric regression and classification. arXiv preprint arXiv:1711.07592, 2017.
- High-dimensional feature selection by feature-wise kernelized lasso. Neural computation, 26(1):185–207, 2014.
- Ultra high-dimensional nonlinear feature selection for big biological data. IEEE Transactions on Knowledge and Data Engineering, 30(7):1352–1365, 2018.
- Block hsic lasso: model-free biomarker detection for ultra-high dimensional data. Bioinformatics, 35(14):i427–i435, 2019.
- Fsnet: Feature selection network on high-dimensional biological data. arXiv preprint arXiv:2001.08322, 2020.
- Concrete autoencoders: Differentiable feature selection and reconstruction. In International conference on machine learning, pages 444–453. PMLR, 2019.
- Lassonet: Neural networks with feature sparsity. In International Conference on Artificial Intelligence and Statistics, pages 10–18. PMLR, 2021.
- Feature selection using stochastic gates. In International Conference on Machine Learning, pages 10648–10659. PMLR, 2020.
- " why should i trust you?" explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
- A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017.
- Learning important features through propagating activation differences. In International conference on machine learning, pages 3145–3153. PMLR, 2017.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
- Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888, 2018.
- On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140, 2015.
- Neural generators of sparse local linear models for achieving both accuracy and interpretability. Information Fusion, 81:116–128, 2022.
- Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Reinforcement learning, pages 5–32, 1992.
- Rebar: Low-variance, unbiased gradient estimates for discrete latent variable models. Advances in Neural Information Processing Systems, 30, 2017.
- Localized lasso for high-dimensional regression. In Artificial Intelligence and Statistics, pages 325–333. PMLR, 2017.
- Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems, 28(11):2660–2673, 2016.
- Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30, 2017.
- Leo Breiman. Random forests. Machine learning, 45:5–32, 2001.
- Evelyn Fix. Discriminatory analysis: nonparametric discrimination, consistency properties, volume 1. USAF school of Aviation Medicine, 1985.
- Classification of human lung carcinomas by mrna expression profiling reveals distinct adenocarcinoma subclasses. Proceedings of the National Academy of Sciences, 98(24):13790–13795, 2001.
- Gene expression correlates of clinical prostate cancer behavior. Cancer cell, 1(2):203–209, 2002.
- Cutting edge: Critical role of glycolysis in human plasmacytoid dendritic cell antiviral responses. The Journal of Immunology, 196(5):2004–2009, 2016.
- Minimum redundancy feature selection from microarray gene expression data. Journal of bioinformatics and computational biology, 3(02):185–205, 2005.
- The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature, 486(7403):346–352, 2012.
- The molecular signatures database hallmark gene set collection. Cell systems, 1(6):417–425, 2015.
- The cancer genome atlas (tcga): an immeasurable source of knowledge. Contemporary oncology, 19(1A):A68, 2015.
- William Falcon and The PyTorch Lightning team. PyTorch Lightning, March 2019.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- Scikit-learn: Machine learning in python. the Journal of machine Learning research, 12:2825–2830, 2011.
- Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2019.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.