Papers
Topics
Authors
Recent
Search
2000 character limit reached

MAFS: Multi-head Attention Feature Selection for High-Dimensional Data via Deep Fusion of Filter Methods

Published 6 Jan 2026 in cs.LG and stat.ME | (2601.02668v1)

Abstract: Feature selection is essential for high-dimensional biomedical data, enabling stronger predictive performance, reduced computational cost, and improved interpretability in precision medicine applications. Existing approaches face notable challenges. Filter methods are highly scalable but cannot capture complex relationships or eliminate redundancy. Deep learning-based approaches can model nonlinear patterns but often lack stability, interpretability, and efficiency at scale. Single-head attention improves interpretability but is limited in capturing multi-level dependencies and remains sensitive to initialization, reducing reproducibility. Most existing methods rarely combine statistical interpretability with the representational power of deep learning, particularly in ultra-high-dimensional settings. Here, we introduce MAFS (Multi-head Attention-based Feature Selection), a hybrid framework that integrates statistical priors with deep learning capabilities. MAFS begins with filter-based priors for stable initialization and guide learning. It then uses multi-head attention to examine features from multiple perspectives in parallel, capturing complex nonlinear relationships and interactions. Finally, a reordering module consolidates outputs across attention heads, resolving conflicts and minimizing information loss to generate robust and consistent feature rankings. This design combines statistical guidance with deep modeling capacity, yielding interpretable importance scores while maximizing retention of informative signals. Across simulated and real-world datasets, including cancer gene expression and Alzheimer's disease data, MAFS consistently achieves superior coverage and stability compared with existing filter-based and deep learning-based alternatives, offering a scalable, interpretable, and robust solution for feature selection in high-dimensional biomedical data.

Authors (3)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.