Feature learning with sharp rates for low-degree polynomial regression

Develop a theoretical framework that characterizes and leverages the feature learning effect of neural networks to learn degree-ℓ0 spherical polynomials on the unit sphere S^{d−1} with sharp (minimax) population regression rates.

Background

A large body of work analyzes neural networks in the NTK or linearized regimes, where feature learning is limited. While some approaches attempt to go beyond linearization, they typically do not achieve sharp minimax rates for learning low-degree polynomials on the sphere.

The authors identify the need to understand and rigorously exploit feature learning—rather than pure kernel behavior—to achieve sharp statistical guarantees. Their proposed learnable channel attention mechanism is designed to select harmonic degrees and realize feature learning with minimax-optimal rates, but the statement frames this direction as an open problem motivating their contribution.

References

Furthermore, it is an open problem how to explore the feature learning effect of neural networks in learning such polynomials with sharp rates.

Shallow Neural Networks Learn Low-Degree Spherical Polynomials with Learnable Channel Attention  (2512.20562 - Yang, 23 Dec 2025) in Section 1 (Introduction)