Learning rates of $l^q$ coefficient regularization learning with Gaussian kernel
Abstract: Regularization is a well recognized powerful strategy to improve the performance of a learning machine and $lq$ regularization schemes with $0<q<\infty$ are central in use. It is known that different $q$ leads to different properties of the deduced estimators, say, $l2$ regularization leads to smooth estimators while $l1$ regularization leads to sparse estimators. Then, how does the generalization capabilities of $lq$ regularization learning vary with $q$? In this paper, we study this problem in the framework of statistical learning theory and show that implementing $lq$ coefficient regularization schemes in the sample dependent hypothesis space associated with Gaussian kernel can attain the same almost optimal learning rates for all $0<q<\infty$. That is, the upper and lower bounds of learning rates for $lq$ regularization learning are asymptotically identical for all $0<q<\infty$. Our finding tentatively reveals that, in some modeling contexts, the choice of $q$ might not have a strong impact with respect to the generalization capability. From this perspective, $q$ can be arbitrarily specified, or specified merely by other no generalization criteria like smoothness, computational complexity, sparsity, etc..
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.