Kernel selection for RKHS alignment with empirical data

Determine a principled method for selecting the kernel function in kernel-based subspace clustering so that the induced reproducing kernel Hilbert space (RKHS) is well aligned with empirical data distributions, thereby enabling effective clustering of data drawn from nonlinear manifolds.

Background

Kernel-based subspace clustering methods map data into a reproducing kernel Hilbert space (RKHS) using the kernel trick, but their performance depends heavily on the choice of kernel function. Without a principled selection strategy, the induced RKHS may not reflect the intrinsic structure of the empirical data, leading to suboptimal clustering.

This uncertainty has motivated alternatives such as deep subspace clustering networks, which aim to learn embeddings that better fit union-of-linear-subspaces models without requiring kernel selection. Nonetheless, a systematic solution to kernel selection in the RKHS setting remains an unresolved issue highlighted in the paper.

References

After many years of research, it is still unclear how to choose the kernel function so that the kernel-induced RKHS fits empirical data [19].

Label-independent hyperparameter-free self-supervised single-view deep subspace clustering  (2504.18179 - Sindicic et al., 25 Apr 2025) in Section 1 (Introduction)