Model-Based Clustering of Functional Data Via Random Projection Ensembles

Published 1 Dec 2025 in stat.ME | (2512.01450v1)

Abstract: Clustering functional data is a challenging task due to intrinsic infinite-dimensionality and the need for stable, data-adaptive partitioning. In this work, we propose a clustering framework based on Random Projections, which simultaneously performs dimensionality reduction and generates multiple stochastic representations of the original functions. Each projection is clustered independently, and the resulting partitions are then aggregated through an ensemble consensus procedure, enhancing robustness and mitigating the influence of any single projection. To focus on the most informative representations, projections are ranked according to clustering quality criteria, and only a selected subset is retained. In particular, we adopt Gaussian Mixture Models as base clusterers and employ the Kullback-Leibler divergence to order the random projections; these choices enable fast computation and eliminate the need to specify the number of clusters a priori. The performance of the proposed methodology is assessed through an extensive simulation study and two real-data applications, one from spectroscopy data for food authentication and one from log-periodograms of speech recording; the obtained results suggest that the proposal represents an effective tool for the clustering of functional data.