Self Supervised Correlation-based Permutations for Multi-View Clustering
Abstract: Combining data from different sources can improve data analysis tasks such as clustering. However, most of the current multi-view clustering methods are limited to specific domains or rely on a suboptimal and computationally intensive two-stage process of representation learning and clustering. We propose an end-to-end deep learning-based multi-view clustering framework for general data types (such as images and tables). Our approach involves generating meaningful fused representations using a novel permutation-based canonical correlation objective. We provide a theoretical analysis showing how the learned embeddings approximate those obtained by supervised linear discriminant analysis (LDA). Cluster assignments are learned by identifying consistent pseudo-labels across multiple views. Additionally, we establish a theoretical bound on the error caused by incorrect pseudo-labels in the unsupervised representations compared to LDA. Extensive experiments on ten multi-view clustering benchmark datasets provide empirical evidence for the effectiveness of the proposed model.
- Deciphering cell–cell interactions and communication from gene expression. Nature Reviews Genetics, 22(2):71–88, 2021.
- Deepmcat: large-scale deep clustering for medical image categorization. In Deep Generative Models, and Data Augmentation, Labelling, and Imperfections: First Workshop, DGM4MICCAI 2021, and First Workshop, DALI 2021, Held in Conjunction with MICCAI 2021, Strasbourg, France, October 1, 2021, Proceedings 1, pages 259–267. Springer, 2021.
- Unsupervised clustering for collider physics. Physical Review D, 103(9):092007, 2021.
- Partitioning-based clustering for web document categorization. Decision Support Systems, 27(3):329–341, 1999.
- Anil K Jain. Data clustering: 50 years beyond k-means. Pattern recognition letters, 31(8):651–666, 2010.
- T Velmurugan and T Santhanam. A survey of partition based clustering algorithms in data mining: An experimental approach. Information Technology Journal, 10(3):478–484, 2011.
- Scalable density-based distributed clustering. In Knowledge Discovery in Databases: PKDD 2004: 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, Pisa, Italy, September 20-24, 2004. Proceedings 8, pages 231–244. Springer, 2004.
- Density-based clustering of uncertain data. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 672–677, 2005.
- Yixin Chen and Li Tu. Density-based clustering for real-time stream data. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 133–142, 2007.
- A local-density based spatial clustering algorithm with noise. Information systems, 32(7):978–986, 2007.
- Distribution-based clustering: using ecology to refine the operational taxonomic unit. Applied and environmental microbiology, 79(21):6593–6603, 2013.
- Clustering uncertain data based on probability distribution similarity. IEEE Transactions on Knowledge and Data Engineering, 25(4):751–763, 2011a.
- Fionn Murtagh. A survey of recent advances in hierarchical clustering algorithms. The computer journal, 26(4):354–359, 1983.
- Characterization, stability and convergence of hierarchical clustering methods. J. Mach. Learn. Res., 11(Apr):1425–1470, 2010.
- Interpretable deep clustering. arXiv preprint arXiv:2306.04785, 2023.
- Domain-generalizable multiple-domain clustering. arXiv preprint arXiv:2301.13530, 2023.
- A survey of multiview machine learning. Neurocomputing, 128:22–45, 2014.
- Co-regularized multi-view spectral clustering. Proceedings of the 28th international conference on machine learning (ICML-11), pages 521–528, 2011.
- Multiview clustering: A survey. In 2018 24th International Conference on Pattern Recognition (ICPR), pages 387–392. IEEE, 2018a.
- A self-training co-training algorithm for multiview spectral clustering. Pattern Recognition Letters, 33(13):1690–1700, 2012.
- A survey on multi-view learning. arXiv preprint arXiv:1304.5634, 2013.
- Multimodal clustering and content-based fusion for multimedia analysis. In 2017 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6. IEEE, 2017.
- A survey of multi-view representation learning. Neurocomputing, 128:27–42, 2014.
- Diversified multi-view video recommendation. IEEE Transactions on Multimedia, 17(4):511–525, 2015a.
- Seismic event discrimination using deep cca. IEEE Geoscience and Remote Sensing Letters, 17(11):1856–1860, 2019.
- Multiview kernels for low-dimensional modeling of seismic events. IEEE Transactions on Geoscience and Remote Sensing, 56(6):3300–3310, 2018.
- Diversity-induced multi-view subspace clustering. In CVPR, pages 586–594, 2015b.
- Consistent and specific multi-view subspace clustering. In AAAI, 2018.
- Reciprocal multi-layer subspace learning for multi-view clustering. In ICCV, pages 8172–8180, 2019a.
- Multi-view clustering via deep matrix factorization. In AAAI, pages 2921–2927, 2017.
- Incomplete multi-view clustering via graph regularized matrix factorization. In ECCV Workshops, 2018.
- Uniform distribution non-negative matrix factorization for multiview clustering. IEEE Transactions on Cybernetics, pages 3249–3262, 2021.
- Self-weighted multiview clustering with multiple graphs. In IJCAI, pages 2564–2570, 2017.
- Graph learning for multiview clustering. IEEE Transactions on Cybernetics, 48(10):2887–2895, 2017.
- One-step multi-view spectral clustering. IEEE Transactions on Knowledge and Data Engineering, 31(10):2022–2034, 2018.
- Anchors bring ease: An embarrassingly simple approach to partial multi-view clustering. In AAAI, pages 118–125, 2019.
- Deep multimodal subspace clustering networks. IEEE Journal of Selected Topics in Signal Processing, 12(6):1601–1614, 2018.
- Self-supervised learning by cross-modal audio-video clustering. In NeurIPS, pages 9758–9770, 2019.
- Deep adversarial multi-view clustering network. In IJCAI, pages 2952–2958, 2019b.
- Shared generative latent representation learning for multi-view clustering. In AAAI, pages 6688–6695, 2020.
- CDIMC-net: Cognitive deep incomplete multi-view clustering network. In IJCAI, pages 3230–3236, 2020.
- Deep embedded multi-view clustering with collaborative training. Information Sciences, 573:279–290, 2021a.
- Multi-VAE: Learning disentangled view-common and view-peculiar visual representations for multi-view clustering. In ICCV, pages 9234–9243, 2021b.
- COMPLETER: Incomplete multi-view clustering via contrastive prediction. In CVPR, 2021.
- Latent multi-view subspace clustering. In CVPR, pages 4279–4287, 2017a.
- Multi-view low-rank sparse subspace clustering. Pattern Recognition, 73:247–258, 2018.
- Deep multi-view sparse subspace clustering. In Proceedings of the 2018 VII International Conference on Network, Communication and Computing, pages 115–119, 2018.
- Deep safe multi-view clustering: Reducing the risk of clustering performance degradation caused by view increase. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 202–211, 2022.
- Reconsidering representation alignment for multi-view clustering. In CVPR, pages 1255–1265, 2021a.
- Multi-view clustering in latent embedding space. In AAAI, pages 3513–3520, 2020.
- Deep multiview clustering by contrasting cluster assignments. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 16752–16761, October 2023.
- Ronald A Fisher. The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2):179–188, 1936.
- Self-supervised deep correlational multi-view clustering. In 2021 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2021.
- Correlational neural networks. Neural computation, 28(2):257–285, 2016.
- Hotelling Harold. Relations between two sets of variables. Biometrika, 28(3):321–377, 1936.
- Bruce Thompson. Canonical correlation analysis: Uses and interpretation. Number 47. Sage, 1984.
- Kernel independent component analysis. Journal of machine learning research, 3(Jul):1–48, 2002.
- Nonparametric canonical correlation analysis. In International conference on machine learning, pages 1967–1976. PMLR, 2016.
- Learning coupled embedding using multiview diffusion maps. In Latent Variable Analysis and Signal Separation: 12th International Conference, LVA/ICA 2015, Liberec, Czech Republic, August 25-28, 2015, Proceedings 12, pages 127–134. Springer, 2015.
- Multi-view diffusion maps. Information Fusion, 55:127–149, 2020.
- Multi-view kernel consensus for data analysis. Applied and Computational Harmonic Analysis, 49(1):208–228, 2020.
- Deep canonical correlation analysis. In ICML, pages 1247–1255, 2013.
- L0-sparse canonical correlation analysis. In International Conference on Learning Representations, 2021.
- Gcfagg: Global and cross-view feature aggregation for multi-view clustering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19863–19872, 2023.
- Biclustering by sparse canonical correlation analysis. Quantitative Biology, 6:56–67, 2018.
- Assessment of mental stress effects on prefrontal cortical activities using canonical correlation analysis: an fnirs-eeg study. Biomedical optics express, 8(5):2583–2598, 2017.
- Sparse bayesian multiway canonical correlation analysis for eeg pattern recognition. Neurocomputing, 225:103–110, 2017b.
- Fault detection for non-gaussian processes using generalized canonical correlation analysis and randomized algorithms. IEEE Transactions on Industrial Electronics, 65(2):1559–1567, 2017.
- Deep clustering for unsupervised learning of visual features. In Proceedings of the European conference on computer vision (ECCV), pages 132–149, 2018.
- Unsupervised learning of visual features by contrasting cluster assignments. Advances in neural information processing systems, 33:9912–9924, 2020.
- Self-supervised autoencoders for clustering and classification. Evolving Systems, 11(3):453–466, 2020.
- Spice: Semantic pseudo-labeling for image clustering. IEEE Transactions on Image Processing, 31:7264–7278, 2022.
- Self-supervised adversarial hashing networks for cross-modal retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4242–4251, 2018b.
- On deep multi-view representation learning. In ICML, pages 1083–1092, 2015.
- Canonical correlation analysis using within-class coupling. Pattern Recognition Letters, 32(2):134–144, 2011.
- Multi-view clustering via canonical correlation analysis. In Proceedings of the 26th annual international conference on machine learning, pages 129–136, 2009.
- Deep generalized canonical correlation analysis. arXiv preprint arXiv:1702.02519, 2017.
- Qi Lyu and Xiao Fu. Nonlinear multiview analysis: Identifiability and neural network-assisted implementation. IEEE Transactions on Signal Processing, 68:2697–2712, 2020.
- Matrix perturbation theory. (No Title), 1990.
- Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
- Scalable and effective deep cca via soft decorrelation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1488–1497, 2018.
- Domain and modality adaptation using multi-kernel matching. In 2023 31st European Signal Processing Conference (EUSIPCO), pages 1285–1289. IEEE, 2023.
- The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature, 486(7403):346–352, 2012.
- A new genome-driven integrated classification of breast cancer and its implications. The EMBO journal, 32(5):617–628, 2013.
- Learning from multiple partially observed views-an application to multilingual text categorization. Advances in neural information processing systems, 22, 2009.
- The pascal visual object classes (voc) challenge. International journal of computer vision, 88:303–338, 2010.
- Reconsidering representation alignment for multi-view clustering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1255–1265, 2021b.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.
- Non-metric affinity propagation for unsupervised image categorization. In 2007 IEEE 11th international conference on computer vision, pages 1–8. IEEE, 2007.
- What are you talking about? text-to-image coreference. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3558–3565, 2014.
- End-to-end adversarial-attention network for multi-modal clustering. In CVPR, pages 14619–14628, 2020.
- Uci machine learning repository, 2007.
- Consumer video understanding: A benchmark database and an evaluation of human and machine performance. In ICMR, pages 1–8, 2011b.
- Multiview concept learning via deep matrix factorization. IEEE transactions on neural networks and learning systems, 32(2):814–825, 2020.
- Li Fei-Fei and Pietro Perona. A bayesian hierarchical model for learning natural scene categories. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 2, pages 524–531. IEEE, 2005.
- Peter J Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65, 1987.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.