Papers
Topics
Authors
Recent
Search
2000 character limit reached

Anchor-free Clustering based on Anchor Graph Factorization

Published 24 Feb 2024 in cs.LG | (2402.15688v2)

Abstract: Anchor-based methods are a pivotal approach in handling clustering of large-scale data. However, these methods typically entail two distinct stages: selecting anchor points and constructing an anchor graph. This bifurcation, along with the initialization of anchor points, significantly influences the overall performance of the algorithm. To mitigate these issues, we introduce a novel method termed Anchor-free Clustering based on Anchor Graph Factorization (AFCAGF). AFCAGF innovates in learning the anchor graph, requiring only the computation of pairwise distances between samples. This process, achievable through straightforward optimization, circumvents the necessity for explicit selection of anchor points. More concretely, our approach enhances the Fuzzy k-means clustering algorithm (FKM), introducing a new manifold learning technique that obviates the need for initializing cluster centers. Additionally, we evolve the concept of the membership matrix between cluster centers and samples in FKM into an anchor graph encompassing multiple anchor points and samples. Employing Non-negative Matrix Factorization (NMF) on this anchor graph allows for the direct derivation of cluster labels, thereby eliminating the requirement for further post-processing steps. To solve the method proposed, we implement an alternating optimization algorithm that ensures convergence. Empirical evaluations on various real-world datasets underscore the superior efficacy of our algorithm compared to traditional approaches.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. k-means++: The advantages of careful seeding. In Soda, Vol. 7. 1027–1035.
  2. Approximate k-means++ in sublinear time. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.
  3. Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection. In Computer Vision - ECCV’96, 4th European Conference on Computer Vision, Cambridge, UK, April 15-18, 1996, Proceedings, Volume I (Lecture Notes in Computer Science, Vol. 1064). Springer, 45–58.
  4. FCM: The fuzzy c-means clustering algorithm. Computers & geosciences 10, 2-3 (1984), 191–203.
  5. Deng Cai and Xinlei Chen. 2014. Large scale spectral clustering via landmark-based sparse representation. IEEE transactions on cybernetics 45, 8 (2014), 1669–1680.
  6. Nonnegative Lagrangian Relaxation of K-Means and Spectral Clustering. In Machine Learning: ECML 2005, 16th European Conference on Machine Learning, Porto, Portugal, October 3-7, 2005, Proceedings (Lecture Notes in Computer Science, Vol. 3720). Springer, 530–538.
  7. Orthogonal nonnegative matrix t-factorizations for clustering. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, August 20-23, 2006. ACM, 126–135.
  8. Yi Ding and Xian Fu. 2016. Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm. Neurocomputing 188 (2016), 233–238.
  9. ${R}_1$ -2-DPCA and Face Recognition. IEEE Trans. Cybern. 49, 4 (2019), 1212–1223.
  10. Daniel B Graham and Nigel M Allinson. 1998. Characterising virtual eigensignatures for general purpose face recognition. In Face recognition: from theory to applications. Springer, 446–456.
  11. Fuzzy double c-means clustering based on sparse self-representation. IEEE Transactions on Fuzzy Systems 26, 2 (2017), 612–626.
  12. KNN Model-Based Approach in Classification. In On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE - OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Sicily, Italy, November 3-7, 2003 (Lecture Notes in Computer Science, Vol. 2888). Springer, 986–996.
  13. Li He and Hong Zhang. 2018. Kernel K-means sampling for Nyström approximation. IEEE Transactions on Image Processing 27, 5 (2018), 2108–2120.
  14. Clustering appearances of objects under varying illumination conditions. In 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., Vol. 1. IEEE, I–I.
  15. Multiscale hybrid linear models for lossy image representation. IEEE Transactions on Image Processing 15, 12 (2006), 3655–3671.
  16. Multiple kernel fuzzy clustering. IEEE Transactions on Fuzzy Systems 20, 1 (2011), 120–134.
  17. Fuzzy c-means clustering based on weights and gene expression programming. Pattern Recognition Letters 90 (2017), 1–7.
  18. Evaluation of the performance of clustering algorithms in kernel-induced feature space. Pattern Recognition 38, 4 (2005), 607–611.
  19. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.
  20. Balanced Clustering: A Uniform Model and Fast Algorithm.. In IJCAI. 2987–2993.
  21. Spectral embedding of graphs. Pattern recognition 36, 10 (2003), 2213–2230.
  22. Coding Facial Expressions with Gabor Wavelets. In 3rd International Conference on Face & Gesture Recognition (FG ’98), April 14-16, 1998, Nara, Japan. IEEE Computer Society, 200–205.
  23. James MacQueen et al. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1. Oakland, CA, USA, 281–297.
  24. Aleix Martinez and Robert Benavente. 1998. The ar face database: Cvc technical report, 24. (1998).
  25. An extension of possibilistic fuzzy c-means with regularization. In International Conference on Fuzzy Systems. IEEE, 1–6.
  26. Fast fuzzy clustering based on anchor graph. IEEE Transactions on Fuzzy Systems 30, 7 (2021), 2375–2387.
  27. The Constrained Laplacian Rank Algorithm for Graph-Based Clustering. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA. AAAI Press, 1969–1976.
  28. Coordinate Descent Method for k𝑘kitalic_k k-means. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 5 (2021), 2371–2385.
  29. Unsupervised large graph embedding. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31.
  30. Centerless Clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 1 (2022), 167–181.
  31. Multiple kernel clustering with kernel k-means coupled graph tensor learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 9411–9418.
  32. Ferdinand Samaria and Andy Harter. 1994. Parameterisation of a stochastic model for human face identification. In Proceedings of Second IEEE Workshop on Applications of Computer Vision, WACV 1994, Sarasota, FL, USA, December 5-7, 1994. IEEE, 138–142.
  33. René Vidal and Richard Hartley. 2004. Motion segmentation with missing data using powerfactorization and gpca. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Vol. 2. IEEE, II–II.
  34. Learning on big graph: Label inference and regularization with anchor hierarchy. IEEE transactions on knowledge and data engineering 29, 5 (2017), 1101–1114.
  35. Scalable semi-supervised learning by efficient anchor graph regularization. IEEE Transactions on Knowledge and Data Engineering 28, 7 (2016), 1864–1877.
  36. Discrete and parameter-free multiple kernel k-means. IEEE Transactions on Image Processing 31 (2022), 2796–2808.
  37. The global fuzzy c-means clustering algorithm. In 2006 6th World Congress on Intelligent Control and Automation, Vol. 1. IEEE, 3604–3607.
  38. John M. Winn and Nebojsa Jojic. 2005. LOCUS: Learning Object Classes with Unsupervised Segmentation. In 10th IEEE International Conference on Computer Vision (ICCV 2005), 17-20 October 2005, Beijing, China. IEEE Computer Society, 756–763.
  39. Hyper-Laplacian regularized multilinear multiview self-representations for clustering and semisupervised learning. IEEE transactions on cybernetics 50, 2 (2018), 572–586.
  40. Robust and sparse fuzzy k-means clustering.. In IJCAI. 2224–2230.
  41. Lotfi A Zadeh. 1975. Fuzzy logic and approximate reasoning: In memory of grigore moisil. Synthese 30 (1975), 407–428.
  42. Kernelized mahalanobis distance for fuzzy clustering. IEEE Transactions on Fuzzy Systems 29, 10 (2020), 3103–3117.
  43. Kai Zhang and James T Kwok. 2010. Clustered Nyström method for large scale manifold learning and dimension reduction. IEEE Transactions on Neural Networks 21, 10 (2010), 1576–1587.
  44. A new membership scaling fuzzy c-means clustering algorithm. IEEE Transactions on Fuzzy Systems 29, 9 (2020), 2810–2818.
  45. Joint Learning of Anchor Graph-Based Fuzzy Spectral Embedding and Fuzzy K-Means. IEEE Transactions on Fuzzy Systems (2023).
  46. Fast spectral clustering with efficient large graph construction. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2492–2496.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.