The Projection Method: a Unified Formalism for Community Detection
Abstract: We present the class of projection methods for community detection that generalizes many popular community detection methods. In this framework, we represent each clustering (partition) by a vector on a high-dimensional hypersphere. A community detection method is a projection method if it can be described by the following two-step approach: 1) the graph is mapped to a query vector on the hypersphere; and 2) the query vector is projected on the set of clustering vectors. This last projection step is performed by minimizing the distance between the query vector and the clustering vector, over the set of clusterings. We prove that optimizing Markov stability, modularity, the likelihood of planted partition models and correlation clustering fit this framework. A consequence of this equivalence is that algorithms for each of these methods can be modified to perform the projection step in our framework. In addition, we show that these different methods suffer from the same granularity problem: they have parameters that control the granularity of the resulting clustering, but choosing these to obtain clusterings of the desired granularity is nontrivial. We provide a general heuristic to address this granularity problem, which can be applied to any projection method. Finally, we show how, given a generator of graphs with community structure, we can optimize a projection method for this generator in order to obtain a community detection method that performs well on this generator.
- Santo Fortunato. Community detection in graphs. Physics reports, 486(3-5):75–174, 2010.
- Community detection in networks: A user guide. Physics reports, 659:1–44, 2016.
- Different approaches to community detection. Advances in network clustering and blockmodeling, pages 105–119, 2019.
- Finding and evaluating community structure in networks. Physical Review E, 69(2):026113, 2004.
- Resolution limit in community detection. Proceedings of the National Academy of Sciences, 104(1):36–41, 2007. ISSN 0027-8424. doi:10.1073/pnas.0605965104. URL https://www.pnas.org/content/104/1/36.
- Statistical mechanics of community detection. Physical Review E, 74(1):016110, 2006.
- Narrow scope for resolution-limit-free community detection. Physical Review E, 84(1):016114, 2011.
- Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10):P10008, 2008.
- Anil K Jain. Data clustering: 50 years beyond k-means. Pattern recognition letters, 31(8):651–666, 2010.
- Ulrike Von Luxburg. A tutorial on spectral clustering. Statistics and computing, 17:395–416, 2007.
- The hyperspherical geometry of community detection: Modularity as a distance. Journal of Machine Learning Research, 24(112):1–36, 2023. URL http://jmlr.org/papers/v24/22-0744.html.
- Correlation clustering. Machine learning, 56:89–113, 2004.
- Stability of graph communities across time scales. Proceedings of the national academy of sciences, 107(29):12755–12760, 2010.
- Random walks, markov processes and the multiscale modular organization of complex networks. IEEE Transactions on Network Science and Engineering, 1(2):76–90, 2014.
- Community recovery in non-binary and temporal stochastic block models. arXiv preprint arXiv:2008.04790, 2020.
- A correlation clustering framework for community detection. In Proceedings of the 2018 World Wide Web Conference, pages 439–448, 2018.
- Mark EJ Newman. Equivalence between modularity optimization and maximum likelihood methods for community detection. Physical Review E, 94(5):052315, 2016.
- Correcting for granularity bias in modularity-based community detection methods. In Algorithms and Models for the Web Graph: 18th International Workshop, WAW 2023, Toronto, ON, Canada, May 23–26, 2023, Proceedings, pages 1–18. Springer, 2023.
- Collective dynamics of ‘small-world’networks. Nature, 393(6684):440–442, 1998.
- Mark EJ Newman. Properties of highly clustered networks. Physical Review E, 68(2):026121, 2003.
- Tiago P Peixoto. Disentangling homophily, community structure, and triadic closure in networks. Physical Review X, 12(1):011004, 2022.
- Systematic analysis of cluster similarity indices: How to validate validation measures. In International Conference on Machine Learning, pages 3799–3808. PMLR, 2021.
- Information theoretic measures for clusterings comparison: is a correction for chance necessary? In Proceedings of the 26th annual international conference on machine learning, pages 1073–1080, 2009.
- Ground truth bias in external cluster validity indices. Pattern Recognition, 65:58–70, 2017.
- Near optimal lp rounding algorithm for correlationclustering on complete and complete k-partite graphs. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 219–228, 2015.
- Clustering with qualitative information. Journal of Computer and System Sciences, 71(3):360–383, 2005.
- Modularity and Dynamics on Complex Networks. Cambridge University Press, 2021.
- Geometric multiscale community detection: Markov stability and vector partitioning. Journal of Complex Networks, 6(2):157–172, 2018.
- The diameter of sparse random graphs. Advances in Applied Mathematics, 26(4):257–279, 2001.
- Mark EJ Newman. Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74(3):036104, 2006.
- Stochastic blockmodels: First steps. Social networks, 5(2):109–137, 1983.
- Statistical inference of assortative community structures. Physical Review Research, 2(4):043271, 2020.
- Tiago P Peixoto. Descriptive vs. inferential community detection: pitfalls, myths and half-truths. arXiv preprint arXiv:2112.00183, 2021.
- Tiago P Peixoto. Bayesian stochastic blockmodeling. Advances in network clustering and blockmodeling, pages 289–332, 2019.
- LH Harper. Stirling behavior is asymptotically normal. The Annals of Mathematical Statistics, 38(2):410–414, 1967.
- Vladimir N Sachkov. Probabilistic methods in combinatorial analysis. Number 56. Cambridge University Press, 1997.
- Kernel k-means: spectral clustering and normalized cuts. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 551–556, 2004.
- On modularity clustering. IEEE transactions on knowledge and data engineering, 20(2):172–188, 2007.
- The parameterised complexity of computing the maximum modularity of a graph. Algorithmica, 82(8):2174–2199, 2020.
- Network clustering via maximizing modularity: Approximation algorithms and theoretical limits. In 2015 IEEE International Conference on Data Mining, pages 101–110. IEEE, 2015.
- The bayan algorithm: Detecting communities in networks through exact and approximate optimization of modularity. arXiv preprint arXiv:2209.04562, 2022.
- From louvain to leiden: guaranteeing well-connected communities. Scientific reports, 9(1):5233, 2019.
- Heuristic modularity maximization algorithms for community detection rarely return an optimal partition or anything similar. arXiv preprint arXiv:2302.14698, 2023.
- Performance of modularity maximization in practical contexts. Physical Review E, 81(4):046106, 2010.
- Benchmark graphs for testing community detection algorithms. Physical Review E, 78(4):046110, 2008.
- Artificial benchmark for community detection (abcd)—fast random graph model with community structure. Network Science, 9(2):153–178, 2021.
- Community detection through likelihood optimization: in search of a sound model. In The World Wide Web Conference, pages 1498–1508, 2019.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.