Combining Density and Overlap (CoDO): A New Method for Assessing the Significance of Overlap Among Subgraphs
Abstract: Algorithms for detecting clusters (including overlapping clusters) in graphs have received significant attention in the research community. A closely related important aspect of the problem -- quantification of statistical significance of overlap of clusters, remains relatively unexplored. This paper presents the first theoretical and practical results on quantifying statistically significant interactions between clusters in networks. Such problems commonly arise in diverse applications, ranging from social network analysis to systems biology. The paper addresses the problem of quantifying the statistical significance of the observed overlap of the two clusters in an Erd\H{o}s-R\'enyi graph model. The analytical framework presented in the paper assigns a $p$-value to overlapping subgraphs by combining information about both the sizes of the subgraphs and their edge densities in comparison to the corresponding values for their overlapping component. This $p$-value is demonstrated to have excellent discrimination properties in real applications and is shown to be robust across broad parameter ranges. Our results are comprehensively validated on synthetic, social, and biological networks. We show that our framework: (i) derives insight from both the density and the size of overlap among communities (circles/pathways), (ii) consistently outperforms state-of-the-art methods over all tested datasets, and (iii) when compared to other measures, has much broader application scope. In the context of social networks, we identify highly interdependent (social) circles and show that our predictions are highly co-enriched with known user features. In networks of biomolecular interactions, we show that our method identifies novel cross-talk between pathways, sheds light on their mechanisms of interaction, and provides new opportunities for investigations of biomolecular interactions.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.