Balancing Summarization and Change Detection in Graph Streams
Abstract: This study addresses the issue of balancing graph summarization and graph change detection. Graph summarization compresses large-scale graphs into a smaller scale. However, the question remains: To what extent should the original graph be compressed? This problem is solved from the perspective of graph change detection, aiming to detect statistically significant changes using a stream of summary graphs. If the compression rate is extremely high, important changes can be ignored, whereas if the compression rate is extremely low, false alarms may increase with more memory. This implies that there is a trade-off between compression rate in graph summarization and accuracy in change detection. We propose a novel quantitative methodology to balance this trade-off to simultaneously realize reliable graph summarization and change detection. We introduce a probabilistic structure of hierarchical latent variable model into a graph, thereby designing a parameterized summary graph on the basis of the minimum description length principle. The parameter specifying the summary graph is then optimized so that the accuracy of change detection is guaranteed to suppress Type I error probability (probability of raising false alarms) to be less than a given confidence level. First, we provide a theoretical framework for connecting graph summarization with change detection. Then, we empirically demonstrate its effectiveness on synthetic and real datasets.
- Y. Liu, T. Safavi, A. Dighe, and D. Koutra, “Graph summarization methods and applications: a survey,” ACM Computing Surveys, pp. 62:1–62:34, 2018.
- J. Sun, S. Papadimitriou, P. S. Yu, and C. Faloutsos, “Graphscope: parameter-free mining of large time evolving graphs,” in Proc. of KDD, 2007, pp. 687–696.
- M. Araujo, S. Papadimitriou, S. Günnemann, C. Faloutsos, P. Basu, A. S. E. Papalexakis, and D. Koutra, “Com2: fast automatic discovery of temporal (‘comet’) communities,” in Proc. of PAKDD, 2014, pp. 271–283.
- N. Shah, D. Koutra, T. Zou, B. Gallagher, and C. Faloutsos, “TimeCrunch: interpretable dynamic graph summarization,” in Proc. of KDD, 2015, p. 1055–1064.
- B. Adhikari, Y. Zhang, A. Bharadwaj, and B. Prakash, “Condensing temporal networks using propagation,” in Proc. of SDM, 2017, p. 417–425.
- S. Kapoor, D. K. Saxena, and M. v. Leeuwen, “Online summarization of dynamic graphs using subjective interestingness for sequential data,” Data Mining and Knowledge Discovery, pp. 88–126, 2020.
- L. Akoglu, H. Tong, and D. Koutra, “Graph based anomaly detection and description: a survey,” Data Mining and Knowledge Discovery, vol. 29, pp. 626–688, 2015.
- S. Ranshous, S. Shen, D. Koutra, S. Harenberg, and C. Faloutsos, “Anomaly detection in dynamic networks: a survey,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 7, no. 3, 2015.
- D. V. Hinkley, “Inference about the change-point in a sequence of random variables,” Biometrika, vol. 27, no. 1, pp. 1–17, 1970.
- T. Ide and H. Kashima, “Eigenspace-based anomaly detection in computer systems,” in Proc. of KDD, 2004, pp. 440–449.
- S. Hirose, K. Yamanishi, T. Nakata, and R. Fujimaki, “Network anomaly detection based on eigen equation compression,” in Proc. of KDD, 2009, pp. 1185–1194.
- L. Akoglu and C. Faloutsos, “Event detection in time series of mobile communication graphs,” in Proc. of 27th Army Science Conference, 2010.
- D. Koutra, N. Shah, J. Vogelstein, B. Gallagher, and C. Faloutsos, “DeltaCon: principled massive-graph similarity function with attribution,” ACM Transactions on Knowledge Discovery from Data, vol. 10, no. 3, pp. 28:1–28:43, 2016.
- L. Peel and A. Clauset, “Detecting change points in the large-scale structure of evolving networks,” in Proc. of AAAI, 2015, pp. 2914–2920.
- S. Huang, Y. Hitti, G. Rabusseau, and R. Rabbany, “Laplacian change point detection for dynamic graphs,” in Proc. of KDD, 2020, pp. 349–358.
- S. Fukushima and K. Yamanishi, “Detecting hierarchical changes in latent variable models,” in Proc. of ICDM, 2020, pp. 1128–1134.
- S. Fukushima, R. Kanai, and K. Yamanishi, “Graph summarization with latent variable probabilistic models,” in Proc. of ComplexNetworks, 2021, pp. 428–440.
- K. Lee, H. Jo, J. Ko, S. Lim, and K. Shin, “SSumM: sparse summarization of massive graphs,” in Proc. of KDD, 2020, p. 144–154.
- K. Yamanishi and S. Fukushima, “Model change detection with the MDL principle,” IEEE Transactions on Information Theory, vol. 9, no. 64, pp. 6115–6126, 2018.
- J. Rissanen, “Modeling by shortest data description,” Automatica, vol. 14, pp. 465–471, 1978.
- P. Kontkanen and P. Myllymäki, “A linear-time algorithm for computing the multinomial stochastic complexity,” Information Processing Letters, vol. 103, no. 6, pp. 227–233, 2007.
- T. P. Peixoto, “Efficient monte carlo and greedy heuristic for the inference of stochastic block models,” Physical Review E, vol. 89, p. 012804, 2014.
- T. Snijders and K. Nowicki, “Estimation and prediction for stochastic blockmodels for graphs with latent block structure,” Journal of Classification, vol. 64, no. 4, pp. 583–639, 1997.
- S. Fukushima and K. Yamanishi, “Detecting metachanges in data streams from a viewpoint of MDL,” Entropy, vol. 21, no. 12, p. 1134, 2019.
- S. Rayana and L. Akoglu, “Less is more: building selective anomaly ensembles,” ACM Transactions on Knowledge Discovery from Data, vol. 10, no. 4, pp. 1–33, 2016.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.