Papers
Topics
Authors
Recent
Search
2000 character limit reached

Node Regression on Latent Position Random Graphs via Local Averaging

Published 29 Oct 2024 in stat.ML and cs.LG | (2410.21987v1)

Abstract: Node regression consists in predicting the value of a graph label at a node, given observations at the other nodes. To gain some insight into the performance of various estimators for this task, we perform a theoretical study in a context where the graph is random. Specifically, we assume that the graph is generated by a Latent Position Model, where each node of the graph has a latent position, and the probability that two nodes are connected depend on the distance between the latent positions of the two nodes. In this context, we begin by studying the simplest possible estimator for graph regression, which consists in averaging the value of the label at all neighboring nodes. We show that in Latent Position Models this estimator tends to a Nadaraya Watson estimator in the latent space, and that its rate of convergence is in fact the same. One issue with this standard estimator is that it averages over a region consisting of all neighbors of a node, and that depending on the graph model this may be too much or too little. An alternative consists in first estimating the true distances between the latent positions, then injecting these estimated distances into a classical Nadaraya Watson estimator. This enables averaging in regions either smaller or larger than the typical graph neighborhood. We show that this method can achieve standard nonparametric rates in certain instances even when the graph neighborhood is too large or too small.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Emmanuel Abbe. Community detection and stochastic block models: recent developments. Journal of Machine Learning Research, 18(177):1–86, 2018.
  2. Statistical mechanics of complex networks. Reviews of Modern Physics, 74(1):47–97, 2002.
  3. Josh Alman and Virginia Vassilevska Williams. A refined laser method and faster matrix multiplication. CoRR, abs/2010.05846, 2020. URL https://arxiv.org/abs/2010.05846.
  4. An Introduction to Random Matrices. Cambridge Studies in Advanced Mathematics. Cambridge University Press, 2009.
  5. On the estimation of latent distances using graph distances, 2018.
  6. Statistical inference on random dot product graphs: a survey, 2017.
  7. Tikhonov regularization and semi-supervised learning on large graphs. In 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 3, pages iii–1000, 2004.
  8. Near-perfect recovery in the one-dimensional latent space model. In Yennun Huang, Irwin King, Tie-Yan Liu, and Maarten van Steen, editors, Proc. of WWW, pages 1932–1942. ACM / IW3C2, 2020.
  9. Reconstruction of random geometric graphs: Breaking the omega(r) distortion barrier, 2022. URL https://arxiv.org/abs/2107.14323.
  10. Santo Fortunato. Community detection in graphs. Physics Reports, 486(3–5):75–174, February 2010. ISSN 0370-1573.
  11. E. N. Gilbert. Random Graphs. The Annals of Mathematical Statistics, 30(4):1141 – 1144, 1959.
  12. Localization in 1d non-parametric latent space models from pairwise affinities, 2023.
  13. node2vec: Scalable feature learning for networks. In Balaji Krishnapuram, Mohak Shah, Alexander J. Smola, Charu C. Aggarwal, Dou Shen, and Rajeev Rastogi, editors, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, pages 855–864. ACM, 2016.
  14. A Distribution-Free Theory of Nonparametric Regression. Springer series in statistics. Springer, 2002. ISBN 978-0-387-95441-7.
  15. Latent space approaches to social network analysis. Journal of the american Statistical association, 97(460):1090–1098, 2002.
  16. Stochastic blockmodels: First steps. Social Networks, 5:109–137, 1983.
  17. Semi-supervised classification with graph convolutional networks. In Proc. of ICLR. OpenReview.net, 2017.
  18. Arne Kovac and Andrew D. A. C. Smith. Regression on a graph, 2009.
  19. Sparse random graphs: regularization and concentration of the laplacian, 2015.
  20. Consistency of spectral clustering in stochastic block models. The Annals of Statistics, 43(1), 2015.
  21. Roberto Imbuzeiro Oliveira. Concentration of the adjacency matrix and of the laplacian in random graphs with independent edges, 2009.
  22. Mathew D. Penrose. Random geometric graphs, volume 5. OUP Oxford, 2003.
  23. Deepwalk: online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD ’14. ACM, 2014.
  24. Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14:75–100, 1997.
  25. Graph-based semi-supervised learning: A comprehensive review, 2021. URL https://arxiv.org/abs/2102.13303.
  26. Universally consistent vertex classification for latent positions graphs. The Annals of Statistics, 41(3), 2013.
  27. Warren S. Torgerson. Multidimensional scaling: I. theory and method. Psychometrika, 17:401–419, 1952. URL https://api.semanticscholar.org/CorpusID:120849755.
  28. Alexandre B. Tsybakov. Introduction to Nonparametric Estimation. Springer Publishing Company, Incorporated, 1st edition, 2008. ISBN 0387790519.
  29. Holger Wendland. Scattered Data Approximation. Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, 2004.
  30. Consistency of community detection in networks under degree-corrected stochastic block models. The Annals of Statistics, 40(4):2266 – 2292, 2012.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 0 likes about this paper.