Papers
Topics
Authors
Recent
Search
2000 character limit reached

A large-scale study of the World Wide Web: network correlation functions with scale-invariant boundaries

Published 4 Dec 2012 in physics.soc-ph and cs.SI | (1212.0749v2)

Abstract: We performed a large-scale crawl of the World Wide Web, covering 6.9 Million domains and 57 Million subdomains, including all high-traffic sites of the Internet. We present a study of the correlations found between quantities measuring the structural relevance of each node in the network (the in- and out-degree, the local clustering coefficient, the first-neighbor in-degree and the Alexa rank). We find that some of these properties show strong correlation effects and that the dependencies occurring out of these correlations follow power laws not only for the averages, but also for the boundaries of the respective density distributions. In addition, these scale-free limits do not follow the same exponents as the corresponding averages. In our study we retain the directionality of the hyperlinks and develop a statistical estimate for the clustering coefficient of directed graphs. We include in our study the correlations between the in-degree and the Alexa traffic rank, a popular index for the traffic volume, finding non-trivial power-law correlations. We find that sites with more/less than about one Thousand links from different domains have remarkably different statistical properties, for all correlation functions studied, indicating towards an underlying hierarchical structure of the World Wide Web.

Citations (3)

Summary

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.