This page provides two large hyperlink graph for public download. The graphs have been extracted from the 2012 and 2014 versions of the Common Crawl web corpera. The 2012 graph covers 3.5 billion web pages and 128 billion hyperlinks between these pages. To the best of our knowledge, the graph is the largest hyperlink graph that is available to the public outside companies such as Google, Yahoo, and Microsoft. The2014 graph covers 1.7 billion web pages connected by 64 billion hyperlinks. Below we provide instructions on how to download the graphs as well as basic statistics about their topology.
Regarding links: back about 12 years ago we built a software framework in the (then new) Java language named "Roku". Our ontology in Roku (Japanese for 'six') broke everything into one of six categories (Who, What, When, Where, Why and How).
This page links to 868 pages around the web with information on Artificial Intelligence. Some of the links will pop up additional information when you move the mouse over them.
G. Lee, S. Kang, and J. Whang. Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, (July 2019)
D. Gibson, J. Kleinberg, and P. Raghavan. Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems links, objects, time and space---structure in hypermedia systems - HYPERTEXT \textquotesingle98, ACM Press, (1998)
D. Liben-Nowell, and J. Kleinberg. Proceedings of the twelfth international conference on Information and knowledge management - CIKM \textquotesingle03, ACM Press, (2003)
Y. Chung, M. Toyoda, and M. Kitsuregawa. Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web, page 9--16. New York, NY, USA, ACM, (2009)
K. Nielsen, R. Gude, M. Petersen, and K. Grønbæk. HT '09: Proceedings of the Twentieth ACM Conference on Hypertext and Hypermedia, New York, NY, USA, ACM, (July 2009)
R. Baeza-Yates, P. Boldi, and C. Castillo. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, page 308--315. New York, NY, USA, ACM, (2006)
K. Nielsen, R. Gude, M. Petersen, and K. Grønbæk. HT '09: Proceedings of the Twentieth ACM Conference on Hypertext and Hypermedia, New York, NY, USA, ACM, (July 2009)
S. Stober, and A. Nürnberger. Knowledge-Based Intelligent Information and Engineering Systems (KES 2006), volume 4251 of LNAI, page 763--770. Berlin / Heidelberg, Springer Verlag, (October 2006)
C. Herzog, M. Luger, and M. Herzog. Proceedings of the ESWC'07 workshop \"Bridging the Gap between Semantic Web and Web 2.0\", Innsbruck, Austria, (June 2007)
E. Oren, J. Breslin, and S. Decker. WWW '06: Proceedings of the 15th international conference on World Wide Web, page 1071--1072. New York, NY, USA, ACM Press, (2006)