Abstract

Network analysis is a quantitative methodology for studying properties related to connectivity and distances in graphs, with diverse applications like citation indexing and information retrieval on the Web. The hyperlinked structure of Wikipedia and the ongoing, incremental editing process behind it make it an interesting and unexplored target domain for network analysis techniques. In this paper we apply two relevance metrics, HITS and PageRank, to the whole set of English Wikipedia entries, in order to gain some preliminary insights on the macro-structure of the organization of the corpus, and on some cultural biases related to specific topics.

Links and resources

Tags

community

  • @plaufer
  • @anneba
  • @bertil.hatt
  • @brightbyte
@anneba's tags highlighted