
A framework to compute page importance based on user behaviors

, , , , and . Information Retrieval, 13 (1): 22--45 (February 2010)


Abstract  This paper is concerned with a framework to compute the importance of webpages by using real browsing behaviors of Web users. In contrast, many previous approaches like PageRank compute page importance through the use of the hyperlink graph of theWeb. Recently, people have realized that the hyperlink graph is incomplete and inaccurate as a data source for determiningpage importance, and proposed using the real behaviors of Web users instead. In this paper, we propose a formal frameworkto compute page importance from user behavior data (which covers some previous works as special cases). First, we use a stochasticprocess to model the browsing behaviors of Web users. According to the analysis on hundreds of millions of real records ofuser behaviors, we justify that the process is actually a continuous-time time-homogeneous Markov process, and its stationaryprobability distribution can be used as the measure of page importance. Second, we propose a number of ways to estimate parametersof the stochastic process from real data, which result in a group of algorithms for page importance computation (all referredto as BrowseRank). Our experimental results have shown that the proposed algorithms can outperform the baseline methods suchas PageRank and TrustRank in several tasks, demonstrating the advantage of using our proposed framework.


SpringerLink - Zeitschriftenbeitrag

Links and resources

