A user browsing model to predict search engine click data from past observations.
G. Dupret, and B. Piwowarski. SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, page 331--338. New York, NY, USA, ACM, (2008)
DOI: 10.1145/1390334.1390392
Abstract
Search engine click logs provide an invaluable source of relevance information but this information is biased because we ignore which documents from the result list the users have actually seen before and after they clicked. Otherwise, we could estimate document relevance by simple counting. In this paper, we propose a set of assumptions on user browsing behavior that allows the estimation of the probability that a document is seen, thereby providing an unbiased estimate of document relevance. To train, test and compare our model to the best alternatives described in the Literature, we gather a large set of real data and proceed to an extensive cross-validation experiment. Our solution outperforms very significantly all previous models. As a side effect, we gain insight into the browsing behavior of users and we can compare it to the conclusions of an eye-tracking experiments by Joachims et al. 12. In particular, our findings confirm that a user almost always see the document directly after a clicked document. They also explain why documents situated just after a very relevant document are clicked more often.
Description
A user browsing model to predict search engine click data from past observations.
%0 Conference Paper
%1 dupret2008browsing
%A Dupret, Georges E.
%A Piwowarski, Benjamin
%B SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
%C New York, NY, USA
%D 2008
%I ACM
%K clickdata clickthrough implicit-feedback
%P 331--338
%R 10.1145/1390334.1390392
%T A user browsing model to predict search engine click data from past observations.
%U http://portal.acm.org/citation.cfm?id=1390334.1390392
%X Search engine click logs provide an invaluable source of relevance information but this information is biased because we ignore which documents from the result list the users have actually seen before and after they clicked. Otherwise, we could estimate document relevance by simple counting. In this paper, we propose a set of assumptions on user browsing behavior that allows the estimation of the probability that a document is seen, thereby providing an unbiased estimate of document relevance. To train, test and compare our model to the best alternatives described in the Literature, we gather a large set of real data and proceed to an extensive cross-validation experiment. Our solution outperforms very significantly all previous models. As a side effect, we gain insight into the browsing behavior of users and we can compare it to the conclusions of an eye-tracking experiments by Joachims et al. 12. In particular, our findings confirm that a user almost always see the document directly after a clicked document. They also explain why documents situated just after a very relevant document are clicked more often.
%@ 978-1-60558-164-4
@inproceedings{dupret2008browsing,
abstract = {Search engine click logs provide an invaluable source of relevance information but this information is biased because we ignore which documents from the result list the users have actually seen before and after they clicked. Otherwise, we could estimate document relevance by simple counting. In this paper, we propose a set of assumptions on user browsing behavior that allows the estimation of the probability that a document is seen, thereby providing an unbiased estimate of document relevance. To train, test and compare our model to the best alternatives described in the Literature, we gather a large set of real data and proceed to an extensive cross-validation experiment. Our solution outperforms very significantly all previous models. As a side effect, we gain insight into the browsing behavior of users and we can compare it to the conclusions of an eye-tracking experiments by Joachims et al. [12]. In particular, our findings confirm that a user almost always see the document directly after a clicked document. They also explain why documents situated just after a very relevant document are clicked more often.},
added-at = {2010-08-02T20:26:30.000+0200},
address = {New York, NY, USA},
author = {Dupret, Georges E. and Piwowarski, Benjamin},
biburl = {https://www.bibsonomy.org/bibtex/2f6c9528501178bf8c51644da59670b66/beate},
booktitle = {SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval},
description = {A user browsing model to predict search engine click data from past observations.},
doi = {10.1145/1390334.1390392},
interhash = {de6591a388954be3f1bf14e871b663c7},
intrahash = {f6c9528501178bf8c51644da59670b66},
isbn = {978-1-60558-164-4},
keywords = {clickdata clickthrough implicit-feedback},
location = {Singapore, Singapore},
pages = {331--338},
publisher = {ACM},
timestamp = {2011-07-29T07:48:49.000+0200},
title = {A user browsing model to predict search engine click data from past observations.},
url = {http://portal.acm.org/citation.cfm?id=1390334.1390392},
year = 2008
}