Recommender systems have been evaluated in many, often incomparable, ways. In this paper we review the key decisions in evaluating
collaborative filtering recommender systems: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which
prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole.
In addition to reviewing the evaluation strategies used by prior researchers, we present empirical results from the analysis of various accuracy
metrics on one content domain where all the tested metrics collapsed roughly into three equivalence classes. Metrics within each equivalency
class were strongly correlated, while metrics from different equivalency classes were uncorrelated.
Beschreibung
Paper sugested by netflix about evaluation of recommender systems (see also Netflix prize: http://www.netflixprize.com/faq)
%0 Journal Article
%1 herlocker2004
%A Herlocker, J.
%A Konstan, J.
%A Terveen, L.
%A Riedl, J.
%D 2004
%I ACM Press
%J ACM Transactions on Information Systems
%K collaborative evaluation netflix recommendation recommender social tagora
%P 5-53
%T Evaluating Collaborative Filtering Recommender Systems
%U http://web.engr.oregonstate.edu/~herlock/papers/eval_tois.pdf
%V 22
%X Recommender systems have been evaluated in many, often incomparable, ways. In this paper we review the key decisions in evaluating
collaborative filtering recommender systems: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which
prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole.
In addition to reviewing the evaluation strategies used by prior researchers, we present empirical results from the analysis of various accuracy
metrics on one content domain where all the tested metrics collapsed roughly into three equivalence classes. Metrics within each equivalency
class were strongly correlated, while metrics from different equivalency classes were uncorrelated.
@article{herlocker2004,
abstract = {Recommender systems have been evaluated in many, often incomparable, ways. In this paper we review the key decisions in evaluating
collaborative filtering recommender systems: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which
prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole.
In addition to reviewing the evaluation strategies used by prior researchers, we present empirical results from the analysis of various accuracy
metrics on one content domain where all the tested metrics collapsed roughly into three equivalence classes. Metrics within each equivalency
class were strongly correlated, while metrics from different equivalency classes were uncorrelated.},
added-at = {2007-03-20T11:53:19.000+0100},
author = {Herlocker, J. and Konstan, J. and Terveen, L. and Riedl, J.},
biburl = {https://www.bibsonomy.org/bibtex/23290b2215a050741a23547b47f195694/andreab},
description = {Paper sugested by netflix about evaluation of recommender systems (see also Netflix prize: http://www.netflixprize.com/faq)},
interhash = {f8a70731d983634ac7105896d101c9d2},
intrahash = {3290b2215a050741a23547b47f195694},
journal = {ACM Transactions on Information Systems},
keywords = {collaborative evaluation netflix recommendation recommender social tagora},
pages = {5-53},
publisher = { ACM Press},
timestamp = {2007-03-20T11:53:19.000+0100},
title = {Evaluating Collaborative Filtering Recommender Systems},
url = {http://web.engr.oregonstate.edu/~herlock/papers/eval_tois.pdf},
volume = 22,
year = 2004
}