Abstract
While Web archive quality is endangered by Web spam, a side effect
of the high commercial value of top-ranked search-engine results,
so farWeb spam filtering technologies are rarely used byWeb
archivists. In this paper we make the first attempt to disseminate
existing methodology and envision a solution for Web archives to
share knowledge and unite efforts in Web spam hunting. We survey
the state of the art inWeb spam filtering illustrated by the recent
Web spam challenge data sets and techniques and describe the filtering
solution for archives envisioned in the LiWA—Living Web
Archives project.
Users
Please
log in to take part in the discussion (add own reviews or comments).