@muehlburger

Identifying featured articles in wikipedia: writing style matters

, and . Proceedings of the 19th international conference on World wide web, page 1147--1148. New York, NY, USA, ACM, (2010)
DOI: http://doi.acm.org/10.1145/1772690.1772847

Abstract

Wikipedia provides an information quality assessment model with criteria for human peer reviewers to identify featured articles. For this classification task “Is an article featured or not?” we present a machine learning approach that exploits an article’s character tri- gram distribution. Our approach differs from existing research in that it aims to writing style rather than evaluating meta features like the edit history. The approach is robust, straightforward to im- plement, and outperforms existing solutions. We underpin these claims by an experiment design where, among others, the domain transferability is analyzed. The achieved performances in terms of the F-measure for featured articles are 0.964 within a single Wikipedia domain and 0.880 in a domain transfer situation.

Links and resources

Tags

community

  • @muehlburger
  • @dblp
@muehlburger's tags highlighted