Analyzing and Accessing Wikipedia as a Lexical Semantic Resource
T. Zesch, I. Gurevych, and M. Mühlhäuser. Biannual Conference of the Society for Computational Linguistics and Language Technology, (2007)
Abstract
We analyze Wikipedia as a lexical semantic resource and compare it with conventional resources, such as dictionaries, thesauri, semantic wordnets, etc. Different parts of Wikipedia reflect different aspects of these resources. We show that Wikipedia contains a vast amount of knowledge about, e.g., named entities, domain specific terms, and rare word senses. If Wikipedia is to be used as a lexical semantic resource in large-scale NLP tasks, efficient programmatic access to the knowledge therein is required. We review existing access mechanisms and show that they are limited with respect to performance and the provided access functions. Therefore, we introduce a general purpose, high performance Java-based Wikipedia API that overcomes these limitations. It is available for research purposes at http://www.ukp.tu-darmstadt.de/software/WikipediaAPI.
%0 Conference Paper
%1 citeulike:2348620
%A Zesch, Torsten
%A Gurevych, Iryna
%A Mühlhäuser, Max
%B Biannual Conference of the Society for Computational Linguistics and Language Technology
%D 2007
%K API READ WW-MUST mining nlp semantic wikipedia
%T Analyzing and Accessing Wikipedia as a Lexical Semantic Resource
%X We analyze Wikipedia as a lexical semantic resource and compare it with conventional resources, such as dictionaries, thesauri, semantic wordnets, etc. Different parts of Wikipedia reflect different aspects of these resources. We show that Wikipedia contains a vast amount of knowledge about, e.g., named entities, domain specific terms, and rare word senses. If Wikipedia is to be used as a lexical semantic resource in large-scale NLP tasks, efficient programmatic access to the knowledge therein is required. We review existing access mechanisms and show that they are limited with respect to performance and the provided access functions. Therefore, we introduce a general purpose, high performance Java-based Wikipedia API that overcomes these limitations. It is available for research purposes at http://www.ukp.tu-darmstadt.de/software/WikipediaAPI.
@inproceedings{citeulike:2348620,
abstract = {We analyze Wikipedia as a lexical semantic resource and compare it with conventional resources, such as dictionaries, thesauri, semantic wordnets, etc. Different parts of Wikipedia reflect different aspects of these resources. We show that Wikipedia contains a vast amount of knowledge about, e.g., named entities, domain specific terms, and rare word senses. If Wikipedia is to be used as a lexical semantic resource in large-scale NLP tasks, efficient programmatic access to the knowledge therein is required. We review existing access mechanisms and show that they are limited with respect to performance and the provided access functions. Therefore, we introduce a general purpose, high performance Java-based Wikipedia API that overcomes these limitations. It is available for research purposes at http://www.ukp.tu-darmstadt.de/software/WikipediaAPI.},
added-at = {2008-02-27T00:39:46.000+0100},
author = {Zesch, Torsten and Gurevych, Iryna and Mühlhäuser, Max},
biburl = {https://www.bibsonomy.org/bibtex/22d8f740fe023824a89405eaaddc4bfce/brightbyte},
booktitle = {Biannual Conference of the Society for Computational Linguistics and Language Technology},
citeulike-article-id = {2348620},
description = {stuff from citeyoulike},
interhash = {d60585a882f4e645a653ce1c2babc6dc},
intrahash = {2d8f740fe023824a89405eaaddc4bfce},
keywords = {API READ WW-MUST mining nlp semantic wikipedia},
priority = {4},
school = {Darmstadt University of Technology},
timestamp = {2009-01-23T09:58:50.000+0100},
title = {Analyzing and Accessing Wikipedia as a Lexical Semantic Resource},
year = 2007
}