Article,

Rule-based multi-word term extraction, lemmatization and description

, , and .
(2016)bibtex: krstevrule.

Abstract

In this paper we present a rule-based method for multi-word term (MWT) extraction and lemmatization of extracted multi-word terms. Extracted and lemmatized MWT candidates are post-processed using data-driven and heuristic approach in order to reject falsely offered lemmas (“parasite lemmas”) and then ranked by calculating various measures before passing them to human evaluators. For accepted terms dictionary entries are automatically produced that enable generation of all terms’ inflected forms. All subtasks of this process are integrated into a tool for development and management of lexical resources LeXimir (Stanković et al., 2016).

Tags

Users

  • @lepsky

Comments and Reviews