Semantic Taxonomy Induction from Heterogenous Evidence
R. Snow, D. Jurafsky, and A. Ng. Proceedings of the 44 th Annual Meeting of the Association for Computational Linguistics, The Stanford Natural Language Processing Group, (2006)Received Best Paper Award.
Abstract
We propose a novel algorithm for inducing semantic taxonomies. Previous algorithms for taxonomy induction have typically focused on independent classifiers for discovering new single relationships based on hand-constructed or automatically discovered textual patterns. By contrast, our algorithm flexibly incorporates evidence from multiple classifiers over heterogenous relationships to optimize the entire structure of the taxonomy, using knowledge of a word’s coordinate terms to help in determining its hypernyms, and vice versa. We apply our algorithm on the problem of sense-disambiguated noun hyponym acquisition, where we combine the predictions of hypernym and coordinate term classifiers with the knowledge in a preexisting semantic taxonomy (WordNet 2.1). We add 10, 000 novel synsets to WordNet 2.1 at 84% precision, a relative error reduction of 70% over a non-joint algorithm using the same component classifiers. Finally, we show that a taxonomy built using our algorithm shows a 23% relative F-score improvementover WordNet 2.1 on an independent testset of hypernym pairs.
%0 Conference Paper
%1 snow06semantic
%A Snow, Rion
%A Jurafsky, Daniel
%A Ng, Andrew Y.
%B Proceedings of the 44 th Annual Meeting of the Association for Computational Linguistics
%D 2006
%K 2006 stanford parsetree parser best NT2OD nlp
%T Semantic Taxonomy Induction from Heterogenous Evidence
%U http://ai.stanford.edu/~rion/papers/semtax_acl06.pdf
%X We propose a novel algorithm for inducing semantic taxonomies. Previous algorithms for taxonomy induction have typically focused on independent classifiers for discovering new single relationships based on hand-constructed or automatically discovered textual patterns. By contrast, our algorithm flexibly incorporates evidence from multiple classifiers over heterogenous relationships to optimize the entire structure of the taxonomy, using knowledge of a word’s coordinate terms to help in determining its hypernyms, and vice versa. We apply our algorithm on the problem of sense-disambiguated noun hyponym acquisition, where we combine the predictions of hypernym and coordinate term classifiers with the knowledge in a preexisting semantic taxonomy (WordNet 2.1). We add 10, 000 novel synsets to WordNet 2.1 at 84% precision, a relative error reduction of 70% over a non-joint algorithm using the same component classifiers. Finally, we show that a taxonomy built using our algorithm shows a 23% relative F-score improvementover WordNet 2.1 on an independent testset of hypernym pairs.
@inproceedings{snow06semantic,
abstract = {We propose a novel algorithm for inducing semantic taxonomies. Previous algorithms for taxonomy induction have typically focused on independent classifiers for discovering new single relationships based on hand-constructed or automatically discovered textual patterns. By contrast, our algorithm flexibly incorporates evidence from multiple classifiers over heterogenous relationships to optimize the entire structure of the taxonomy, using knowledge of a word’s coordinate terms to help in determining its hypernyms, and vice versa. We apply our algorithm on the problem of sense-disambiguated noun hyponym acquisition, where we combine the predictions of hypernym and coordinate term classifiers with the knowledge in a preexisting semantic taxonomy (WordNet 2.1). We add 10, 000 novel synsets to WordNet 2.1 at 84% precision, a relative error reduction of 70% over a non-joint algorithm using the same component classifiers. Finally, we show that a taxonomy built using our algorithm shows a 23% relative F-score improvementover WordNet 2.1 on an independent testset of hypernym pairs.},
added-at = {2007-02-25T19:33:30.000+0100},
author = {Snow, Rion and Jurafsky, Daniel and Ng, Andrew Y.},
biburl = {https://www.bibsonomy.org/bibtex/2422b165064841b49ba7947e3922674c8/butonic},
booktitle = {Proceedings of the 44 th Annual Meeting of the Association for Computational Linguistics},
interhash = {c0f5a3a22faa8dc4b61c9a717a6c9037},
intrahash = {422b165064841b49ba7947e3922674c8},
keywords = {2006 stanford parsetree parser best NT2OD nlp},
note = {Received Best Paper Award},
organization = {The Stanford Natural Language Processing Group},
school = {Stanford University},
timestamp = {2007-02-25T19:33:30.000+0100},
title = {{S}emantic {T}axonomy {I}nduction from {H}eterogenous {E}vidence},
url = {http://ai.stanford.edu/~rion/papers/semtax_acl06.pdf},
year = 2006
}