Web taxonomy integration using support vector machines
D. Zhang, and W. Lee. WWW '04: Proceedings of the 13th international conference on World Wide Web, page 472--481. New York, NY, USA, ACM Press, (2004)
Abstract
We address the problem of integrating objects from a source taxonomy into a master taxonomy. This problem is not only currently pervasive on the web, but also important to the emerging semantic web. A straightforward approach to automating this process would be to train a classifier for each category in the master taxonomy, and then classify objects from the source taxonomy into these categories. In this paper we attempt to use a powerful classification method, Support Vector Machine (SVM), to attack this problem. Our key insight is that the availability of the source taxonomy data could be helpful to build better classifiers in this scenario, therefore it would be beneficial to do transductive learning rather than inductive learning, i.e., learning to optimize classification performance on a particular set of test examples. Noticing that the categorizations of the master and source taxonomies often have some semantic overlap, we propose a method, Cluster Shrinkage (CS), to further enhance the classification by exploiting such implicit knowledge. Our experiments with real-world web data show substantial improvements in the performance of taxonomy integration.
%0 Conference Paper
%1 zhang2004web
%A Zhang, Dell
%A Lee, Wee Sun
%B WWW '04: Proceedings of the 13th international conference on World Wide Web
%C New York, NY, USA
%D 2004
%I ACM Press
%K studienarbeit semantic_web ontology_mapping classification transductive_learning support_vector_machines eventually_useful taxonomy_integration
%P 472--481
%T Web taxonomy integration using support vector machines
%U zhang04.ps
%X We address the problem of integrating objects from a source taxonomy into a master taxonomy. This problem is not only currently pervasive on the web, but also important to the emerging semantic web. A straightforward approach to automating this process would be to train a classifier for each category in the master taxonomy, and then classify objects from the source taxonomy into these categories. In this paper we attempt to use a powerful classification method, Support Vector Machine (SVM), to attack this problem. Our key insight is that the availability of the source taxonomy data could be helpful to build better classifiers in this scenario, therefore it would be beneficial to do transductive learning rather than inductive learning, i.e., learning to optimize classification performance on a particular set of test examples. Noticing that the categorizations of the master and source taxonomies often have some semantic overlap, we propose a method, Cluster Shrinkage (CS), to further enhance the classification by exploiting such implicit knowledge. Our experiments with real-world web data show substantial improvements in the performance of taxonomy integration.
@inproceedings{zhang2004web,
abstract = {We address the problem of integrating objects from a source taxonomy into a master taxonomy. This problem is not only currently pervasive on the web, but also important to the emerging semantic web. A straightforward approach to automating this process would be to train a classifier for each category in the master taxonomy, and then classify objects from the source taxonomy into these categories. In this paper we attempt to use a powerful classification method, Support Vector Machine (SVM), to attack this problem. Our key insight is that the availability of the source taxonomy data could be helpful to build better classifiers in this scenario, therefore it would be beneficial to do transductive learning rather than inductive learning, i.e., learning to optimize classification performance on a particular set of test examples. Noticing that the categorizations of the master and source taxonomies often have some semantic overlap, we propose a method, Cluster Shrinkage (CS), to further enhance the classification by exploiting such implicit knowledge. Our experiments with real-world web data show substantial improvements in the performance of taxonomy integration.},
added-at = {2011-01-28T11:34:37.000+0100},
address = {New York, NY, USA},
author = {Zhang, Dell and Lee, Wee Sun},
biburl = {https://www.bibsonomy.org/bibtex/2b683a9061fe344cfbcd62aa85e44d2c4/dbenz},
booktitle = {WWW '04: Proceedings of the 13th international conference on World Wide Web},
file = {zhang2004web.pdf:zhang2004web.pdf:PDF},
interhash = {7edb6ba8814bb3382ffaeb009d5a3183},
intrahash = {b683a9061fe344cfbcd62aa85e44d2c4},
keywords = {studienarbeit semantic_web ontology_mapping classification transductive_learning support_vector_machines eventually_useful taxonomy_integration},
lastdatemodified = {2005-08-07},
lastname = {Zhang},
own = {own},
pages = {472--481},
pdf = {zhang04.pdf},
publisher = {ACM Press},
read = {notread},
timestamp = {2013-07-31T15:39:42.000+0200},
title = {Web taxonomy integration using support vector machines},
url = {zhang04.ps},
year = 2004
}