Semantic Analysis of Tag Similarity Measures in Collaborative Tagging Systems
C. Cattuto, D. Benz, A. Hotho, and G. Stumme. Proceedings of the 3rd Workshop on Ontology Learning and Population (OLP3), Patras, Greece, (July 2008)
Abstract
Social bookmarking systems allow users to organise collections of resources on the Web in a collaborative fashion. The increasing popularity of these systems as well as first insights into their emergent semantics have made them relevant to disciplines like knowledge extraction and ontology learning. The problem of devising methods to measure the semantic relatedness between tags and characterizing it semantically is still largely open. Here we analyze three measures of tag relatedness: tag co-occurrence, cosine similarity of co-occurrence distributions, and FolkRank, an adaptation of the PageRank algorithm to folksonomies. Each measure is computed on tags from a large-scale dataset crawled from the social bookmarking system del.icio.us. To provide a semantic grounding of our findings, a connection to WordNet (a semantic lexicon for the English language) is established by mapping tags into synonym sets of WordNet, and applying there well-known metrics of semantic similarity. Our results clearly expose different characteristics of the selected measures of relatedness, making them applicable to different subtasks of knowledge extraction such as synonym detection or discovery of concept hierarchies.
%0 Conference Paper
%1 cattuto08-semantic
%A Cattuto, Ciro
%A Benz, Dominik
%A Hotho, Andreas
%A Stumme, Gerd
%B Proceedings of the 3rd Workshop on Ontology Learning and Population (OLP3)
%C Patras, Greece
%D 2008
%K 2.0 2008 collaborative folksonomies folksonomy itegpub myown semantic systems tagging web web2.0
%T Semantic Analysis of Tag Similarity Measures in Collaborative Tagging Systems
%U http://olp.dfki.de/olp3/
%X Social bookmarking systems allow users to organise collections of resources on the Web in a collaborative fashion. The increasing popularity of these systems as well as first insights into their emergent semantics have made them relevant to disciplines like knowledge extraction and ontology learning. The problem of devising methods to measure the semantic relatedness between tags and characterizing it semantically is still largely open. Here we analyze three measures of tag relatedness: tag co-occurrence, cosine similarity of co-occurrence distributions, and FolkRank, an adaptation of the PageRank algorithm to folksonomies. Each measure is computed on tags from a large-scale dataset crawled from the social bookmarking system del.icio.us. To provide a semantic grounding of our findings, a connection to WordNet (a semantic lexicon for the English language) is established by mapping tags into synonym sets of WordNet, and applying there well-known metrics of semantic similarity. Our results clearly expose different characteristics of the selected measures of relatedness, making them applicable to different subtasks of knowledge extraction such as synonym detection or discovery of concept hierarchies.
@inproceedings{cattuto08-semantic,
abstract = {Social bookmarking systems allow users to organise collections of resources on the Web in a collaborative fashion. The increasing popularity of these systems as well as first insights into their emergent semantics have made them relevant to disciplines like knowledge extraction and ontology learning. The problem of devising methods to measure the semantic relatedness between tags and characterizing it semantically is still largely open. Here we analyze three measures of tag relatedness: tag co-occurrence, cosine similarity of co-occurrence distributions, and FolkRank, an adaptation of the PageRank algorithm to folksonomies. Each measure is computed on tags from a large-scale dataset crawled from the social bookmarking system del.icio.us. To provide a semantic grounding of our findings, a connection to WordNet (a semantic lexicon for the English language) is established by mapping tags into synonym sets of WordNet, and applying there well-known metrics of semantic similarity. Our results clearly expose different characteristics of the selected measures of relatedness, making them applicable to different subtasks of knowledge extraction such as synonym detection or discovery of concept hierarchies.},
added-at = {2008-06-09T14:05:11.000+0200},
address = {Patras, Greece},
author = {Cattuto, Ciro and Benz, Dominik and Hotho, Andreas and Stumme, Gerd},
biburl = {https://www.bibsonomy.org/bibtex/23b0aca61b24e4343bd80390614e3066e/stumme},
booktitle = {Proceedings of the 3rd Workshop on Ontology Learning and Population (OLP3)},
interhash = {cc62b733f6e0402db966d6dbf1b7711f},
intrahash = {3b0aca61b24e4343bd80390614e3066e},
keywords = {2.0 2008 collaborative folksonomies folksonomy itegpub myown semantic systems tagging web web2.0},
month = {July},
timestamp = {2009-03-02T22:15:24.000+0100},
title = {Semantic Analysis of Tag Similarity Measures in Collaborative Tagging Systems},
url = {http://olp.dfki.de/olp3/},
year = 2008
}