Semantic relatedness between words has been extracted from
a variety of sources. In this ongoing work, we explore
and compare several options for determining if semantic
relatedness can be extracted from navigation structures in
Wikipedia. In that direction, we first investigate the potential
of representation learning techniques such as
DeepWalk in comparison to previously applied methods based on counting
co-occurrences. Since both methods are based on (random)
paths in the network, we also study different approaches to
generate paths from Wikipedia link structure. For this task,
we do not only consider the link structure of Wikipedia, but
also actual navigation behavior of users. Finally, we analyze
if semantics can also be extracted from smaller subsets of the
Wikipedia link network. As a result we find that representa-
tion learning techniques mostly outperform the investigated
co-occurrence counting methods on the Wikipedia network.
However, we find that this is not the case for paths sampled
from human navigation behavior.
%0 Conference Paper
%1 dallmann2016extracting
%A Dallmann, Alexander
%A Niebler, Thomas
%A Lemmerich, Florian
%A Hotho, Andreas
%B Wiki Workshop@ICWSM
%D 2016
%E West, Robert
%E Zia, Leila
%E Taraborelli, Dario
%E Leskovec, Jure
%K 2016 deepwalk deepwiki myown navigation semantic wikipedia word2vec
%T Extracting Semantics from Random Walks on Wikipedia: Comparing learning and counting methods
%U http://snap.stanford.edu/wikiworkshop2016/papers/wikiworkshop_icwsm2016_dallmann.pdf
%X Semantic relatedness between words has been extracted from
a variety of sources. In this ongoing work, we explore
and compare several options for determining if semantic
relatedness can be extracted from navigation structures in
Wikipedia. In that direction, we first investigate the potential
of representation learning techniques such as
DeepWalk in comparison to previously applied methods based on counting
co-occurrences. Since both methods are based on (random)
paths in the network, we also study different approaches to
generate paths from Wikipedia link structure. For this task,
we do not only consider the link structure of Wikipedia, but
also actual navigation behavior of users. Finally, we analyze
if semantics can also be extracted from smaller subsets of the
Wikipedia link network. As a result we find that representa-
tion learning techniques mostly outperform the investigated
co-occurrence counting methods on the Wikipedia network.
However, we find that this is not the case for paths sampled
from human navigation behavior.
@inproceedings{dallmann2016extracting,
abstract = {Semantic relatedness between words has been extracted from
a variety of sources. In this ongoing work, we explore
and compare several options for determining if semantic
relatedness can be extracted from navigation structures in
Wikipedia. In that direction, we first investigate the potential
of representation learning techniques such as
DeepWalk in comparison to previously applied methods based on counting
co-occurrences. Since both methods are based on (random)
paths in the network, we also study different approaches to
generate paths from Wikipedia link structure. For this task,
we do not only consider the link structure of Wikipedia, but
also actual navigation behavior of users. Finally, we analyze
if semantics can also be extracted from smaller subsets of the
Wikipedia link network. As a result we find that representa-
tion learning techniques mostly outperform the investigated
co-occurrence counting methods on the Wikipedia network.
However, we find that this is not the case for paths sampled
from human navigation behavior.},
added-at = {2016-07-14T13:38:05.000+0200},
author = {Dallmann, Alexander and Niebler, Thomas and Lemmerich, Florian and Hotho, Andreas},
biburl = {https://www.bibsonomy.org/bibtex/212ce818c62131f722e6d723574ad2a03/hotho},
booktitle = {Wiki Workshop@ICWSM},
editor = {West, Robert and Zia, Leila and Taraborelli, Dario and Leskovec, Jure},
interhash = {a8393a6d07a1ef923eb0a7013639c103},
intrahash = {12ce818c62131f722e6d723574ad2a03},
keywords = {2016 deepwalk deepwiki myown navigation semantic wikipedia word2vec},
timestamp = {2016-10-08T18:36:42.000+0200},
title = {Extracting Semantics from Random Walks on Wikipedia: Comparing learning and counting methods},
url = {http://snap.stanford.edu/wikiworkshop2016/papers/wikiworkshop_icwsm2016_dallmann.pdf},
year = 2016
}