Abstract
Machine learning about language can be improved by supplying it with specific
knowledge and sources of external information. We present here a new version of
the linked open data resource ConceptNet that is particularly well suited to be
used with modern NLP techniques such as word embeddings.
ConceptNet is a knowledge graph that connects words and phrases of natural
language with labeled edges. Its knowledge is collected from many sources that
include expert-created resources, crowd-sourcing, and games with a purpose. It
is designed to represent the general knowledge involved in understanding
language, improving natural language applications by allowing the application
to better understand the meanings behind the words people use.
When ConceptNet is combined with word embeddings acquired from distributional
semantics (such as word2vec), it provides applications with understanding that
they would not acquire from distributional semantics alone, nor from narrower
resources such as WordNet or DBPedia. We demonstrate this with state-of-the-art
results on intrinsic evaluations of word relatedness that translate into
improvements on applications of word vectors, including solving SAT-style
analogies.
Users
Please
log in to take part in the discussion (add own reviews or comments).