In natural language understanding, there is a hierarchy of lenses through which we can extract meaning - from words to sentences to paragraphs to documents. At the document level, one of the most useful ways to understand text is by analyzing its topics.
ConceptNet Numberbatch consists of state-of-the-art semantic vectors (also known as word embeddings) that can be used directly as a representation of word meanings or as a starting point for further machine learning.
In this tutorial we look at the word2vec model by Mikolov et al. This model is used for learning vector representations of words, called "word embeddings".
Q. Le, und T. Mikolov. Proceedings of the 31st International Conference on Machine Learning, Volume 32 von Proceedings of Machine Learning Research, Seite 1188--1196. Bejing, China, PMLR, (Juni 2014)
G. Marco Baroni, Georgiana Dinu. 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Proceedings of the Conference, (2014)
S. Cordeiro, C. Ramisch, M. Idiart, und A. Villavicencio. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1, Seite 1986--1997. The Association for Computer Linguistics, (2016)
M. Hartung, F. Kaupmann, S. Jebbara, und P. Cimiano. Proceedings of the 15th Meeting of the European Chapter of the Association for Computational Linguistics (EACL), 1, Association for Computational Linguistics, (2017)