In natural language understanding, there is a hierarchy of lenses through which we can extract meaning - from words to sentences to paragraphs to documents. At the document level, one of the most useful ways to understand text is by analyzing its topics.
I made an introductory talk on word embeddings in the past and this write-up is an extended version of the part about philosophical ideas behind word vectors.
ConceptNet Numberbatch consists of state-of-the-art semantic vectors (also known as word embeddings) that can be used directly as a representation of word meanings or as a starting point for further machine learning.
S. Bordia, und S. Bowman. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, Seite 7--15. Minneapolis, Minnesota, Association for Computational Linguistics, (Juni 2019)
M. Artetxe, G. Labaka, I. Lopez-Gazpio, und E. Agirre. Proceedings of the 22nd Conference on Computational Natural Language Learning, Seite 282--291. Association for Computational Linguistics, (2018)
M. Kusner, Y. Sun, N. Kolkin, und K. Weinberger. Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37, Seite 957--966. JMLR.org, (2015)