Neural Word Embedding as Implicit Matrix Factorization
O. Levy, and Y. Goldberg. Advances in Neural Information Processing Systems, page 2177--2185. (2014)
Abstract
We analyze skip-gram with negative-sampling (SGNS), a word embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs, shifted by a global constant. We find that another embedding method, NCE, is implicitly factorizing a similar matrix, where each cell is the (shifted) log conditional probability of a word given its context. We show that using a sparse Shifted Positive PMI word-context matrix to represent words improves results on two word similarity tasks and one of two analogy tasks. When dense low-dimensional vectors are preferred, exact factorization with SVD can achieve solutions that are at least as good as SGNS’s solutions for word similarity tasks. On analogy questions SGNS remains superior to SVD. We conjecture that this stems from the weighted nature of SGNS’s factorization.
Levy, Goldberg - Neural word embedding as implicit matrix factorization.pdf:C\:\\Users\\Admin\\Documents\\Research\\_Paperbase\\Word Embeddings\Łevy, Goldberg - Neural word embedding as implicit matrix factorization.pdf:application/pdf
%0 Conference Paper
%1 levy_neural_2014
%A Levy, Omer
%A Goldberg, Yoav
%B Advances in Neural Information Processing Systems
%D 2014
%K imported
%P 2177--2185
%T Neural Word Embedding as Implicit Matrix Factorization
%X We analyze skip-gram with negative-sampling (SGNS), a word embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs, shifted by a global constant. We find that another embedding method, NCE, is implicitly factorizing a similar matrix, where each cell is the (shifted) log conditional probability of a word given its context. We show that using a sparse Shifted Positive PMI word-context matrix to represent words improves results on two word similarity tasks and one of two analogy tasks. When dense low-dimensional vectors are preferred, exact factorization with SVD can achieve solutions that are at least as good as SGNS’s solutions for word similarity tasks. On analogy questions SGNS remains superior to SVD. We conjecture that this stems from the weighted nature of SGNS’s factorization.
@inproceedings{levy_neural_2014,
abstract = {We analyze skip-gram with negative-sampling (SGNS), a word embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs, shifted by a global constant. We find that another embedding method, NCE, is implicitly factorizing a similar matrix, where each cell is the (shifted) log conditional probability of a word given its context. We show that using a sparse Shifted Positive PMI word-context matrix to represent words improves results on two word similarity tasks and one of two analogy tasks. When dense low-dimensional vectors are preferred, exact factorization with SVD can achieve solutions that are at least as good as SGNS’s solutions for word similarity tasks. On analogy questions SGNS remains superior to SVD. We conjecture that this stems from the weighted nature of SGNS’s factorization.},
added-at = {2020-02-21T16:09:44.000+0100},
author = {Levy, Omer and Goldberg, Yoav},
biburl = {https://www.bibsonomy.org/bibtex/2a7665d8f5e0aaef21930d9146ea4ea51/tschumacher},
booktitle = {Advances in {Neural} {Information} {Processing} {Systems}},
file = {Levy, Goldberg - Neural word embedding as implicit matrix factorization.pdf:C\:\\Users\\Admin\\Documents\\Research\\_Paperbase\\Word Embeddings\\Levy, Goldberg - Neural word embedding as implicit matrix factorization.pdf:application/pdf},
interhash = {85dfc849c364f186582e7bfc781e28d7},
intrahash = {a7665d8f5e0aaef21930d9146ea4ea51},
keywords = {imported},
language = {en},
pages = {2177--2185},
timestamp = {2020-02-21T16:09:44.000+0100},
title = {Neural {Word} {Embedding} as {Implicit} {Matrix} {Factorization}},
year = 2014
}