Аннотация
State-of-the-art unsupervised multilingual models (e.g., multilingual BERT)
have been shown to generalize in a zero-shot cross-lingual setting. This
generalization ability has been attributed to the use of a shared subword
vocabulary and joint training across multiple languages giving rise to deep
multilingual abstractions. We evaluate this hypothesis by designing an
alternative approach that transfers a monolingual model to new languages at the
lexical level. More concretely, we first train a transformer-based masked
language model on one language, and transfer it to a new language by learning a
new embedding matrix with the same masked language modeling objective, freezing
parameters of all other layers. This approach does not rely on a shared
vocabulary or joint training. However, we show that it is competitive with
multilingual BERT on standard cross-lingual classification benchmarks and on a
new Cross-lingual Question Answering Dataset (XQuAD). Our results contradict
common beliefs of the basis of the generalization ability of multilingual
models and suggest that deep monolingual models learn some abstractions that
generalize across languages. We also release XQuAD as a more comprehensive
cross-lingual benchmark, which comprises 240 paragraphs and 1190
question-answer pairs from SQuAD v1.1 translated into ten languages by
professional translators.
Пользователи данного ресурса
Пожалуйста,
войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)