Abstract

Unsupervised machine translation, which utilizes unpaired monolingual corpora as training data, has achieved comparable performance against supervised machine translation. However, it still suffers from data-scarce domains. To address this issue, this paper presents a novel meta-learning algorithm for unsupervised neural machine translation (UNMT) that trains the model to adapt to another domain by utilizing only a small amount of training data. We assume that domain-general knowledge is a significant factor in handling data-scarce domains. Hence, we extend the meta-learning algorithm, which utilizes knowledge learned from high-resource domains, to boost the performance of low-resource UNMT. Our model surpasses a transfer learning-based approach by up to 2-3 BLEU scores. Extensive experimental results show that our proposed algorithm is pertinent for fast adaptation and consistently outperforms other baselines.

Description

This paper introduces MetaUMT and MetaGUMT, novel meta-learning algorithms for unsupervised neural machine translation, focusing on low-resource domains. It demonstrates how these models can adapt to new domains with minimal training data, leveraging domain-general knowledge to handle data-scarce scenarios effectively.

Links and resources

Tags