Author of the publication

Scaling Laws for Multilingual Neural Machine Translation.

, , , , and . ICML, volume 202 of Proceedings of Machine Learning Research, page 10053-10071. PMLR, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

A Loss Curvature Perspective on Training Instabilities of Deep Learning Models., , , , , , , , and . ICLR, OpenReview.net, (2022)Scaling Laws for Neural Machine Translation., , , , , , , and . ICLR, OpenReview.net, (2022)Data Scaling Laws in NMT: The Effect of Noise and Architecture., , , , , , and . ICML, volume 162 of Proceedings of Machine Learning Research, page 1466-1482. PMLR, (2022)An Investigation into Neural Net Optimization via Hessian Eigenvalue Density., , and . ICML, volume 97 of Proceedings of Machine Learning Research, page 2232-2241. PMLR, (2019)Binarized Neural Machine Translation., , , , , , and . CoRR, (2023)Limitations of Lazy Training of Two-layers Neural Networks, , , and . (2019)cite arxiv:1906.08899Comment: 39 pages; 2 pdf figures.Scaling Laws for Multilingual Neural Machine Translation., , , , and . ICML, volume 202 of Proceedings of Machine Learning Research, page 10053-10071. PMLR, (2023)Examining Scaling and Transfer of Language Model Architectures for Machine Translation., , , , , , and . ICML, volume 162 of Proceedings of Machine Learning Research, page 26176-26192. PMLR, (2022)Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation., , and . EMNLP (Findings), page 9198-9209. Association for Computational Linguistics, (2023)Linearized two-layers neural networks in high dimension., , , and . CoRR, (2019)