Author of the publication

Learning Multiscale Transformer Models for Sequence Generation.

, , , , , and . ICML, volume 162 of Proceedings of Machine Learning Research, page 13225-13241. PMLR, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Functional Correlation Analysis in Crosstalk Induced Critical Paths Identification., and . DAC, page 653-656. ACM, (2001)Sharing Attention Weights for Fast Transformer., , , , and . IJCAI, page 5292-5298. ijcai.org, (2019)Introduction to Transformers: an NLP Perspective., and . CoRR, (2023)Spatially Selective Active Noise Control Systems., , and . CoRR, (2022)Large Language Models are Parallel Multilingual Learners., , , , , , , , , and 1 other author(s). CoRR, (2024)Person Re-Identification With Deep Kronecker-Product Matching and Group-Shuffling Random Walk., , , , , and . IEEE Trans. Pattern Anal. Mach. Intell., 43 (5): 1649-1665 (2021)The NiuTrans's Submission to the IWSLT22 English-to-Chinese Offline Speech Translation Task., , , , , , , and . IWSLT@ACL, page 232-238. Association for Computational Linguistics, (2022)ESRL: Efficient Sampling-Based Reinforcement Learning for Sequence Generation., , , , , , , and . AAAI, page 19107-19115. AAAI Press, (2024)Context Sensitive Word Deletion Model for Statistical Machine Translation., , , and . CCL, volume 10565 of Lecture Notes in Computer Science, page 73-84. Springer, (2017)Multi-layer Representation Fusion for Neural Machine Translation., , , , , and . COLING, page 3015-3026. Association for Computational Linguistics, (2018)