Author of the publication

Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.

, , , , , , and . ICML, volume 119 of Proceedings of Machine Learning Research, page 5958-5968. PMLR, (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Voice localization using nearby wall reflections., , , , and . MobiCom, page 7:1-7:14. ACM, (2020)Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers, , , , , , and . (2020)cite arxiv:2002.11794.Multitask Prompted Training Enables Zero-Shot Task Generalization, , , , , , , , , and 30 other author(s). International Conference on Learning Representations, (2022)HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption., , , , , , , , , and . CoRR, (2023)Virtual stereo content rendering technology review for light-field display., , , , and . Displays, (January 2023)RAFT: Adapting Language Model to Domain Specific RAG., , , , , , and . CoRR, (2024)MCEENet: Multi-Scale Context Enhancement and Edge-Assisted Network for Few-Shot Semantic Segmentation., , , , , and . Sensors, 23 (6): 2922 (March 2023)Discovering Non-monotonic Autoregressive Orderings with Variational Inference., , , , , , and . ICLR, OpenReview.net, (2021)What's Hidden in a One-layer Randomly Weighted Transformer?, , , , and . EMNLP (1), page 2914-2921. Association for Computational Linguistics, (2021)Poisoning Language Models During Instruction Tuning., , , and . ICML, volume 202 of Proceedings of Machine Learning Research, page 35413-35425. PMLR, (2023)