Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus., , , , , , , and . EMNLP (1), page 1286-1305. Association for Computational Linguistics, (2021)OLMo: Accelerating the Science of Language Models., , , , , , , , , and 33 other author(s). CoRR, (2024)DataComp-LM: In search of the next generation of training sets for language models., , , , , , , , , and 49 other author(s). CoRR, (2024)Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research., , , , , , , , , and 26 other author(s). CoRR, (2024)What's In My Big Data?, , , , , , , , , and 3 other author(s). CoRR, (2023)Continued Pretraining for Better Zero- and Few-Shot Promptability., , , , , , and . EMNLP, page 4517-4531. Association for Computational Linguistics, (2022)What's In My Big Data?, , , , , , , , , and 3 other author(s). ICLR, OpenReview.net, (2024)A Simple Yet Strong Pipeline for HotpotQA., , , and . EMNLP (1), page 8839-8845. Association for Computational Linguistics, (2020)