Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

D4: Improving LLM Pretraining via Document De-Duplication and Diversification., , , and . CoRR, (2023)Investigating Generalization by Controlling Normalized Margin., , , , and . ICML, volume 162 of Proceedings of Machine Learning Research, page 6324-6336. PMLR, (2022)Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models., , , and . NeurIPS, (2022)The Unreasonable Ineffectiveness of the Deeper Layers., , , , and . CoRR, (2024)Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data., , , , , , , , , and . CoRR, (2023)Effective pruning of web-scale datasets based on complexity of concept clusters., , , , , and . CoRR, (2024)Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks., , , , , , , , , and . ACL (demo), page 174-181. Association for Computational Linguistics, (2022)Ensemble Machine Learning Methods for Modeling COVID19 Deaths., , , and . CoRR, (2020)SemDeDup: Data-efficient learning at web-scale through semantic deduplication., , , , and . CoRR, (2023)