From post

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French., , , , , , и . LREC, стр. 3367-3374. European Language Resources Association, (2022)A CURATEd CATalog: Rethinking the Extraction of Pretraining Corpora for Mid-Resourced Languages., , , , , , , , , и 1 other автор(ы). LREC/COLING, стр. 335-349. ELRA and ICCL, (2024)Semi-automatic staging area for high-quality structured data extraction from scientific literature., , , , , , , , , и . CoRR, (2023)Tokenizer Choice For LLM Training: Negligible or Crucial?, , , , , , , , , и 11 other автор(ы). CoRR, (2023)A Data-driven Approach to Named Entity Recognition for Early Modern French., и . COLING, стр. 3722-3730. International Committee on Computational Linguistics, (2022)BLOOM: A 176B-Parameter Open-Access Multilingual Language Model., , , , , , , , , и 39 other автор(ы). CoRR, (2022)The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset., , , , , , , , , и 44 other автор(ы). CoRR, (2023)Perplexed by Quality: A Perplexity-based Method for Adult and Harmful Content Detection in Multilingual Heterogeneous Web Data., , , и . CoRR, (2022)Automatic Extraction of Materials and Properties from Superconductors Scientific Literature., , , , , и . CoRR, (2022)The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset., , , , , , , , , и 44 other автор(ы). NeurIPS, (2022)