Author of the publication

Get to Know Your Parallel Data: Performing English Variety and Genre Classification over MaCoCu Corpora.

, , and . VarDial@EACL, page 91-103. Association for Computational Linguistics, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

The GINCO Training Dataset for Web Genre Identification of Documents Out in the Wild., , and . LREC, page 1584-1594. European Language Resources Association, (2022)BENCHić-lang: A Benchmark for Discriminating between Bosnian, Croatian, Montenegrin and Serbian., , and . VarDial@EACL, page 113-120. Association for Computational Linguistics, (2023)CLASSLA-web: Comparable Web Corpora of South Slavic Languages Enriched with Linguistic and Genre Annotation., and . LREC/COLING, page 3271-3282. ELRA and ICCL, (2024)ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification., , and . CoRR, (2023)MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages., , , , , , , , , and 4 other author(s). EAMT, page 301-302. European Association for Machine Translation, (2022)Get to Know Your Parallel Data: Performing English Variety and Genre Classification over MaCoCu Corpora., , and . VarDial@EACL, page 91-103. Association for Computational Linguistics, (2023)Verbal Multiword Expressions in Slovene., , and . Europhras, volume 10596 of Lecture Notes in Computer Science, page 247-259. Springer, (2017)MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages., , , , , , , , , and 4 other author(s). EAMT, page 505-506. European Association for Machine Translation, (2023)Do Language Models Care about Text Quality? Evaluating Web-Crawled Corpora across 11 Languages., , , , , , and . LREC/COLING, page 5221-5234. ELRA and ICCL, (2024)Language Models on a Diet: Cost-Efficient Development of Encoders for Closely-Related Languages via Additional Pretraining., , , , and . CoRR, (2024)