Beliebiger Eintrag,

Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations

Q. Chen, A. Allot, R. Leaman, R. Doğan, J. Du, L. Fang, W. Kai, S. Xu, Y. Zhang, P. Bagherzadeh, S. Bergler, A. Bhatnagar, N. Bhavsar, Y. Chang, S. Lin, W. Tang, H. Zhang, I. Tavchioski, S. Tian, J. Zhang, Y. Otmakhova, A. Yepes, H. Dong, H. Wu, R. Dufour, Y. Labrak, N. Chatterjee, K. Tandon, F. Laleye, L. Rakotoson, E. Chersoni, J. Gu, A. Friedrich, S. Pujari, M. Chizhikova, N. Sivadasan, N. Sivadasan, und Z. Lu.
(2022)cite arxiv:2204.09781.

Zusammenfassung

The COVID-19 pandemic has been severely impacting global society since December 2019. Massive research has been undertaken to understand the characteristics of the virus and design vaccines and drugs. The related findings have been reported in biomedical literature at a rate of about 10,000 articles on COVID-19 per month. Such rapid growth significantly challenges manual curation and interpretation. For instance, LitCovid is a literature database of COVID-19-related articles in PubMed, which has accumulated more than 200,000 articles with millions of accesses each month by users worldwide. One primary curation task is to assign up to eight topics (e.g., Diagnosis and Treatment) to the articles in LitCovid. Despite the continuing advances in biomedical text mining methods, few have been dedicated to topic annotations in COVID-19 literature. To close the gap, we organized the BioCreative LitCovid track to call for a community effort to tackle automated topic annotation for COVID-19 literature. The BioCreative LitCovid dataset, consisting of over 30,000 articles with manually reviewed topics, was created for training and testing. It is one of the largest multilabel classification datasets in biomedical scientific literature. 19 teams worldwide participated and made 80 submissions in total. Most teams used hybrid systems based on transformers. The highest performing submissions achieved 0.8875, 0.9181, and 0.9394 for macro F1-score, micro F1-score, and instance-based F1-score, respectively. The level of participation and results demonstrate a successful track and help close the gap between dataset curation and method development. The dataset is publicly available via https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/ for benchmarking and further development.

BibTeX-Schlüssel: chen2022multilabel
Eintragstyp: misc
Jahr: 2022
URL: http://arxiv.org/abs/2204.09781
Hinweis: cite arxiv:2204.09781

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

Zitieren Sie diese Publikation

%0 Generic %1 chen2022multilabel %A Chen, Qingyu %A Allot, Alexis %A Leaman, Robert %A Doğan, Rezarta Islamaj %A Du, Jingcheng %A Fang, Li %A Kai, Wang %A Xu, Shuo %A Zhang, Yuefu %A Bagherzadeh, Parsa %A Bergler, Sabine %A Bhatnagar, Aakash %A Bhavsar, Nidhir %A Chang, Yung-Chun %A Lin, Sheng-Jie %A Tang, Wentai %A Zhang, Hongtong %A Tavchioski, Ilija %A Tian, Shubo %A Zhang, Jinfeng %A Otmakhova, Yulia %A Yepes, Antonio Jimeno %A Dong, Hang %A Wu, Honghan %A Dufour, Richard %A Labrak, Yanis %A Chatterjee, Niladri %A Tandon, Kushagri %A Laleye, Fréjus %A Rakotoson, Loïc %A Chersoni, Emmanuele %A Gu, Jinghang %A Friedrich, Annemarie %A Pujari, Subhash Chandra %A Chizhikova, Mariia %A Sivadasan, Naveen %A Sivadasan, Naveen %A Lu, Zhiyong %D 2022 %K biomedical_literature covid deep_learning digital_libraries document_classification information_retrieval litcovid multi_label_classification myown %T Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations %U http://arxiv.org/abs/2204.09781 %X The COVID-19 pandemic has been severely impacting global society since December 2019. Massive research has been undertaken to understand the characteristics of the virus and design vaccines and drugs. The related findings have been reported in biomedical literature at a rate of about 10,000 articles on COVID-19 per month. Such rapid growth significantly challenges manual curation and interpretation. For instance, LitCovid is a literature database of COVID-19-related articles in PubMed, which has accumulated more than 200,000 articles with millions of accesses each month by users worldwide. One primary curation task is to assign up to eight topics (e.g., Diagnosis and Treatment) to the articles in LitCovid. Despite the continuing advances in biomedical text mining methods, few have been dedicated to topic annotations in COVID-19 literature. To close the gap, we organized the BioCreative LitCovid track to call for a community effort to tackle automated topic annotation for COVID-19 literature. The BioCreative LitCovid dataset, consisting of over 30,000 articles with manually reviewed topics, was created for training and testing. It is one of the largest multilabel classification datasets in biomedical scientific literature. 19 teams worldwide participated and made 80 submissions in total. Most teams used hybrid systems based on transformers. The highest performing submissions achieved 0.8875, 0.9181, and 0.9394 for macro F1-score, micro F1-score, and instance-based F1-score, respectively. The level of participation and results demonstrate a successful track and help close the gap between dataset curation and method development. The dataset is publicly available via https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/ for benchmarking and further development.

@misc{chen2022multilabel, abstract = {The COVID-19 pandemic has been severely impacting global society since December 2019. Massive research has been undertaken to understand the characteristics of the virus and design vaccines and drugs. The related findings have been reported in biomedical literature at a rate of about 10,000 articles on COVID-19 per month. Such rapid growth significantly challenges manual curation and interpretation. For instance, LitCovid is a literature database of COVID-19-related articles in PubMed, which has accumulated more than 200,000 articles with millions of accesses each month by users worldwide. One primary curation task is to assign up to eight topics (e.g., Diagnosis and Treatment) to the articles in LitCovid. Despite the continuing advances in biomedical text mining methods, few have been dedicated to topic annotations in COVID-19 literature. To close the gap, we organized the BioCreative LitCovid track to call for a community effort to tackle automated topic annotation for COVID-19 literature. The BioCreative LitCovid dataset, consisting of over 30,000 articles with manually reviewed topics, was created for training and testing. It is one of the largest multilabel classification datasets in biomedical scientific literature. 19 teams worldwide participated and made 80 submissions in total. Most teams used hybrid systems based on transformers. The highest performing submissions achieved 0.8875, 0.9181, and 0.9394 for macro F1-score, micro F1-score, and instance-based F1-score, respectively. The level of participation and results demonstrate a successful track and help close the gap between dataset curation and method development. The dataset is publicly available via https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/ for benchmarking and further development.}, added-at = {2022-04-22T13:54:03.000+0200}, author = {Chen, Qingyu and Allot, Alexis and Leaman, Robert and Doğan, Rezarta Islamaj and Du, Jingcheng and Fang, Li and Kai, Wang and Xu, Shuo and Zhang, Yuefu and Bagherzadeh, Parsa and Bergler, Sabine and Bhatnagar, Aakash and Bhavsar, Nidhir and Chang, Yung-Chun and Lin, Sheng-Jie and Tang, Wentai and Zhang, Hongtong and Tavchioski, Ilija and Tian, Shubo and Zhang, Jinfeng and Otmakhova, Yulia and Yepes, Antonio Jimeno and Dong, Hang and Wu, Honghan and Dufour, Richard and Labrak, Yanis and Chatterjee, Niladri and Tandon, Kushagri and Laleye, Fréjus and Rakotoson, Loïc and Chersoni, Emmanuele and Gu, Jinghang and Friedrich, Annemarie and Pujari, Subhash Chandra and Chizhikova, Mariia and Sivadasan, Naveen and Sivadasan, Naveen and Lu, Zhiyong}, biburl = {https://www.bibsonomy.org/bibtex/2cd69788a86a26c8b06bb6979133316d1/hangdong}, description = {[2204.09781] Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations}, interhash = {2eb236e6a376e2e3ef29801d091cbd99}, intrahash = {cd69788a86a26c8b06bb6979133316d1}, keywords = {biomedical_literature covid deep_learning digital_libraries document_classification information_retrieval litcovid multi_label_classification myown}, note = {cite arxiv:2204.09781}, timestamp = {2022-04-22T13:55:02.000+0200}, title = {Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations}, url = {http://arxiv.org/abs/2204.09781}, year = 2022 }

BibSonomy

Multi-label classification for biomedical literature: an overview of the BioCreative VII LitCovid Track for COVID-19 literature topic annotations

Zusammenfassung

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf