копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Ontology-Based and Weakly Supervised Rare Disease Phenotyping from Clinical Notes

H. Dong, V. Suárez-Paniagua, H. Zhang, M. Wang, A. Casey, E. Davidson, J. Chen, B. Alex, W. Whiteley, и H. Wu. (2022)cite arxiv:2205.05656Comment: 20 pages, 5 figures, submitted to Journal of Biomedical Informatics.

Аннотация

Computational text phenotyping is the practice of identifying patients with certain disorders and traits from clinical notes. Rare diseases are challenging to be identified due to few cases available for machine learning and the need for data annotation from domain experts. We propose a method using ontologies and weak supervision, with recent pre-trained contextual representations from Bi-directional Transformers (e.g. BERT). The ontology-based framework includes two steps: (i) Text-to-UMLS, extracting phenotypes by contextually linking mentions to concepts in Unified Medical Language System (UMLS), with a Named Entity Recognition and Linking (NER+L) tool, SemEHR, and weak supervision with customised rules and contextual mention representation; (ii) UMLS-to-ORDO, matching UMLS concepts to rare diseases in Orphanet Rare Disease Ontology (ORDO). The weakly supervised approach is proposed to learn a phenotype confirmation model to improve Text-to-UMLS linking, without annotated data from domain experts. We evaluated the approach on three clinical datasets of discharge summaries and radiology reports from two institutions in the US and the UK. Our best weakly supervised method achieved 81.4% precision and 91.4% recall on extracting rare disease UMLS phenotypes from MIMIC-III discharge summaries. The overall pipeline processing clinical notes can surface rare disease cases, mostly uncaptured in structured data (manually assigned ICD codes). Results on radiology reports from MIMIC-III and NHS Tayside were consistent with the discharge summaries. We discuss the usefulness of the weak supervision approach and propose directions for future studies.

Описание

[2205.05656] Ontology-Based and Weakly Supervised Rare Disease Phenotyping from Clinical Notes

Линки и ресурсы

ключ BibTeX: dong2022ontologybased
тип записи: misc
год: 2022
url: http://arxiv.org/abs/2205.05656
Примечание: cite arxiv:2205.05656Comment: 20 pages, 5 figures, submitted to Journal of Biomedical Informatics

тэги

@hangdong- тэги данного пользователя выделены

Цитировать эту публикацию

%0 Generic %1 dong2022ontologybased %A Dong, Hang %A Suárez-Paniagua, Víctor %A Zhang, Huayu %A Wang, Minhong %A Casey, Arlene %A Davidson, Emma %A Chen, Jiaoyan %A Alex, Beatrice %A Whiteley, William %A Wu, Honghan %D 2022 %K clinical_notes ehr icd ontologies ontology_matching ordo phenotyping rare_disease semehr umls weak_supervision %T Ontology-Based and Weakly Supervised Rare Disease Phenotyping from Clinical Notes %U http://arxiv.org/abs/2205.05656 %X Computational text phenotyping is the practice of identifying patients with certain disorders and traits from clinical notes. Rare diseases are challenging to be identified due to few cases available for machine learning and the need for data annotation from domain experts. We propose a method using ontologies and weak supervision, with recent pre-trained contextual representations from Bi-directional Transformers (e.g. BERT). The ontology-based framework includes two steps: (i) Text-to-UMLS, extracting phenotypes by contextually linking mentions to concepts in Unified Medical Language System (UMLS), with a Named Entity Recognition and Linking (NER+L) tool, SemEHR, and weak supervision with customised rules and contextual mention representation; (ii) UMLS-to-ORDO, matching UMLS concepts to rare diseases in Orphanet Rare Disease Ontology (ORDO). The weakly supervised approach is proposed to learn a phenotype confirmation model to improve Text-to-UMLS linking, without annotated data from domain experts. We evaluated the approach on three clinical datasets of discharge summaries and radiology reports from two institutions in the US and the UK. Our best weakly supervised method achieved 81.4% precision and 91.4% recall on extracting rare disease UMLS phenotypes from MIMIC-III discharge summaries. The overall pipeline processing clinical notes can surface rare disease cases, mostly uncaptured in structured data (manually assigned ICD codes). Results on radiology reports from MIMIC-III and NHS Tayside were consistent with the discharge summaries. We discuss the usefulness of the weak supervision approach and propose directions for future studies.

@misc{dong2022ontologybased, abstract = {Computational text phenotyping is the practice of identifying patients with certain disorders and traits from clinical notes. Rare diseases are challenging to be identified due to few cases available for machine learning and the need for data annotation from domain experts. We propose a method using ontologies and weak supervision, with recent pre-trained contextual representations from Bi-directional Transformers (e.g. BERT). The ontology-based framework includes two steps: (i) Text-to-UMLS, extracting phenotypes by contextually linking mentions to concepts in Unified Medical Language System (UMLS), with a Named Entity Recognition and Linking (NER+L) tool, SemEHR, and weak supervision with customised rules and contextual mention representation; (ii) UMLS-to-ORDO, matching UMLS concepts to rare diseases in Orphanet Rare Disease Ontology (ORDO). The weakly supervised approach is proposed to learn a phenotype confirmation model to improve Text-to-UMLS linking, without annotated data from domain experts. We evaluated the approach on three clinical datasets of discharge summaries and radiology reports from two institutions in the US and the UK. Our best weakly supervised method achieved 81.4% precision and 91.4% recall on extracting rare disease UMLS phenotypes from MIMIC-III discharge summaries. The overall pipeline processing clinical notes can surface rare disease cases, mostly uncaptured in structured data (manually assigned ICD codes). Results on radiology reports from MIMIC-III and NHS Tayside were consistent with the discharge summaries. We discuss the usefulness of the weak supervision approach and propose directions for future studies.}, added-at = {2022-08-05T10:38:46.000+0200}, author = {Dong, Hang and Suárez-Paniagua, Víctor and Zhang, Huayu and Wang, Minhong and Casey, Arlene and Davidson, Emma and Chen, Jiaoyan and Alex, Beatrice and Whiteley, William and Wu, Honghan}, biburl = {https://www.bibsonomy.org/bibtex/25230b384acba476d8328c3176f2f557e/hangdong}, description = {[2205.05656] Ontology-Based and Weakly Supervised Rare Disease Phenotyping from Clinical Notes}, interhash = {ee921e0a2eab175a2a641abc8ac579ca}, intrahash = {5230b384acba476d8328c3176f2f557e}, keywords = {clinical_notes ehr icd ontologies ontology_matching ordo phenotyping rare_disease semehr umls weak_supervision}, note = {cite arxiv:2205.05656Comment: 20 pages, 5 figures, submitted to Journal of Biomedical Informatics}, timestamp = {2022-08-05T10:38:46.000+0200}, title = {Ontology-Based and Weakly Supervised Rare Disease Phenotyping from Clinical Notes}, url = {http://arxiv.org/abs/2205.05656}, year = 2022 }

искать в

Метаданные

Последнее изменение 2 лет назад
Создан 2 лет назад

Комментарии и рецензии
(0)

Комментарии, или рецензии отсутствуют. Вы можете их написать!