Article,

Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation

A. Yepes.
Journal of Biomedical Informatics, (2017)
DOI: https://doi.org/10.1016/j.jbi.2017.08.001

Abstract

Word sense disambiguation helps identifying the proper sense of ambiguous words in text. With large terminologies such as the UMLS Metathesaurus ambiguities appear and highly effective disambiguation methods are required. Supervised learning algorithm methods are used as one of the approaches to perform disambiguation. Features extracted from the context of an ambiguous word are used to identify the proper sense of such a word. The type of features have an impact on machine learning methods, thus affect disambiguation performance. In this work, we have evaluated several types of features derived from the context of the ambiguous word and we have explored as well more global features derived from MEDLINE using word embeddings. Results show that word embeddings improve the performance of more traditional features and allow as well using recurrent neural network classifiers based on Long-Short Term Memory (LSTM) nodes. The combination of unigrams and word embeddings with an SVM sets a new state of the art performance with a macro accuracy of 95.97 in the MSH WSD data set.

BibTeX key: JIMENOYEPES2017137
entry type: article
year: 2017
journal: Journal of Biomedical Informatics
pages: 137 - 147
volume: 73
issn: 1532-0464
DOI: https://doi.org/10.1016/j.jbi.2017.08.001
url: http://www.sciencedirect.com/science/article/pii/S1532046417301806

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{JIMENOYEPES2017137, abstract = {Word sense disambiguation helps identifying the proper sense of ambiguous words in text. With large terminologies such as the UMLS Metathesaurus ambiguities appear and highly effective disambiguation methods are required. Supervised learning algorithm methods are used as one of the approaches to perform disambiguation. Features extracted from the context of an ambiguous word are used to identify the proper sense of such a word. The type of features have an impact on machine learning methods, thus affect disambiguation performance. In this work, we have evaluated several types of features derived from the context of the ambiguous word and we have explored as well more global features derived from MEDLINE using word embeddings. Results show that word embeddings improve the performance of more traditional features and allow as well using recurrent neural network classifiers based on Long-Short Term Memory (LSTM) nodes. The combination of unigrams and word embeddings with an SVM sets a new state of the art performance with a macro accuracy of 95.97 in the MSH WSD data set.}, added-at = {2019-11-11T22:09:53.000+0100}, author = {Yepes, Antonio Jimeno}, biburl = {https://www.bibsonomy.org/bibtex/2472c21ba1c0bb96338a9c37d2362a575/brusilovsky}, description = {Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation - ScienceDirect}, doi = {https://doi.org/10.1016/j.jbi.2017.08.001}, interhash = {8a4c160d2a4ae0e4fe4456c32cf706e4}, intrahash = {472c21ba1c0bb96338a9c37d2362a575}, issn = {1532-0464}, journal = {Journal of Biomedical Informatics}, keywords = {neural-network word-sense-disambiguation}, pages = {137 - 147}, timestamp = {2019-11-11T22:09:53.000+0100}, title = {Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation}, url = {http://www.sciencedirect.com/science/article/pii/S1532046417301806}, volume = 73, year = 2017 }

BibSonomy

Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on