An unsupervised model for text message normalization
P. Cook, and S. Stevenson. CALC '09: Proceedings of the Workshop on Computational Approaches to Linguistic Creativity, page 71--78. Morristown, NJ, USA, Association for Computational Linguistics, (2009)
Abstract
Cell phone text messaging users express themselves briefly and colloquially using a variety of creative forms. We analyze a sample of creative, non-standard text message word forms to determine frequent word formation processes in texting language. Drawing on these observations, we construct an unsupervised noisy-channel model for text message normalization. On a test set of 303 text message forms that differ from their standard form, our model achieves 59% accuracy, which is on par with the best supervised results reported on this dataset.
Description
An unsupervised model for text message normalization
%0 Conference Paper
%1 1642021
%A Cook, Paul
%A Stevenson, Suzanne
%B CALC '09: Proceedings of the Workshop on Computational Approaches to Linguistic Creativity
%C Morristown, NJ, USA
%D 2009
%I Association for Computational Linguistics
%K SMS chatlog normalization
%P 71--78
%T An unsupervised model for text message normalization
%U http://portal.acm.org/citation.cfm?id=1642021&dl=#
%X Cell phone text messaging users express themselves briefly and colloquially using a variety of creative forms. We analyze a sample of creative, non-standard text message word forms to determine frequent word formation processes in texting language. Drawing on these observations, we construct an unsupervised noisy-channel model for text message normalization. On a test set of 303 text message forms that differ from their standard form, our model achieves 59% accuracy, which is on par with the best supervised results reported on this dataset.
%@ 978-1-932432-36-7
@inproceedings{1642021,
abstract = {Cell phone text messaging users express themselves briefly and colloquially using a variety of creative forms. We analyze a sample of creative, non-standard text message word forms to determine frequent word formation processes in texting language. Drawing on these observations, we construct an unsupervised noisy-channel model for text message normalization. On a test set of 303 text message forms that differ from their standard form, our model achieves 59% accuracy, which is on par with the best supervised results reported on this dataset.},
added-at = {2010-06-17T21:16:30.000+0200},
address = {Morristown, NJ, USA},
author = {Cook, Paul and Stevenson, Suzanne},
biburl = {https://www.bibsonomy.org/bibtex/25ffcde106e83f8a0ea825661c75864b8/zhenzhenx},
booktitle = {CALC '09: Proceedings of the Workshop on Computational Approaches to Linguistic Creativity},
description = {An unsupervised model for text message normalization},
interhash = {d9964d96ba52e9e987397c92f93d13a7},
intrahash = {5ffcde106e83f8a0ea825661c75864b8},
isbn = {978-1-932432-36-7},
keywords = {SMS chatlog normalization},
location = {Boulder, Colorado},
pages = {71--78},
publisher = {Association for Computational Linguistics},
timestamp = {2010-06-17T21:16:30.000+0200},
title = {An unsupervised model for text message normalization},
url = {http://portal.acm.org/citation.cfm?id=1642021&dl=#},
year = 2009
}