Get free access to the in-progress manuscript of Programming Pig via the Open Feedback Publishing System (OFPS). Interact with the authors and community and provide your feedback in real-time.
The Datawrangling blog was put on the back burner last May while I focused on my startup. Now that I have some bandwidth again, I am getting back to work on several pet projects (including the Amazon EC2 Cluster).
(2000) Sun Le, Jin Youbing, Du Lin, & Sun Yufang: Automatic extraction of English-Chinese term lexicons from noisy bilingual corpora. LREC-2000: Second International Conference on Language Resources and Evaluation. Proceedings, Athens, Greece, 31 May – 2 June 2000; pp. 751-755. [PDF, 128KB]
As the use of a Bayesian probability calculation on a simple co-occurrence frequency table created from the same data has similar disambiguation capabilities, the paper also incorporates comparison of LSA with the Bayesian model.
M. Carman, M. Baillie, R. Gwadera, and F. Crestani. SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, page 123--130. New York, NY, USA, ACM, (2009)
S. Bao, G. Xue, X. Wu, Y. Yu, B. Fei, and Z. Su. WWW '07: Proceedings of the 16th international conference on World Wide Web, page 501--510. New York, NY, USA, ACM, (2007)
L. Muñoz, S. Rojas, and M. Rosell. Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation, page 75--82. Alicante, Universidad de Alicante. Departamento de Lenguajes y Sistemas Informáticos, (2009)
M. Zubizarreta, F. Tyers, and G. Ramírez-Sánchez. Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation, page 3--10. Alicante, Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, (2009)