My primary area of research is Arabic Computational Linguistics. Specifically:
Stemming: Details about the stemmer I have developed for Arabic. With link to Java code.
Tagging: Details about the Part-Of-Speech (POS) tagger I am developing for Arabic.
Corpora: Details about the Arabic corpora I am using. I have manually tagged 50,000 words of Arabic newspaper text with the basic tags (noun, verb, particle). I have also tagged 1,700 words with more detailed tags (i.e. singular, masculine, definite common noun). These are available for research purposes. Please e-mail me if you would like a copy of them.
Publications: I have included a couple of my publications here that can be viewed or downloaded.
“This guide is designated to anybody with basic programming knowledge or a computer science background interested in becoming a Research Scientist with on Deep Learning and NLP”.
S. Gulwani, and M. Marron. Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, page 803--814. New York, NY, USA, ACM, (2014)
K. Kobs, T. Koopmann, A. Zehe, D. Fernes, P. Krop, and A. Hotho. Findings of the Association for Computational Linguistics: EMNLP 2020, page 878--883. Online, Association for Computational Linguistics, (November 2020)