“This guide is designated to anybody with basic programming knowledge or a computer science background interested in becoming a Research Scientist with on Deep Learning and NLP”.
My primary area of research is Arabic Computational Linguistics. Specifically:
Stemming: Details about the stemmer I have developed for Arabic. With link to Java code.
Tagging: Details about the Part-Of-Speech (POS) tagger I am developing for Arabic.
Corpora: Details about the Arabic corpora I am using. I have manually tagged 50,000 words of Arabic newspaper text with the basic tags (noun, verb, particle). I have also tagged 1,700 words with more detailed tags (i.e. singular, masculine, definite common noun). These are available for research purposes. Please e-mail me if you would like a copy of them.
Publications: I have included a couple of my publications here that can be viewed or downloaded.
SIGIR is the major international forum for the presentation of new research results and the demonstration of new systems and techniques in the field of information retrieval.
K. Kobs, T. Koopmann, A. Zehe, D. Fernes, P. Krop, und A. Hotho. Findings of the Association for Computational Linguistics: EMNLP 2020, Seite 878--883. Online, Association for Computational Linguistics, (November 2020)
S. Gulwani, und M. Marron. Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, Seite 803--814. New York, NY, USA, ACM, (2014)