Unitex is a corpus processing system, based on automata-oriented technology. The concept of this software was born at LADL (Laboratoire d'Automatique Documentaire et Linguistique), under the direction of its director, Maurice Gross. With this tool, you can handle electronic resources such as electronic dictionaries and grammars and apply them. You can work at the levels of morphology, the lexicon and syntax.
NLTK — the Natural Language Toolkit — is a suite of open source Python modules, data and documentation for research and development in natural language processing. NLTK contains Code supporting dozens of NLP tasks, along with 40 popular Corpora and extensive Documentation including a 375-page online Book. Distributions for Windows, Mac OSX and Linux are available.
P. Cimiano, S. Staab, und J. Tane. Proceedings of the ECML/PKDD Workshop on Adaptive Text Extraction and Mining, Cavtat-Dubrovnik, Croatia, Seite 10-17. (2003)