Part-of-Speech Tagging, Phrase Chunking and Named Entity Recognition with Python NLTK. Taggers and chunkers trained on treebank, brown, conll2000, ieer.
In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
This post is meant as a summary of many of the concepts that I learned in Marti Hearst's Natural Language Processing class at the UC Berkeley School of Information.