You've built a vibrant community of Family Guy enthusiasts. The SVD recommendation algorithm took your site to the next level by allowing you to leverage the implicit knowledge of your community. But now you're ready for the next iteration - you are about
MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
R. Schapire, Y. Singer, and A. Singhal. Proceedings of SIGIR-98, 21st ACM International Conference on
Research and Development in Information Retrieval, page 215--223. Melbourne, Australia, ACM Press, New York, US, (1998)
G. Krempl, T. Ha, and M. Spiliopoulou. Proc. of the 18th Int. Conf. on Discovery Science (DS 2015), volume 9356 of Lecture Notes in Computer Science, page 101--115. Springer, (2015)