MIT 6.034 Artificial Intelligence, Fall 2010 View the complete course: http://ocw.mit.edu/6-034F10 Instructor: Patrick Winston In this lecture, we explore su...
In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
LIBLINEAR is a linear classifier for data with millions of instances and features. It supports L2-regularized logistic regression (LR), L2-loss linear SVM, and L1-loss linear SVM.
Main features of LIBLINEAR include
* Same data format as LIBSVM, our general-purpose SVM solver, and also similar usage
* Multi-class classification: 1) one-vs-the rest, 2) Crammer & Singer
* Cross validation for model selection
* Probability estimates (logistic regression only)
* Weights for unbalanced data
* MATLAB/Octave, Java interfaces
N. Gunasekara, S. Pang, и N. Kasabov. Neural Information Processing. Models and Applications, том 6444 из Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2010)
K. Chen, T. Chen, G. Zheng, O. Jin, E. Yao, и Y. Yu. Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, стр. 661--670. ACM, (2012)