In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
T. Lanciano, F. Bonchi, und A. Gionis. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Seite 3308--3318. (2020)
M. Paris, und R. Jäschke. Proceedings of the 14th International Conference on Knowledge Science, Engineering and Management, Volume 12816 von Lecture Notes in Artificial Intelligence, Seite 1--14. Springer, (2021)
W. Martins, M. Goncalves, A. Laender, und G. Pappa. Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, Seite 193--202. New York, NY, USA, ACM, (2009)
C. Henning, und R. Ewerth. Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, Seite 14--22. New York, NY, USA, ACM, (2017)
X. Zhang, und Y. LeCun. (2015)cite arxiv:1502.01710Comment: This technical report is superseded by a paper entitled "Character-level Convolutional Networks for Text Classification", arXiv:1509.01626. It has considerably more experimental results and a rewritten introduction.
G. Krempl, T. Ha, und M. Spiliopoulou. Proc. of the 18th Int. Conf. on Discovery Science (DS 2015), Volume 9356 von Lecture Notes in Computer Science, Seite 101--115. Springer, (2015)
X. Li, B. Liu, und S. Ng. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Seite 218--228. Stroudsburg, PA, USA, Association for Computational Linguistics, (2010)