In natural language understanding (NLU) tasks, there is a hierarchy of lenses through which we can extract meaning — from words to sentences to paragraphs to documents. At the document level, one of the most useful ways to understand text is by analyzing its topics. The process of learning, recognizing, and extracting these topics across a collection of documents is called topic modeling.
In this post, we will explore topic modeling through 4 of the most popular techniques today: LSA, pLSA, LDA, and the newer, deep learning-based lda2vec.
From the user's perspective, MDP is a collection of supervised and unsupervised learning algorithms and other data processing units that can be combined into data processing sequences and more complex feed-forward network architectures.
"Here's a preliminary data mining analysis of musical social networking service Last.fm. An automated classification into clusters or sub populations with related musical genres reveals the structure of musical preferences among the users in a relatively large sample population. Musical tag clouds are adopted to characterise users and populations, which adds a highly descriptive value and aids with the interpretation of the results."
Modular toolkit for Data Processing (MDP) is a Python data processing framework. Implemented algorithms include: Principal Component Analysis (PCA), Independent Component Analysis (ICA), Slow Feature Analysis (SFA), Growing Neural Gas (GNG), Factor Analys