@claudio.lucchese

A Simple Algorithm for Topic Identification in 0–1 Data

, , and . Knowledge Discovery in Databases: PKDD 2003, (2003)

Abstract

Topics in 0-1 datasets are sets of variables whose occurrences are positively connected together. Earlier, we described a simple generative topic model. In this paper we show that, given data produced by this model, the lift statistics of attributes can be described in matrix form. We use this result to obtain a simple algorithm for finding topics in 0-1 data. We also show that a problem related to the identification of topics is NP-hard. We give experimental results on the topic identification problem, both on generated and real data. ER -

Description

SpringerLink - Book Chapter

Links and resources

Tags

community

  • @dblp
  • @claudio.lucchese
@claudio.lucchese's tags highlighted