Lesezeichen

An Information-Theoretical Approach to Clustering Categorical Databases using Genetic Algorithms


Beschreibung

CiteSeerX - Document Details (Isaac Councill, Lee Giles): Clustering categorical databases presents special difficulties due to the absence of natural dissimilarities between objects. We present a solution that overcomes these difficulties that is based on an information-theoretical definition of dissimilarities between partitions of finite sets (applied to partitions of the set of objects to be clustered which are determined by categorical attributes) and makes use of genetic algorithms for finding an acceptable approximative clustering. We tested our method on databases for which the clustering of the rows is known in advance and we show that our proposed method finds the natural clustering of the data with a good classification rate, better than that of the classical algorithm k-means.

Vorschau

Tags

Nutzer

  • @k.e.

Kommentare und Rezensionen