,

Data Mining using Genetic Programming: Classification and Symbolic Regression

.
Institute for Programming research and Algorithmics, Leiden Institute of Advanced Computer Science, Faculty of Mathematics & Natural Sciences, Leiden University, The Netherlands, (14 September 2005)

Аннотация

Sir Francis Bacon said about four centuries ago: "Knowledge is Power". If we look at today's society, information is becoming increasingly important. According to 73 about five exabytes (5 � 1018 bytes) of new information were produced in 2002, 92% of which on magnetic media (e.g., hard-disks). This was more than double the amount of information produced in 1999 (2 exabytes). However, as Albert Einstein observed: "Information is not Knowledge". One of the challenges of the large amounts of information stored in databases is to find or extract potentially useful, understandable and novel patterns in data which can lead to new insights. To quote T.S. Eliot: "Where is the knowledge we have lost in information ?" 35. This is the goal of a process called Knowledge Discovery in Databases (KDD) 36. The KDD process consists of several phases: in the Data Mining phase the actual discovery of new knowledge takes place. The outline of the rest of this introduction is as follows. We start with an introduction of Data Mining and more specifically the two subject areas of Data Mining we will be looking at: classification and regression. Next we give an introduction about evolutionary computation in general and tree-based genetic programming in particular. In Section 1.4 we give our motivation for using genetic programming for Data Mining. Finally, in the last sections we give an overview of the thesis and related publications.

тэги

Пользователи данного ресурса

  • @brazovayeye

Комментарии и рецензии