@becker

Exhaustive search algorithms to mine subgroups on Big Data using Apache Spark

, , and . Progress in Artificial Intelligence, (2017)
DOI: 10.1007/s13748-017-0112-x

Abstract

Subgroup discovery is a well-known technique for the extraction of patterns, with respect to a variable of interest in the data. However, the explosion in data gathering has hampered the performance of traditional algorithms to discover interesting relationships between different objects in a set with respect to a specific property which is of interest to the user. In this regard, our goal is to propose a set of efficient techniques to mine subgroups on Big Data by means of Apache Spark. On this matter, AprioriK-SD-OE and PFP-SD-OE are proposed as fast exhaustive search algorithms to discover subgroups on Big Data using Apache Spark. The experimental study includes more than 70 datasets considering search spaces bigger than \$\$10^\15\\$\$ 10 15 subgroups. The scalability of our proposals are analyzed by considering datasets with 200 million of instances demonstrating the usefulness of using Spark to tackle Big Data.

Description

Exhaustive search algorithms to mine subgroups on Big Data using Apache Spark | SpringerLink

Links and resources

Tags

community

  • @becker
  • @dblp
@becker's tags highlighted