copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Exhaustive search algorithms to mine subgroups on Big Data using Apache Spark

F. Padillo, J. Luna, and S. Ventura. Progress in Artificial Intelligence, (2017)
DOI: 10.1007/s13748-017-0112-x

Abstract

Subgroup discovery is a well-known technique for the extraction of patterns, with respect to a variable of interest in the data. However, the explosion in data gathering has hampered the performance of traditional algorithms to discover interesting relationships between different objects in a set with respect to a specific property which is of interest to the user. In this regard, our goal is to propose a set of efficient techniques to mine subgroups on Big Data by means of Apache Spark. On this matter, AprioriK-SD-OE and PFP-SD-OE are proposed as fast exhaustive search algorithms to discover subgroups on Big Data using Apache Spark. The experimental study includes more than 70 datasets considering search spaces bigger than \$\$10^\15\\$\$ 10 15 subgroups. The scalability of our proposals are analyzed by considering datasets with 200 million of instances demonstrating the usefulness of using Spark to tackle Big Data.

Description

Exhaustive search algorithms to mine subgroups on Big Data using Apache Spark | SpringerLink

@becker's tags highlighted

Cite this publication

@article{padillo2017exhaustive, abstract = {Subgroup discovery is a well-known technique for the extraction of patterns, with respect to a variable of interest in the data. However, the explosion in data gathering has hampered the performance of traditional algorithms to discover interesting relationships between different objects in a set with respect to a specific property which is of interest to the user. In this regard, our goal is to propose a set of efficient techniques to mine subgroups on Big Data by means of Apache Spark. On this matter, AprioriK-SD-OE and PFP-SD-OE are proposed as fast exhaustive search algorithms to discover subgroups on Big Data using Apache Spark. The experimental study includes more than 70 datasets considering search spaces bigger than {\$}{\$}10^{\{}15{\}}{\$}{\$} 10 15 subgroups. The scalability of our proposals are analyzed by considering datasets with 200 million of instances demonstrating the usefulness of using Spark to tackle Big Data.}, added-at = {2017-02-01T09:17:54.000+0100}, author = {Padillo, F. and Luna, J. M. and Ventura, S.}, biburl = {https://www.bibsonomy.org/bibtex/298dc9843d08ee2d2e0141617ddb2c33d/becker}, description = {Exhaustive search algorithms to mine subgroups on Big Data using Apache Spark | SpringerLink}, doi = {10.1007/s13748-017-0112-x}, interhash = {e2d9e11931a577ba376ca8b91c263435}, intrahash = {98dc9843d08ee2d2e0141617ddb2c33d}, issn = {2192-6360}, journal = {Progress in Artificial Intelligence}, keywords = {distributed emm mapreduce parallel spark subgroup subgroups}, pages = {1--14}, timestamp = {2017-02-01T09:17:54.000+0100}, title = {Exhaustive search algorithms to mine subgroups on Big Data using Apache Spark}, url = {http://dx.doi.org/10.1007/s13748-017-0112-x}, year = 2017 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Exhaustive search algorithms to mine subgroups on Big Data using Apache Spark

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Exhaustive search algorithms to mine subgroups on Big Data using Apache Spark

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Exhaustive search algorithms to mine subgroups on Big Data using Apache Spark

Comments and Reviews
(0)