@macek

Pfp: parallel fp-growth for query recommendation

, , , , and . Proceedings of the 2008 ACM conference on Recommender systems, page 107--114. New York, NY, USA, ACM, (2008)
DOI: 10.1145/1454008.1454027

Abstract

Frequent itemset mining (FIM) is a useful tool for discovering frequently co-occurrent items. Since its inception, a number of significant FIM algorithms have been developed to speed up mining performance. Unfortunately, when the dataset size is huge, both the memory use and computational cost can still be prohibitively expensive. In this work, we propose to parallelize the FP-Growth algorithm (we call our parallel algorithm PFP) on distributed machines. PFP partitions computation in such a way that each machine executes an independent group of mining tasks. Such partitioning eliminates computational dependencies between machines, and thereby communication between them. Through empirical study on a large dataset of 802,939 Web pages and 1,021,107 tags, we demonstrate that PFP can achieve virtually linear speedup. Besides scalability, the empirical study demonstrates that PFP to be promising for supporting query recommendation for search engines.

Description

CiteULike: Pfp: parallel fp-growth for query recommendation

Links and resources

Tags

community

  • @macek
  • @dblp
  • @claudio.lucchese
@macek's tags highlighted