Article,

Fast Detection of Connected Components in Large Scale Graphs Using MapReduce

.
IOSR Journal of Engineering (IOSRJEN), (February 2014)

Abstract

Finding connected components of a graph is a fundamental problem in graph theory which arises in many different applications including data mining and network analysis. By increasing popularity of social networks and information systems, scale of real world graphs have increased to billions of nodes and edges. Thus, finding connected components of large scale graphs turned to be a computationally challenging task. Because of this, in recent years, there has been some works addressing this problem using the well-known MapReduce distributed large scale data processing framework. However, they do not have acceptable performance and sstill tere is great potential for imporvments. In this paper, we introduce a new approach for finding connected components of large scale graphs using MapReduce framework. Based on the results of the experiments on real-world datasets, we show that, by using the new algorithm, significant performance improvements have been gained. We also explain that the main idea of our algorithm is based on a general theory for effective utilization of computational resources provided by nodes in a MapReduce cluster to reduce communication and IO load.

Tags

Users

  • @agibhardt

Comments and Reviews