Efficient FP Growth using Hadoop - (Improved Parallel FP-Growth)

IJSRP, Volume 4, Issue 7, July 2014 Edition [ISSN 2250-3153]

Efficient FP Growth using Hadoop - (Improved Parallel FP-Growth)

Sankalp Mitra, Suchit Bande, Shreyas Kudale, Advait Kulkarni, Asst. Prof. Leena A. Deshpande

Abstract: As an important part of discovering association rules, frequent itemsets mining plays a key role in mining associations, correlations, causality and other important data mining tasks. Since some traditional frequent itemsets mining algorithms are unable to handle massive small files datasets effectively, such as high memory cost, high I/O overhead, and low computing performance, an improved Parallel FP-Growth (IPFP) algorithm and discuss its applications in this paper. In particular, a small files processing strategy for massive small files datasets to compensate defects of low read/write speed and low processing efficiency in Hadoop. Moreover, use of MapReduce to implement the parallelization of FP-Growth algorithm, thereby improving the overall performance of frequent itemsets mining. The experimental results show that the IPFP algorithm is feasible and valid with a good speedup and a higher mining efficiency, and can meet the rapidly growing needs of frequent itemsets mining for massive small files datasets.

[VIEW FULL PAPER]

[DOWNLOAD]

[Reference this Paper] [BACK]

Reference this Research Paper (copy & paste below code):

Sankalp Mitra, Suchit Bande, Shreyas Kudale, Advait Kulkarni, Asst. Prof. Leena A. Deshpande (2018); Efficient FP Growth using Hadoop - (Improved Parallel FP-Growth); Int J Sci Res Publ 4(7) (ISSN: 2250-3153). http://www.ijsrp.org/research-paper-0714.php?rp=P312905