A Novel approach for k-means++ approximation using hadoop

IJSRP, Volume 5, Issue 12, December 2015 Edition [ISSN 2250-3153]

A Novel approach for k-means++ approximation using hadoop

Prajakta chandgude, Ashwini Bhagwat, Mayuri Autade, Anjali Pansare

Abstract: k-means is one of the most used clustering algorithms due to its simplicity of understanding and efficiency. However, this algorithm is mostly sensitive to the chosen initial centers and thus a proper initialization is hard for obtaining an ideal solution. To overcome this problem, k-means++ one by one chooses the centers so as to achieve a optimal solution. Due to less scalability, k-means++ is not efficient as the size of data increases. To improve its scalability and efficiency, use MapReduce along with the k-means++ method which can reduce the number of MapReduce jobs by using only one MapReduce job to obtain k centers. In this the k-means++ initialization algorithm is run in the first phase called Mapper phase and secondly the weighted k-means++ initialization algorithm is run in the Reducer phase. As this new MapReduce k-means++ method replaces the instances among multiple machines with a single machine. As this iterations are going to perform on single machine it can reduce the communication and I/O costs significant.

[VIEW FULL PAPER]

[DOWNLOAD]

[Reference this Paper] [BACK]

Reference this Research Paper (copy & paste below code):

Prajakta chandgude, Ashwini Bhagwat, Mayuri Autade, Anjali Pansare (2018); A Novel approach for k-means++ approximation using hadoop; Int J Sci Res Publ 5(12) (ISSN: 2250-3153). http://www.ijsrp.org/research-paper-1215.php?rp=P484884