IJSRP, Volume 6, Issue 1, January 2016 Edition [ISSN 2250-3153]
Pratiksha D. Mandal, Madhuri S. Kadam, Sayali R. Kakade, Monali J. Reddy, Guided By - Prof. Amar More
Whatever kind of industry are you in being able to obtain information based on analysis of data coming from wide variety of sources can help make better decisions. In 2004 Google developed MapReduce, a programming framework for the processing of large datasets across distributed systems. MapReduce got more popularised by open source Apache Hadoop framework. In 2009 Amazon introduced Elastic MapReduce which is used for processing large datasets efficiently using the Apache framework on the fly. It allows customer to write their MapReduce application without dealing with hardware, network and Hadoop configuration. User only needs to submit their Map and Reduce functions along with required number of nodes and in return user will get simplified data as per his specifications mentioned in the application. Issue with Amazon EMR is the usage of computing resources provided only by Amazon datacenter at a certain cost. The goal of this project will be open sourcing of Amazon Elastic MapReduce. It will add the features of Elastic MapReduce to open source private cloud.