IJSRP, Volume 4, Issue 1, January 2014 Edition [ISSN 2250-3153]
Big Data is characterized by increasing volume and velocity of data.IBM estimates that every day 2.5 quintillion bytes of data are created – so much that 90% of the data in the world today has been created in the last two years. The traditional data-intensive sciences such as astronomy, high energy physics, meteorology, genomics, biological and environmental research in which peta- and Exabyte of data are generated are common domain examples. Here even the capture and storage of the data is a challenge. Google implemented hundreds of special-purpose computations that process large amounts of raw data, such as crawled documents, Web request logs, etc., to compute various kinds of derived data, such as inverted indices, various representations of the graph structure of Web documents, summaries of the number of pages crawled per host, and the set of most frequent queries in a given day. In this paper big data that is navigating in years from the past to present and to the future is analyzed. To address the problem space of unstructured analytics, Map Reduce with Hadoop distributed File System (HDFS) is also discussed. To process terabytes of data efficiently on daily basis some of tools and techniques available and challenges, issues and benefits of big data is also listed.