A Survey of Big Data Processing in Perspective of Hadoop and Mapreduce
Pages : 602-606
Download PDF
Abstract
Big Data is a data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it. Hadoop is the core platform for structuring Big Data, and solves the problem of making it useful for analytics purposes. Hadoop is an open source software project that enables the distributed processing of large data sets across clusters of commodity servers. It is designed to scale up from a single server to thousands of machines, with a very high degree of fault tolerance. Hadoop MapReduce is an implementation of the algorithm developed and maintained by the Apache Hadoop project. Map Reduce is a programming model for processing large data sets with parallel distributed algorithm on cluster. This paper presents the survey of bigdata processing in perspective of hadoop and mapreduce.
Keywords: Big data, Hadoop, Mapreduce, HDFS.
Article published in International Journal of Current Engineering and Technology, Vol.4,No.2 (April- 2014)