IJSRP, Volume 3, Issue 5, May 2013 Edition [ISSN 2250-3153]
Mrudula Varade, Vimla Jethani
A Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably and to stream those data sets at high bandwidth to user applications. Metadata management is critical to distributed file system. In HDFS architecture, a single master server manages all metadata, while a number of data servers store file data. This architecture cannot meet the exponentially increased storage demand in cloud computing, as the single master server may become a performance bottleneck. Comparative study of a metadata management scheme is done. There is three of techniques sub-tree partitioning, hashing and consistent hashing of metadata management scheme. Out of these three schemes consistent hashing is the best techniques which employs multiple NameNodes, and divides the metadata into buckets which can be dynamically migrated among NameNodes according to system workloads. To maintain reliability, metadata is replicated in different NameNodes with log replication technology, and Paxos algorithm is adopted to keep replication consistency.