Replication causes data redundancy then why is is pursued in HDFS?
Answer / Rakesh Kumar Gupta
Replication is pursued in HDFS to provide high availability and fault tolerance. Although replication leads to data redundancy, it ensures that the system can continue functioning even if one or more DataNodes fail. In addition, Hadoop's distributed nature allows for efficient processing of large datasets by leveraging multiple nodes.
| Is This Answer Correct ? | 0 Yes | 0 No |
What is a block in HDFS, why block size 64MB?
What is the optimal block size in HDFS?
What is the difference between an input split and hdfs block?
Why HDFS performs replication, although it results in data redundancy?
What do you mean by the High Availability of a NameNode in Hadoop HDFS?
How to use hdfs put command for data transfer from flume to hdfs?
Why is block size large in Hadoop?
Define data integrity?
What is a block in HDFS? what is the default size in Hadoop 1 and Hadoop 2? Can we change the block size?
What is Fault Tolerance in HDFS?
How to Delete file from HDFS?
What are the main properties of hdfs-site.xml file?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)