What is difference between a MapReduce InputSplit and HDFS block
Answer / Vinita Chaudhary
An HDFS block is a physical segment of data stored in Hadoop Distributed File System. An InputSplit, on the other hand, is a logical partition of the data that MapReduce uses to divide the work among multiple mappers. One or more HDFS blocks can contribute to an InputSplit, and the size of an InputSplit doesn't necessarily match the size of an HDFS block.
| Is This Answer Correct ? | 0 Yes | 0 No |
What is the Hadoop MapReduce API contract for a key and value Class?
mapper or reducer?
What is a Speculative Execution in Hadoop MapReduce?
Why MapReduce uses the key-value pair to process the data?
What is shuffling and sorting in mapreduce?
Explain how do ‘map’ and ‘reduce’ work?
Explain the differences between a combiner and reducer
What is the key- value pair in MapReduce?
Why can aggregation not be done in Mapper in MapReduce?
What are the main components of MapReduce Job?
Explain what combiners are and when you should use a combiner in a mapreduce job?
Why is output file name in Hadoop MapReduce part-r-00000?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)