Since the data is replicated thrice in hdfs, does it mean that any calculation done on one node will also be replicated on the other two?
If a particular file is 50 mb, will the hdfs block still consume 64 mb as the default size?
What is the difference between an hdfs block and input split?
Explain hdfs?
Explain the key features of hdfs?
How does hdfs get a good throughput?
Explain how indexing is done in hdfs?
Explain the difference between mapreduce engine and hdfs cluster?
Explain the difference between an hdfs block and input split?
What do you know about sequencefileinputformat?
Why we cannot do aggregation (addition) in a mapper? Why we require reducer for that?
Why the name ‘hadoop’?
What do you know about nlineoutputformat?
How can we change the split size if our commodity hardware has less storage space?
How is hadoop different from other data processing tools?