What is the difference between nas (network attached storage) and hdfs?
What is a rack awareness algorithm?
What is the problem in having lots of small files in hdfs?
Why rack awareness algorithm is used in hadoop?
Can you change the block size of hdfs files?
What is an identity mapper and identity reducer?
What are the advantages of using mapreduce with hadoop?
What do you know about nlineinputformat?
Why the output of map tasks are stored (spilled ) into local disc and not in hdfs?
What is the role of recordreader in hadoop mapreduce?
What happens when the node running the map task fails before the map output has been sent to the reducer?
Define speculative execution?
Is it legal to set the number of reducer task to zero? Where the output will be stored in this case?
What are the advantages of using map side join in mapreduce?
What is a map side join?