Explain what is difference between an input split and hdfs block?
Replication causes data redundancy then why is pursued in hdfs?
Since the data is replicated thrice in hdfs, does it mean that any calculation done on one node will also be replicated on the other two?
If a particular file is 50 mb, will the hdfs block still consume 64 mb as the default size?
What is the difference between an hdfs block and input split?
Explain hdfs?
Explain the key features of hdfs?
How does hdfs get a good throughput?
Explain how indexing is done in hdfs?
Explain the difference between mapreduce engine and hdfs cluster?
Explain the difference between an hdfs block and input split?
Explain what is big data?
Give some examples of big data?
Give a detailed overview about the big data being generated by facebook?
What are the three characteristics of big data according to ibm?