What is HDFS High Availability?
Virtual Box & Ubuntu Installation?
What are the sources generating big data?
What is Apache Flume?
What will you do when NameNode is down?
What do you know about collaborative filtering?
how indexing in HDFS is done?
On what basis Namenode will decide which datanode to write on?
What are combiners and its purpose?
How do you define "block" in HDFS?
How is the distance between two nodes defined in Hadoop?
What are some of the interesting facts about Big Data?
What are active and passive "NameNodes"?
When Hive is run in embedded mode
how JobTracker schedules a task ?