What do you know about keyvaluetextinputformat?
How is 0xdata's h2o different from apache mahout ?
What are problems with small files and hdfs?
What is spark driver application?
What is hbase fsck?
What is Cassandra?
What is a Cluster, Node and Key space in Cassandra ?
how would you modify that solution to only count the number of unique words in all the documents?
What does the "USE" command in hive do?
On which hosts does impala run?
What is a IdentityMapper and IdentityReducer in MapReduce ?
What are combiners and its purpose?
What is Fault Tolerance?
Do we need scala for spark?
Explain the concept of Tunable Consistency in Cassandra?