Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What does meta-store means in hive?
What is the fundamental difference between a MapReduce InputSplit and HDFS block?
What are the drawbacks of Apache Spark?
Is spark better than mapreduce?
What is the relationship between Hadoop, HBase, Hive and Cassandra ?
How would you tackle counting words in several text documents?
What is rack awareness in hadoop?
Assume that an HBase table Student is disabled. So, how to access the student table once it is disabled, by using Scan command?
Compare Spark vs Hadoop MapReduce
How can you manually partition the rdd?
Differentiate between the terms: node, a cluster, and data center in cassandra?
What is an offset?
What is identity mapper and reducer? In which cases can we use them?
What is Sqoop Job?
Why do we need indexing?