Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
How do you integrate spark and hive?
What is the difference between an hdfs block and input split?
After the Map phase finishes, the Hadoop framework does 'Partitioning, Shuffle and sort'. Explain what happens in this phase?
What are tokens in cassandra?
Is secondary namenode a substitute to the namenode?
Describe Accumulator in detail in Apache Spark?
Can you explain logistic regression?
what are relational operations in pig latin?
How do I download apache mahout?
What are the key differences between Pig vs MapReduce?
Difference between groupByKey vs reduceByKey in Apache Spark?
How to write a custom partitioner for a Hadoop MapReduce job?
Does spark load all data in memory?
What is the difference between rdd and dataframe in spark?
How can you overwrite the replication factors in HDFS?