Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is the fundamental difference between a MapReduce InputSplit and HDFS block?
Bag in pig ?
How can you set an arbitrary number of mappers to be created for a job in Hadoop?
What database are supported by Hive?
State use cases of impala?
What are accumulators in spark?
Why we are using flume?
What daemons run on master nodes?
what is WebDAV in Hadoop?
What is pipelined rdd?
What is column store db? Explain with an example.
Explain countByValue() operation in Apache Spark RDD?
How does speculative execution work in Hadoop?
How much faster is Apache spark than Hadoop?
What are the three different modes in which hive can be run?