Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Is bigger than spark driver maxresultsize?
When to use secondary indexes?
What is replication in kafka?
What is the difference between external table and managed table?
Why do we need a password-less ssh in fully distributed environment?
Some of the most notable applications of Kafka?
What is Flume?
Does Apache Spark provide checkpoints?
What is InputSplit and RecordReader?
Explain the maximum size of a message that can be received by the Kafka?
Explain about the core components of Flume?
What are ‘maps’ and ‘reduces’?
What is rdd lineage graph? How is it useful in achieving fault tolerance?
Mention the common features in Pig and Hive?
What is spark rdd?