Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Is apache flume real time processing framework?
Explain the process to trigger automatic clean-up in Spark to manage accumulated metadata.
What are distinct operators in impala?
What is closing out ledgers?
Explain how do ‘map’ and ‘reduce’ works?
What is the importance of driver in hive?
Explain how message is consumed by consumer in Kafka?
What is partitioning in MapReduce?
Can we broadcast an rdd?
How to explain Bigdatadeveloper projects
What is the maximum number of rows in a table?
What is SerDe in Apache Hive ?
Difference between order by and sort by in Hive?
What is spark deploy mode?
Can you give us some examples, how Hadoop is used in real time environment?