Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
When do you call the cleanup method?
Which one will you choose for a project –Hadoop MapReduce or Apache Spark?
Can you define the process of creating ambari client?
What is Rack Awareness in Apache Hadoop?
What do you understand by data center in cassandra?
If I create a folder in HDFS, will there be metadata created corresponding to the folder? If yes, what will be the size of metadata created for a directory?
What is a rack awareness algorithm and why is it used in hadoop?
Is it necessary to know java to learn hadoop?
Explain about the different channel types in Flume. Which channel type is faster?
What is a spark standalone cluster?
What is Data Log in Kafka?
What is the prerequisite for Apache Hive installation?
When we create an rdd, does it bring the data and load it into the memory?
What is a skewed join?
Is hadoop a database?