Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
How we can take Hadoop out of Safe Mode?
Difference between groupByKey vs reduceByKey in Apache Spark?
Explain Clustering in Hive?
How many daemon processes run on a hadoop cluster?
What are the two ways to create rdd in spark?
Explain the key features of hdfs?
What are Guarantees provided by Kafka?
If there is certain data that we want to use again and again in different transformations, what should improve the performance?
Define compaction in HBase?
Explain various Apache Spark ecosystem components. In which scenarios can we use these components?
What is Derby database?
Explain what is the role of the zookeeper?
What is flume and sqoop?
Why Flume?
What is DataFrames?