Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is a bloom filter and how does it help in searching rows?
What are the roles and responsibilities of worker nodes in the Apache Spark cluster? Is Worker Node in Spark is same as Slave Node?
What happens if rdd partition is lost due to worker node failure?
Is there any difference between FileSink and FileRollSink?
What is master node in spark?
How many ways we can create rdd?
How can a developer utilize hive?
Illustrate some demerits of using Spark.
What are the relation operations in Pig? Explain any two with examples?
What is a speculative execution in Apache Hadoop MapReduce?
Mention what is the number of default partitioner in Hadoop?
What is RDD?
How is Spark not quite the same as MapReduce? Is Spark quicker than MapReduce?
Why are the number of splits equal to the number of maps?
What is a job tracker?