Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What are the uses and applications of mahout ?
Can we set the number of reducers to zero in MapReduce?
Hadoop achieves parallelism by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program and slow down the program. What mechanism Hadoop provides to combat this?
What are the types of hive ddl commands?
what Hive query processor does?
What is the job of blend () and repartition () in Map Reduce?
What is structured and unstructured data?
What is the need of MapReduce?
What is HBase?
What is the use of flume in hadoop?
Explain the concept of bloom filter?
What language is apache spark?
What is Distributed Cache in Hadoop?
How does executor work in spark?
Can you explain how to minimize data transfers while working with Spark?