Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) how you can reduce churn in ISR? When does broker leave the ISR?
Difference between groupByKey vs reduceByKey in Apache Spark?
How can we import data from particular row or column? What is the destination types allowed in Sqoop import command?
Which files are used by the startup and shutdown commands?
What is broadcast variable?
What is the job of blend () and repartition () in Map Reduce?
Define Apache Pig?
Explain the terms Spark Partitions and Partitioners?
What sorts of actions does the job tracker process perform?
Explain about the different cluster managers in Apache Spark
Explain why to use hbase?
What do you understand by the term Straggler ?
How do users interact with HDFS in Apache Pig ?
Is apache spark an etl tool?
Explain HCatLoader APIs?