Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Who creates dag in spark?
Does Apache Spark provide checkpoints?
Before deploying the hadoop instance, what are the checks that an individual should do?
Is ambari python clients can be used to make the good use of ambari api’s?
What do slaves consist of?
As part of optimizing the queries in hive, what should be the order of table size in a join query?
Compare hive, hbase, and impala?
Can you give a detailed overview about the Big Data being generated by Facebook?
What is dataframe api?
Name some Complex types of Data types, Avro Supports?
How can you minimize data transfers when working with Spark?
How does broadcast join work in spark?
Name various types of Cluster Managers in Spark.
What is shuffleing in mapreduce?
What do you mean by the high availability of a namenode? How is it achieved?