Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Which are the three modes in which hadoop can be run?
What is RDD Lineage?
What is the use of checkpoints in spark?
How can you minimize data transfers when working with Spark?
What is a broker?
What is CQL?
Explain the process to trigger automatic clean-up in Spark to manage accumulated metadata.
What is the difference between logical and physical plans?
What are advantages of Spark over MapReduce?
What stored in HDFS?
What is the difference between client mode and cluster mode in spark?
What are the key features of HDFS?
What does it mean by Columnar Storage Format?
What is apache spark architecture?
What is the best practice to deploy the secondary name node?