Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Define data integrity? How does hdfs ensure data integrity of data blocks stored in hdfs?
What is difference between hive and spark?
Explain the difference between a MapReduce InputSplit and HDFS block?
What is the use of checkpoints in spark?
What is the sequencefileinputformat in hadoop?
Use of import-all-tables command in hadoop sqoop?
What is Spark.executor.memory in a Spark Application?
What are the various levels of persistence in Apache Spark?
What do you understand by schemardd in apache spark rdd?
What are the different components of a Hive query processor?
Is it possible to run Apache Spark without Hadoop?
In MapReduce, ideally how many mappers should be configured on a slave?
Explain the various table design approaches in HBase?
What are the modes in which Hadoop run?
What is lazy evaluation in Spark?