Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What do you mean by schema on read?
How to handle record boundaries in Text files or Sequence files in MapReduce InputSplits?
Why Hadoop MapReduce?
When would you use hbase?
Explain how indexing is done in hdfs?
Say when to pick “inward table” and “outside table” in hive?
Define the Use of MapReduce?
What features from relational databases or hive are not available in impala?
What is a record reader?
Compare Hadoop 2 and Hadoop 3?
What is the main difference between Kafka and Flume?
Do we need to install spark in all nodes?
What are the core benefits for hadoop users by using apache ambari?
Explain what is the row key?
Explain the flatMap operation on Apache Spark RDD?