Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain map-only job?
What are the components of Hive architecture?
What is shuffling in mapreduce?
Explain the run-time architecture of Spark?
What is the default extension of the files produced from a sqoop import using the –compress parameter?
Mention what job does the conf class do?
name few other popular column oriented databases like hbase.
What is flatmap in apache spark?
Does the hdfs client decide the input split or namenode?
How do you do a file system check in hdfs?
What Are Good Use Cases For Impala As Opposed To Hive Or MapReduce?
what is the default replication factor in HDFS?
What is the purpose of sqoop-merge?
What can be optimum value for Reducer?
Explain the format of an apache kafka message?