Big Data Interview Questions
Questions Answers Views Company eMail

What are the identity mapper and reducer in MapReduce?

487

How to optimize Hadoop MapReduce Job?

372

How to specify more than one directory as input in the Hadoop MapReduce Program?

410

How to change the name of the output file from part-r-00000 in Hadoop MapReduce?

382

Which one will you decide for an undertaking – Hadoop MapReduce or Apache Spark?

362

In which kind of scenarios MapReduce jobs will be more useful than PIG in Hadoop?

389

How to overwrite an existing output file/dir during execution of Hadoop MapReduce jobs?

390

What is the relation between MapReduce and Hive?

377

What are advantages of Spark over MapReduce?

348

What is the use of InputFormat in MapReduce process?

384

What is Counter in MapReduce?

359

What is the difference between a MapReduce InputSplit and HDFS block?

419

How to compress mapper output in Hadoop?

390

Why is Apache Spark faster than Hadoop MapReduce?

372

Define MapReduce?

430


Un-Answered Questions { Big Data }

How to remove safemode of namenode forcefully in HDFS?

27


Explain HCatalog Architecture in Brief?

5


what needs to be taken care while adding a Column?

81


List the benefits of Spark over MapReduce.

206


Difference between groupByKey vs reduceByKey in Apache Spark?

248






What do you understand about yarn?

204


How much is flume worth?

58


How Hive distributes the rows into buckets?

476


How do I download apache mahout?

47


Hadoop uses replication to achieve fault tolerance. How is this achieved in Apache Spark?

304


What is difference between flume and kafka?

58


What is off heap memory in spark?

184


What do you know about the case sensitivity of apache pig?

238


What is Hive Database?

435


Can you define the process of creating ambari client?

48