Big Data Interview Questions
Questions Answers Views Company eMail

Where is the output of Mapper written in Hadoop?

384

Is Mapreduce Required For Impala? Will Impala Continue To Work As Expected If Mapreduce Is Stopped?

383

Explain the difference between a MapReduce InputSplit and HDFS block?

398

How to create custom key and custom value in MapReduce Job?

355

How is Spark not quite the same as MapReduce? Is Spark quicker than MapReduce?

364

What is the difference between Job and Task in MapReduce?

377

Compare Pig vs Hive vs Hadoop MapReduce?

367

What is the fundamental difference between a MapReduce InputSplit and HDFS block?

340

How to submit extra files(jars, static files) for Hadoop MapReduce job during runtime?

388

How to get the single file as the output from MapReduce Job?

369

Why is output file name in Hadoop MapReduce part-r-00000?

364

Which among the two is preferable for the project- Hadoop MapReduce or Apache Spark?

351

Explain the process of spilling in Hadoop MapReduce?

373

What Are Good Use Cases For Impala As Opposed To Hive Or MapReduce?

331

What are the the issues associated with the map and reduce slots based mechanism in mapReduce?

386


Un-Answered Questions { Big Data }

Explain HCatOutputFormat?

5


What is a databricks cluster?

281


Explain the Reducer's Sort phase?

614


What is the difference between Hiveserver1 and Hiveserver2?

479


Explain textFile Vs wholeTextFile in Spark?

285






How does HDFS Index Data blocks? Explain.

19


Why do we need rdd in spark?

184


What is session in Cassandra?

105


How do we write our own custom serde?

423


State the difference between persist() and cache() functions.

192


What is difference between a MapReduce InputSplit and HDFS block

384


How can you launch Spark jobs inside Hadoop MapReduce?

246


What happens in a textinputformat?

386


Use of Help command in Hadoop sqoop?

5


List few differences between apache kafka and rabbitmq?

303