Big Data Interview Questions
Questions Answers Views Company eMail

What are the benefits of Spark over MapReduce?

347

What are the disservices of utilizing Apache Spark over Hadoop MapReduce?

358

Why do we need MapReduce during Pig programming?

370

What is difference between a MapReduce InputSplit and HDFS block

382

What is lineage graph in Apache Spark?

217

Different Running Modes of Apache Spark

213

How will you calculate the number of executors required to do real-time processing using Apache Spark? What factors need to be considered for deciding on the number of nodes for real-time processing?

197

List the popular use cases of Apache Spark?

204

What is Spark.executor.memory in a Spark Application?

191

Compare Hadoop and Spark?

195

What is write ahead log(journaling) in Spark?

244

What are Actions?

191

What are the limitations of Spark?

201

What is a reliable and unreliable receiver in Spark?

229

In a given spark program, how will you identify whether a given operation is Transformation or Action ?

248


Un-Answered Questions { Big Data }

How to start and stop spark in interactive shell?

202


What are accumulators in spark?

210


What is the local repository and where it is useful while using ambari environment?

48


What is apache spark sql?

187


What is spark used for?

172






Explain the master class and the output class do?

365


How do I start a spark cluster?

179


What do you understand by receivers in Spark Streaming ?

216


How can you delete the DBPROPERTY in Hive?

401


Clarify what is shuffling in map reduce?

353


What is a rack?

233


What are the differences between PIG and MapReduce?

351


Can you explain sequence file in hadoop?

229


What is difference between hive and hdfs?

380


Name a few companies that use Apache Spark in production?

247