Hadoop Interview Questions
Questions Answers Views Company eMail

How can I improve my spark performance?

188

What is apache spark architecture?

214

Why spark is faster than hive?

186

What happens if rdd partition is lost due to worker node failure?

301

What is pair rdd in spark?

196

What is difference between cache and persist in spark?

192

Is bigger than spark driver maxresultsize?

215

Does spark use java?

199

How do you process big data with spark?

177

What is a spark shuffle?

208

Why do we need apache spark?

191

How do I optimize my spark code?

197

What is the difference between client mode and cluster mode in spark?

203

What are transformations in spark?

204

What is driver and executor in spark?

184


Un-Answered Questions { Hadoop }

What does the high availability of a name-node means? How is it accomplished?

228


What mode(s) can hadoop code be run in?

248


Can kafka be utilized without zookeeper?

290


How you can use Akka with Spark?

210


Why would nosql be better than using a sql database? And how much better is it?

254






Why is HDFS only suitable for large data sets and not the correct tool to use for many small files?

35


Why does the picture of Spark come into existence?

200


What is Output Format in MapReduce?

406


How do I download adobe spark?

275


What are sink processors?

633


What is HBase Shell?

128


What are the different tasks we can perform managing host using ambari host tab?

61


How can one increase replication factor to a desired value in Hadoop?

778


What is an "Accumulator"?

196


How can we create a hadoop cluster from scratch?

222