Hadoop Interview Questions
Questions Answers Views Company eMail

What is the difference between coalesce and repartition in spark?

197

What does rdd mean?

204

How does groupbykey work in spark?

192

What is tungsten in spark?

241

What is the difference between dataframe and dataset in spark?

188

What is an accumulator in spark?

200

What is apache spark used for?

179

What is the difference between spark and scala?

211

What is accumulators and broadcast variables in spark?

212

Does spark run hadoop?

202

What is coalesce in spark sql?

191

What are the features of spark rdd?

185

What is cluster mode in spark?

190

What are shared variables in spark?

213

What is the future of apache spark?

193


Un-Answered Questions { Hadoop }

Does rdd have schema?

205


How often do you need to reformat the namenode?

264


What causes breaker to spark?

220


File permissions in HDFS?

17


What are the uses and applications of mahout ?

38






What is version-id mismatch error in hadoop?

751


Explain the difference between mahout & mllib?

41


Define parquet file format? How to convert data to parquet format?

222


How can we see all the clusters that are available in Ambari?

125


What are the befefits of nosql over relational database?

47


Apache Spark is a good fit for which type of machine learning techniques?

211


How to start kafka server?

344


What is the data storage component used by Hadoop?

368


What are the use cases of Apache Pig?

508


What is spark certification?

195