Big Data Interview Questions
Questions Answers Views Company eMail

What is difference between secondary namenode, checkpoint namenode & backupnode?

248

Name the most common input formats defined in hadoop?

248

Which one is default?

236

Is it possible to have hadoop job output in multiple directories?

254

What is tasktracker in hadoop?

236

What is a combiner in hadoop?

256

Have you ever used counters in hadoop?

235

What are some typical functions of job tracker in hadoop?

243

Can you define inputsplit in hadoop?

263

What daemons run on master nodes?

243

Can you explain rack awareness?

248

Can you explain record reader?

240

What are the side effects of not running a secondary name node?

282

What are the main components of hadoop?

227

What do you understand by unit and ()in scala?

293


Un-Answered Questions { Big Data }

What should be the HDFS Block size to get maximum performance from Hadoop cluster?

22


Define data lake?

208


State about ZooKeeper WebUI?

5


What is Identity reducer?

237


What are the main features and Characteristics of Hadoop which makes it the most popular and powerful Big Data tool?

280






What can I do with my m&s sparks points?

194


Do you know the comparative differences between apache spark and hadoop?

184


What is the jobtracker?

373


Hadoop uses replication to achieve fault tolerance. How is this achieved in Apache Spark?

304


How to identify that given operation is transformation/action in your program?

183


What is write ahead log(journaling) in Spark?

248


When would you use hbase?

142


What are the all tasks we can perform for managing services using the ambari service tab?

51


What is flume used for?

58


Explain the concept of resilient distributed dataset (rdd).

1948