Big Data Interview Questions
Questions Answers Views Company eMail

Mention what is the data storage component used by hadoop?

232

Mention what does the text input format do?

231

Mention what daemons run on a master node and slave nodes?

280

Explain what is namenode in hadoop?

237

Explain what is a sequence file in hadoop?

258

Mention how many inputsplits is made by a hadoop framework?

294

Explain what is jobtracker in hadoop? What are the actions followed by hadoop?

223

Explain is it possible to search for files using wildcards?

225

Explain what is hadoop?

257

Explain what is sequencefileinputformat?

237

Mention how hadoop is different from other data processing tools?

234

Mention what are the most common input formats defined in hadoop?

247

Mention what is rack awareness?

223

Mention what is the difference between an rdbms and hadoop?

237

Mention what are the three modes in which hadoop can be run?

232


Un-Answered Questions { Big Data }

Which is the reliable channel in Flume to ensure that there is no data loss?

98


How HCatalog helps to capture processing states to enable sharing?

5


Is it necessary to start Hadoop to run any Apache Spark Application ?

197


Is kafka an etl tool?

259


Clarify what a task tracker is in hadoop?

231






Is spark a language?

179


Is it possible to leverage real time analysis on the big data collected by flume directly? If yes, then explain how?

53


Define "Transformations" in Spark

200


Explain textFile Vs wholeTextFile in Spark?

285


What is JPS? Why is it used in Hadoop?

237


What are active and passive "NameNodes"?

1060


Explain Thrift & Protocol Buffers Vs. Avro?

63


Mention the common features in Pig and Hive?

649


Does spark work with python 3?

180


What are the basic steps to writing a UDF Function in Pig?

446