Big Data Interview Questions
Questions Answers Views Company eMail

Define column families?

127

Which code is used to open a connection in hbase?

220

What is the role of the offset.

296

What is the role of the kafka producer api.

312

Explain the main difference between kafka and flume?

319

Explain what is kafka?

300

Explain what is the role of the zookeeper?

279

What are the various components in kafka.

316

Explain the process for starting a kafka server?

296

Define partitioning key?

291

Explain what is a consumer group?

307

Explain why are replications critical in kafka?

306

Mention what is data cleansing?

232

What is a udf?

222

What is a record reader?

224


Un-Answered Questions { Big Data }

What is the difference between structured and unstructured big data?

208


How we can check hadoop sqoop installed or not in a system?

5


Compare MapReduce and Spark?

188


Can flume provide 100% reliability to the data flow?

62


What is coalesce in spark?

185






Differentiate between Hadoop MapReduce and Pig?

412


What does jps command do in Hadoop?

303


What are spark jobs?

186


Name some companies that are already using Spark Streaming?

168


What happens to a namenode, when job tracker is down?

413


What is spark pipeline?

198


What is configured in /etc/hosts and what is its role in setting Hadoop cluster?

262


What is simple strategy?

91


Can you explain textinformat?

247


Can you use Spark to access and analyse data stored in Cassandra databases?

272