Big Data Interview Questions
Questions Answers Views Company eMail

Explain how input and output data format of the hadoop framework?

408

when hadoop enter in safe mode?

442

How to enable recycle bin or trash in hadoop?

447

Whats is distributed cache in hadoop?

395

what should be the ideal replication factor in hadoop?

384

What is difference between secondary namenode, checkpoint namenode & backupnod secondary namenode, a poorly named component of hadoop?

444

How to resolve ioexception: cannot create directory, while formatting namenode in hadoop?

414

How does an hadoop application look like or their basic components?

413

What mechanism does hadoop framework provides to synchronize changes made in distribution cache during runtime of the application?

387

What is partitioning?

250

Can we change the file cached by distributed cache

231

Why would nosql be better than using a sql database? And how much better is it?

254

Why we use intwritable instead of int? Why we use longwritable instead of long?

228

What happens if you don?t override the mapper methods and keep them as it is?

243

What are the side data distribution techniques?

263


Un-Answered Questions { Big Data }

What are partitions in cassandra?

48


What is executor and driver in spark?

191


What is the use of cloudera?

226


How is security achieved in Apache Hadoop?

430


What is aws spark?

188






What does apache spark stand for?

192


What if a namenode has no data?

409


Define data lake?

195


Explain the maximum size of a message that can be received by the Kafka?

477


What is the problem with HDFS and streaming data like logs

673


Why big data?

243


What are the important tools useful for big data?

224


What is configuration of a typical slave node on Hadoop cluster? How many JVMs run on a slave node?

723


Is fs.mapr.working.dir a single directory?

408


What is ObjectInspector functionality in Hive?

683