Hadoop Interview Questions
Questions Answers Views Company eMail

What is pair rdd?

208

What is data pipeline in spark?

203

What is a spark rdd?

221

What are the optimization techniques in spark?

184

Can you run spark on windows?

199

Why is spark good?

197

Do I need to know hadoop to learn spark?

204

Is a distributed machine learning framework on top of spark?

194

How does hadoop achieve fault tolerance?

218

Is hadoop still in demand?

221

What is winutils hadoop?

237

Is hive a nosql database?

379

Is hive similar to sql?

426

What is difference between hive and hdfs?

387

What is skew data in hive?

430


Un-Answered Questions { Hadoop }

Whether the output of mapper or output of partitioner written on local disk?

391


Difference between external table and internal table in HIVE ?

616


What are the similarities and differences between Apache Flume and Apache Kafka?

81


What are the data types of Pig Latin?

307


What is a secondary namenode?

363






Are there any problems which can only be solved by MapReduce and cannot be solved by PIG? In which kind of scenarios MR jobs will be more useful than PIG?

511


Why hbase is a schema-less database?

119


Can multiple clients write into an HDFS file concurrently in hadoop?

41


What is partitioning?

254


What are different tombstone markers in hbase?

109


What database are supported by Hive?

409


Replication causes data redundancy then why is is pursued in HDFS?

25


How is impala metadata managed?

36


State some advantages of impala?

30


What is the default replication factor and how will you change it?

249