Big Data Interview Questions
Questions Answers Views Company eMail

What is Catalyst framework?

203

What do you understand by Pair RDD?

216

How can you launch Spark jobs inside Hadoop MapReduce?

244

How can you compare Hadoop and Spark in terms of ease of use?

194

Which one will you choose for a project –Hadoop MapReduce or Apache Spark?

204

What do you understand by Lazy Evaluation?

207

How can you remove the elements with a key present in any other RDD?

217

How Spark uses Hadoop?

195

What is a DStream?

238

What are the various data sources available in SparkSQL?

215

Explain about the core components of a distributed Spark application?

205

What are the benefits of using Spark with Apache Mesos?

168

What are the common mistakes developers make when running Spark applications?

202

When running Spark applications, is it necessary to install Spark on all the nodes of YARN cluster?

239

What is the significance of Sliding Window operation?

210


Un-Answered Questions { Big Data }

Suppose hadoop spawned 100 tasks for a job and one of the tasks failed. What will hadoop do?

278


Is kafka open source?

273


Does Hadoop requires RAID?

651


Illustrate some demerits of using Spark.

211


Why does my select statement fail?

41






What is the way of creating Avro Schemas?

41


Define role of value in big data?

234


What is the best hardware configuration to run Hadoop?

1327


Which file systems does Spark support?

207


What is the default block size in hdfs?

719


Is Apache Kafka is a distributed streaming platform? if yes, what you can do with it?

367


What is a shuffle block in spark?

180


Does spark use yarn?

175


How rdd persist the data?

207


What are the three types of tombstone markers in hbase?

112