Big Data Interview Questions
Questions Answers Views Company eMail

What is Apache Spark and what are the benefits of Spark over MapReduce?

187

What are the cases where Apache Spark surpasses Hadoop?

192

What is the bottom layer of abstraction in the Spark Streaming API ?

634

Which the fundamental data structure of Spark

212

List the advantage of Parquet file in Apache Spark?

259

What does map transformation do? Provide an example.

190

What are the different ways of representing data in Spark?

180

What are the features of Spark?

194

What are shared variables in Apache Spark?

222

What are the various libraries available on top of Apache Spark?

215

Explain the operations of Apache Spark RDD?

198

What are the limitations of Apache Spark?

189

State the difference between persist() and cache() functions.

188

What is Directed Acyclic Graph(DAG)?

217

What are Actions? Give some examples.

219


Un-Answered Questions { Big Data }

What is Flatten and what it do in PIG?

324


What database are supported by Hive?

392


What is 'jps'?

235


What do you know about Partition in Kafka?

361


What is dataframe in spark?

188






Is spark a special attack?

169


What is Mapper? How can we compress Mapper output in Hadoop?

230


What is distinct clause in apache tajo?

5


What is lazy evaluation and how is it useful?

210


On what basis name node distribute blocks across the data nodes in HDFS?

33


When a large data set is maintained?

408


Can we say a COGROUP is a group of more than 1 data set?

307


What is a dataframe spark?

144


How to optimize Hive Performance?

422


What is the man difference between hbase and hive?

378