Big Data Interview Questions, Answers for Freshers and Experienced asked in Job Interviews

Un-Answered Questions { Big Data }

What happens when the data set exceeds available memory?

How much memory is required?

Are results returned as they become available, or all at once when a query completes?

Why do I have to use refresh and invalidate metadata, what do they do?

Why does my select statement fail?

Does impala performance improve as it is deployed to more hosts in a cluster in much the same way that hadoop performance does?

What is RDD in Apache Spark? How are they computed in Spark? what are the various ways in which it can create?

221

What role does worker node play in Apache Spark Cluster? And what is the need to register a worker node with the driver program?

224

What is SparkSession in Apache Spark? Why is it needed?

219

What is the task of Spark Engine

232

What is the user of sparkContext?

225

How is the processing of streaming data achieved in Apache Spark? Explain.

193

Can you do real-time processing with Spark SQL?

195

Discuss the role of Spark driver in Spark application?

200

What are the features of RDD, that makes RDD an important abstraction of Spark?

192