Big Data Interview Questions
Questions Answers Views Company eMail

List the functions of Spark SQL?

53

What is RDD?

72

How to create RDD?

71




Does Apache Spark provide check pointing?

58

Explain about the popular use cases of Apache Spark

54

Do you need to install Spark on all nodes of Yarn cluster while running Spark on Yarn?

81

What are the different String functions available in pig?

94

Differentiate between the physical plan and logical plan in Pig script?

102

What are the use cases of Apache Pig?

106

What do you understand by an inner bag and outer bag in Pig?

112

Explain different execution modes available in Pig?

90

How do users interact with HDFS in Apache Pig ?

99

what are the basic parameters of a Mapper?

32

What is a MapReduce Combiner?

38

Where is Mapper output stored?

37







Un-Answered Questions { Big Data }

How to change replication factor of files already stored in HDFS?

182


How to keep HDFS cluster balanced?

207


Highlight the difference between group and Cogroup operators in Pig?

52


What are the basic steps to writing a UDF Function in Pig?

49


what is the default replication factor in HDFS?

149






What are combiners? When should I use a combiner in my MapReduce Job?

185


What is Derby database?

150


How to stop a partition form being queried?

65


Is Hive supports Temporary Tables?

61


what are the nodes in the Hadoop cluster?

143


What is the difference between traditional RDBMS and Hadoop?

285


How to use Hive using the command line and Beeline?

55


What is HDFS block size and what did you chose in your project?

137


What are the most commonly defined input formats in Hadoop?

248


How is reporting controlled in hadoop?

41