Big Data Interview Questions
Questions Answers Views Company eMail

List some commonly used Machine Learning Algorithm Apache Spark?

19

What is the command to start and stop the Spark in an interactive shell?

16

List out the ways of creating RDD in Apache Spark?

17

What are the various advantages of DataFrame over RDD in Apache Spark?

15

What is flatmap in apache spark?

15

What is the standalone mode in spark cluster?

15

Explain apache spark streaming? How is the processing of streaming data achieved in apache spark?

13

In what ways sparksession different from sparkcontext?

11

Explain fold() operation in spark?

15

Define sparkcontext in apache spark?

11

List out the various advantages of dataframe over rdd in apache spark?

27

What is map in apache spark?

13

Write the command to start and stop the spark in an interactive shell?

12

Define various running modes of apache spark?

23

What are the ways to run spark over hadoop?

15







Un-Answered Questions { Big Data }

Which serialization libraries are supported in spark?

21


Explain different queries performed by apache tajo?

1


Does Pig differ from MapReduce? If yes, how?

35


What are the consistency levels for write operations in Cassandra?

8


What does serdes mean in apache kafka?

44






What is Clustring in Hive?

79


What are the parameters used to create keyspace in cassandra?

5


When should you use a reducer?

38


What is the heartbeat used for?

305


hbase support syntax structure like sql. Yes or no?

16


Explain about the core components of a distributed Spark application?

19


What is Federation?

7


Is Mapreduce Required For Impala? Will Impala Continue To Work As Expected If Mapreduce Is Stopped?

46


Why does hive not store metadata information in hdfs?

1


What is small file problem in hadoop?

67