Big Data Interview Questions
Questions Answers Views Company eMail

What is apache spark for beginners?

9

What is deploy mode in spark?

10

What is pair rdd?

7

What is data pipeline in spark?

13

What is a spark rdd?

9

What are the optimization techniques in spark?

7

Can you run spark on windows?

9

Why is spark good?

9

Do I need to know hadoop to learn spark?

13

Is a distributed machine learning framework on top of spark?

10

How does hadoop achieve fault tolerance?

1

Is hadoop still in demand?

5

What is winutils hadoop?

4

Is hive a nosql database?

35

Is hive similar to sql?

40







Un-Answered Questions { Big Data }

Explain how you can reduce churn in isr? When does broker leave the isr?

45


What are 'slaves' and 'masters' in Hadoop?

3


What is a map in pig?

40


What is map/reduce job in hadoop?

39


What is the relationship between Job and Task in Hadoop?

52






what needs to be taken care while adding a Column?

39


What are the side effects of not running a secondary name node?

3


How is Ambari different from ZooKeeper?

1


What are the parameters used to create a keyspace?

21


What is lineage graph?

156


Explain Erasure Coding in Apache Hadoop?

73


Web-ui shows that half of the datanodes are in decommissioning mode. What does that mean? Is it safe to remove those nodes from the network?

3


Explain how is hadoop different from other data processing tools?

47


What do you mean by the NameNode High Availability in hadoop?

19


What are the three steps involved in big data?

15