Big Data Interview Questions
Questions Answers Views Company eMail

How many ways we can create rdd?

13

What does repartition do in spark?

11

What is the driver program in spark?

9

What is spark submit?

11

How do I clear my spark cache?

9

What is a partition in spark?

9

What is spark vectorization?

15

What is off heap memory in spark?

10

What is a tuple in spark?

13

Is spark an etl?

7

How is rdd distributed?

14

What are the common transformations in apache spark?

8

What is the difference between dataset and dataframe in spark?

7

What is distributed cache in spark?

7

What is catalyst framework in spark?

13







Un-Answered Questions { Big Data }

What is the difference between TextinputFormat and KeyValueTextInputFormat class?

3


Can you explain clustering in mahout?

1


Explain the concept of resilient distributed dataset (rdd).

348


How we can take Hadoop out of Safe Mode?

277


How does cassandra perform read operation?

7






What is wal and hlog in hbase?

26


How to create table in hive for a json input file?

88


What are tokens in cassandra?

5


What is Small File Problem in Hadoop? How can it be resolved?

15


What are the different commands used to startup and shutdown Hadoop daemons?

2


Can there be no Reducer?

252


Can you execute Hadoop dfs Commands from Hive CLI? How?

457


What are the advantages of using map side join in mapreduce?

39


What are keywords?

9


When you should use Hbase?

227