Big Data Interview Questions
Questions Answers Views Company eMail

I have a row or key cache hit rate of 0.XX123456789 reported by JMX. Is that XX% or 0.XX% ?

51

When to avoid secondary indexes?

54

What ports does Cassandra use?

43

When should you not use Cassandra? OR When to use RDBMS instead of Cassandra?

90

What are secondary indexes?

55

What is the design architecture of Cassandra?

55

Who uses Cassandra?

47

JMX stands for?

51

What are the different features of Cassandra?

59

What happens to existing data in my cluster when I add new nodes?

121

What is the primary objective of NoSQL databases?

64

What is Cassandra Query Language?

64

What are the different components of Cassandra?

72

What are “Seed Nodes” in Cassandra?

257

What do you understand by High availability?

59


Un-Answered Questions { Big Data }

What is big data concept?

232


Can multiple clients write into a Hadoop HDFS file concurrently?

33


What is sc textfile?

226


Why should we use ‘distinct’ keyword in Pig scripts?

308


Why HDFS stores data using commodity hardware despite the higher chance of failures in hadoop?

26






What is vectorized query execution?

217


Define a sequence file in hadoop?

301


Explain how you can improve the throughput of a remote consumer?

325


What is InputFormat in Hadoop MapReduce?

366


Explain about the partitioning, shuffle and sort phase in MapReduce?

501


Explain what are the basic parameters of a mapper?

469


In MapReduce, ideally how many mappers should be configured on a slave?

384


Cassandra is written in which language?

78


What are the important steps in the configuration?

60


When the reducers are are started in a mapreduce job?

351