Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) How many JVMs run on a slave node?
What counter in Hadoop MapReduce?
Clarify how hive de-serialize and serialize the information?
How can we create children / sub-znode?
Is it possible to leverage real time analysis on the big data collected by flume directly? If yes, then explain how?
How much memory is required?
Who divides the file into Block while storing inside hdfs in hadoop?
How many ways we can create rdd in spark?
Define the term Column Families?
Why should we use ‘distinct’ keyword in Pig scripts?
How can you prevent a large job from running for a long time? What do u think is more popular among the developers - Pig or Hive?
How is NFS different from HDFS?
Can kafka be utilized without zookeeper?
when to choose “internal table” and “external table” in hive?
How rdd can be created in spark?