Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
Explain why the name ‘hadoop’?
Explain why do we need hadoop?
If no custom partitioner is defined in Hadoop then how is data partitioned before it is sent to the reducer?
What is session in Cassandra?
When Namenode is down what happens to job tracker?
What is Apache Spark Streaming?
What is indexing and why do we need it?
What is spark code?
What is a DStream?
Can you define what is Event Serializer in Flume?
Write a Hive UDF that returns a sentiment score. For example, if good = 1, bad = -1, and average = 0, then a review of a restaurant states "Good food, bad service," your score might be 1 - 1 = 0.
How hbase handles the write failure?
Explain the Scope operators used in hbase?
What is the difference between Pig and MapReduce?
Explain Spark leftOuterJoin() and rightOuterJoin() operation?