What are the advantages of using map side join in mapreduce?
What is a map side join?
What is a combiner and where you should use it?
When should you use sequencefileinputformat?
What is the purpose of textinputformat?
What is reduce side join in mapreduce?
What do you mean by inputformat?
What are the various configuration parameters required to run a mapreduce job?
What is a distributed cache in mapreduce framework?
What do you mean by data locality?
How can we assure that the values regarding a particular key goes to the same reducer?
What is pig statistics?
List the relational operators in pig.
What are all stats classes in the java api package available?
List the diagnostic operators in pig.