How would you tackle calculating the number of unique visitors for each hour by mining a huge apache log? You can use post processing on the output of the mapreduce job.
What combiners is and when you should use a combiner in a MapReduce Job?
Why comparison of types is important for MapReduce?
what is WebDAV in Hadoop?
Detail description of the Reducer phases?
Clarify what is shuffling in map reduce?
What is a Distributed Cache in Hadoop?
How to set the number of mappers for a MapReduce job?
How to optimize MapReduce Job?
A number of combiners can be changed or not in MapReduce?
How many Mappers run for a MapReduce job in Hadoop?
In Hadoop what is InputSplit?
What is the use of InputFormat in MapReduce process?
Whether the output of mapper or output of partitioner written on local disk?
How to change the name of the output file from part-r-00000 in Hadoop MapReduce?