How would you tackle calculating the number of unique visitors for each hour by mining a huge apache log? You can use post processing on the output of the mapreduce job.
No Answer is Posted For this Question
Be the First to Post Answer
What is shuffling and sorting in mapreduce?
What happens when a datanode fails ?
Why do we need MapReduce during Pig programming?
How to submit extra files(jars,static files) for MapReduce job during runtime in Hadoop?
How to handle record boundaries in Text files or Sequence files in MapReduce InputSplits?
Difference between mapreduce and spark
How does inputsplit in mapreduce determines the record boundaries correctly?
What are the various InputFormats in Hadoop?
How to change a number of mappers running on a slave in MapReduce?
What do you understand by the term Straggler ?
Can we rename the output file?
what daemons run on a master node and slave nodes?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)