how would you modify that solution to only count the number of unique words in all the documents?
No Answer is Posted For this Question
Be the First to Post Answer
Explain the hadoop configuration files at present?
How many instances of JobTracker can run on a Hadoop Cluser?
What is unstructured data?
Why do we use HDFS for applications having large data sets and not when there are lot of small files?
what should be the ideal replication factor in hadoop?
Define a combiner?
what is meaning Replication factor?
Which are the two types of 'writes' in HDFS?
What is IdentityMapper?
What is a “Distributed Cache” in Apache Hadoop?
How is security achieved in Apache Hadoop?
What is Safemode in Apache Hadoop?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)