Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
No Answer is Posted For this Question
Be the First to Post Answer
What are the most commonly defined input formats in Hadoop?
What are the network requirements for using hadoop?
What will be the consideration while we do Hardware Planning for Master in Hadoop architecture?
Whats is distributed cache in hadoop?
What are the additional benefits YARN brings in to Hadoop?
Who is a 'user' in HDFS?
what factors the block size takes before creation?
What do you know about nlineoutputformat?
Name the various types of lists supported by bootstrap.
What is formatting of the dfs?
Which one is default InputFormat in Hadoop ?
What are the two main components of ResourceManager?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)