Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
704How would you use Map/Reduce to split a very large graph into smaller pieces and parallelize the computation of edges according to the fast/dynamic change of data?
647Post New Apache Hadoop Questions
What is 'Key value pair' in HDFS?
What is Partioner in hadoop? Where does it run
What is the meaning of speculative execution in Hadoop? Why is it important?
What are the steps to submit a Hadoop job?
What is the functionality of jobtracker in hadoop? How many instances of a jobtracker run on hadoop cluster?
What do you know about sequencefileinputformat?
Define a job tracker?
Explain what if rack 2 and datanode fails?
Explain the features of fully distributed mode?
What are watches?
What is the port number for NameNode
Can Hadoop be compared to NOSQL database like Cassandra?
Explain how is hadoop different from other data processing tools?
What does ‘jps’ command do?
Explain the hadoop-core configuration?