Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
748How would you use Map/Reduce to split a very large graph into smaller pieces and parallelize the computation of edges according to the fast/dynamic change of data?
694Post New Apache Hadoop Questions
Define a job tracker?
What are different hdfs dfs shell commands to perform copy operation?
Explain how can we change the split size if our commodity hardware has less storage space?
What is cloudera and why it is used?
On What concept the Hadoop framework works?
How can we change the split size if our commodity hardware has less storage space?
What is HDFS Federation?
Explain why the name ‘hadoop’?
What is Input Split in hadoop?
What will you do when NameNode is down?
What are the modules that constitute the Apache Hadoop 2.0 framework?
Can we have multiple entries in the master files?
Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
What are the limitations of importing RDBMS tables into Hcatalog directly?
What does /var/hadoop/pids do?