Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What are the machine learning algorithms supports in apache mahout?
What is the difference between kafka and flume?
What is Grunt shell?
What operations does rdd support?
What language is apache spark?
Name some Complex types of Data types, Avro Supports?
What happen on the namenode when a client tries to read a data file?
What does the command mapred.job.tracker do?
Is it possible to provide multiple input to Hadoop? If yes then how can you give multiple directories as input to the Hadoop job?
Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
What is secondary namenode?
Specify the different methods of hive?
How to write a custom partitioner for a Hadoop MapReduce job?
What is salting in spark?
what is the traditional method of message transfer?