Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
Which are the elements of kafka?
When is it not recommended to use MapReduce paradigm for large
What do you mean by schema on reading?
Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
Can you define rdd lineage?
What is PageRank in Spark?
What is Apache Spark Machine learning library?
Explain Machine Learning library in Spark?
Explain different execution modes available in Pig?
What is LazyOutputFormat in MapReduce?
What are ‘reduces’?
What is a scarce system resource?
What is spark and what is its purpose?
Why would nosql be better than using a sql database? And how much better is it?
Why Hadoop performs replication, although it results in data redundancy?