Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) Which spark library allows reliable file sharing at memory speed across different cluster frameworks?
263Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
717
Explain what is a column family in cassandra?
What are "coordinator nodes" in cassandra?
What are the different ways of representing data in Spark?
What is the History of Cassandra Database ?
Explain the benefits of block transfer?
What do you understand by the super column in cassandra?
Why big data?
Mention the common features in Pig and Hive?
Can you tell us more about ssh?
How many ways we can create rdd?
Is there any difference between FileSink and FileRollSink?
How can a user create a Keyspace in Cassandra?
Is spark a language?
What is kafka Producer?
What is difference between split and block in hadoop?