Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) Illustrate a simple example of the working of MapReduce.
What is the difference between python and spark?
Clarify what combiners are and when you should utilize a combiner in a map reduce job?
What is anti-entropy and how is it associated with merkel tree?
What is the driver program in spark?
What is the relation between job and task in hadoop?
What is the difference between Hive CLI and Beeline?
If I create a folder in HDFS, will there be metadata created corresponding to the folder? If yes, what will be the size of metadata created for a directory?
Can Hadoop be compared to NOSQL database like Cassandra?
Name all HCatalog Features?
Explain reduceByKey() Spark operation?
Where does the data of a Hive table gets stored?
What does rdd mean?
what happens when Hadoop spawned 50 tasks for a job and one of the task failed?
What is vectorized query execution?