Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What is apache spark in big data?
Does impala use caching?
Explain Spark coalesce() operation?
Mention what job does the conf class do?
What are the port numbers of job tracker?
How would you tackle calculating the number of unique visitors for each hour by mining a huge apache log? You can use post processing on the output of the mapreduce job.
Why is block size large in Hadoop?
Compare Pig vs Hive vs Hadoop MapReduce?
What is the latest version of spark?
Explain the features of fully distributed mode?
How does apache spark engine work?
What is Internal and External table in Hive?
Does spark need hadoop?
What is the difference between spark ml and spark mllib?
Where do you specify the Mapper Implementation?