Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
How can you set an arbitrary number of Reducers to be created for a job in Hadoop?
Why is Spark RDD immutable?
Explain the different logging levels in cassandra.
What are the various types of shared variable in apache spark?
How Cassandra stores data?
What are accumulators in spark?
How do users interact with the shell in apache pig?
What is worker node in Apache Spark cluster?
What is the use of flatmap in spark?
What is the function of UNION and SPLIT operators? Give examples?
What is a local repository and when will you use it?
Data node block size in HDFS, why 64MB?
What is Streaming / Log Data?
What is the role zookeeper plays in a cluster of kafka?
What is difference between flume and kafka?