Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What are the various levels of persistence in Apache Spark?
What are the different types of nosql databases?
Why replication is required in Kafka?
How to do ‘map’ and ‘reduce’ works?
Explain different execution modes available in Pig?
Explain about mappartitions() and mappartitionswithindex()
What is the command to change the replication factor ?
What is 'Key value pair' in HDFS?
What do the master class and the output class do?
Explain how can we change the split size if our commodity hardware has less storage space?
What is Apache Spark and what are the benefits of Spark over MapReduce?
Does 'ILLUSTRATE' run MR job?
Mention what are the data components used by Hadoop?
What is dataframe in spark?
What is Spark SQL?