Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) What are the features of RDD, that makes RDD an important abstraction of Spark?
How to identify that given operation is transformation/action in your program?
What does the mapred.job.tracker command do?
What is the functionality of Query Processor in Apached Hive ?
Explain distnct(),union(),intersection() and substract() transformation in Spark?
What are “Seed Nodes” in Cassandra?
How can you start the kafka server?
How to add the partition in existing table without the partition table?
How does groupbykey work in spark?
What are the similarities and differences between Apache Flume and Apache Kafka?
What is Grunt shell?
Why do I have to use refresh and invalidate metadata, what do they do?
Why is jconsole used? What is it’s different elements?
When a large data set is maintained?
Elaborate on Identifiers?