Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) How hbase uses zookeeper?
What is the importance of — the split-by clause in running parallel import tasks in sqoop?
What is flume used for?
Why do we need MapReduce during Pig programming?
Explain what is logging in Cassandra?
Give the sqoop command to see the content of the job named myjob?
What does serdes mean in apache kafka?
What is the default replication factor in Hadoop and how will you change it?
Why do we use persist () on links rdd?
List of the some best tools that can be useful for data-analysis?
Can you explain broadcast variables?
How does gossip protocol work?
What is shuffle read and shuffle write in spark?
Explain what do you understand by cassandra- cql collections?
What are 3 core dimension of big data?