Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What is spark deploy mode?
Explain the terms Spark Partitions and Partitioners?
Explain how does hbase actually delete a row?
Can you explain recommendation engine?
What are the three components of Cassandra write?
What is a map side join?
Is kafka an etl tool?
Hadoop achieves parallelism by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program and slow down the program. What mechanism Hadoop provides to combat this?
How data or file is written into Hadoop HDFS?
Is hadoop a memory?
Explain Catalyst framework?
What do you mean by tunable consistency?
What is the role of the ZooKeeper in Kafka?
What is the problem with small files in Hadoop?
What is rdd partition?