Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) Hadoop achieves parallelism by dividing the tasks across many nodes, it is possible for a few slow nodes to rate-limit the rest of the program and slow down the program. What mechanism Hadoop provides to combat this?
456Is it possible to provide multiple input to Hadoop? If yes then how can you give multiple directories as input to the Hadoop job?
431
What is having clause in apache tajo?
What is kafka logs?
What is the heartbeat used for?
How can you store the data in spark?
What are the usage of different consistency levels for write operations ?
Why are Replications critical in Kafka?
List out the ways of creating RDD in Apache Spark?
Is it possible to use same metastore by multiple users, in case of embedded hive?
Why is block size large in Hadoop?
What is the meaning of speculative execution in Hadoop? Why is it important?
Is hadoop open source?
Explain the flatMap() transformation in Apache Spark?
What is job tracker in Hadoop?
What is Client API?
How do I achieve fifo behavior with kafka?