What is the problem with HDFS and streaming data like logs
Answer / Udit Jain
HDFS was not designed for handling real-time or streaming data efficiently. It has high latency due to the need to write data in blocks, which can cause delays when dealing with small, frequent writes.
| Is This Answer Correct ? | 0 Yes | 0 No |
Mention what is the use of Context Object?
How blocks are distributed among all data nodes for a particular chunk of data?
What is a heartbeat in HDFS?
What is the default block size in hdfs?
What are watches?
What is a Combiner?
Explain how do ‘map’ and ‘reduce’ works?
What is HDFS - Hadoop Distributed File System?
how would you modify that solution to only count the number of unique words in all the documents?
What is Writable & WritableComparable interface?
Is hadoop obsolete?
Explain the difference between NameNode
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)