What is the problem with small files in Apache Hadoop?



What is the problem with small files in Apache Hadoop?..

Answer / Gaurav Nidhar

The problem with small files in Apache Hadoop is that they can negatively impact the performance of Hadoop Distributed File System (HDFS). This is due to the overhead associated with maintaining metadata for each file, which increases proportionally with the number of files, regardless of their size. Also, small files may not fully utilize the HDFS block size, leading to inefficient use of storage and network bandwidth.

Is This Answer Correct ?    0 Yes 0 No

Post New Answer

More Apache Hadoop Interview Questions

What are the modes in which Apache Hadoop run?

1 Answers  


How would you tackle counting words in several text documents?

1 Answers  


What is the port number for NameNode

1 Answers  


what are the nodes in the Hadoop cluster?

1 Answers  


what are the steps involved in commissioning adding

1 Answers  


What is the difference between hadoop and other data processing tools?

1 Answers  


what are Task Tracker and Job Tracker?

1 Answers  


Can Hadoop be compared to NOSQL database like Cassandra?

1 Answers  


What are the core components of Apache Hadoop?

1 Answers  


Explain the features of fully distributed mode?

1 Answers  


What does the command mapred.job.tracker do?

1 Answers  


What is the difference between HDFS and NAS ?

1 Answers  


Categories