Mention what are the data components used by Hadoop?
Answer / Maneesh Jain
Hadoop primarily works with two types of data components: HDFS (Hadoop Distributed File System) is used for storing large volumes of data in a distributed and fault-tolerant manner. MapReduce is the processing engine that processes the data stored in HDFS.
| Is This Answer Correct ? | 0 Yes | 0 No |
Explain what happens in textinformat ?
What is difference between split and block in hadoop?
Data Engineer Given a list of followers in the format:123, 345234, 678345, 123…Where column one is the ID of the follower and column two is the ID of the followee. Find all mutual following pairs (the pair 123, 345 in the example above). How would you use Map/Reduce to solve the problem when the list does not fit in memory?
Can you explain how do ‘map’ and ‘reduce’ work?
Where do you specify the Mapper Implementation?
What is the best hardware configuration to run Hadoop?
How to change replication factor of files already stored in HDFS?
Which command is used for the retrieval of the status of daemons running the hadoop cluster?
Explain Erasure Coding in Apache Hadoop?
what factors the block size takes before creation?
Explain the use of .mecia class?
What is the difference between rdbms and hadoop?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)