Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is apache presto?
What does it mean by Columnar Storage Format?
What is a rack?
What happen on the namenode when a client tries to read a data file?
What is the maximum size of a message that can be received by the kafka?
What does producer api in kafka?
Explain Hadoop streaming?
What is MapReduce in Hadoop?
Why we need compression and what are the different compression format supported?
when to choose “internal table” and “external table” in hive?
Explain how is hadoop different from other data processing tools?
What is the difference between spark ml and spark mllib?
Explain what is a task tracker in hadoop?
What are configuration files in Hadoop?
What is the use of checkpoints in spark?