Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is the full form of MSLAB?
What are features of apache spark?
How can Flume be used with HBase?
What is the difference between reducebykey and groupbykey?
Replication causes data redundancy then why is is pursued in HDFS?
Why is BlinkDB used?
How can you stop a partition form being queried?
What are components of Cassandra Data Model?
Does Flume provide 100% reliability to the data flow?
Explain what is “map” and what is "reducer" in hadoop?
What happens if the block on Hadoop HDFS is corrupted?
How tables are managed in apache tajo?
Describe Replication Factor?
Explain Avro Schemas?
What are the features of RDD, that makes RDD an important abstraction of Spark?