Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What are the file formats that Hive supports and can use be used for storage?
What is log compaction?
What are the modes in which Apache Hadoop run?
Explain how you can improve the throughput of a remote consumer?
What does reduce action do?
What is the replica placement Strategy in Cassandra ?
State some highlights of Ambari?
What is the role of “ambari-qa” user?
How do sparks work?
Explain is it possible to search for files using wildcards?
What is NameNode and DataNode in HDFS?
What is Partioner in hadoop? Where does it run
Is it possible to iterate through the rows of HBase table in reverse order?
How do you parse data in xml? Which kind of class do you use with java to parse data?
What is a dstream in apache spark?