Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Why HDFS stores data using commodity hardware despite the higher chance of failures?
Give examples of the SerDe classes whihc hive uses to Serializa and Deserilize data?
List the network requirements for using Hadoop ?
What are the majorly used commands in sqoop?
What are some of the characteristics of Hadoop framework?
What is Spark SQL?
What is compute and Storage nodes?
Did you ever ran into a lop sided job that resulted in out of memory error, if yes then how did you handled it ?
What is dag – directed acyclic graph?
Differentiate HDFS & HBase?
Define the management tools in Cassandra?
What is the difference between hbase and hdfs?
How can I improve my spark performance?
When we send a data to a node, do we allow settling in time, before sending another data to that node?
What is SuperColumn in Cassandra?