Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What are the different Eval functions available in Pig?
What are benefits of DataFrame in Spark?
What is a 'block' in HDFS?
Web-ui shows that half of the datanodes are in decommissioning mode. What does that mean? Is it safe to remove those nodes from the network?
How do I check my spark status?
Clarify how job tracker schedules an assignment?
Specify Cassandra’s importance on Facebook?
What is the difference between Gen1 and Gen2 Hadoop with regards to the Namenode?
If DataNode increases, then do we need to upgrade NameNode in Hadoop?
What is Data Log in Kafka?
What is the difference between rdd and dataframe?
What are "coordinator nodes" in cassandra?
Is it possible to use the same metastore by multiple users, in case of the embedded hive?
Mention some basic tajo shell commands?
What is an offset?