Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Can you explain hadoop streaming?
How would you tackle calculating the number of unique visitors for each hour by mining a huge apache log? You can use post processing on the output of the mapreduce job.
What is a table generating function on hive?
What is the significance of ‘IF EXISTS” clause while dropping a table?
What is Fault Tolerance in Hadoop HDFS?
Difference Between Hadoop and HDFS?
Does the HDFS go wrong? If so, how?
Where can I find impala documentation?
Is there an api for implementing graphs in spark?
What is the Job interface in MapReduce framework?
In which language is the Ambari Shell is developed?
How to setup the local repository manually?
Which method is used to access HFile directly without using HBase?
Differentiate HDFS & HBase?
What is the difference between python and spark?