Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What does job conf class do?
What are the port numbers of job tracker?
What is kafka technology?
Why does one remove or add nodes in a Hadoop cluster frequently?
The partition of hive table has been modified to point to a new directory location. Do I have to move the data to the new location or the data will be moved automatically to the new location?
On what all basis can you differentiate rdd, dataframe, and dataset?
What is Apache Spark? What is the reason behind the evolution of this framework?
What are the functions of "Spark Core"?
What are the different ways you can use to secure a cluster using Ambari?
What do you understand by mapreduce?
What is Spark SQL?
What does name-node mean in hadoop?
Which channel type is faster in Flume?
What is a difference between an input split and hdfs block?
Who uses Cassandra?