Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Why do the nodes are removed and added frequently in a hadoop cluster?
Tell something about the query language used in Cassandra Database?
What is a “Distributed Cache” in Apache Hadoop?
Explain the Differences between Hive and Spark SQL?
Is the hdfs block size reduced to achieve faster query results?
What is a combiner and where you should use it?
What is mlib in apache spark?
What is yarn in hadoop?
Which are the various data sources available in spark sql?
What are the various input and output types supported by mapreduce?
Are Namenode and job tracker on the same host?
What are the purposes of using Ambari shell?
What is the role of the zookeeper?
What are the complex datatypes in pig?
What is the use of context object?