Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is difference between flume and sqoop?
Explain sortbykey() operation?
What is the function of consistency cqlsh command in Cassandra?
What is spark application?
What is off heap memory in spark?
Explain different execution modes available in Pig?
Explain about the core components of Flume?
How many types of ambari repositories are available?
What is the difference between a node, a cluster, and data centre?
Are there any problems which can only be solved by MapReduce and cannot be solved by PIG? In which kind of scenarios MR jobs will be more useful than PIG?
Do we need to install spark in all nodes?
how would you modify that solution to only count the number of unique words in all the documents?
What is Cassandra Query Language?
Do we need to install scala for spark?
What are the languages supported by apache spark and which is the most popular one?