Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Is apache spark a database?
What does map transformation do? Provide an example.
What are the different input sources for Spark Streaming?
How is hadoop different from other data processing tools?
What are the important modes of hadoop?
What is active and passive NameNode in HDFS?
What is an "Accumulator"?
What is zookeper?
What do you understand by receivers in Spark Streaming ?
Define speculative execution?
Which one is the master node in HDFS? Can it be commodity hardware?
What is SSTable? How is it different from other relational tables?
What do you understand by High availability?
Establish the difference between a node, cluster & data centres in Cassandra.
What is the difference between Reducer and Combiner in Hadoop MapReduce?