Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Why Mapper runs in heavy weight process and not in a thread in MapReduce?
Is spark an etl?
What is the advantage of hadoop over java serialization?
What exactly is apache spark?
What is a spark context?
How does a log flume work?
How does MapReduce framework view its input internally?
Explain transformation in rdd. How is lazy evaluation helpful in reducing the complexity of the system?
How can Spark be connected to Apache Mesos?
How can hive avoid mapreduce?
What are the prime features of apache zookeeper?
Give a brief overview of Hadoop history?
Explain briefly what is Action in Apache Spark? How is final result generated using an action?
In Hadoop what is InputSplit?
What are the components of a Hive query processor?