Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is master node in spark?
Explain cogroup() operation in Spark?
Why is the spark so fast?
Why do we need sparkcontext?
What are the tools that are we needed or helps to build ambari?
Define a metadata?
What is mlib?
How do users interact with HDFS in Apache Pig ?
Why hbase is a schema-less database?
What is keyvaluetextinputformat?
What is difference between cache and persist in spark?
List out Hadoop's three configuration files?
What are the ways to run spark over hadoop?
What are the different compaction types in hbase?
Is Apache Kafka is a distributed streaming platform? if yes, what you can do with it?