Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
How to identify that the given operation is transformation or action?
What is the ZooKeeper ensemble?
What is spark deploy mode?
Do we need to give a password, even if the key is added in ssh?
Replication causes data redundancy and consume a lot of space, then why is it pursued in hdfs?
How the read operation is performed on Cassandra node ?
What type of data hadoop can handle ?
What is mapreduce algorithm?
What is the primary objective of NoSQL databases?
How do you check if a particular partition exists?
What kind of data warehouse application is suitable for Hive? What are the types of tables in Hive?
Why is the spark so fast?
What are the basic steps to writing a UDF Function in Pig?
What is a TaskInstance?
What are the ways to run spark over hadoop?