Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) When we create an rdd, does it bring the data and load it into the memory?
What is Rack Awareness? What is its need in Hadoop?
How to access HDFS?
Explain what is a task tracker in hadoop?
When we write a= load …, what does 'a' called?
What rdd stands for?
Is a distributed machine learning framework on top of spark?
What are possible types of Channel Selectors?
Differentiate HDFS & HBase?
Why scala is used in spark?
What is closing out ledgers?
Mention what are the data components used by Hadoop?
What are the design goals of zookeeper?
What is azure spark?
How to optimize Hadoop MapReduce Job?