Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain hdfs?
How rdd persist the data?
When to use explode in Hive?
What do you mean by replication strategy?
What is the difference between spark ml and spark mllib?
What do you understand by standalone (or local) mode?
What are Guarantees provided by Kafka?
Name different types of NoSQL database?
What platform and Java version is required to run Hadoop?
What is an "RDD Lineage"?
Can we deploy job tracker other than name node?
Why would nosql be better than using a sql database? And how much better is it?
Can you explain logistic regression?
Explain the use of .mecia class?
Is apache spark a framework?