Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Does spark store data?
What is the problem in having lots of small files in hdfs?
What is dataframe in spark?
What is the heartbeat used for?
What are "coordinator nodes" in cassandra?
What is the Reducer used for?
What is keyspace in Cassandra?
What do you understand by Pair RDD?
Explain distnct(),union(),intersection() and substract() transformation in Spark?
What are the differences between hadoop 1 and hadoop 2?
Can you explain apache ambari?
On what basis name node distribute blocks across the data nodes?
What is sparkContext?
What are the various data sources available in SparkSQL?
How can one set space quota in Hadoop (HDFS) directory?