Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Map reduce jobs take too long. What can be done to improve the performance of the cluster?
What is a "Spark Executor"?
Why HDFS performs replication, although it results in data redundancy?
List out the ways of creating RDD in Apache Spark?
What do you understand by mapreduce?
Mention what are the three types of tombstone markers in hbase?
What happens when a datanode fails ?
Can I do trforms or add new functionality?
What do you mean by Speculative execution in Apache Spark?
Explain the process of spilling in MapReduce?
UPPER or UCASE function in Hive with example?
What is Importance of Java in Apache Kafka?
What is the inputsplit in map reduce software?
What is the port number for NameNode
What is a yaml file in cassandra?