Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Explain how is hadoop different from other data processing tools?
Assume that an HBase table Student is disabled. So, how to access the student table once it is disabled, by using Scan command?
Explain Reliability and Failure Handling in Apache Flume?
Is there an easy way to expire a session for testing?
What is a Combiner?
Clarify what is sequence file input format?
What is difference between dataset and dataframe?
What is mlib in apache spark?
Explain the general mapreduce algorithm
What is DistributedCache and its purpose?
When to use hadoop, hbase, hive and pig?
How many partitions are created by default in Apache Spark RDD?
Can you give a detailed overview about the Big Data being generated by Facebook?
Define functions of SparkCore?
In MapReduce how to change the name of the output file from part-r-00000?