Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is a block in Hadoop HDFS? What should be the block size to get optimum performance from the Hadoop cluster?
42Explain the sequence of execution of all the components of MapReduce like a map, reduce, recordReader, split, combiner, partitioner, sort, shuffle.
704
When should we use SORT BY instead of ORDER BY?
What are the key differences between Pig vs MapReduce?
Can we deploye job tracker other than name node?
What are the types of transformation in RDD in Apache Spark?
Are results returned as they become available, or all at once when a query completes?
Can you overwrite Hadoop MapReduce configuration in Hive?
Can you list few commonly used hive services?
what is the maximum size of the message does Kafka server can receive?
What is apache spark sql?
How to managed create a table in hive ?
How to specify more than one directory as input in the Hadoop MapReduce Program?
What are the actions followed by hadoop?
What are the problems with Hadoop 1.0?
What is Buckets in Hive?
What is having clause in apache tajo?