Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What are all stats classes in the org.apache.pig.tools.pigstats package?
What is apache hcatalog?
What does rack awareness algorithm means and why is it utilized as a part of hadoop?
Explain when using field grouping in storm, is there any time-out or limit to known field values?
How multi-hop agent can be setup in Flume?
What is a kafka cluster?
What is the difference between Hiveserver1 and Hiveserver2?
When Namenode is down what happens to job tracker?
Mention what are the most common input formats defined in hadoop?
Explain the common input formats in hadoop?
Explain the use of tasktracker in the hadoop cluster?
Since the data is replicated thrice in hdfs, does it mean that any calculation done on one node will also be replicated on the other two?
What is hbase in hadoop?
What is hfile ?
What is RecordReader in a Map Reduce?