Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is kafka?
What are the functionalities of jobtracker?
What is column store db? Explain with an example.
Explain the difference between nas and hdfs?
Name the scalar data type and complex data types in Pig?
Explain apache spark streaming? How is the processing of streaming data achieved in apache spark?
how Cassandra writes changed data into commitlog?
In MapReduce, ideally how many mappers should be configured on a slave?
How would you tackle counting words in several text documents?
What is the best hardware configuration to run Hadoop?
What is REST?
What is the best practice to deploy the secondary name node?
Name the operations supported by rdd?
Should the region server be located on all DataNodes?
Explain how Hadoop cluster hardware planning and provisioning is done?