Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
In which language apache kafka is written?
How to change the replication factor of data which is already stored in HDFS?
What is write ahead log(journaling) in Spark?
what are the different modes of Hive?
Explain caching in spark streaming.
How Hive distributes the rows into buckets?
Different ways of debugging a job in MapReduce?
How does hdfs ensure information integrity of data blocks squares kept in hdfs?
Mention some machine learning algorithms exposed by mahout?
What is Pig Storage?
What are the parameters of mappers and reducers?
Ideally what should be replication factor in a Hadoop cluster?
What is keyvaluetextinputformat?
What is the process of creating ambari client?
List the various types of "Cluster Managers" in Spark.