Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What happen when namenode enters in safemode in hadoop?
What is the latest version of spark?
What are the port numbers of namenode, job tracker and task tracker?
What is graph db?
Define parquet file format? How to convert data to parquet format?
Can Ambari manage multiple clusters?
How can Flume be used with HBase?
What does a split do?
Replication causes data redundancy then why is pursued in hdfs?
What are the different modes in which PIG can run and explain those?
Why should we use ‘distinct’ keyword in Pig scripts?
What is Apache Hadoop YARN?
What are problems with small files and hdfs?
How do you organize the pig latin statements?
Explain the maximum size of a message that can be received by the Kafka?