Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
List the advantage of Parquet files?
What are the file formats supported by spark?
What does map transformation do? Provide an example.
How can you use adminclient api?
What are the differences between hadoop 1 and hadoop 2?
What does the Spark Engine do?
What is shuffling and sorting in Hadoop MapReduce?
Does spark use hive?
Replication causes data redundancy then why is is pursued in HDFS?
What is a Hive variable? What for we use it?
What is kafka Producer?
What are brokers in kafka?
Wherever (Different Directory) I run hive query, it creates new metastore_db, please explain the reason for it?
Is avro supported?
Can you list down the limitations of using Apache Spark?