Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain how input and output data format of the hadoop framework?
Which command is used to SHOW PARTITIONS lists in HCatalog?
What do you mean by replication factor?
What is a keyspace in Cassandra?
What are the difference between of the “HDFS Block” and “Input Split”?
Explain what does the conf.setmapper class do?
What is off heap memory in spark?
What is hive on spark?
How is spark fault tolerance?
If there is certain data that we want to use again and again in different transformations, what should improve the performance?
How one can change Replication factor when Data is already stored in HDFS
What is Hive Database?
Why are spark transformations lazy?
What is the difference between rdbms and hadoop?
Can you explain how do ‘map’ and ‘reduce’ work?