Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain the flatMap operation on Apache Spark RDD?
How to perform the inter-cluster data copying work in HDFS?
What file systems does spark support?
What is the bottom layer of abstraction in the Spark Streaming API ?
What is dag – directed acyclic graph?
State use cases of impala?
What is hadoop sqoop?
In Map Reduce why map write output to Local Disk instead of HDFS?
What problems can be addressed by using Zookeeper?
What is a primary key? And what are it’s different types?
Explain what is difference between an input split and hdfs block?
Define the term Column Families?
What is Row Key?
When should you use spark cache?
Does Apache Sqoop have a default database?