Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Define the common faults of the developer while using apache spark?
What is Spark Streaming?
How will you backup an HBase cluster?
What are the three different modes in which hive can be run?
How does cassandra perform write operations?
Explain briefly what is Action in Apache Spark? How is final result generated using an action?
What are the main features of impala?
Name a few import control commands. How can Sqoop handle large objects?
How will you submit extra files or data ( like jars, static files, etc. ) For a mapreduce job during runtime?
Explain the Reducer's Sort phase?
What are the various programming languages supported by Spark?
List of the some best tools that can be useful for data-analysis?
What is Data Locality in Hadoop?
Is apache spark going to replace hadoop?
Different ways of debugging a job in MapReduce?