Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is Safemode in Apache Hadoop?
How can you see the list of stored jobs in sqoop metastore?
How do you write your own SerDe?
How will you calculate the number of executors required to do real-time processing using Apache Spark? What factors need to be considered for deciding on the number of nodes for real-time processing?
To use Spark on an existing Hadoop Cluster, do we need to install Spark on all nodes of Hadoop?
What is difference between cache and persist in spark?
Differentiate between drop and truncate in cqlsh
Name the examples of some companies that are using hadoop structure?
What is the use of flatmap in spark?
Are multiline comments supported in Hive?
How do sparks work?
Explain the flatMap operation on Apache Spark RDD?
How can hive avoid mapreduce?
What is the use of “void close()” method?
Why is block size set to 128 MB in HDFS?