Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is the difference between dataframe and dataset in spark?
What is the data storage component used by Hadoop?
How is the processing of streaming data achieved in Apache Spark? Explain.
What is bookkeeper?
What is sample Query in Hive?
What is NameNode and DataNode in HDFS?
Do I need to install hadoop for spark?
Is spark a language?
Explain the repartition() operation in Spark?
What is pyarrow?
Define actions in spark.
What is a block and block scanner in HDFS?
What is lazy evaluation in Spark?
What are the different methods to run Spark over Apache Hadoop?
What is the primary purpose of the pig in the hadoop architecture?