Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What are different logging levels in cassandra?
State some key Points about Apache Avro?
What are the different components that are available in kafka?
What are the exact differences between reduce and fold operation in Spark?
What is the FlatMap Transformation in Apache Spark RDD?
How many datanodes can run on a single Hadoop cluster?
Use of import-all-tables command in hadoop sqoop?
What is the disadvantage of spark sql?
Suppose that your data is stored in collections, for instance, some binary data, message data or metadata is all keyed on the same value. Will you use HBase for this?
Does mapreduce programming model provide a way for reducers to communicate with each other? In a mapreduce job can a reducer communicate with another reducer?
Is it possible to provide multiple input to Hadoop? If yes then how can you give multiple directories as input to the Hadoop job?
How does impala achieve its performance improvements?
What is the use of tools command?
How can we create a hadoop cluster from scratch?
Explain Hadoop streaming?