Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Define the term Column Families?
List of some best tools that can be useful for data-analysis?
Is databricks an etl tool?
Can there be no Reducer?
Can MapReduce program be written in any language other than Java?
Compare Spark vs Hadoop MapReduce
Is it possible to provide multiple input to Hadoop? If yes then how can you give multiple directories as input to the Hadoop job?
Explain Spark join() operation?
Explain catalyst query optimizer in Apache Spark?
Is it possible to provide multiple inputs to hadoop? If yes, explain.
Is spark distributed computing?
What is scala and spark?
What is Hive Data Definition language?
Which object can be used to get the progress of a particular job
What are the types of cluster managers in spark?