Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What does hbase consists of?
What bit version that ambari needs and also list out the operating systems that are compatible?
List out the various advantages of dataframe over rdd in apache spark?
Explain about the common workflow of a Spark program?
Give the differences between the different types of primary keys in cassandra?
Mention what happens if the preferred replica is not in the ISR?
What are the additional benefits YARN brings in to Hadoop?
Define Simple Strategy?
Define commit log?
Elaborate on cassandra - cql?
Why spark is faster than hadoop?
What are shared variables?
Do you need to install Spark on all nodes of Yarn cluster while running Spark on Yarn?
Explain what is Cassandra-Cqlsh?
In which language apache kafka is written?