Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What square measure the options of apache mahout?
What is apache spark used for?
What is coarsegrainedexecutorbackend?
How does hdfs ensure information integrity of data blocks squares kept in hdfs?
Is kafka part of hadoop?
Explain what is a difference between an input split and hdfs block?
How many instances of a jobtracker run on hadoop cluster?
Can you explain spark streaming?
What is a dataset? What are its advantages over dataframe and rdd?
Can you explain textinformat?
Give the name of some components of Cassandra?
Explain a common use case for Flume?
What can skew the mean?
How do you write your own SerDe?
Can you define pagerank?