Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is the maximum number of rows in a table?
Clarify what is sequence file input format?
There seem to be certain management tools in Cassandra. What are they?
What does producer api in kafka?
Explain how input and output data format of the hadoop framework?
Define replication strategy?
What is the stable version of Hive ?
What is the role of the kafka producer api.
Discuss and explain the various types of partitioners in cassandra?
Explain caching in spark streaming.
What are broadcast variables in Apache Spark? Why do we need them?
What are the features of presto?
What is KeyValueTextInputFormat in Hadoop MapReduce?
What are the characteristics of hadoop framework?
Is it possible to have hadoop job output in multiple directories?