Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Double type in Hive - Important points?
Enlist the several components in Kafka?
What is structured and unstructured data?
What are the differences between Caching and Persistence method in Apache Spark?
What is apache spark good for?
Can I do insert … select * into a partitioned table?
Is it necessary to install spark on all the nodes of a YARN cluster while running Apache Spark on YARN ?
What is a Cluster, Node and Key space in Cassandra ?
What is the use of ZooKeeper?
Can we use Ambari Python Client to use of Ambari API’s?
What is high availability in hadoop?
When we send a data to a node, do we allow settling in time, before sending another data to that node?
Why spark is faster than hadoop?
Differentiate between PigLatin and Hive?
What is structured data?