Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
What is the use of spark sql?
Explain how can you debug hadoop code?
Hdfs stores data using commodity hardware which has higher chances of failures. So, how hdfs ensures the fault tolerance capability of the system?
Explain some Kafka Streams real-time Use Cases?
Which one is better hadoop or spark?
What is the optimal block size in HDFS?
When to avoid secondary indexes?
What is dataproc cluster?
What do you mean by the High Availability of a NameNode in Hadoop HDFS?
What is keyspace in Cassandra?
What is the use of checkpoints in spark?
Explain what is composite type in cassandra?
Is spark sql a database?
What is MapFile?
What is Pig Storage?