Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain in brief what is the architecture of Spark?
Explain Cqlsh?
What is the replication factor?
What is the problem with HDFS and streaming data like logs
What do you understand by High availability?
what if job tracker machine is down?
Discuss the precautions that are needed to take care while adding a column?
Is spark sql faster than hive?
Explain what is Cassandra-Cqlsh?
How can you debug a pig script?
What is sparkContext?
How to set property in apache tajo?
Who divides the file into Block while storing inside hdfs in hadoop?
What do you mean by inputformat?
Is ambari python client can be used to make good use of ambari api’s?