Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Mention what is data cleansing?
What is big data spark?
What is the key difference between textfile and wholetextfile method?
How NameNode tackle Datanode failures in Hadoop?
What is spark certification?
How do we write our own custom serde?
What is apache spark architecture?
Can we run unix shell commands from hive? Can hive queries be executed from script files? How? Give an example.
What do you mean by Stream Processing in Kafka?
Explain the common input formats in hadoop?
What do you mean by ss table and explain how it is different from the other original tables?
What is the use of cassandra cql collection?
State the difference between Spark SQL and Hql
What is Derby database?
What are the machine learning algorithms supports in apache mahout?