Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Are there any problems which can only be solved by MapReduce and cannot be solved by PIG? In which kind of scenarios MR jobs will be more useful than PIG?
How does spark work with python?
What is the difference between Column and SuperColumn?
What are the default record and field delimiter used for hive text files?
Mention what does the shell commands “capture” and “consistency” determines?
What is the use of “void close()” method?
What is the replication factor?
How can one increase replication factor to a desired value in Hadoop?
How do you parse data in xml? Which kind of class do you use with java to pass data?
Can hive queries be executed from script files? How?
What language is apache kafka written in?
What is apache spark sql?
How can we drop a table in HCatalog?
On what basis data will be stored on a rack?
How is it completely different from doing machine learning in r or sas?