Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Why do we use HDFS for applications having large data sets and not when there are lot of small files?
1 2686
What is lambda in spark?
Say what the object inspector functionality is in hive?
why use hcolumndescriptor class?
What is CQL?
What is OutputCommitter?
How rdd persist the data?
What co-group does in Pig?
How will you explain COGROUP in Pig?
How will you design or modify schema in hbase programmatically?
Name some companies that are already using Spark Streaming?
What are the port numbers of task tracker?
What is the need of MapReduce?
Explain about the replication and multiplexing selectors in Flume?
What do you understand by compute and storage nodes?
While writing evaluate UDF, which method has to be overridden?