Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) How to configure the number of the Combiner in MapReduce?
What is the heartbeat used for?
When should we use SORT BY instead of ORDER BY?
If a Replica stays out of the ISR for a long time, what does it signify?
Can a partition be archived? What are the advantages and Disadvantages?
How to load data into table created in hive ?
Can we run spark on windows?
Suppose there is file of size 514 mb stored in hdfs (hadoop 2.x) using default block size configuration and default replication factor. Then, how many blocks will be created in total and what will be the size of each block?
what are views in Hive?
What is Apache Flume?
Can you explain how it is different from doing machine learning in r or sas?
What is number of executors in spark?
What are the components of Pig Execution Environment?
Explain the commit log?
What is row rdd in spark?