Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) What is a parquet file?
What do you know about sequencefileinputformat?
What does secondary name-node means?
What are the parameters of mappers and reducers?
What are “Seed Nodes” in Cassandra?
How does hdfs get a good throughput?
According to IBM, what are the three characteristics of Big Data?
How do I know how many impala nodes are in my cluster?
Who uses Cassandra?
How rdd can be created in spark?
Where is rdd stored?
What is the Repository?
How is the distance between two nodes defined in Hadoop?
What does the “USE” command in the hive do?
How do you categorize a big data?