Hadoop (4218)
Big Data General (104)
Big Data AllOther (3) What are the various input and output types supported by mapreduce?
What is Geo-Replication in Kafka?
What is the reason for creating a new metastore_db whenever Hive query is run from a different directory?
What does hadoop-metrics.properties file do?
What are the different zkclientbindings?
How many operational command in hbase?
As part of optimizing the queries in hive, what should be the order of table size in a join query?
What is kafka topic?
What is inputformat in hadoop?
What is spark vs hadoop?
Does Apache Spark provide check pointing?
What is Counter in MapReduce?
What is partitioner and its usage?
What do you know by storage and compute node?
Define HDFS and talk about their respective components?