Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)
Explain the operation transformation and action in Apache Spark RDD?
How to create hadoop archive?
Say what the views are in hive?
What do you mean by shuffling and sorting in MapReduce?
How does Apache Spark handles accumulated Metadata?
How can we see only top 15 records from the student.txt out of100 records in the HDFS directory?
What are the components of Spark Ecosystem?
Suppose there is file of size 514 mb stored in hdfs (hadoop 2.x) using default block size configuration and default replication factor. Then, how many blocks will be created in total and what will be the size of each block?
Define a metadata?
How does apache flume work?
How do ‘map’ and ‘reduce’ work?
What are 4 v's of big data?
What is Apache HBase?
What are the features of kafka?
Explain the Parquet File format in Apache Spark. When is it the best to choose this?