Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407) Explain about the execution plans of a pig script?
or
differentiate between the logical and physical plan of an apache pig script?
You have a file employee.txt in the hdfs directory with 100 records. You want to see only the first 10 records from the employee.txt file. How will you do this?
962
Does Flume provide 100% reliability to the data flow?
Why do we need hive?
What is a TaskInstance?
Explain the various Transformation on Apache Spark RDD like distinct(), union(), intersection(), and subtract()?
Name the filter which accepts the page size as the parameter in hbase?
Can ambari manage multiple clusters and why?
Why is flume used?
Why is spark popular?
What are the uses of explode hive?
What are the different life cycle commands in ambari?
Can you explain how you can use Apache Spark along with Hadoop?
What is the disadvantage of spark sql?
How many layers of Hadoop components are supported by Apache Ambari and what are they?
What is the purpose of JConsole?
What are the features of spark rdd?