Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
How does hdfs give great throughput?
What is configured in /etc/hosts and what is its role in setting Hadoop cluster?
What is the difference between cassandra's schema and rdbms schema?
Is spark sql a database?
Explain the term paired RDD in Apache Spark?
What is difference between map and flatmap?
If data is present in HDFS and RF is defined, then how can we change Replication Factor?
In Hadoop, which file controls reporting in Hadoop?
What are the advantages of datasets in spark?
Highlight the difference between group and Cogroup operators in Pig?
What is inputformat in hadoop?
What features from relational databases or hive are not available in impala?
What do you mean by meta information in hdfs? List the documents related to metadata.
What is the difference between pig and hive?
What types of costs are associated in creating index on hive tables?