Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
Explain a simple Map/Reduce problem.
Do you need to install spark on all nodes of yarn cluster?
In which language is the Ambari Shell is developed?
What is the throughput? How does hdfs give great throughput?
What are different hdfs dfs shell commands to perform copy operation?
What is the characteristic of streaming API that makes it flexible run MapReduce jobs in languages like Perl, Ruby, Awk etc.?
List out some common problems faced by data analyst?
What is row rdd in spark?
Explain about the smb join in hive?
Clarify what is sequence file input format?
What is meant by streaming access?
Can you explain spark rdd?
What is tasktracker in hadoop?
What do you understand by Consistency in Cassandra?
When is it not recommended to use MapReduce paradigm for large