Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
Explain what happens when hadoop spawned 50 tasks for a job and one of the task failed?
Why do we use persist () on links rdd?
Explain some important features of hadoop?
Explain how RDDs work with Scala in Spark
What is Combiner in MapReduce?
According to IBM, what are the three characteristics of Big Data?
what does the shell commands “Capture” and “Consistency” determines?
what is pig?
What is non-dfs used in hdfs web console
Can we install spark on windows?
what are the languages supported by apache spark for developing big data applications?
Are there any problems which can only be solved by MapReduce and cannot be solved by PIG? In which kind of scenarios MR jobs will be more useful than PIG?
How to set up local repository manually?
How is HDFS fault tolerant?
What will you do when NameNode is down?