Hadoop (4218)
Big Data General (104)
Big Data AllOther (3)
What is spark vs hadoop?
Name job control options specified by mapreduce.
What is parallelize in spark?
What are the common hadoop PIG interview questions, that you have been asked in a Hadoop Job Interview?
How to write a Custom Key Class?
Mention what is the best way to copy files between hdfs clusters?
Define streaming?
Mention Hive default read and write classes?
What is the difference between local and remote metastore?
Define data integrity? How does hdfs ensure data integrity of data blocks stored in hdfs?
How is spark sql different from hql and sql?
Explain what happens when hadoop spawned 50 tasks for a job and one of the task failed?
Where sorting is done on mapper node or reducer node in MapReduce?
Can you tell us how many daemon processes run on a hadoop system?
Is spark good for machine learning?