What are the four basic parameters of a mapper?
What are some typical functions of Job Tracker?
How will you write a custom partitioner for a Hadoop job?
What are the limitations of Hadoop?
What are the restriction to the key and value class ?
Define data cleansing?
Is it possible to have hadoop job output in multiple directories?
List some use cases where classification machine learning algorithms can be used.
Is it possible to provide multiple input to Hadoop? If yes then how can you give multiple directories as input to the Hadoop job?
Can you give some examples of Big Data?
What is fsck?
What is the use of context object?
Are job tracker and task trackers present in separate machines?
Give me the examples of Columnar database ?
List Hadoop’s three configuration files?