Big Data Interview Questions
Questions Answers Views Company eMail

Explain in brief what is the architecture of Spark?

205

What is Spark DataFrames?

163

In how many ways can we use Spark over Hadoop?

203

Explain Machine Learning library in Spark?

199

Explain about transformations and actions in the context of RDDs.

216

Explain the difference between Spark SQL and Hive.

241

Explain Catalyst framework?

240

What are the downsides of Spark?

210

What is the need for Spark DAG?

209

What is broadcast variable?

273

Why does the picture of Spark come into existence?

204

How can we launch Spark application on YARN?

243

Explain catalyst query optimizer in Apache Spark?

196

What are the various modes in which Spark runs on YARN? (Local vs Client vs Cluster Mode)

168

Apache Spark is a good fit for which type of machine learning techniques?

211


Un-Answered Questions { Big Data }

Why Mapper runs in heavy weight process and not in a thread in MapReduce?

380


What is nlineoutputformat?

410


What are the different methods to set up local repositories?

50


What is a bookkeeper client in bookkeeper?

1


Can copper cause a spark?

183






Define “speculative execution” in hadoop?

239


How can one write custom record reader?

260


Define functions of SparkCore?

387


Does Hoe Spark handle monitoring and logging in Standalone mode?

195


Discuss about the different tombstone markers used for deletion purposes in HBase.?

136


Which java class handles the Input record encoding into files which store the tables in Hive?

423


What is the utilization of hcatalog?

410


What is standalone mode in spark?

209


What is difference between spark and mapreduce?

188


Explain Data Type Conversion in Hive?

429