Big Data Interview Questions
Questions Answers Views Company eMail

What is the difference between DSM and RDD?

213

What do you mean by Persistence?

202

How to create a Sparse vector from a dense vector?

216

What are common uses of Apache Spark?

194

In a very huge text file, you want to just check if a particular keyword exists. How would you do this using Spark?

256

How is fault tolerance achieved in Apache Spark?

211

What are the benefits of lazy evaluation?

196

How does Apache Spark handles accumulated Metadata?

242

What is lazy evaluation in Spark?

224

What is the use of checkpoints in spark?

187

How can Spark be connected to Apache Mesos?

208

In how many ways RDDs can be created? Explain.

207

What are the advantages of DataSets?

206

What makes Apache Spark good at low-latency workloads like graph processing and machine learning?

234

Can you use Spark to access and analyse data stored in Cassandra databases?

272


Un-Answered Questions { Big Data }

What are the main configuration parameters in a MapReduce program?

388


What is meant by spark in big data?

182


What is write ahead log(journaling)?

210


what is the Hadoop MapReduce APIs contract for a key and value class?

410


What are secondary indexes?

55






What are Actions?

191


Name different types of the data model?

69


How to Administering Hadoop?

648


What is apache tajo?

5


What is spark vcores?

195


What is fluming?

106


What is the procedure of data storage in cassandra?

54


What is throughput? How does HDFS provide good throughput?

24


What platform and Java version is required to run Hadoop?

381


What is write ahead log(journaling) in Spark?

244