Big Data Interview Questions
Questions Answers Views Company eMail

What kind of datawarehouse application is suitable for Hive?

422

Explain REPEAT function in Hive with example?

471

What does the "USE" command in hive do?

418

Can you give us some examples, how Hadoop is used in real time environment?

414

Big Data Engineer Can you explain what REST is?

LinkedIn,

5

How do you deal with sparse data?

LinkedIn,

5

Name and describe three different kernel functions and in what situation you would use each.

LinkedIn,

5

How would you pipeline large amounts of data?

Mozilla,

297

What is Apache Spark?

195

explain the key features of Apache Spark?

216

How is Apache Spark better than Hadoop?

202

Explain the term paired RDD in Apache Spark?

262

How is RDD in Spark different from Distributed Storage Management?

208

Which all languages Apache Spark supports?

233

explain the concept of RDD (Resilient Distributed Dataset). Also, state how you can create RDDs in Apache Spark.

283


Un-Answered Questions { Big Data }

What are producers in kafka?

296


What is meant by rdd lazy evaluation?

303


What do we mean by Partitions or slices?

194


What is avro format?

33


What is Hive Database?

414






how you can reduce churn in ISR? When does broker leave the ISR?

336


What is connection_loss error?

1


How is HCatalog different from Hive?

395


Can you use Spark to access and analyse data stored in Cassandra databases?

270


Difference Between Apache Sqoop vs Flume?

5


Explain what is a task tracker in hadoop?

212


What are the different tasks we can perform managing host using ambari host tab?

61


How rdd persist the data?

201


If the hadoop administrator needs to make a change, which configuration file does he need to change?

241


What is the relationship between Jobs and Tasks in Hadoop?

410