What is RDD?



What is RDD?..

Answer / Bhaskar Shukla

RDD (Resilient Distributed Dataset) is a distributed collection of data that can be operated on in parallel across a cluster. It is the fundamental data structure in Apache Spark, which can be created from various sources like HDFS files, text files, or other RDDs.

Is This Answer Correct ?    0 Yes 0 No

Post New Answer

More Apache Spark Interview Questions

Can you explain worker node?

1 Answers  


Can we run spark without hadoop?

1 Answers  


What is pregel api?

1 Answers  


Define Partition and Partitioner in Apache Spark?

1 Answers  


What is in memory in spark?

1 Answers  


How do I change hive execution engine to spark?

1 Answers  


What is coarsegrainedexecutorbackend?

1 Answers  


How Spark uses Hadoop?

1 Answers  


Can spark work without hadoop?

1 Answers  


What is the Difference SparkSession vs SparkContext in Apache Spark?

1 Answers  


How is the processing of streaming data achieved in Apache Spark? Explain.

1 Answers  


Which are the methods to create rdd in spark?

1 Answers  


Categories