Resilient Distributed Dataset (RDD) is the fundamental data structure in Ap

What is Resilient Distributed Dataset (RDD) in Apache Spark? How does it make spark operator rich?

Question Posted / Ankit Bhatnagar

1 Answers
305 Views
I also Faced
E-Mail Answers

Answer Posted / Ankit Bhatnagar

Resilient Distributed Dataset (RDD) is the fundamental data structure in Apache Spark. It is an immutable distributed collection of objects. RDDs are fault-tolerant and can handle failures gracefully through lineage information, which allows them to recalculate a partition if a failure occurs. This makes Spark operator rich because it provides a wide range of operations such as map(), filter(), reduce(), and join().

Is This Answer Correct ?

0 Yes

0 No

Post New Answer View All Answers

Please Help Members By Posting Answers For Below Questions

Explain how RDDs work with Scala in Spark

355

What is the latest version of spark?

287

What is meant by Transformation? Give some examples.

328

List the advantage of Parquet file in Apache Spark?

473