What is Resilient Distributed Dataset (RDD) in Apache Spark? How does it make spark operator rich?
Answer Posted / Ankit Bhatnagar
Resilient Distributed Dataset (RDD) is the fundamental data structure in Apache Spark. It is an immutable distributed collection of objects. RDDs are fault-tolerant and can handle failures gracefully through lineage information, which allows them to recalculate a partition if a failure occurs. This makes Spark operator rich because it provides a wide range of operations such as map(), filter(), reduce(), and join().
| Is This Answer Correct ? | 0 Yes | 0 No |
Post New Answer View All Answers