What are the features of RDD, that makes RDD an important abstraction of Spark?
Answer Posted / Neelam
RDD (Resilient Distributed Dataset) is a fundamental data structure in Apache Spark. Its key features include: (1) Immutable: Once created, an RDD cannot be modified; instead, new RDDs are created from existing ones. (2) Distribute and partitioned: Data in RDDs are automatically distributed across nodes in a cluster for parallel processing. (3) Fault-tolerant: Spark stores multiple copies of each partition on different nodes to ensure fault tolerance. When a failure occurs, the lost data can be recovered from other copies. (4) Rich API: RDD provides a rich set of transformation and action operations that are easy to use and extend.
| Is This Answer Correct ? | 0 Yes | 0 No |
Post New Answer View All Answers