What are the major features/characteristics of rdd (resilient distributed datasets)?
Answer Posted / Umesh Kumar Chaurasia
{"Resilient Distributed Datasets (RDDs) in Apache Spark are immutable distributed collections that can be constructed from various data sources. RDDs are the fundamental building blocks for processing data in Spark, and they offer several important features: 1) Immutability: once an RDD is created, it cannot be modified; 2) Fault tolerance: RDDs automatically recompute lost partitions when a worker node fails to ensure consistent results; 3) Lineage: each RDD has a lineage that records the transformation history; and 4) Scalable and parallel computation: RDDs are designed to perform scalable and efficient computations across large datasets."}
| Is This Answer Correct ? | 0 Yes | 0 No |
Post New Answer View All Answers