Can you define rdd lineage?
Answer Posted / Mohit Pal
RDD (Resilient Distributed Dataset) in Apache Spark is an immutable distributed collection of objects that can be processed in parallel across a cluster.
Post New Answer View All Answers
What is the latest version of spark?
287
List the advantage of Parquet file in Apache Spark?
473
Explain how RDDs work with Scala in Spark
355
What is meant by Transformation? Give some examples.
328