Answer Posted / Peeyush Tripathi
Apache Spark manages accumulated metadata using a structure called the lineage. Each RDD in the computation has a lineage, which is a record of all its ancestors. This allows Spark to trace back the history of data and recalculate any RDD that is needed again if it or one of its dependencies fails. Additionally, Spark periodically prunes the lineage graph to reduce memory usage.
| Is This Answer Correct ? | 0 Yes | 0 No |
Post New Answer View All Answers