In PySpark, RDD (Resilient Distributed Datasets) are the basic building blo

What is pyspark rdd?

Question Posted / Smit Agarwal

1 Answers
6 Views
I also Faced
E-Mail Answers

Answer Posted / Smit Agarwal

In PySpark, RDD (Resilient Distributed Datasets) are the basic building blocks of Spark. An RDD is an immutable distributed collection of data that can be computed from datasets in Hadoop storage systems like HDFS, or from local data system files. RDDs can be transformed and acted upon by various transformations (e.g., map, filter) and actions (e.g., count, save) provided by PySpark.

Is This Answer Correct ?

0 Yes

0 No

Post New Answer View All Answers

Please Help Members By Posting Answers For Below Questions

How might you associate Hive to Spark SQL?