Answer Posted / Hitesh Pandey
The primary way to represent data in Apache Spark is through Resilient Distributed Datasets (RDDs). An RDD is an immutable distributed collection of data that can be partitioned across nodes in a cluster. The data is divided into logical partitions, and each partition is stored on one or more nodes.
| Is This Answer Correct ? | 0 Yes | 0 No |
Post New Answer View All Answers