The primary way to represent data in Apache Spark is through Resilient Dist

How do we represent data in Spark?

Question Posted / Hitesh Pandey

1 Answers
375 Views
I also Faced
E-Mail Answers

Answer Posted / Hitesh Pandey

The primary way to represent data in Apache Spark is through Resilient Distributed Datasets (RDDs). An RDD is an immutable distributed collection of data that can be partitioned across nodes in a cluster. The data is divided into logical partitions, and each partition is stored on one or more nodes.

Is This Answer Correct ?

0 Yes

0 No

Post New Answer View All Answers

Please Help Members By Posting Answers For Below Questions

What is the latest version of spark?

288

List the advantage of Parquet file in Apache Spark?

473

What is meant by Transformation? Give some examples.

328

Explain how RDDs work with Scala in Spark

355