What is RDD?

What is RDD?

Question Posted / nilesh

1 Answers
580 Views
I also Faced
E-Mail Answers

What is RDD?..

Answer / Bhaskar Shukla

RDD (Resilient Distributed Dataset) is a distributed collection of data that can be operated on in parallel across a cluster. It is the fundamental data structure in Apache Spark, which can be created from various sources like HDFS files, text files, or other RDDs.

Is This Answer Correct ?

0 Yes

0 No

Post New Answer

More Apache Spark Interview Questions

Can you explain worker node?

Can we run spark without hadoop?

What is pregel api?

Define Partition and Partitioner in Apache Spark?

What is in memory in spark?

How do I change hive execution engine to spark?

What is coarsegrainedexecutorbackend?

How Spark uses Hadoop?

Can spark work without hadoop?

What is the Difference SparkSession vs SparkContext in Apache Spark?

How is the processing of streaming data achieved in Apache Spark? Explain.

Which are the methods to create rdd in spark?

For more Apache Spark Interview Questions Click Here

Categories

Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)