What is rdd in spark with example?
Answer / Neeraj P Singh
"Resilient Distributed Datasets (RDD) are the basic data structure of Apache Spark. An RDD is an immutable distributed collection of objects that can be processed in parallel across a cluster. For example, you can create an RDD from a list of integers using the `spark.sparkContext.parallelize()` method like so: `val rdd = sparkContext.parallelize(Array(1, 2, 3, 4))`."n
| Is This Answer Correct ? | 0 Yes | 0 No |
What is apache spark architecture?
Explain the term paired RDD in Apache Spark?
What are the types of Transformation in Spark RDD Operations?
What is spark machine learning?
What is spark vcores?
Is the following approach correct? Is the sqrt Of Sum Of Sq a valid reducer?
What is a "Parquet" in Spark?
how can you identify whether a given operation is transformation or action?
Explain write ahead log(journaling) in spark?
Why is there a need for broadcast variables when working with Apache Spark?
Name various types of Cluster Managers in Spark.
Does spark use java?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)