What is a spark rdd?
Answer / Neeraj Sahu
Resilient Distributed Datasets (RDDs) are the fundamental data structure in Apache Spark. They are fault-tolerant distributed collections of objects that can be processed in parallel.
| Is This Answer Correct ? | 0 Yes | 0 No |
Explain Accumulator in Spark?
What are accumulators in Apache Spark?
What happens when we submit a spark job?
Can I run Apache Spark without Hadoop?
Explain catalyst query optimizer in Apache Spark?
What is salting in spark?
Is there a module to implement sql in spark? How does it work?
Is spark written in java?
How to explain Bigdatadeveloper projects
How can you compare Hadoop and Spark in terms of ease of use?
What is the difference between spark and python?
What is number of executors in spark?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)