What are the various types of shared variable in apache spark?
Answer / Aman Varshney
1. Broadcast variables: Used for sending large amounts of data to each executor without duplicating it. 2. Accumulators: Used for updating a single variable across multiple tasks and actions. 3. Distributed cache: Used to share files or RDDs between the driver program and executors.
| Is This Answer Correct ? | 0 Yes | 0 No |
Which is better scala or python for spark?
What is Directed Acyclic Graph(DAG)?
Define Partitions?
How is streaming implemented in spark? Explain with examples.
What is "GraphX" in Spark?
Where is apache spark used?
How Spark handles monitoring and logging in Standalone mode?
Explain join() operation in Apache Spark?
Who created spark?
What are the features of RDD, that makes RDD an important abstraction of Spark?
What is mlib?
How does spark work with python?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)