Describe Accumulator in detail in Apache Spark?
Answer / Manish Kumar Gupta
An Accumulator in Apache Spark is a variable that can be updated across different tasks during an operation. It provides a convenient way to keep track of aggregated results while performing computations on large datasets. Accumulators are useful for iterative algorithms, reducing the need to shuffle data back and forth between nodes.
| Is This Answer Correct ? | 0 Yes | 0 No |
Explain benefits of lazy evaluation in RDD in Apache Spark?
Does spark use tez?
Can you explain how to minimize data transfers while working with Spark?
What are the features of spark rdd?
How can data transfer be minimized when working with Apache Spark?
How can you trigger automatic clean-ups in Spark to handle accumulated metadata?
How can you remove the elements with a key present in any other RDD?
Explain various level of persistence in Apache Spark?
Explain the use of broadcast variables
What is broadcast variable?
Define various running modes of apache spark?
Name some companies that are already using Spark Streaming?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)