Explain Accumulator in Spark?
Answer / Gourav Pandey
"Accumulator": A variable that allows you to modify its value across multiple actions and tasks. They can be used for purposes such as counting the number of occurrences of a particular event, or storing intermediate results of a computation. Accumulators are useful when you need to calculate statistics over the data processed during the execution of Spark jobs, but don't want to store this data because it is large or unnecessary for your purposes."
| Is This Answer Correct ? | 0 Yes | 0 No |
Explain Spark join() operation?
What is the difference between map and flatmap?
Define a worker node?
Where are rdd stored?
Explain Spark SQL caching and uncaching?
Define parquet file format? How to convert data to parquet format?
What are broadcast variables in Apache Spark? Why do we need them?
How to process data using Transformation operation in Spark?
What is apache spark core?
What is spark etl?
What is Starvation scenario in spark streaming?
What is difference between dataset and dataframe in spark?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)