Why is there a need for broadcast variables when working with Apache Spark?
Answer / Amit Jeet Kumar
Broadcast variables are useful in Apache Spark when a large dataset needs to be accessed by many tasks. By broadcasting the data, it reduces network communication overhead since only the changes in the dataset are sent instead of the entire dataset.
| Is This Answer Correct ? | 0 Yes | 0 No |
List the advantage of Parquet file in Apache Spark?
What does apache spark do?
Explain different transformations in DStream in Apache Spark Streaming?
What is difference between spark and scala?
What is sc parallelize in spark?
Is spark based on hadoop?
What is a partition in spark?
What does spark do during speculative execution?
Compare MapReduce and Spark?
Describe Spark SQL?
Explain the operations of Apache Spark RDD?
What is coalesce in spark sql?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)