Why is transformation lazy operation in Apache Spark RDD? How is it useful?
Answer / Rahul Bajpai
Transformations in Apache Spark are lazy because they only execute when an action (like collect(), count(), or save()) is triggered. This allows for efficient task scheduling, optimization, and caching of intermediate results to improve performance for large datasets.
| Is This Answer Correct ? | 0 Yes | 0 No |
Name some sources from where Spark streaming component can process real-time data?
What are shared variables in Apache Spark?
Explain the operation transformation and action in Apache Spark RDD?
What does reduce action do?
Which all languages Apache Spark supports?
How does groupbykey work in spark?
Explain the run-time architecture of Spark?
What are the ways in which Apache Spark handles accumulated Metadata?
Explain the key features of Spark.
Why spark is faster than hive?
What is the command to start and stop the Spark in an interactive shell?
How can data transfer be minimized when working with Apache Spark?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)