What is the difference between persist() and cache()?
Answer / Hari Shankar
"The main difference lies in storage level: cache() stores data in memory, while persist() allows for more flexibility by allowing you to specify a storage level (such as MEMORY_ONLY, MEMORY_AND_DISK, or OFF_HEAP)."n
| Is This Answer Correct ? | 0 Yes | 0 No |
How is streaming implemented in spark? Explain with examples.
How is fault tolerance achieved in Apache Spark?
How do I use spark with big data?
Can you use Spark for ETL process?
Why do we need sparkcontext?
Explain about the popular use cases of Apache Spark
Explain the term paired RDD in Apache Spark?
What is graphx spark?
Is it possible to run Spark and Mesos along with Hadoop?
What are the common faults of the developer while using Apache Spark?
What is parallelize in spark?
When creating an RDD, what goes on internally?
Apache Hadoop (394)
MapReduce (354)
Apache Hive (345)
Apache Pig (225)
Apache Spark (991)
Apache HBase (164)
Apache Flume (95)
Apache Impala (72)
Apache Cassandra (392)
Apache Mahout (35)
Apache Sqoop (82)
Apache ZooKeeper (65)
Apache Ambari (93)
Apache HCatalog (34)
Apache HDFS Hadoop Distributed File System (214)
Apache Kafka (189)
Apache Avro (26)
Apache Presto (15)
Apache Tajo (26)
Hadoop General (407)