State the difference between persist() and cache() functions.
What is Directed Acyclic Graph(DAG)?
What are Actions? Give some examples.
What is the difference between DSM and RDD?
What do you mean by Persistence?
How to create a Sparse vector from a dense vector?
What are common uses of Apache Spark?
In a very huge text file, you want to just check if a particular keyword exists. How would you do this using Spark?
How is fault tolerance achieved in Apache Spark?
What are the benefits of lazy evaluation?
How does Apache Spark handles accumulated Metadata?
What is lazy evaluation in Spark?
What is the use of checkpoints in spark?
How can Spark be connected to Apache Mesos?
In how many ways RDDs can be created? Explain.