What are the ways in which one can know that the given operation is transformation or action?
What is spark lineage?
What is apache spark architecture?
What do we mean by Partitions or slices?
Is spark based on hadoop?
How do you parse data in xml? Which kind of class do you use with java to pass data?
What does map transformation do? Provide an example.
How does spark rdd work?
What is a DStream?
What are the languages in which Apache Spark create API?
What is apache spark written in?
What do you know about schemardd?
What is difference between cache and persist in spark?
What is a spark rdd?
Explain benefits of lazy evaluation in RDD in Apache Spark?