What is dag – directed acyclic graph?
Explain schemardd?
Describe coalesce() operation. When can you coalesce to a larger number of partitions? Explain.
When we create an rdd, does it bring the data and load it into the memory?
What does reduce action do?
how can you identify whether a given operation is transformation or action?
Explain the use of broadcast variables
How do you parse data in xml? Which kind of class do you use with java to pass data?
Explain sortbykey() operation?
List various commonly used machine learning algorithm?
What are the ways in which one can know that the given operation is transformation or action?
Describe join() operation. How is outer join supported?
What is the method to create a data frame?
On what all basis can you differentiate rdd, dataframe, and dataset?
How can you manually partition the rdd?