Can you define data reduction?
Can you define convex hull?
Can you explain recommender system?
Can you explain data preparation?
Can you define a/b testing?
What cross validation technique would you use on time series data set? Is it k-fold or loocv?
Can you explain interpolation and extrapolation?
Can you explain difference between data modeling and database design?
Explain while working on a data set, how do you select important variables?
Explain how can you assess a good logistic model?
Define cluster sampling?
Explain data munging?
Explain root cause analysis?
How is true positive rate and recall related?
Explain the differences between bivariate and multivariate analysis?