What is a recall?
Define standard deviation, mean, mode and median.
How will you deal with unbalanced data where the ratio of negative and positive is huge?
Can you explain data munging or data wrangling?
Treating a categorical variable as a continuous variable would result in a better predictive model?
If you had to choose between the programming languages r and python, which one would you use for text analytics?
What is market basket analysis? How would you do it in r and python?
Explain how to define the number of clusters in a clustering algorithm?
What is meant by Random Forest Classifier?
What is collaborative filtering?
Explain types of clustering algorithm?
What is random forests and how is it different from decision trees?
How can you decide if one algorithm is better than the other?
What is linear optimization? Where is it used?
Do gradient descent methods always converge to same point?