What is an outlier? How do you treat outlier data?
What is column storage?
How is random forests different from decision trees?
Name some kinds of graphs and explain how you would build them in python or r.
How do you do data import in sas?
What is churn? How would it help predict and control churn for a customer?
Which package is used to do data import in r and python? How do you do data import in sas?
What is clustering? What is the difference between kmeans clustering and hierarchical clustering?
What is advantage of using apply family of functions in r?
How do you use lambda in python?
What is the difference between kmeans clustering and hierarchical clustering?
What is the difference between supervised and unsupervised methods?
What is a z test, chi square test, f test and t test?
What packages are used for data mining in python and r?
What are roles in cqlsh?