Create a program in a language of your choice to read a text file with various tweets. The output should be 2 text files-one that contains the list of all unique words among all tweets along with the count for repeated words and the second file should contain the medium number of unique words for all tweets.
373You can roll a dice three times. You will be given $X where X is the highest roll you get. You can choose to stop rolling at any time (example, if you roll a 6 on the first roll, you can stop). What is your expected pay-out?
339You are at a Casino. You have two dices to play with. You win $10 every time you roll a 5. If you play till you win and then stop, what is the expected pay-out?
371What are the metrics you will use to track if Uber's paid advertising strategies to acquire customers work? How will you figure out the acceptable cost of customer acquisition?
Uber,
303How will you design the heatmap for Uber drivers to provide recommendation on where to wait for passengers? How would you approach this?
Uber,
1720Case Study based questions - Cars are implanted with speed tracker so that the insurance companies can track about our driving state. Based on this new scheme what kind of business questions can be answered?
465Which technique will you use to compare the performance of two back-end engines that generate automatic friend recommendations on Facebook?
451You have two tables-the first table has data about the users and their friends, the second table has data about the users and the pages they have liked. Write an SQL query to make recommendations using pages that your friends liked. The query result should not recommend the pages that have already been liked by a user.
426Post New Data Science Questions
Define some key performance indicators for the product
How to solve multi-collinearity?
Why is it mandatory to clean a data set?
Explain auto-encoder
What prior subject is required to become a data analyst?
What is meant by R statistics?
What is a data science job?
Explain the differences between univariate and multivariate analysis?
How decision tree algorithm is different from the random forest algorithm?
Which one would you prefer for text analytics python or r?
Name some packages in r and python for building regression models.
Can you cite some examples where a false negative important than a false positive?
What is cluster sampling?
What are the factors used to produce "People You May Know" data product on LinkedIn?
Give an example of a data set that has a non-gaussian distribution?