Create a program in a language of your choice to read a text file with various tweets. The output should be 2 text files-one that contains the list of all unique words among all tweets along with the count for repeated words and the second file should contain the medium number of unique words for all tweets.
337You can roll a dice three times. You will be given $X where X is the highest roll you get. You can choose to stop rolling at any time (example, if you roll a 6 on the first roll, you can stop). What is your expected pay-out?
296You are at a Casino. You have two dices to play with. You win $10 every time you roll a 5. If you play till you win and then stop, what is the expected pay-out?
327What are the metrics you will use to track if Uber's paid advertising strategies to acquire customers work? How will you figure out the acceptable cost of customer acquisition?
Uber,
273How will you design the heatmap for Uber drivers to provide recommendation on where to wait for passengers? How would you approach this?
Uber,
1682Case Study based questions - Cars are implanted with speed tracker so that the insurance companies can track about our driving state. Based on this new scheme what kind of business questions can be answered?
435Which technique will you use to compare the performance of two back-end engines that generate automatic friend recommendations on Facebook?
416You have two tables-the first table has data about the users and their friends, the second table has data about the users and the pages they have liked. Write an SQL query to make recommendations using pages that your friends liked. The query result should not recommend the pages that have already been liked by a user.
380Post New Data Science Questions
Do you prefer python or r for text analytics?
Why L1 regularizations causes parameter sparsity whereas L2 regularization does not?
How will you design a recommendation engine for jobs?
How can the outlier values be treated?
What will be the output of runif (7)?
What is a z test?
What is the differences between univariate, bivariate and multivariate analysis?
How often should an algorithm be updated?
What is cluster sampling?
What are the major skills data scientist need?
Why is Python used in data science?
Explain the structure of artificial neural networks?
What is skewed distribution & uniform distribution?
In k-means or knn, we use euclidean distance to calculate the distance between nearest neighbors. Why not manhattan distance?
How regularly must an algorithm be updated?