Week – 2 Data leakages >>> How to Win a Data Science Competition Learn from Top Kagglers
1. Suppose that you have a credit scoring task, where you have to create a ML model that approximates expert evaluation of an individual’s creditworthiness. Which of the following can potentially be a data leakage? Select all that apply.
2. What is the most foolproof way to set up a time series competition?
3. Suppose that you have a binary classification task being evaluated by logloss metric. You know that there are 10000 rows in public chunk of test set and that constant 0.3 prediction gives the public score of 1.01. Mean of target variable in train is 0.44. What is the mean of target variable in public part of test data (up to 4 decimal places)?
4. Suppose that you are solving image classification task. What is the label of this picture?