Recognizing Potential New Customers

Task

An online retailer is interested identifying potential new customers from a population of consumers. Your task is to rank ordering consumer pool according to who is most likely to become customers of the retailer.

The first task involves binary classification to determine customers of the retailer. The training data contains 334 variables for a known set of 130,475 customers and non-customers with a ratio of 1:10, respectively.

Training Data 39 MB 130,475 One line per example, feature values are comma delimited. Training Labels 12 KB 130,475 One line per example, aligned with the training data file. 1 means the corresponding training example is positive, 0 means the corresponding training example is negative.

Scoring Predictions

You need to use cross-validation on the training data to test your predictor. The evaluation metric for the E-commerce Customer Identification (Raw) is AUC (area under the receiver operating characteristic curve).