{"title":"Bagging Probit Models for Unbalanced Classification","authors":"Hualin Wang, Xiaogang Su","doi":"10.4018/978-1-60566-717-1.CH017","DOIUrl":null,"url":null,"abstract":"The 11th Pacific-Asia Knowledge Discovery and Data Mining Conference (PAKDD 2007) hosted a data mining competition, co-organized by the Singapore Institute of Statistics. The data set is from a consumer finance company with the aim of finding solutions for a cross-selling business problem. The company currently has two databases, one for credit card holders and the other for home loan (mortgage) customers and they would like to make use of this opportunity to cross-sell home loans to its credit card holders. Thus, it is of their keen interest to have an effective scoring model for predicting potential cross-sell take-ups. The training dataset contains information on 40,700 customers with 40 input variables, most of which are related to the point of application for the company’s credit card, plus a binary target variable indicating the home loan take-up status. This is a sample of customers who opened a new credit card with the company within a specific 2-year period and did not have an existing home loan with the company. The binary target variable has a value of 1 if the customer then opened a home loan with the company within 12 months after opening the credit abstract","PeriodicalId":399104,"journal":{"name":"Strategic Advancements in Utilizing Data Mining and Warehousing Technologies","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Strategic Advancements in Utilizing Data Mining and Warehousing Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/978-1-60566-717-1.CH017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The 11th Pacific-Asia Knowledge Discovery and Data Mining Conference (PAKDD 2007) hosted a data mining competition, co-organized by the Singapore Institute of Statistics. The data set is from a consumer finance company with the aim of finding solutions for a cross-selling business problem. The company currently has two databases, one for credit card holders and the other for home loan (mortgage) customers and they would like to make use of this opportunity to cross-sell home loans to its credit card holders. Thus, it is of their keen interest to have an effective scoring model for predicting potential cross-sell take-ups. The training dataset contains information on 40,700 customers with 40 input variables, most of which are related to the point of application for the company’s credit card, plus a binary target variable indicating the home loan take-up status. This is a sample of customers who opened a new credit card with the company within a specific 2-year period and did not have an existing home loan with the company. The binary target variable has a value of 1 if the customer then opened a home loan with the company within 12 months after opening the credit abstract