Strategic Advancements in Utilizing Data Mining and Warehousing Technologies最新文献

The Power of Sampling and Stacking for the PAKDD-2007 Cross-Selling Problem PAKDD-2007交叉销售问题的抽样和堆叠能力

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies Pub Date : 2008-04-01 DOI: 10.4018/jdwm.2008040104

P. Adeodato, G. C. Vasconcelos, A. L. Arnaud, Rodrigo C. L. V. Cunha, Domingos S. M. P. Monteiro, R. Neto

引用次数: 22

Selecting Salient Features and Samples Simultaneously to Enhance Cross-Selling Model Performance 同时选择显著特征和样本以提高交叉销售模型的性能

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies Pub Date : 1900-01-01 DOI: 10.4018/978-1-60566-717-1.CH021

Dehong Qiu, Ye Wang, Qifeng Zhang

引用次数: 0

Seismological Data Warehousing and Mining 地震数据仓库与挖掘

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies Pub Date : 1900-01-01 DOI: 10.4018/978-1-60566-098-1.ch019

Gerasimos Marketos, Y. Theodoridis, I. Kalogeras

引用次数: 1

Bagging Probit Models for Unbalanced Classification 不平衡分类的Bagging Probit模型

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies Pub Date : 1900-01-01 DOI: 10.4018/978-1-60566-717-1.CH017

Hualin Wang, Xiaogang Su

{"title":"Bagging Probit Models for Unbalanced Classification","authors":"Hualin Wang, Xiaogang Su","doi":"10.4018/978-1-60566-717-1.CH017","DOIUrl":"https://doi.org/10.4018/978-1-60566-717-1.CH017","url":null,"abstract":"The 11th Pacific-Asia Knowledge Discovery and Data Mining Conference (PAKDD 2007) hosted a data mining competition, co-organized by the Singapore Institute of Statistics. The data set is from a consumer finance company with the aim of finding solutions for a cross-selling business problem. The company currently has two databases, one for credit card holders and the other for home loan (mortgage) customers and they would like to make use of this opportunity to cross-sell home loans to its credit card holders. Thus, it is of their keen interest to have an effective scoring model for predicting potential cross-sell take-ups. The training dataset contains information on 40,700 customers with 40 input variables, most of which are related to the point of application for the company’s credit card, plus a binary target variable indicating the home loan take-up status. This is a sample of customers who opened a new credit card with the company within a specific 2-year period and did not have an existing home loan with the company. The binary target variable has a value of 1 if the customer then opened a home loan with the company within 12 months after opening the credit abstract","PeriodicalId":399104,"journal":{"name":"Strategic Advancements in Utilizing Data Mining and Warehousing Technologies","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114387621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

An Integrated Framework for Fuzzy Classification and Analysis of Gene Expression Data 基因表达数据模糊分类与分析的集成框架

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies Pub Date : 1900-01-01 DOI: 10.4018/978-1-60566-717-1.CH009

M. Khabbaz, K. Kianmehr, Mohammed Al-Shalalfa, R. Alhajj

{"title":"An Integrated Framework for Fuzzy Classification and Analysis of Gene Expression Data","authors":"M. Khabbaz, K. Kianmehr, Mohammed Al-Shalalfa, R. Alhajj","doi":"10.4018/978-1-60566-717-1.CH009","DOIUrl":"https://doi.org/10.4018/978-1-60566-717-1.CH009","url":null,"abstract":"This chapter takes advantage of using fuzzy classifier rules to capture the correlations between genes. The main motivation to conduct this study is that a fuzzy classifier rule is essentially an “if-then” rule that contains linguistic terms to represent the feature values. This representation of a rule that demonstrates the correlations among the genes is very simple to understand and interpret for domain experts. In this proposed gene selection procedure, instead of measuring the effectiveness of every single gene for building the classifier model, the authors incorporate the impotence of a gene correlation with other existing genes in the process of gene selection. That is, a gene is rejected if it is not in a significant correlation with other genes in the dataset. Furthermore, in order to improve the reliability of this approach, the process is repeated several times in these experiments, and the genes reported as the result are the genes selected in most experiments. This chapter reports test results on five datasets and analyzes the achieved results from biological perspective. DOI: 10.4018/978-1-60566-717-1.ch009","PeriodicalId":399104,"journal":{"name":"Strategic Advancements in Utilizing Data Mining and Warehousing Technologies","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130570141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3