Mining for combined association rules on multiple datasets

Yanchang Zhao, Huaifeng Zhang, F. Figueiredo, Longbing Cao, Chengqi Zhang
{"title":"Mining for combined association rules on multiple datasets","authors":"Yanchang Zhao, Huaifeng Zhang, F. Figueiredo, Longbing Cao, Chengqi Zhang","doi":"10.1145/1288552.1288555","DOIUrl":null,"url":null,"abstract":"Many organisations have their digital information stored in a distributed systems structure scheme, be it in different locations, using vertically and horizontally distributed repositories, which brings about an high level of complexity to data mining. From a classical data mining view, where the algorithms expect a denormalised structure to be able to operate on, heterogeneous data sources, such as static demographic and dynamic transactional data are to be manipulated and integrated to the extent commercial association rules algorithms can be applied. Bearing in mind the usefulness and understandability of the application from a business perspective, combined rules of multiple patterns derived from different repositories, containing historical and point in time data, were used to produce new techniques in association mining applied to debt recovery. Initially debt repayment patterns were discovered using transactional data and class labels defined by domain expertise, then demographic patterns were attached to each of the class labels. After combining the patterns, two type of rules were discovered leading to different results: 1) same demographic pattern with different repayment patterns, and 2) same repayment pattern with different demographic patterns. The rules produced are interesting, valuable, complete and understandable, which shows the applicability and effectiveness of the new method.","PeriodicalId":424328,"journal":{"name":"Proceedings of the 2007 international workshop on Domain driven data mining - DDDM '07","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2007 international workshop on Domain driven data mining - DDDM '07","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1288552.1288555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 37

Abstract

Many organisations have their digital information stored in a distributed systems structure scheme, be it in different locations, using vertically and horizontally distributed repositories, which brings about an high level of complexity to data mining. From a classical data mining view, where the algorithms expect a denormalised structure to be able to operate on, heterogeneous data sources, such as static demographic and dynamic transactional data are to be manipulated and integrated to the extent commercial association rules algorithms can be applied. Bearing in mind the usefulness and understandability of the application from a business perspective, combined rules of multiple patterns derived from different repositories, containing historical and point in time data, were used to produce new techniques in association mining applied to debt recovery. Initially debt repayment patterns were discovered using transactional data and class labels defined by domain expertise, then demographic patterns were attached to each of the class labels. After combining the patterns, two type of rules were discovered leading to different results: 1) same demographic pattern with different repayment patterns, and 2) same repayment pattern with different demographic patterns. The rules produced are interesting, valuable, complete and understandable, which shows the applicability and effectiveness of the new method.
多数据集组合关联规则挖掘
许多组织将其数字信息存储在分布式系统结构方案中,无论是在不同的位置,还是使用垂直和水平分布的存储库,这给数据挖掘带来了很高的复杂性。从经典数据挖掘的角度来看,算法期望能够操作非规范化的结构,异构数据源,如静态人口统计和动态事务数据,将被操纵和集成到商业关联规则算法可以应用的程度。考虑到应用程序从业务角度的有用性和可理解性,我们使用来自不同存储库(包含历史和时间点数据)的多个模式的组合规则来产生应用于债务回收的关联挖掘中的新技术。最初使用事务数据和由领域专家定义的类标签来发现债务偿还模式,然后将人口统计模式附加到每个类标签上。结合这些模式,发现了两类导致不同结果的规则:1)相同的人口模式与不同的还款模式;2)相同的还款模式与不同的人口模式。生成的规则有趣、有价值、完整、易懂,表明了新方法的适用性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信