Mining for combined association rules on multiple datasets

Proceedings of the 2007 international workshop on Domain driven data mining - DDDM '07 Pub Date : 2007-08-12 DOI:10.1145/1288552.1288555

Yanchang Zhao, Huaifeng Zhang, F. Figueiredo, Longbing Cao, Chengqi Zhang

{"title":"Mining for combined association rules on multiple datasets","authors":"Yanchang Zhao, Huaifeng Zhang, F. Figueiredo, Longbing Cao, Chengqi Zhang","doi":"10.1145/1288552.1288555","DOIUrl":null,"url":null,"abstract":"Many organisations have their digital information stored in a distributed systems structure scheme, be it in different locations, using vertically and horizontally distributed repositories, which brings about an high level of complexity to data mining. From a classical data mining view, where the algorithms expect a denormalised structure to be able to operate on, heterogeneous data sources, such as static demographic and dynamic transactional data are to be manipulated and integrated to the extent commercial association rules algorithms can be applied. Bearing in mind the usefulness and understandability of the application from a business perspective, combined rules of multiple patterns derived from different repositories, containing historical and point in time data, were used to produce new techniques in association mining applied to debt recovery. Initially debt repayment patterns were discovered using transactional data and class labels defined by domain expertise, then demographic patterns were attached to each of the class labels. After combining the patterns, two type of rules were discovered leading to different results: 1) same demographic pattern with different repayment patterns, and 2) same repayment pattern with different demographic patterns. The rules produced are interesting, valuable, complete and understandable, which shows the applicability and effectiveness of the new method.","PeriodicalId":424328,"journal":{"name":"Proceedings of the 2007 international workshop on Domain driven data mining - DDDM '07","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2007 international workshop on Domain driven data mining - DDDM '07","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1288552.1288555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 37

Abstract

Many organisations have their digital information stored in a distributed systems structure scheme, be it in different locations, using vertically and horizontally distributed repositories, which brings about an high level of complexity to data mining. From a classical data mining view, where the algorithms expect a denormalised structure to be able to operate on, heterogeneous data sources, such as static demographic and dynamic transactional data are to be manipulated and integrated to the extent commercial association rules algorithms can be applied. Bearing in mind the usefulness and understandability of the application from a business perspective, combined rules of multiple patterns derived from different repositories, containing historical and point in time data, were used to produce new techniques in association mining applied to debt recovery. Initially debt repayment patterns were discovered using transactional data and class labels defined by domain expertise, then demographic patterns were attached to each of the class labels. After combining the patterns, two type of rules were discovered leading to different results: 1) same demographic pattern with different repayment patterns, and 2) same repayment pattern with different demographic patterns. The rules produced are interesting, valuable, complete and understandable, which shows the applicability and effectiveness of the new method.

查看原文本刊更多论文

多数据集组合关联规则挖掘

许多组织将其数字信息存储在分布式系统结构方案中，无论是在不同的位置，还是使用垂直和水平分布的存储库，这给数据挖掘带来了很高的复杂性。从经典数据挖掘的角度来看，算法期望能够操作非规范化的结构，异构数据源，如静态人口统计和动态事务数据，将被操纵和集成到商业关联规则算法可以应用的程度。考虑到应用程序从业务角度的有用性和可理解性，我们使用来自不同存储库(包含历史和时间点数据)的多个模式的组合规则来产生应用于债务回收的关联挖掘中的新技术。最初使用事务数据和由领域专家定义的类标签来发现债务偿还模式，然后将人口统计模式附加到每个类标签上。结合这些模式，发现了两类导致不同结果的规则:1)相同的人口模式与不同的还款模式;2)相同的还款模式与不同的人口模式。生成的规则有趣、有价值、完整、易懂，表明了新方法的适用性和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2007 international workshop on Domain driven data mining - DDDM '07

自引率

0.00%

发文量