Mining of Attribute Interactions Using Information Theoretic Metrics

P. Chanda, Young-Rae Cho, A. Zhang, M. Ramanathan
{"title":"Mining of Attribute Interactions Using Information Theoretic Metrics","authors":"P. Chanda, Young-Rae Cho, A. Zhang, M. Ramanathan","doi":"10.1109/ICDMW.2009.51","DOIUrl":null,"url":null,"abstract":"Knowledge of the statistical interactions between the attributes in a data set provides insight into the underlying structure of the data and explains the relationships (independence, synergy, redundancy) between the attributes. In a supervised learning problem, normally, a small subset of the classifying attributes are actually associated with the class label. Interaction information among the attributes captures the multivariate dependencies (synergy and redundancy) among the attributes and the class label. Mining the significant statistical interactions that contain information about the class label is a computationally challenging task - the number of possible interactions increases exponentially and most of these interactions contain redundant information when a number of correlated attributes are present. In this paper, we present a data mining method (named IM or Interaction Mining) to mine non-redundant attribute sets that have significant interactions with the class label. We further demonstrate that the mined statistical interactions are useful for improved feature selection as they successfully capture the multivariate inter-dependencies among the attributes.","PeriodicalId":351078,"journal":{"name":"2009 IEEE International Conference on Data Mining Workshops","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Conference on Data Mining Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2009.51","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 63

Abstract

Knowledge of the statistical interactions between the attributes in a data set provides insight into the underlying structure of the data and explains the relationships (independence, synergy, redundancy) between the attributes. In a supervised learning problem, normally, a small subset of the classifying attributes are actually associated with the class label. Interaction information among the attributes captures the multivariate dependencies (synergy and redundancy) among the attributes and the class label. Mining the significant statistical interactions that contain information about the class label is a computationally challenging task - the number of possible interactions increases exponentially and most of these interactions contain redundant information when a number of correlated attributes are present. In this paper, we present a data mining method (named IM or Interaction Mining) to mine non-redundant attribute sets that have significant interactions with the class label. We further demonstrate that the mined statistical interactions are useful for improved feature selection as they successfully capture the multivariate inter-dependencies among the attributes.
利用信息理论度量挖掘属性交互
了解数据集中属性之间的统计交互,可以深入了解数据的底层结构,并解释属性之间的关系(独立性、协同性、冗余性)。在监督学习问题中,通常,分类属性的一小部分实际上与类标签相关联。属性之间的交互信息捕获属性和类标签之间的多变量依赖关系(协同和冗余)。挖掘包含有关类标签信息的重要统计交互是一项具有计算挑战性的任务—可能交互的数量呈指数增长,并且当存在许多相关属性时,大多数这些交互包含冗余信息。在本文中,我们提出了一种数据挖掘方法(称为IM或交互挖掘)来挖掘与类标签有重要交互的非冗余属性集。我们进一步证明,挖掘的统计交互有助于改进特征选择,因为它们成功地捕获了属性之间的多变量相互依赖关系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信