数据集统计特性的变化对特征选择稳定性的影响

P. M. Chelvan, K. Perumal
{"title":"数据集统计特性的变化对特征选择稳定性的影响","authors":"P. M. Chelvan, K. Perumal","doi":"10.1109/ICICES.2017.8070728","DOIUrl":null,"url":null,"abstract":"Data mining extracts previously not known knowledge from huge amount of stored operational data of organizations which can be used for managerial decision making. The datasets are mostly high dimensional due to the advancements in information and communication technologies. Feature selection is an important dimensionality reduction technique to manage the “curse of dimensionality”. The subset of features selected in subsequent iterations of feature selection algorithms must be same or at least similar for the small perturbations of the experimental dataset. The robustness of feature selection algorithms is called as the feature selection stability. High data quality with security/privacy is the major requirement of good privacy preserving data mining technique. This paper explores the change in statistical properties of datasets due to perturbations of datasets by the privacy preserving data mining techniques and their effects in feature selection stability.","PeriodicalId":134931,"journal":{"name":"2017 International Conference on Information Communication and Embedded Systems (ICICES)","volume":"251 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The effects of change in statistical properties of datasets on feature selection stability\",\"authors\":\"P. M. Chelvan, K. Perumal\",\"doi\":\"10.1109/ICICES.2017.8070728\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data mining extracts previously not known knowledge from huge amount of stored operational data of organizations which can be used for managerial decision making. The datasets are mostly high dimensional due to the advancements in information and communication technologies. Feature selection is an important dimensionality reduction technique to manage the “curse of dimensionality”. The subset of features selected in subsequent iterations of feature selection algorithms must be same or at least similar for the small perturbations of the experimental dataset. The robustness of feature selection algorithms is called as the feature selection stability. High data quality with security/privacy is the major requirement of good privacy preserving data mining technique. This paper explores the change in statistical properties of datasets due to perturbations of datasets by the privacy preserving data mining techniques and their effects in feature selection stability.\",\"PeriodicalId\":134931,\"journal\":{\"name\":\"2017 International Conference on Information Communication and Embedded Systems (ICICES)\",\"volume\":\"251 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Information Communication and Embedded Systems (ICICES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICES.2017.8070728\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Information Communication and Embedded Systems (ICICES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICES.2017.8070728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

数据挖掘是从大量存储的组织运营数据中提取出以前不知道的知识,这些知识可以用于管理决策。由于信息和通信技术的进步,数据集大多是高维的。特征选择是解决“维数诅咒”的一种重要降维技术。对于实验数据集的小扰动,在特征选择算法的后续迭代中选择的特征子集必须相同或至少相似。特征选择算法的鲁棒性称为特征选择稳定性。高质量的数据质量和安全/隐私是良好的隐私保护数据挖掘技术的主要要求。本文探讨了隐私保护数据挖掘技术对数据集的扰动对数据集统计特性的影响及其对特征选择稳定性的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The effects of change in statistical properties of datasets on feature selection stability
Data mining extracts previously not known knowledge from huge amount of stored operational data of organizations which can be used for managerial decision making. The datasets are mostly high dimensional due to the advancements in information and communication technologies. Feature selection is an important dimensionality reduction technique to manage the “curse of dimensionality”. The subset of features selected in subsequent iterations of feature selection algorithms must be same or at least similar for the small perturbations of the experimental dataset. The robustness of feature selection algorithms is called as the feature selection stability. High data quality with security/privacy is the major requirement of good privacy preserving data mining technique. This paper explores the change in statistical properties of datasets due to perturbations of datasets by the privacy preserving data mining techniques and their effects in feature selection stability.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信