Undiagnosed samples aided rough set feature selection for medical data

D. Guan, Weiwei Yuan, Zilong Jin, Sungyoung Lee
{"title":"Undiagnosed samples aided rough set feature selection for medical data","authors":"D. Guan, Weiwei Yuan, Zilong Jin, Sungyoung Lee","doi":"10.1109/PDGC.2012.6449895","DOIUrl":null,"url":null,"abstract":"Medical data often consists of a large number of disease markers. For medical data analysis, some disease markers are not helpful and sometimes even have negative effects. Therefore, applying feature selection is necessary as it can remove those unimportant disease markers. Among many feature selection methods, rough set based feature selection (RSFS) has been widely used. Unlike other methods, RSFS is completely data-driven. It does not require any other information like probability distributions. Traditional RSFS methods extract the information only from the diagnosed samples. Therefore, they usually require a large number of diagnosed samples to achieve the good feature selection performance. However, in many real medical applications, diagnosed samples are limited, yet the number of undiagnosed samples is large. Motivated by semi-supervised learning methodology, in this paper, we propose a novel RSFS method which can learn from both diagnosed and undiagnosed samples. This method is called undiagnosed samples aided rough set feature selection (USA-RSFS). Its main benefit is to reduce the requirement on diagnosed samples by the help of undiagnosed ones. Finally, the promising performance of USA-RSFS is validated through a set of experiments on medical datasets.","PeriodicalId":166718,"journal":{"name":"2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDGC.2012.6449895","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Medical data often consists of a large number of disease markers. For medical data analysis, some disease markers are not helpful and sometimes even have negative effects. Therefore, applying feature selection is necessary as it can remove those unimportant disease markers. Among many feature selection methods, rough set based feature selection (RSFS) has been widely used. Unlike other methods, RSFS is completely data-driven. It does not require any other information like probability distributions. Traditional RSFS methods extract the information only from the diagnosed samples. Therefore, they usually require a large number of diagnosed samples to achieve the good feature selection performance. However, in many real medical applications, diagnosed samples are limited, yet the number of undiagnosed samples is large. Motivated by semi-supervised learning methodology, in this paper, we propose a novel RSFS method which can learn from both diagnosed and undiagnosed samples. This method is called undiagnosed samples aided rough set feature selection (USA-RSFS). Its main benefit is to reduce the requirement on diagnosed samples by the help of undiagnosed ones. Finally, the promising performance of USA-RSFS is validated through a set of experiments on medical datasets.
未诊断样本辅助粗糙集特征选择医学数据
医学数据通常由大量的疾病标记物组成。对于医学数据分析,一些疾病标记物没有帮助,有时甚至产生负面影响。因此,应用特征选择是必要的,因为它可以去除那些不重要的疾病标记。在众多特征选择方法中,基于粗糙集的特征选择(RSFS)得到了广泛的应用。与其他方法不同,RSFS完全是数据驱动的。它不需要任何其他信息,比如概率分布。传统的RSFS方法仅从诊断样本中提取信息。因此,它们通常需要大量的诊断样本才能达到良好的特征选择性能。然而,在许多实际的医学应用中,诊断的样本是有限的,而未诊断的样本数量很大。在半监督学习方法的激励下,我们提出了一种新的RSFS方法,可以从诊断和未诊断的样本中学习。这种方法被称为未诊断样本辅助粗糙集特征选择(USA-RSFS)。它的主要好处是在未诊断样本的帮助下减少了对诊断样本的需求。最后,通过一组医学数据集实验验证了USA-RSFS的良好性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信