Choosing SNPs using feature selection.

Tu Minh Phuong, Zhen Lin, Russ B Altman
{"title":"Choosing SNPs using feature selection.","authors":"Tu Minh Phuong,&nbsp;Zhen Lin,&nbsp;Russ B Altman","doi":"10.1109/csb.2005.22","DOIUrl":null,"url":null,"abstract":"<p><p>A major challenge for genomewide disease association studies is the high cost of genotyping large number of single nucleotide polymorphisms (SNP). The correlations between SNPs, however, make it possible to select a parsimonious set of informative SNPs, known as \"tagging\" SNPs, able to capture most variation in a population. Considerable research interest has recently focused on the development of methods for finding such SNPs. In this paper, we present an efficient method for finding tagging SNPs. The method does not involve computation-intensive search for SNP subsets but discards redundant SNPs using a feature selection algorithm. In contrast to most existing methods, the method presented here does not limit itself to using only correlations between SNPs in local groups. By using correlations that occur across different chromosomal regions, the method can reduce the number of globally redundant SNPs. Experimental results show that the number of tagging SNPs selected by our method is smaller than by using block-based methods.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"301-9"},"PeriodicalIF":0.0000,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.22","citationCount":"89","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE Computational Systems Bioinformatics Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/csb.2005.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 89

Abstract

A major challenge for genomewide disease association studies is the high cost of genotyping large number of single nucleotide polymorphisms (SNP). The correlations between SNPs, however, make it possible to select a parsimonious set of informative SNPs, known as "tagging" SNPs, able to capture most variation in a population. Considerable research interest has recently focused on the development of methods for finding such SNPs. In this paper, we present an efficient method for finding tagging SNPs. The method does not involve computation-intensive search for SNP subsets but discards redundant SNPs using a feature selection algorithm. In contrast to most existing methods, the method presented here does not limit itself to using only correlations between SNPs in local groups. By using correlations that occur across different chromosomal regions, the method can reduce the number of globally redundant SNPs. Experimental results show that the number of tagging SNPs selected by our method is smaller than by using block-based methods.

使用特征选择选择snp。
全基因组疾病关联研究的一个主要挑战是对大量单核苷酸多态性(SNP)进行基因分型的高成本。然而,snp之间的相关性使得选择一组信息丰富的snp成为可能,这些snp被称为“标记”snp,能够捕获种群中的大多数变异。相当大的研究兴趣最近集中在寻找这种snp的方法的发展上。在本文中,我们提出了一种有效的方法来寻找标记snp。该方法不涉及对SNP子集的计算密集型搜索,而是使用特征选择算法丢弃冗余SNP。与大多数现有方法相比,本文提出的方法并不局限于仅使用本地群体中snp之间的相关性。通过使用发生在不同染色体区域的相关性,该方法可以减少全局冗余snp的数量。实验结果表明,与基于块的方法相比,该方法选择的标记snp数量更少。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信