A Case-Control Study of Non-parametric Approaches for Detecting SNP-SNP Interactions

F. R. B. Araujo, E. Gusmão, K. Guimaraes
{"title":"A Case-Control Study of Non-parametric Approaches for Detecting SNP-SNP Interactions","authors":"F. R. B. Araujo, E. Gusmão, K. Guimaraes","doi":"10.1109/SCCC.2011.4","DOIUrl":null,"url":null,"abstract":"The massive volume of SNP data available requires the use of adequate computational strategies to properly handle it. Identify the SNP-SNP and SNP-environment combinations that better explain the propensity for a certain disease. We introduce a website (https://jaqueira.cin.ufpe.br/pit/faces/index.jsp) where three previously reported and a new method for SNP-SNP interaction are implemented, and can be executed individually or all together for the same dataset. We also present the results of a case-control study of those methods, based on 70 epistatic models, varying rates for heritability and for minor allele frequency. The experiments also consider different numbers of SNPs and sizes of case-control sets. We observe that for a small number of SNPs, the four methods are statistically equal, but when the number of SNPs grow, they have different behavior, except for ESNP2 and our method. Although the methods are exhaustive, in general, in our analysis ESNP2 runs much faster and achieves better accuracy. Nonetheless, the performance of ESNP2 can be disturbed in a scenario where a single gene can explain most of the epistatic effects. In those cases, considering the interaction effects of all SNPs, instead of only the most significant, can deliver more accurate results. A proposed method, called Multi-Approach SNP-SNP Interaction Analysis (MASS), although statistically equal to ESNP2, achieves better results than ESNP2 in that situation. Our experiments show that specific epistatic models lead to particularly better or worse performance. While a small value for minor allele frequency can negatively impact the accuracy, small heritability rates if the single variation studied that has the strongest negative impact on the accuracy.","PeriodicalId":173639,"journal":{"name":"2011 30th International Conference of the Chilean Computer Science Society","volume":"311 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 30th International Conference of the Chilean Computer Science Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCCC.2011.4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The massive volume of SNP data available requires the use of adequate computational strategies to properly handle it. Identify the SNP-SNP and SNP-environment combinations that better explain the propensity for a certain disease. We introduce a website (https://jaqueira.cin.ufpe.br/pit/faces/index.jsp) where three previously reported and a new method for SNP-SNP interaction are implemented, and can be executed individually or all together for the same dataset. We also present the results of a case-control study of those methods, based on 70 epistatic models, varying rates for heritability and for minor allele frequency. The experiments also consider different numbers of SNPs and sizes of case-control sets. We observe that for a small number of SNPs, the four methods are statistically equal, but when the number of SNPs grow, they have different behavior, except for ESNP2 and our method. Although the methods are exhaustive, in general, in our analysis ESNP2 runs much faster and achieves better accuracy. Nonetheless, the performance of ESNP2 can be disturbed in a scenario where a single gene can explain most of the epistatic effects. In those cases, considering the interaction effects of all SNPs, instead of only the most significant, can deliver more accurate results. A proposed method, called Multi-Approach SNP-SNP Interaction Analysis (MASS), although statistically equal to ESNP2, achieves better results than ESNP2 in that situation. Our experiments show that specific epistatic models lead to particularly better or worse performance. While a small value for minor allele frequency can negatively impact the accuracy, small heritability rates if the single variation studied that has the strongest negative impact on the accuracy.
非参数方法检测SNP-SNP相互作用的病例对照研究
大量可用的SNP数据需要使用适当的计算策略来正确处理。确定SNP-SNP和snp -环境组合,更好地解释某种疾病的倾向。我们介绍了一个网站(https://jaqueira.cin.ufpe.br/pit/faces/index.jsp),其中实现了三个先前报道的SNP-SNP相互作用的新方法,可以单独执行,也可以一起执行同一数据集。我们还提出了这些方法的病例对照研究结果,基于70上位模型,不同的遗传率和小等位基因频率。实验还考虑了不同的snp数量和病例控制集的大小。我们观察到,当SNPs数量较少时,四种方法在统计上是相等的,但当SNPs数量增加时,除了ESNP2和我们的方法外,它们的行为不同。虽然这些方法是详尽的,但总的来说,在我们的分析中,ESNP2运行得更快,并且达到了更好的准确性。尽管如此,当单个基因可以解释大多数上位性效应时,ESNP2的表现可能会受到干扰。在这些情况下,考虑所有snp的相互作用效应,而不仅仅是最显著的,可以提供更准确的结果。本文提出的一种多途径SNP-SNP相互作用分析(MASS)方法,虽然在统计上与ESNP2相等,但在这种情况下比ESNP2获得更好的结果。我们的实验表明,特定的上位模型会导致更好或更差的表现。较小的等位基因频率值可能会对准确性产生负面影响,而较小的遗传率如果研究的单个变异对准确性产生最大的负面影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信