A Novel Detection Method for High-Order SNP Epistatic Interactions Based on Explicit-Encoding-Based Multitasking Harmony Search.

IF 3.9 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Shouheng Tuo, Jiewei Jiang
{"title":"A Novel Detection Method for High-Order SNP Epistatic Interactions Based on Explicit-Encoding-Based Multitasking Harmony Search.","authors":"Shouheng Tuo, Jiewei Jiang","doi":"10.1007/s12539-024-00621-2","DOIUrl":null,"url":null,"abstract":"<p><p>To elucidate the genetic basis of complex diseases, it is crucial to discover the single-nucleotide polymorphisms (SNPs) contributing to disease susceptibility. This is particularly challenging for high-order SNP epistatic interactions (HEIs), which exhibit small individual effects but potentially large joint effects. These interactions are difficult to detect due to the vast search space, encompassing billions of possible combinations, and the computational complexity of evaluating them. This study proposes a novel explicit-encoding-based multitasking harmony search algorithm (MTHS-EE-DHEI) specifically designed to address this challenge. The algorithm operates in three stages. First, a harmony search algorithm is employed, utilizing four lightweight evaluation functions, such as Bayesian network and entropy, to efficiently explore potential SNP combinations related to disease status. Second, a G-test statistical method is applied to filter out insignificant SNP combinations. Finally, two machine learning-based methods, multifactor dimensionality reduction (MDR) as well as random forest (RF), are employed to validate the classification performance of the remaining significant SNP combinations. This research aims to demonstrate the effectiveness of MTHS-EE-DHEI in identifying HEIs compared to existing methods, potentially providing valuable insights into the genetic architecture of complex diseases. The performance of MTHS-EE-DHEI was evaluated on twenty simulated disease datasets and three real-world datasets encompassing age-related macular degeneration (AMD), rheumatoid arthritis (RA), and breast cancer (BC). The results demonstrably indicate that MTHS-EE-DHEI outperforms four state-of-the-art algorithms in terms of both detection power and computational efficiency. The source code is available at https://github.com/shouhengtuo/MTHS-EE-DHEI.git .</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"688-711"},"PeriodicalIF":3.9000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interdisciplinary Sciences: Computational Life Sciences","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s12539-024-00621-2","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/2 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

To elucidate the genetic basis of complex diseases, it is crucial to discover the single-nucleotide polymorphisms (SNPs) contributing to disease susceptibility. This is particularly challenging for high-order SNP epistatic interactions (HEIs), which exhibit small individual effects but potentially large joint effects. These interactions are difficult to detect due to the vast search space, encompassing billions of possible combinations, and the computational complexity of evaluating them. This study proposes a novel explicit-encoding-based multitasking harmony search algorithm (MTHS-EE-DHEI) specifically designed to address this challenge. The algorithm operates in three stages. First, a harmony search algorithm is employed, utilizing four lightweight evaluation functions, such as Bayesian network and entropy, to efficiently explore potential SNP combinations related to disease status. Second, a G-test statistical method is applied to filter out insignificant SNP combinations. Finally, two machine learning-based methods, multifactor dimensionality reduction (MDR) as well as random forest (RF), are employed to validate the classification performance of the remaining significant SNP combinations. This research aims to demonstrate the effectiveness of MTHS-EE-DHEI in identifying HEIs compared to existing methods, potentially providing valuable insights into the genetic architecture of complex diseases. The performance of MTHS-EE-DHEI was evaluated on twenty simulated disease datasets and three real-world datasets encompassing age-related macular degeneration (AMD), rheumatoid arthritis (RA), and breast cancer (BC). The results demonstrably indicate that MTHS-EE-DHEI outperforms four state-of-the-art algorithms in terms of both detection power and computational efficiency. The source code is available at https://github.com/shouhengtuo/MTHS-EE-DHEI.git .

Abstract Image

基于显式编码的多任务和谐搜索的高阶 SNP 表观相互作用的新型检测方法
要阐明复杂疾病的遗传基础,发现导致疾病易感性的单核苷酸多态性(SNPs)至关重要。这对于高阶 SNP 表观交互作用(HEIs)来说尤其具有挑战性,因为这种交互作用表现出较小的个体效应,但可能产生较大的联合效应。由于搜索空间巨大,包含数十亿种可能的组合,而且评估这些组合的计算复杂,因此很难检测到这些相互作用。本研究提出了一种基于显式编码的新型多任务和谐搜索算法(MTHS-EE-DHEI),专门用于解决这一难题。该算法分三个阶段运行。首先,采用和谐搜索算法,利用贝叶斯网络和熵等四种轻量级评估函数,有效探索与疾病状态相关的潜在 SNP 组合。其次,采用 G 检验统计方法过滤掉不重要的 SNP 组合。最后,采用多因素降维(MDR)和随机森林(RF)这两种基于机器学习的方法来验证剩余重要 SNP 组合的分类性能。这项研究旨在证明,与现有方法相比,MTHS-EE-DHEI 在识别 HEI 方面非常有效,有可能为复杂疾病的遗传结构提供有价值的见解。MTHS-EE-DHEI 的性能在二十个模拟疾病数据集和三个真实世界数据集上进行了评估,包括老年性黄斑变性(AMD)、类风湿性关节炎(RA)和乳腺癌(BC)。结果表明,MTHS-EE-DHEI 在检测能力和计算效率方面都优于四种最先进的算法。源代码见 https://github.com/shouhengtuo/MTHS-EE-DHEI.git 。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Interdisciplinary Sciences: Computational Life Sciences
Interdisciplinary Sciences: Computational Life Sciences MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
8.60
自引率
4.20%
发文量
55
期刊介绍: Interdisciplinary Sciences--Computational Life Sciences aims to cover the most recent and outstanding developments in interdisciplinary areas of sciences, especially focusing on computational life sciences, an area that is enjoying rapid development at the forefront of scientific research and technology. The journal publishes original papers of significant general interest covering recent research and developments. Articles will be published rapidly by taking full advantage of internet technology for online submission and peer-reviewing of manuscripts, and then by publishing OnlineFirstTM through SpringerLink even before the issue is built or sent to the printer. The editorial board consists of many leading scientists with international reputation, among others, Luc Montagnier (UNESCO, France), Dennis Salahub (University of Calgary, Canada), Weitao Yang (Duke University, USA). Prof. Dongqing Wei at the Shanghai Jiatong University is appointed as the editor-in-chief; he made important contributions in bioinformatics and computational physics and is best known for his ground-breaking works on the theory of ferroelectric liquids. With the help from a team of associate editors and the editorial board, an international journal with sound reputation shall be created.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信