Jiaxin Yang, Sikta Das Adhikari, Hao Wang, Binbin Huang, Wenjie Qi, Yuehua Cui, Jianrong Wang
{"title":"De novo prediction of functional effects of genetic variants from DNA sequences based on context-specific molecular information","authors":"Jiaxin Yang, Sikta Das Adhikari, Hao Wang, Binbin Huang, Wenjie Qi, Yuehua Cui, Jianrong Wang","doi":"10.3389/fsysb.2024.1402664","DOIUrl":null,"url":null,"abstract":"Deciphering the functional effects of noncoding genetic variants stands as a fundamental challenge in human genetics. Traditional approaches, such as Genome-Wide Association Studies (GWAS), Transcriptome-Wide Association Studies (TWAS), and Quantitative Trait Loci (QTL) studies, are constrained by obscured the underlying molecular-level mechanisms, making it challenging to unravel the genetic basis of complex traits. The advent of Next-Generation Sequencing (NGS) technologies has enabled context-specific genome-wide measurements, encompassing gene expression, chromatin accessibility, epigenetic marks, and transcription factor binding sites, to be obtained across diverse cell types and tissues, paving the way for decoding genetic variation effects directly from DNA sequences only. The de novo predictions of functional effects are pivotal for enhancing our comprehension of transcriptional regulation and its disruptions caused by the plethora of noncoding genetic variants linked to human diseases and traits. This review provides a systematic overview of the state-of-the-art models and algorithms for genetic variant effect predictions, including traditional sequence-based models, Deep Learning models, and the cutting-edge Foundation Models. It delves into the ongoing challenges and prospective directions, presenting an in-depth perspective on contemporary developments in this domain.","PeriodicalId":73109,"journal":{"name":"Frontiers in systems biology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in systems biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fsysb.2024.1402664","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Deciphering the functional effects of noncoding genetic variants stands as a fundamental challenge in human genetics. Traditional approaches, such as Genome-Wide Association Studies (GWAS), Transcriptome-Wide Association Studies (TWAS), and Quantitative Trait Loci (QTL) studies, are constrained by obscured the underlying molecular-level mechanisms, making it challenging to unravel the genetic basis of complex traits. The advent of Next-Generation Sequencing (NGS) technologies has enabled context-specific genome-wide measurements, encompassing gene expression, chromatin accessibility, epigenetic marks, and transcription factor binding sites, to be obtained across diverse cell types and tissues, paving the way for decoding genetic variation effects directly from DNA sequences only. The de novo predictions of functional effects are pivotal for enhancing our comprehension of transcriptional regulation and its disruptions caused by the plethora of noncoding genetic variants linked to human diseases and traits. This review provides a systematic overview of the state-of-the-art models and algorithms for genetic variant effect predictions, including traditional sequence-based models, Deep Learning models, and the cutting-edge Foundation Models. It delves into the ongoing challenges and prospective directions, presenting an in-depth perspective on contemporary developments in this domain.
破解非编码基因变异的功能效应是人类遗传学面临的一项基本挑战。传统的方法,如全基因组关联研究(GWAS)、全转录组关联研究(TWAS)和定量性状位点研究(QTL),受制于模糊的分子水平机制,使得揭示复杂性状的遗传基础具有挑战性。下一代测序(NGS)技术的出现使人们能够在不同的细胞类型和组织中获得特定的全基因组测量结果,包括基因表达、染色质可及性、表观遗传标记和转录因子结合位点,为仅从 DNA 序列直接解码遗传变异效应铺平了道路。对功能效应的全新预测,对于提高我们对转录调控及其由与人类疾病和性状相关的大量非编码基因变异引起的破坏的理解至关重要。本综述系统地概述了用于遗传变异效应预测的最先进模型和算法,包括传统的基于序列的模型、深度学习模型和最先进的基础模型。它深入探讨了当前面临的挑战和未来的发展方向,对该领域的当代发展提出了深入的看法。