经典统计学和深度学习能在可解释的、因果驱动的目标发现上融合吗？

IF 2.9 2区生物学 Q1 GENETICS & HEREDITY

DNA Research Pub Date : 2025-09-15 DOI:10.1093/dnares/dsaf024

Liyin Chen

{"title":"经典统计学和深度学习能在可解释的、因果驱动的目标发现上融合吗？","authors":"Liyin Chen","doi":"10.1093/dnares/dsaf024","DOIUrl":null,"url":null,"abstract":"Understanding the molecular causes of complex diseases remains one of the most pressing challenges in biomedicine. Despite large-scale genome-wide association studies mapping thousands of risk loci, identifying which genetic variants truly drive disease remains difficult. Traditional statistical genetics has laid a strong foundation for variant discovery, but it often struggles to capture non-linear interactions and cannot fully integrate the breadth of the interconnected multi-omics data. In recent years, deep learning approaches have shown promise in bridging these gaps: modeling high-order genetic interactions, uncovering latent biological structure, and enabling multi-layered data integration. However, most current deep learning models for genomics remain exploratory in nature, and issues such as susceptibility to overfitting, difficulties in interpretability, and the general lack of standardized evaluation frameworks have limited their widespread adoption for genomics research. In this review, we explore how traditional statistical and deep learning methods can be applied to uncover causal mechanisms in complex disease. We critically compare these two frameworks for their advantages and limitations in detecting genetic associations and prioritizing causal associations. Toward the end, we propose a future direction centered around hybrid models that blend the scalability of deep learning with the inferential power of statistical genetics. Our goal is to guide researchers in developing next-generation computational tools to uncover the molecular basis of complex diseases and accelerate the translation of genetic findings into effective treatments.","PeriodicalId":51014,"journal":{"name":"DNA Research","volume":" ","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Can classical statistics and deep learning converge on explainable, causally driven target discovery?\",\"authors\":\"Liyin Chen\",\"doi\":\"10.1093/dnares/dsaf024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding the molecular causes of complex diseases remains one of the most pressing challenges in biomedicine. Despite large-scale genome-wide association studies mapping thousands of risk loci, identifying which genetic variants truly drive disease remains difficult. Traditional statistical genetics has laid a strong foundation for variant discovery, but it often struggles to capture non-linear interactions and cannot fully integrate the breadth of the interconnected multi-omics data. In recent years, deep learning approaches have shown promise in bridging these gaps: modeling high-order genetic interactions, uncovering latent biological structure, and enabling multi-layered data integration. However, most current deep learning models for genomics remain exploratory in nature, and issues such as susceptibility to overfitting, difficulties in interpretability, and the general lack of standardized evaluation frameworks have limited their widespread adoption for genomics research. In this review, we explore how traditional statistical and deep learning methods can be applied to uncover causal mechanisms in complex disease. We critically compare these two frameworks for their advantages and limitations in detecting genetic associations and prioritizing causal associations. Toward the end, we propose a future direction centered around hybrid models that blend the scalability of deep learning with the inferential power of statistical genetics. Our goal is to guide researchers in developing next-generation computational tools to uncover the molecular basis of complex diseases and accelerate the translation of genetic findings into effective treatments.\",\"PeriodicalId\":51014,\"journal\":{\"name\":\"DNA Research\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"DNA Research\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/dnares/dsaf024\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"DNA Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/dnares/dsaf024","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

摘要

了解复杂疾病的分子原因仍然是生物医学领域最紧迫的挑战之一。尽管大规模的全基因组关联研究绘制了数千个风险位点，但确定哪些基因变异真正导致疾病仍然很困难。传统的统计遗传学为变异发现奠定了坚实的基础，但它往往难以捕捉非线性相互作用，不能充分整合相互关联的多组学数据的广度。近年来，深度学习方法在弥合这些差距方面显示出了希望：建模高阶遗传相互作用，揭示潜在的生物结构，并实现多层数据集成。然而，目前大多数基因组学深度学习模型本质上仍然是探索性的，诸如易过度拟合、可解释性困难以及普遍缺乏标准化评估框架等问题限制了它们在基因组学研究中的广泛采用。在这篇综述中，我们探讨了如何应用传统的统计和深度学习方法来揭示复杂疾病的因果机制。我们批判性地比较了这两个框架在检测遗传关联和优先考虑因果关系方面的优势和局限性。最后，我们提出了一个以混合模型为中心的未来方向，混合模型将深度学习的可扩展性与统计遗传学的推断能力相结合。我们的目标是指导研究人员开发下一代计算工具，以揭示复杂疾病的分子基础，并加速将遗传发现转化为有效的治疗方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Can classical statistics and deep learning converge on explainable, causally driven target discovery?

Understanding the molecular causes of complex diseases remains one of the most pressing challenges in biomedicine. Despite large-scale genome-wide association studies mapping thousands of risk loci, identifying which genetic variants truly drive disease remains difficult. Traditional statistical genetics has laid a strong foundation for variant discovery, but it often struggles to capture non-linear interactions and cannot fully integrate the breadth of the interconnected multi-omics data. In recent years, deep learning approaches have shown promise in bridging these gaps: modeling high-order genetic interactions, uncovering latent biological structure, and enabling multi-layered data integration. However, most current deep learning models for genomics remain exploratory in nature, and issues such as susceptibility to overfitting, difficulties in interpretability, and the general lack of standardized evaluation frameworks have limited their widespread adoption for genomics research. In this review, we explore how traditional statistical and deep learning methods can be applied to uncover causal mechanisms in complex disease. We critically compare these two frameworks for their advantages and limitations in detecting genetic associations and prioritizing causal associations. Toward the end, we propose a future direction centered around hybrid models that blend the scalability of deep learning with the inferential power of statistical genetics. Our goal is to guide researchers in developing next-generation computational tools to uncover the molecular basis of complex diseases and accelerate the translation of genetic findings into effective treatments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

DNA Research 生物-遗传学

CiteScore

6.00

自引率

4.90%

发文量

审稿时长

4.5 months

期刊介绍： DNA Research is an internationally peer-reviewed journal which aims at publishing papers of highest quality in broad aspects of DNA and genome-related research. Emphasis will be made on the following subjects: 1) Sequencing and characterization of genomes/important genomic regions, 2) Comprehensive analysis of the functions of genes, gene families and genomes, 3) Techniques and equipments useful for structural and functional analysis of genes, gene families and genomes, 4) Computer algorithms and/or their applications relevant to structural and functional analysis of genes and genomes. The journal also welcomes novel findings in other scientific disciplines related to genomes.