Haoyu Tian, Kuo Yang, Zeyu Liu, Hong Gao, Jian Yu, Lei Zhang, Xuezhong Zhou
{"title":"MELGene: knowledge-enhanced multimodel ensemble learning for disease-gene association prediction.","authors":"Haoyu Tian, Kuo Yang, Zeyu Liu, Hong Gao, Jian Yu, Lei Zhang, Xuezhong Zhou","doi":"10.1093/bib/bbag172","DOIUrl":null,"url":null,"abstract":"<p><p>Disease-gene prediction (DGP) plays a pivotal role in understanding the genetic underpinnings of various diseases, offering insights for disease diagnosis, treatment, and prevention. Accurate identification of disease-related genes can enhance personalized medicine and the development of targeted therapies. While numerous methods for DGP have been proposed in the field, a significant challenge remains in effectively capturing and modeling the complex relationships among biological entities, such as diseases, symptoms, genes, and pathways. These intricate interactions are essential for learning robust representations of phenotypes and genotypes, which are critical for accurate DGP. In this study, we introduce MELGene, a knowledge-enhanced multimodel ensemble learning framework for DGP. MELGene leverages an adaptive integration of multiple pretrained knowledge inference models based on knowledge graph, effectively integrating the collective intelligence of diverse models to achieve more accurate gene predictions. The framework incorporates Model-aware Importance Learning, which dynamically adjusts the contributions of individual models, and introduces a dynamic ensemble mechanism to obtain robust consensus predictions. Finally, we conducted comprehensive experiments, including performance comparisons, which demonstrated the excellent performance of MELGene. Ablation experiments highlighted the positive impact of each module, while case studies showcased the reliability of the biological relevance of gastric, lung, and liver cancers, as supported by the analysis of network medicine, functional enrichment, and literature mining. MELGene offers a flexible framework for DGP through knowledge enhancement and adaptive ensemble learning, with broad potential for decoding disease mechanisms.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"27 2","pages":""},"PeriodicalIF":7.7000,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13082380/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbag172","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Disease-gene prediction (DGP) plays a pivotal role in understanding the genetic underpinnings of various diseases, offering insights for disease diagnosis, treatment, and prevention. Accurate identification of disease-related genes can enhance personalized medicine and the development of targeted therapies. While numerous methods for DGP have been proposed in the field, a significant challenge remains in effectively capturing and modeling the complex relationships among biological entities, such as diseases, symptoms, genes, and pathways. These intricate interactions are essential for learning robust representations of phenotypes and genotypes, which are critical for accurate DGP. In this study, we introduce MELGene, a knowledge-enhanced multimodel ensemble learning framework for DGP. MELGene leverages an adaptive integration of multiple pretrained knowledge inference models based on knowledge graph, effectively integrating the collective intelligence of diverse models to achieve more accurate gene predictions. The framework incorporates Model-aware Importance Learning, which dynamically adjusts the contributions of individual models, and introduces a dynamic ensemble mechanism to obtain robust consensus predictions. Finally, we conducted comprehensive experiments, including performance comparisons, which demonstrated the excellent performance of MELGene. Ablation experiments highlighted the positive impact of each module, while case studies showcased the reliability of the biological relevance of gastric, lung, and liver cancers, as supported by the analysis of network medicine, functional enrichment, and literature mining. MELGene offers a flexible framework for DGP through knowledge enhancement and adaptive ensemble learning, with broad potential for decoding disease mechanisms.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.