DA-HGL：用于蛋白质功能预测的域增强异构图学习框架。

IF 7.7 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Briefings in bioinformatics Pub Date : 2025-08-31 DOI:10.1093/bib/bbaf511

Sai Hu, Wei Zhang, Bihai Zhao

{"title":"DA-HGL：用于蛋白质功能预测的域增强异构图学习框架。","authors":"Sai Hu, Wei Zhang, Bihai Zhao","doi":"10.1093/bib/bbaf511","DOIUrl":null,"url":null,"abstract":"Accurate protein function prediction is critical for deciphering disease mechanisms and advancing precision medicine, yet remains challenging for proteins with sparse annotations. Traditional methods struggle with annotation sparsity and fail to integrate multimodal data holistically. We propose DA-HGL, a heterogeneous graph learning framework that integrates protein sequences, domain architectures, and Gene Ontology (GO) hierarchies through a multilayered graph and non-negative matrix factorization with dual biological constraints. DA-HGL uniquely models domain-function coherence, GO semantic consistency, and topological congruence. Evaluated on yeast and human proteomes, DA-HGL achieves Fmax gains of 9.0% (yeast CC) and 17.2% (human BP) over state-of-the-art methods. By dynamically learning domain-context associations and resolving annotation sparsity, DA-HGL excels in cold-start scenarios and disease-specific predictions (e.g. Parkinson's \"ubiquitin-dependent catabolism\"). This framework offers a robust tool for accelerating functional genomics and precision medicine. Code/data: https://github.com/husaiccsu/DA-HGL.","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 5","pages":""},"PeriodicalIF":7.7000,"publicationDate":"2025-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12476837/pdf/","citationCount":"0","resultStr":"{\"title\":\"DA-HGL: a domain-augmented heterogeneous graph learning framework for protein function prediction.\",\"authors\":\"Sai Hu, Wei Zhang, Bihai Zhao\",\"doi\":\"10.1093/bib/bbaf511\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accurate protein function prediction is critical for deciphering disease mechanisms and advancing precision medicine, yet remains challenging for proteins with sparse annotations. Traditional methods struggle with annotation sparsity and fail to integrate multimodal data holistically. We propose DA-HGL, a heterogeneous graph learning framework that integrates protein sequences, domain architectures, and Gene Ontology (GO) hierarchies through a multilayered graph and non-negative matrix factorization with dual biological constraints. DA-HGL uniquely models domain-function coherence, GO semantic consistency, and topological congruence. Evaluated on yeast and human proteomes, DA-HGL achieves Fmax gains of 9.0% (yeast CC) and 17.2% (human BP) over state-of-the-art methods. By dynamically learning domain-context associations and resolving annotation sparsity, DA-HGL excels in cold-start scenarios and disease-specific predictions (e.g. Parkinson's \\\"ubiquitin-dependent catabolism\\\"). This framework offers a robust tool for accelerating functional genomics and precision medicine. Code/data: https://github.com/husaiccsu/DA-HGL.\",\"PeriodicalId\":9209,\"journal\":{\"name\":\"Briefings in bioinformatics\",\"volume\":\"26 5\",\"pages\":\"\"},\"PeriodicalIF\":7.7000,\"publicationDate\":\"2025-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12476837/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Briefings in bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/bib/bbaf511\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf511","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

准确的蛋白质功能预测对于破译疾病机制和推进精准医学至关重要，但对于具有稀疏注释的蛋白质仍然具有挑战性。传统的方法在标注稀疏性方面存在问题，并且不能完整地集成多模态数据。我们提出了DA-HGL，这是一个异构图学习框架，通过多层图和具有双重生物约束的非负矩阵分解，集成了蛋白质序列、结构域结构和基因本体（GO）层次。DA-HGL独特地建模域函数一致性，GO语义一致性和拓扑一致性。在酵母和人类蛋白质组学的评估中，DA-HGL比最先进的方法获得了9.0%（酵母CC）和17.2%（人类BP）的Fmax增益。通过动态学习域上下文关联和解析注释稀疏性，DA-HGL在冷启动场景和疾病特异性预测（例如帕金森病的泛素依赖性分解代谢）方面表现出色。这个框架为加速功能基因组学和精准医学提供了一个强大的工具。代码/数据:https://github.com/husaiccsu/DA-HGL。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DA-HGL: a domain-augmented heterogeneous graph learning framework for protein function prediction.

Accurate protein function prediction is critical for deciphering disease mechanisms and advancing precision medicine, yet remains challenging for proteins with sparse annotations. Traditional methods struggle with annotation sparsity and fail to integrate multimodal data holistically. We propose DA-HGL, a heterogeneous graph learning framework that integrates protein sequences, domain architectures, and Gene Ontology (GO) hierarchies through a multilayered graph and non-negative matrix factorization with dual biological constraints. DA-HGL uniquely models domain-function coherence, GO semantic consistency, and topological congruence. Evaluated on yeast and human proteomes, DA-HGL achieves Fmax gains of 9.0% (yeast CC) and 17.2% (human BP) over state-of-the-art methods. By dynamically learning domain-context associations and resolving annotation sparsity, DA-HGL excels in cold-start scenarios and disease-specific predictions (e.g. Parkinson's "ubiquitin-dependent catabolism"). This framework offers a robust tool for accelerating functional genomics and precision medicine. Code/data: https://github.com/husaiccsu/DA-HGL.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Briefings in bioinformatics 生物-生化研究方法

CiteScore

13.20

自引率

13.70%

发文量

549

审稿时长

6 months

期刊介绍： Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.