M-DeepAssembly：基于多目标多结构域蛋白质构象采样的增强DeepAssembly。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics Pub Date : 2025-05-05 DOI:10.1186/s12859-025-06131-2

Xinyue Cui, Yuhao Xia, Minghua Hou, Xuanfeng Zhao, Suhui Wang, Guijun Zhang

{"title":"M-DeepAssembly：基于多目标多结构域蛋白质构象采样的增强DeepAssembly。","authors":"Xinyue Cui, Yuhao Xia, Minghua Hou, Xuanfeng Zhao, Suhui Wang, Guijun Zhang","doi":"10.1186/s12859-025-06131-2","DOIUrl":null,"url":null,"abstract":"Background: Association and cooperation among structural domains play an important role in protein function and drug design. Despite remarkable advancements in highly accurate single-domain protein structure prediction through the collaborative efforts of the community using deep learning, challenges still exist in predicting multi-domain protein structures when the evolutionary signal for a given domain pair is weak or the protein structure is large.Results: To alleviate the above challenges, we proposed M-DeepAssembly, a protocol based on multi-objective protein conformation sampling algorithm for multi-domain protein structure prediction. Firstly, the inter-domain interactions and full-length sequence distance features are extracted through DeepAssembly and AlphaFold2, respectively. Secondly, subject to these features, we constructed a multi-objective energy model and designed a sampling algorithm for exploring and exploiting conformational space to generate ensembles. Finally, the output protein structure was selected from the ensembles using our in-house developed model quality assessment algorithm. On the test set of 164 multi-domain proteins, the results show that the average TM-score of M-DeepAssembly is 15.4% and 2.0% higher than AlphaFold2 and DeepAssembly, respectively. It is worth noting that there are models with higher accuracy in ensembles, achieving an improvement of 20.3% and 6.4% relative to the two baseline methods, although these models were not selected. Furthermore, when compared to the prediction results of AlphaFold2 for CASP15 multi-domain targets, M-DeepAssembly demonstrates certain performance advantages.Conclusions: M-DeepAssembly provides a distinctive multi-domain protein assembly algorithm, which can alleviate the current challenges of weak evolutionary signals and large structures to some extent by forming diverse ensembles using multi-objective protein conformation sampling algorithm. The proposed method contributes to exploring the functions of multi-domain proteins, especially providing new insights into targets with multiple conformational states.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"120"},"PeriodicalIF":3.3000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12054043/pdf/","citationCount":"0","resultStr":"{\"title\":\"M-DeepAssembly: enhanced DeepAssembly based on multi-objective multi-domain protein conformation sampling.\",\"authors\":\"Xinyue Cui, Yuhao Xia, Minghua Hou, Xuanfeng Zhao, Suhui Wang, Guijun Zhang\",\"doi\":\"10.1186/s12859-025-06131-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Association and cooperation among structural domains play an important role in protein function and drug design. Despite remarkable advancements in highly accurate single-domain protein structure prediction through the collaborative efforts of the community using deep learning, challenges still exist in predicting multi-domain protein structures when the evolutionary signal for a given domain pair is weak or the protein structure is large.Results: To alleviate the above challenges, we proposed M-DeepAssembly, a protocol based on multi-objective protein conformation sampling algorithm for multi-domain protein structure prediction. Firstly, the inter-domain interactions and full-length sequence distance features are extracted through DeepAssembly and AlphaFold2, respectively. Secondly, subject to these features, we constructed a multi-objective energy model and designed a sampling algorithm for exploring and exploiting conformational space to generate ensembles. Finally, the output protein structure was selected from the ensembles using our in-house developed model quality assessment algorithm. On the test set of 164 multi-domain proteins, the results show that the average TM-score of M-DeepAssembly is 15.4% and 2.0% higher than AlphaFold2 and DeepAssembly, respectively. It is worth noting that there are models with higher accuracy in ensembles, achieving an improvement of 20.3% and 6.4% relative to the two baseline methods, although these models were not selected. Furthermore, when compared to the prediction results of AlphaFold2 for CASP15 multi-domain targets, M-DeepAssembly demonstrates certain performance advantages.Conclusions: M-DeepAssembly provides a distinctive multi-domain protein assembly algorithm, which can alleviate the current challenges of weak evolutionary signals and large structures to some extent by forming diverse ensembles using multi-objective protein conformation sampling algorithm. The proposed method contributes to exploring the functions of multi-domain proteins, especially providing new insights into targets with multiple conformational states.\",\"PeriodicalId\":8958,\"journal\":{\"name\":\"BMC Bioinformatics\",\"volume\":\"26 1\",\"pages\":\"120\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12054043/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12859-025-06131-2\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06131-2","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

背景：结构域之间的关联和合作在蛋白质功能和药物设计中起着重要作用。尽管通过深度学习社区的共同努力，在高精度的单结构域蛋白质结构预测方面取得了显著进展，但当给定结构域对的进化信号较弱或蛋白质结构较大时，预测多结构域蛋白质结构仍然存在挑战。结果：为了缓解上述挑战，我们提出了一种基于多目标蛋白质构象采样算法的M-DeepAssembly协议，用于多结构域蛋白质结构预测。首先，分别通过DeepAssembly和AlphaFold2提取域间相互作用和全长序列距离特征；其次，针对这些特点，构建了多目标能量模型，并设计了一种探索和利用构象空间生成系综的采样算法；最后，使用我们内部开发的模型质量评估算法从集成中选择输出的蛋白质结构。在164个多结构域蛋白的测试集上，结果表明M-DeepAssembly的平均tm分数分别比AlphaFold2和DeepAssembly高15.4%和2.0%。值得注意的是，尽管没有选择这些模型，但在集成中存在精度更高的模型，相对于两种基线方法，它们的精度分别提高了20.3%和6.4%。此外，与AlphaFold2对CASP15多域靶标的预测结果相比，M-DeepAssembly显示出一定的性能优势。结论：M-DeepAssembly提供了一种独特的多结构域蛋白质组装算法，通过多目标蛋白质构象采样算法形成多样化的集成，在一定程度上缓解了当前弱进化信号和大结构的挑战。该方法有助于探索多结构域蛋白的功能，特别是对具有多种构象状态的靶标提供了新的认识。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

M-DeepAssembly: enhanced DeepAssembly based on multi-objective multi-domain protein conformation sampling.

Background: Association and cooperation among structural domains play an important role in protein function and drug design. Despite remarkable advancements in highly accurate single-domain protein structure prediction through the collaborative efforts of the community using deep learning, challenges still exist in predicting multi-domain protein structures when the evolutionary signal for a given domain pair is weak or the protein structure is large.

Results: To alleviate the above challenges, we proposed M-DeepAssembly, a protocol based on multi-objective protein conformation sampling algorithm for multi-domain protein structure prediction. Firstly, the inter-domain interactions and full-length sequence distance features are extracted through DeepAssembly and AlphaFold2, respectively. Secondly, subject to these features, we constructed a multi-objective energy model and designed a sampling algorithm for exploring and exploiting conformational space to generate ensembles. Finally, the output protein structure was selected from the ensembles using our in-house developed model quality assessment algorithm. On the test set of 164 multi-domain proteins, the results show that the average TM-score of M-DeepAssembly is 15.4% and 2.0% higher than AlphaFold2 and DeepAssembly, respectively. It is worth noting that there are models with higher accuracy in ensembles, achieving an improvement of 20.3% and 6.4% relative to the two baseline methods, although these models were not selected. Furthermore, when compared to the prediction results of AlphaFold2 for CASP15 multi-domain targets, M-DeepAssembly demonstrates certain performance advantages.

Conclusions: M-DeepAssembly provides a distinctive multi-domain protein assembly algorithm, which can alleviate the current challenges of weak evolutionary signals and large structures to some extent by forming diverse ensembles using multi-objective protein conformation sampling algorithm. The proposed method contributes to exploring the functions of multi-domain proteins, especially providing new insights into targets with multiple conformational states.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMC Bioinformatics 生物-生化研究方法

CiteScore

5.70

自引率

3.30%

发文量

506

审稿时长

4.3 months

期刊介绍： BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.