蛋白质组学密码：新的氨基酸残基配对模型“编码”蛋白质折叠和蛋白质-蛋白质相互作用

IF 6.3 2区医学 Q1 BIOLOGY

Computers in biology and medicine Pub Date : 2025-03-19 DOI:10.1016/j.compbiomed.2025.110033

Tareq Hameduh , Andrew D. Miller , Zbynek Heger , Yazan Haddad

{"title":"蛋白质组学密码：新的氨基酸残基配对模型“编码”蛋白质折叠和蛋白质-蛋白质相互作用","authors":"Tareq Hameduh , Andrew D. Miller , Zbynek Heger , Yazan Haddad","doi":"10.1016/j.compbiomed.2025.110033","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advances in protein 3D structure prediction using deep learning have focused on the importance of amino acid residue-residue connections (<em>i.e.</em>, pairwise atomic contacts) for accuracy at the expense of mechanistic interpretability. Therefore, we decided to perform a series of analyses based on an alternative framework of residue-residue connections making primary use of the TOP2018 dataset. This framework of residue-residue connections is derived from amino acid residue pairing models both historic and new, all based on genetic principles complemented by relevant biophysical principles. Of these pairing models, three new models (named the GU, Transmuted and Shift pairing models) exhibit the highest observed-over-expected ratios and highest correlations in statistical analyses with various intra- and inter-chain datasets, in comparison to the remaining models. In addition, these new pairing models are universally frequent across different connection ranges, secondary structure connections, and protein sizes. Accordingly, following further statistical and other analyses described herein, we have come to a major conclusion that all three pairing models together could represent the basis of a universal proteomic code (second genetic code) sufficient, in and of itself, to “encode” for both protein folding mechanisms and protein-protein interactions.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"190 ","pages":"Article 110033"},"PeriodicalIF":6.3000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The proteomic code: Novel amino acid residue pairing models “encode” protein folding and protein-protein interactions\",\"authors\":\"Tareq Hameduh , Andrew D. Miller , Zbynek Heger , Yazan Haddad\",\"doi\":\"10.1016/j.compbiomed.2025.110033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recent advances in protein 3D structure prediction using deep learning have focused on the importance of amino acid residue-residue connections (<em>i.e.</em>, pairwise atomic contacts) for accuracy at the expense of mechanistic interpretability. Therefore, we decided to perform a series of analyses based on an alternative framework of residue-residue connections making primary use of the TOP2018 dataset. This framework of residue-residue connections is derived from amino acid residue pairing models both historic and new, all based on genetic principles complemented by relevant biophysical principles. Of these pairing models, three new models (named the GU, Transmuted and Shift pairing models) exhibit the highest observed-over-expected ratios and highest correlations in statistical analyses with various intra- and inter-chain datasets, in comparison to the remaining models. In addition, these new pairing models are universally frequent across different connection ranges, secondary structure connections, and protein sizes. Accordingly, following further statistical and other analyses described herein, we have come to a major conclusion that all three pairing models together could represent the basis of a universal proteomic code (second genetic code) sufficient, in and of itself, to “encode” for both protein folding mechanisms and protein-protein interactions.</div></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"190 \",\"pages\":\"Article 110033\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-03-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482525003841\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525003841","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

使用深度学习的蛋白质3D结构预测的最新进展集中在氨基酸残基-残基连接（即成对原子接触）的重要性上，以牺牲机制可解释性为代价。因此，我们决定基于残基-残基连接的替代框架（主要使用TOP2018数据集）进行一系列分析。残基-残基连接的框架来源于历史上和新的氨基酸残基配对模型，所有这些模型都基于遗传原理，并辅以相关的生物物理原理。在这些配对模型中，与其他模型相比，三个新模型（称为GU， transmute和Shift配对模型）在与各种链内和链间数据集的统计分析中显示出最高的观察到的超额预期比率和最高的相关性。此外，这些新的配对模型在不同的连接范围、二级结构连接和蛋白质大小中普遍频繁。因此，在进一步的统计和其他分析之后，我们得出了一个主要结论，即所有三种配对模型一起可以代表一个通用蛋白质组密码（第二遗传密码）的基础，其本身足以“编码”蛋白质折叠机制和蛋白质-蛋白质相互作用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

The proteomic code: Novel amino acid residue pairing models “encode” protein folding and protein-protein interactions

查看原文本刊更多论文

The proteomic code: Novel amino acid residue pairing models “encode” protein folding and protein-protein interactions

Recent advances in protein 3D structure prediction using deep learning have focused on the importance of amino acid residue-residue connections (i.e., pairwise atomic contacts) for accuracy at the expense of mechanistic interpretability. Therefore, we decided to perform a series of analyses based on an alternative framework of residue-residue connections making primary use of the TOP2018 dataset. This framework of residue-residue connections is derived from amino acid residue pairing models both historic and new, all based on genetic principles complemented by relevant biophysical principles. Of these pairing models, three new models (named the GU, Transmuted and Shift pairing models) exhibit the highest observed-over-expected ratios and highest correlations in statistical analyses with various intra- and inter-chain datasets, in comparison to the remaining models. In addition, these new pairing models are universally frequent across different connection ranges, secondary structure connections, and protein sizes. Accordingly, following further statistical and other analyses described herein, we have come to a major conclusion that all three pairing models together could represent the basis of a universal proteomic code (second genetic code) sufficient, in and of itself, to “encode” for both protein folding mechanisms and protein-protein interactions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers in biology and medicine 工程技术-工程：生物医学

CiteScore

11.70

自引率

10.40%

发文量

1086

审稿时长

74 days

期刊介绍： Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.