Identifying sequential differences between protein structural classes using network and statistical approaches.

Xiaogeng Wan, Xinying Tan
{"title":"Identifying sequential differences between protein structural classes using network and statistical approaches.","authors":"Xiaogeng Wan, Xinying Tan","doi":"10.62617/mcb.v21.199","DOIUrl":null,"url":null,"abstract":"Protein sequence information are believed to embed the hint of their structures. In this study, we motivate to use network and statistical approaches to identify the sequential differences between different protein structural classes and between different structural motifs. By examine significant amino acid feature interactions and statistical distributions of feature series, both common and special characteristics are identified for the different protein structural types. Analyses suggest that all top protein structural classes of CATH and SCOP show Leu, Val, and Asn as the sources of strong feature interactions, while Cys, His, Trp, and Met exhibit weak intra-type interactions with other features. There are also significant interactions between amino acids features such as Ala and -helix and bend preference, Ala and side-chain size, Ala and Gly, and Met and Leu. These phenomena are observed in all structural classes, which are assumed to have little influences in distinguishing the different structures. In  structures, Glu, Pro and side-chain size, hydrophobicity exhibit high importance in feature interactions, while in  structures, Gly, Thr and physical properties such as -helix and bend preference, extended structural preference, pK-C value and surrounding hydrophobicity for  structures show high importance in feature interactions. When comparing between the  and  structures, both types of structures show Ser as the common sources of feature interactions. The mixed  and  structures not only present common feature interactions with  and  structures, but exhibit special interactions between Met, Lys and double-bend preference property, and between the sequence arrangements of Cys, His, Met, Tyr and amino acid composition features. The intrinsically disordered proteins (IDPs) tends to present repetition patterns for a same kind of amino acids in high frequency Kmers, while the nine typical types of structural motifs also show different characteristics. Different value ranges are also found for different structural types according to statistical tests. The outcomes of this comparison study not only help to illuminate the mechanism of amino acid feature interactions in different types of structures, but also help us gain deeper understanding on how protein sequence influence structures.","PeriodicalId":153642,"journal":{"name":"Molecular & Cellular Biomechanics","volume":"87 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular & Cellular Biomechanics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.62617/mcb.v21.199","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Protein sequence information are believed to embed the hint of their structures. In this study, we motivate to use network and statistical approaches to identify the sequential differences between different protein structural classes and between different structural motifs. By examine significant amino acid feature interactions and statistical distributions of feature series, both common and special characteristics are identified for the different protein structural types. Analyses suggest that all top protein structural classes of CATH and SCOP show Leu, Val, and Asn as the sources of strong feature interactions, while Cys, His, Trp, and Met exhibit weak intra-type interactions with other features. There are also significant interactions between amino acids features such as Ala and -helix and bend preference, Ala and side-chain size, Ala and Gly, and Met and Leu. These phenomena are observed in all structural classes, which are assumed to have little influences in distinguishing the different structures. In  structures, Glu, Pro and side-chain size, hydrophobicity exhibit high importance in feature interactions, while in  structures, Gly, Thr and physical properties such as -helix and bend preference, extended structural preference, pK-C value and surrounding hydrophobicity for  structures show high importance in feature interactions. When comparing between the  and  structures, both types of structures show Ser as the common sources of feature interactions. The mixed  and  structures not only present common feature interactions with  and  structures, but exhibit special interactions between Met, Lys and double-bend preference property, and between the sequence arrangements of Cys, His, Met, Tyr and amino acid composition features. The intrinsically disordered proteins (IDPs) tends to present repetition patterns for a same kind of amino acids in high frequency Kmers, while the nine typical types of structural motifs also show different characteristics. Different value ranges are also found for different structural types according to statistical tests. The outcomes of this comparison study not only help to illuminate the mechanism of amino acid feature interactions in different types of structures, but also help us gain deeper understanding on how protein sequence influence structures.
利用网络和统计方法识别蛋白质结构类别之间的序列差异。
蛋白质序列信息被认为蕴含着其结构的暗示。在本研究中,我们利用网络和统计方法来识别不同蛋白质结构类别之间以及不同结构主题之间的序列差异。通过研究重要的氨基酸特征相互作用和特征序列的统计分布,我们发现了不同蛋白质结构类型的共性和特殊性。分析表明,CATH 和 SCOP 的所有顶级蛋白质结构类别都显示出 Leu、Val 和 Asn 是强特征相互作用的来源,而 Cys、His、Trp 和 Met 则显示出与其他特征的弱类型内相互作用。氨基酸特征之间也存在明显的相互作用,如 Ala 与螺旋和弯曲偏好、Ala 与侧链大小、Ala 与 Gly 以及 Met 与 Leu。这些现象在所有结构类别中都能观察到,但假定它们对区分不同结构的影响不大。在结构上,Glu、Pro 和侧链大小、疏水性在特征相互作用中表现出很高的重要性,而在结构上,Gly、Thr 和物理特性(如-螺旋和弯曲偏好、扩展结构偏好、pK-C 值和结构周围的疏水性)在特征相互作用中表现出很高的重要性。在比较混合结构和结构时,两类结构都显示 Ser 是特征相互作用的共同来源。混合结构和结构不仅与和结构之间存在共同的特征相互作用,而且在 Met、Lys 和双弯曲偏好特性之间,以及 Cys、His、Met、Tyr 的序列排列和氨基酸组成特征之间表现出特殊的相互作用。本征无序蛋白(IDPs)倾向于在高频 Kmers 中出现同类氨基酸的重复模式,而九种典型的结构图案也表现出不同的特征。根据统计检验,不同结构类型也有不同的数值范围。这项比较研究的结果不仅有助于阐明氨基酸特征在不同类型结构中的相互作用机制,而且有助于我们更深入地理解蛋白质序列如何影响结构。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信