{"title":"Identifying sequential differences between protein structural classes using network and statistical approaches.","authors":"Xiaogeng Wan, Xinying Tan","doi":"10.62617/mcb.v21.199","DOIUrl":null,"url":null,"abstract":"Protein sequence information are believed to embed the hint of their structures. In this study, we motivate to use network and statistical approaches to identify the sequential differences between different protein structural classes and between different structural motifs. By examine significant amino acid feature interactions and statistical distributions of feature series, both common and special characteristics are identified for the different protein structural types. Analyses suggest that all top protein structural classes of CATH and SCOP show Leu, Val, and Asn as the sources of strong feature interactions, while Cys, His, Trp, and Met exhibit weak intra-type interactions with other features. There are also significant interactions between amino acids features such as Ala and -helix and bend preference, Ala and side-chain size, Ala and Gly, and Met and Leu. These phenomena are observed in all structural classes, which are assumed to have little influences in distinguishing the different structures. In structures, Glu, Pro and side-chain size, hydrophobicity exhibit high importance in feature interactions, while in structures, Gly, Thr and physical properties such as -helix and bend preference, extended structural preference, pK-C value and surrounding hydrophobicity for structures show high importance in feature interactions. When comparing between the and structures, both types of structures show Ser as the common sources of feature interactions. The mixed and structures not only present common feature interactions with and structures, but exhibit special interactions between Met, Lys and double-bend preference property, and between the sequence arrangements of Cys, His, Met, Tyr and amino acid composition features. The intrinsically disordered proteins (IDPs) tends to present repetition patterns for a same kind of amino acids in high frequency Kmers, while the nine typical types of structural motifs also show different characteristics. Different value ranges are also found for different structural types according to statistical tests. The outcomes of this comparison study not only help to illuminate the mechanism of amino acid feature interactions in different types of structures, but also help us gain deeper understanding on how protein sequence influence structures.","PeriodicalId":153642,"journal":{"name":"Molecular & Cellular Biomechanics","volume":"87 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular & Cellular Biomechanics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.62617/mcb.v21.199","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Protein sequence information are believed to embed the hint of their structures. In this study, we motivate to use network and statistical approaches to identify the sequential differences between different protein structural classes and between different structural motifs. By examine significant amino acid feature interactions and statistical distributions of feature series, both common and special characteristics are identified for the different protein structural types. Analyses suggest that all top protein structural classes of CATH and SCOP show Leu, Val, and Asn as the sources of strong feature interactions, while Cys, His, Trp, and Met exhibit weak intra-type interactions with other features. There are also significant interactions between amino acids features such as Ala and -helix and bend preference, Ala and side-chain size, Ala and Gly, and Met and Leu. These phenomena are observed in all structural classes, which are assumed to have little influences in distinguishing the different structures. In structures, Glu, Pro and side-chain size, hydrophobicity exhibit high importance in feature interactions, while in structures, Gly, Thr and physical properties such as -helix and bend preference, extended structural preference, pK-C value and surrounding hydrophobicity for structures show high importance in feature interactions. When comparing between the and structures, both types of structures show Ser as the common sources of feature interactions. The mixed and structures not only present common feature interactions with and structures, but exhibit special interactions between Met, Lys and double-bend preference property, and between the sequence arrangements of Cys, His, Met, Tyr and amino acid composition features. The intrinsically disordered proteins (IDPs) tends to present repetition patterns for a same kind of amino acids in high frequency Kmers, while the nine typical types of structural motifs also show different characteristics. Different value ranges are also found for different structural types according to statistical tests. The outcomes of this comparison study not only help to illuminate the mechanism of amino acid feature interactions in different types of structures, but also help us gain deeper understanding on how protein sequence influence structures.