模式:蛋白质内在紊乱、低复杂性和成分偏差的一组解译助手。

IF 4.8 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Biomolecules Pub Date : 2025-09-18 DOI:10.3390/biom15091332
Paul M Harrison
{"title":"模式:蛋白质内在紊乱、低复杂性和成分偏差的一组解译助手。","authors":"Paul M Harrison","doi":"10.3390/biom15091332","DOIUrl":null,"url":null,"abstract":"<p><p>Intrinsically disordered regions (IDRs) are sometimes considered parts of the 'dark proteomes', i.e., protein parts that have been largely under-appreciated, as are the overlapping phenomena of low-complexity or compositionally biased regions (LCRs/CBRs). Experimentalists and computationalists alike are still learning how to decrypt the functionally meaningful features of such regions. Here, I report the creation of the support troupe <b><i>Patterny</i></b> to aid such protein cryptanalysis. The current troupe members are named <i>Blocky</i>, <i>Bandy</i>, <i>Moduley</i>, <i>Repeaty</i>, and <i>Runny</i>. To discern important features, protein regions are compared to ideal assortments wherein everything is sampled proportionally and dispersed randomly. <i>Blocky</i> discerns the segregation of amino-acids by type, and scores them for it. <i>Bandy</i> is focused on picking out compositional bands and calculating their evenness. <i>Moduley</i> labels the boundaries of optimized compositional modules ('CModules') and other possible boundary sets for compositionally biased regions. <i>Repeaty</i> concisely summarizes repetitiveness using an information entropy of amino-acid interval diversity. <i>Runny</i> enumerates homopeptide content and assesses its significance. Both original whole sequences and CModules from <i>Moduley</i>, are fed into the other <b><i>Patterny</i></b> members. <b><i>Patterny</i></b> is applied to some illustrative sample data from yeast proteome and the DISPROT database. It is available at Github, and might aid those aiming to intensify light-shedding and hypothesis generation for protein regions with function encoded in a distributed manner, such as IDRs and LCRs/CBRs more generally.</p>","PeriodicalId":8943,"journal":{"name":"Biomolecules","volume":"15 9","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12467476/pdf/","citationCount":"0","resultStr":"{\"title\":\"<i>Patterny</i>: A Troupe of Decipherment Helpers for Intrinsic Disorder, Low Complexity and Compositional Bias in Proteins.\",\"authors\":\"Paul M Harrison\",\"doi\":\"10.3390/biom15091332\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Intrinsically disordered regions (IDRs) are sometimes considered parts of the 'dark proteomes', i.e., protein parts that have been largely under-appreciated, as are the overlapping phenomena of low-complexity or compositionally biased regions (LCRs/CBRs). Experimentalists and computationalists alike are still learning how to decrypt the functionally meaningful features of such regions. Here, I report the creation of the support troupe <b><i>Patterny</i></b> to aid such protein cryptanalysis. The current troupe members are named <i>Blocky</i>, <i>Bandy</i>, <i>Moduley</i>, <i>Repeaty</i>, and <i>Runny</i>. To discern important features, protein regions are compared to ideal assortments wherein everything is sampled proportionally and dispersed randomly. <i>Blocky</i> discerns the segregation of amino-acids by type, and scores them for it. <i>Bandy</i> is focused on picking out compositional bands and calculating their evenness. <i>Moduley</i> labels the boundaries of optimized compositional modules ('CModules') and other possible boundary sets for compositionally biased regions. <i>Repeaty</i> concisely summarizes repetitiveness using an information entropy of amino-acid interval diversity. <i>Runny</i> enumerates homopeptide content and assesses its significance. Both original whole sequences and CModules from <i>Moduley</i>, are fed into the other <b><i>Patterny</i></b> members. <b><i>Patterny</i></b> is applied to some illustrative sample data from yeast proteome and the DISPROT database. It is available at Github, and might aid those aiming to intensify light-shedding and hypothesis generation for protein regions with function encoded in a distributed manner, such as IDRs and LCRs/CBRs more generally.</p>\",\"PeriodicalId\":8943,\"journal\":{\"name\":\"Biomolecules\",\"volume\":\"15 9\",\"pages\":\"\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2025-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12467476/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomolecules\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.3390/biom15091332\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomolecules","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3390/biom15091332","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

内在无序区(idr)有时被认为是“暗蛋白质组”的一部分,即大部分未被充分认识的蛋白质部分,低复杂性或成分偏倚区(lcr / cbr)的重叠现象也是如此。实验学家和计算学家都还在学习如何解密这些区域的功能特征。在这里,我报告了支持剧团模式的创建,以帮助这种蛋白质密码分析。目前的剧团成员分别是布洛克、班迪、莫德利、重复性和Runny。为了辨别重要的特征,将蛋白质区域与理想的分类进行比较,其中所有的东西都按比例采样并随机分散。布洛克根据类型辨别出氨基酸的分离,并为此打分。班迪专注于挑选出成分波段并计算它们的均匀度。Moduley标记了优化的组合模块('CModules')的边界和组合偏置区域的其他可能的边界集。重复性用氨基酸间隔多样性的信息熵简明地概括了重复性。Runny列举同肽含量并评价其意义。原始的整个序列和来自Moduley的CModules都被输入到pattern的其他成员中。模式应用于酵母蛋白质组和DISPROT数据库的一些说明性样本数据。它可以在Github上获得,并且可能有助于那些旨在加强以分布式方式编码功能的蛋白质区域的光脱落和假设生成,例如更普遍的idr和lcr / cbr。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Patterny: A Troupe of Decipherment Helpers for Intrinsic Disorder, Low Complexity and Compositional Bias in Proteins.

Intrinsically disordered regions (IDRs) are sometimes considered parts of the 'dark proteomes', i.e., protein parts that have been largely under-appreciated, as are the overlapping phenomena of low-complexity or compositionally biased regions (LCRs/CBRs). Experimentalists and computationalists alike are still learning how to decrypt the functionally meaningful features of such regions. Here, I report the creation of the support troupe Patterny to aid such protein cryptanalysis. The current troupe members are named Blocky, Bandy, Moduley, Repeaty, and Runny. To discern important features, protein regions are compared to ideal assortments wherein everything is sampled proportionally and dispersed randomly. Blocky discerns the segregation of amino-acids by type, and scores them for it. Bandy is focused on picking out compositional bands and calculating their evenness. Moduley labels the boundaries of optimized compositional modules ('CModules') and other possible boundary sets for compositionally biased regions. Repeaty concisely summarizes repetitiveness using an information entropy of amino-acid interval diversity. Runny enumerates homopeptide content and assesses its significance. Both original whole sequences and CModules from Moduley, are fed into the other Patterny members. Patterny is applied to some illustrative sample data from yeast proteome and the DISPROT database. It is available at Github, and might aid those aiming to intensify light-shedding and hypothesis generation for protein regions with function encoded in a distributed manner, such as IDRs and LCRs/CBRs more generally.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Biomolecules
Biomolecules Biochemistry, Genetics and Molecular Biology-Molecular Biology
CiteScore
9.40
自引率
3.60%
发文量
1640
审稿时长
18.28 days
期刊介绍: Biomolecules (ISSN 2218-273X) is an international, peer-reviewed open access journal focusing on biogenic substances and their biological functions, structures, interactions with other molecules, and their microenvironment as well as biological systems. Biomolecules publishes reviews, regular research papers and short communications.  Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信