基于信息准则的蛋白质铰链快速准确估计。

IF 1.4 4区生物学 Q4 BIOCHEMICAL RESEARCH METHODS

Journal of Computational Biology Pub Date : 2025-05-01 Epub Date: 2025-04-28 DOI:10.1089/cmb.2024.0731

Bunsho Koyano, Tetsuo Shibuya

{"title":"基于信息准则的蛋白质铰链快速准确估计。","authors":"Bunsho Koyano, Tetsuo Shibuya","doi":"10.1089/cmb.2024.0731","DOIUrl":null,"url":null,"abstract":"Protein hinges are flexible parts connecting several rigid substructures of proteins that are crucial to determine protein function. Various methods have been developed for efficiently and accurately estimating protein hinge positions by comparing two different conformations of the same protein for a growing number of protein structures. However, few studies have focused on accurately estimating the number of hinges, and it is required to accurately estimate both the number and positions of hinges. We propose faster and more accurate algorithms for estimating the number and positions of hinges by utilizing information criteria that run in O(n2)-time, where n is the protein length. Our algorithms utilize Bayesian Information Criterion (BIC) or Akaike's Information Criterion based on a newly proposed k-hinge structure generation model that models the hinge motions between two protein conformations. Our exact algorithm based on BIC outperformed the most accurate previous method in terms of both hinge number and position accuracy on our simulation dataset. Our exact algorithm was approximately as fast as the previous fastest method, DynDom, on our simulation dataset. We evaluated the hinge number and position accuracy of our exact algorithm and previous methods on one hinge-annotated dataset. The hinge number and position accuracy of our exact algorithm were comparable to the most accurate previous method on the hinge-annotated dataset. We further propose even faster O(n)-time heuristic algorithms, where n is the protein length. Our heuristic algorithm achieved almost the same hinge number and position accuracy as our exact algorithm, and was over 18 times faster than our exact algorithm and DynDom.","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":"32 5","pages":"498-519"},"PeriodicalIF":1.4000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Faster and More Accurate Estimation of Protein Hinges Based on Information Criteria.\",\"authors\":\"Bunsho Koyano, Tetsuo Shibuya\",\"doi\":\"10.1089/cmb.2024.0731\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Protein hinges are flexible parts connecting several rigid substructures of proteins that are crucial to determine protein function. Various methods have been developed for efficiently and accurately estimating protein hinge positions by comparing two different conformations of the same protein for a growing number of protein structures. However, few studies have focused on accurately estimating the number of hinges, and it is required to accurately estimate both the number and positions of hinges. We propose faster and more accurate algorithms for estimating the number and positions of hinges by utilizing information criteria that run in O(n2)-time, where n is the protein length. Our algorithms utilize Bayesian Information Criterion (BIC) or Akaike's Information Criterion based on a newly proposed k-hinge structure generation model that models the hinge motions between two protein conformations. Our exact algorithm based on BIC outperformed the most accurate previous method in terms of both hinge number and position accuracy on our simulation dataset. Our exact algorithm was approximately as fast as the previous fastest method, DynDom, on our simulation dataset. We evaluated the hinge number and position accuracy of our exact algorithm and previous methods on one hinge-annotated dataset. The hinge number and position accuracy of our exact algorithm were comparable to the most accurate previous method on the hinge-annotated dataset. We further propose even faster O(n)-time heuristic algorithms, where n is the protein length. Our heuristic algorithm achieved almost the same hinge number and position accuracy as our exact algorithm, and was over 18 times faster than our exact algorithm and DynDom.\",\"PeriodicalId\":15526,\"journal\":{\"name\":\"Journal of Computational Biology\",\"volume\":\"32 5\",\"pages\":\"498-519\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1089/cmb.2024.0731\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/28 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1089/cmb.2024.0731","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/28 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

蛋白质铰链是连接蛋白质的几个刚性亚结构的柔性部件，对决定蛋白质的功能至关重要。对于越来越多的蛋白质结构，通过比较同一蛋白质的两种不同构象，已经开发了各种方法来有效和准确地估计蛋白质的铰链位置。然而，很少有研究关注铰链数量的准确估计，并且需要准确估计铰链的数量和位置。我们提出了更快和更准确的算法来估计铰链的数量和位置，利用信息标准在O（n2）时间内运行，其中n是蛋白质长度。我们的算法利用基于新提出的k-铰结构生成模型的贝叶斯信息准则（BIC）或Akaike信息准则，该模型模拟了两种蛋白质构象之间的铰运动。在仿真数据集上，基于BIC的精确算法在铰链数和位置精度方面都优于之前最精确的方法。在我们的模拟数据集上，我们的精确算法与之前最快的方法DynDom差不多快。我们在一个铰链注释数据集上评估了我们的精确算法和以前的方法的铰链数和位置精度。我们的精确算法在铰链注释数据集上的铰链数和位置精度与之前最精确的方法相当。我们进一步提出更快的O(n)时间启发式算法，其中n为蛋白质长度。启发式算法获得了与精确算法几乎相同的铰链数和位置精度，比精确算法和DynDom快18倍以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Faster and More Accurate Estimation of Protein Hinges Based on Information Criteria.

Protein hinges are flexible parts connecting several rigid substructures of proteins that are crucial to determine protein function. Various methods have been developed for efficiently and accurately estimating protein hinge positions by comparing two different conformations of the same protein for a growing number of protein structures. However, few studies have focused on accurately estimating the number of hinges, and it is required to accurately estimate both the number and positions of hinges. We propose faster and more accurate algorithms for estimating the number and positions of hinges by utilizing information criteria that run in O(n²)-time, where n is the protein length. Our algorithms utilize Bayesian Information Criterion (BIC) or Akaike's Information Criterion based on a newly proposed k-hinge structure generation model that models the hinge motions between two protein conformations. Our exact algorithm based on BIC outperformed the most accurate previous method in terms of both hinge number and position accuracy on our simulation dataset. Our exact algorithm was approximately as fast as the previous fastest method, DynDom, on our simulation dataset. We evaluated the hinge number and position accuracy of our exact algorithm and previous methods on one hinge-annotated dataset. The hinge number and position accuracy of our exact algorithm were comparable to the most accurate previous method on the hinge-annotated dataset. We further propose even faster O(n)-time heuristic algorithms, where n is the protein length. Our heuristic algorithm achieved almost the same hinge number and position accuracy as our exact algorithm, and was over 18 times faster than our exact algorithm and DynDom.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Computational Biology 生物-计算机：跨学科应用

CiteScore

3.60

自引率

5.90%

发文量

113

审稿时长

6-12 weeks

期刊介绍： Journal of Computational Biology is the leading peer-reviewed journal in computational biology and bioinformatics, publishing in-depth statistical, mathematical, and computational analysis of methods, as well as their practical impact. Available only online, this is an essential journal for scientists and students who want to keep abreast of developments in bioinformatics. Journal of Computational Biology coverage includes: -Genomics -Mathematical modeling and simulation -Distributed and parallel biological computing -Designing biological databases -Pattern matching and pattern detection -Linking disparate databases and data -New tools for computational biology -Relational and object-oriented database technology for bioinformatics -Biological expert system design and use -Reasoning by analogy, hypothesis formation, and testing by machine -Management of biological databases