蛋白质接触预测算法的性能分析

2020 XLVI Latin American Computing Conference (CLEI) Pub Date : 2020-10-01 DOI:10.1109/CLEI52000.2020.00015

Romina Valdez, Khevin Roig, Diego Pinto, José Colbes

{"title":"蛋白质接触预测算法的性能分析","authors":"Romina Valdez, Khevin Roig, Diego Pinto, José Colbes","doi":"10.1109/CLEI52000.2020.00015","DOIUrl":null,"url":null,"abstract":"One of the most important unsolved problems in the area of Computational Biology is the prediction of protein structures. A key element in this problem is the prediction of contacts in a protein from its amino acid sequence, since it provides fundamental information for the determination of its three-dimensional structure. Due to the attention devoted to this subproblem, especially in the last decade, there are a large number of methods in the literature that obtain very good results; but there is still a considerable room for improvement. In the 13th edition of the Critical Assessment of protein Structure Prediction (CASP), a notable progress has been achieved in this area due to the use of deep learning and deep convolutional residual neural networks in state-of-the-art methods; in addition to the use of additional information from other predictions, such as solvent accessibility, conformation of the secondary structure, etc. The present work analyzes the performance of the most outstanding CASP13 methods, considering a larger test set (483 proteins) with proteins of four different classes according to SCOP. The results were evaluated using the CASP metrics. The analysis indicates that most of the selected methods have an accuracy above 90% for the test set used; SPOT-Contact being the best prediction method in general, and at least one of the best in each of the SCOP classes. The test cases and implementations made for the evaluation of results are publicly available.","PeriodicalId":413655,"journal":{"name":"2020 XLVI Latin American Computing Conference (CLEI)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance Analysis of Protein Contact Prediction Algorithms\",\"authors\":\"Romina Valdez, Khevin Roig, Diego Pinto, José Colbes\",\"doi\":\"10.1109/CLEI52000.2020.00015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the most important unsolved problems in the area of Computational Biology is the prediction of protein structures. A key element in this problem is the prediction of contacts in a protein from its amino acid sequence, since it provides fundamental information for the determination of its three-dimensional structure. Due to the attention devoted to this subproblem, especially in the last decade, there are a large number of methods in the literature that obtain very good results; but there is still a considerable room for improvement. In the 13th edition of the Critical Assessment of protein Structure Prediction (CASP), a notable progress has been achieved in this area due to the use of deep learning and deep convolutional residual neural networks in state-of-the-art methods; in addition to the use of additional information from other predictions, such as solvent accessibility, conformation of the secondary structure, etc. The present work analyzes the performance of the most outstanding CASP13 methods, considering a larger test set (483 proteins) with proteins of four different classes according to SCOP. The results were evaluated using the CASP metrics. The analysis indicates that most of the selected methods have an accuracy above 90% for the test set used; SPOT-Contact being the best prediction method in general, and at least one of the best in each of the SCOP classes. The test cases and implementations made for the evaluation of results are publicly available.\",\"PeriodicalId\":413655,\"journal\":{\"name\":\"2020 XLVI Latin American Computing Conference (CLEI)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 XLVI Latin American Computing Conference (CLEI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLEI52000.2020.00015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 XLVI Latin American Computing Conference (CLEI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLEI52000.2020.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

计算生物学领域最重要的未解决问题之一是蛋白质结构的预测。这个问题的一个关键因素是根据氨基酸序列预测蛋白质的接触，因为它为确定蛋白质的三维结构提供了基本信息。由于对这一子问题的关注，特别是近十年来，文献中有大量的方法获得了非常好的结果;但仍有相当大的改进空间。在第13版蛋白质结构预测关键评估(CASP)中，由于在最先进的方法中使用了深度学习和深度卷积残差神经网络，因此在该领域取得了显着进展;除了使用来自其他预测的附加信息，如溶剂可及性、二级结构的构象等。目前的工作分析了最优秀的CASP13方法的性能，考虑到一个更大的测试集(483种蛋白质)，根据SCOP分为四种不同的蛋白质。使用CASP指标对结果进行评估。分析表明，对于所使用的测试集，大多数选择的方法的准确率在90%以上;一般来说，SPOT-Contact是最好的预测方法，并且至少是每个SCOP类别中最好的预测方法之一。用于评估结果的测试用例和实现是公开可用的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Performance Analysis of Protein Contact Prediction Algorithms

One of the most important unsolved problems in the area of Computational Biology is the prediction of protein structures. A key element in this problem is the prediction of contacts in a protein from its amino acid sequence, since it provides fundamental information for the determination of its three-dimensional structure. Due to the attention devoted to this subproblem, especially in the last decade, there are a large number of methods in the literature that obtain very good results; but there is still a considerable room for improvement. In the 13th edition of the Critical Assessment of protein Structure Prediction (CASP), a notable progress has been achieved in this area due to the use of deep learning and deep convolutional residual neural networks in state-of-the-art methods; in addition to the use of additional information from other predictions, such as solvent accessibility, conformation of the secondary structure, etc. The present work analyzes the performance of the most outstanding CASP13 methods, considering a larger test set (483 proteins) with proteins of four different classes according to SCOP. The results were evaluated using the CASP metrics. The analysis indicates that most of the selected methods have an accuracy above 90% for the test set used; SPOT-Contact being the best prediction method in general, and at least one of the best in each of the SCOP classes. The test cases and implementations made for the evaluation of results are publicly available.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 XLVI Latin American Computing Conference (CLEI)

自引率

0.00%

发文量