Prediction of Catalytic Residues in Proteins Using a Consensus of Prediction (CoP) Approach

2010 IEEE International Conference on BioInformatics and BioEngineering Pub Date : 2010-05-31 DOI:10.1109/BIBE.2010.44

N. Petrova, Cathy H. Wu

{"title":"Prediction of Catalytic Residues in Proteins Using a Consensus of Prediction (CoP) Approach","authors":"N. Petrova, Cathy H. Wu","doi":"10.1109/BIBE.2010.44","DOIUrl":null,"url":null,"abstract":"One of the aims of the Protein Structure Initiative (PSI) in the post genome-sequencing era is to elucidate biochemical and biophysical functions of each protein structure. Thus, the development of new methods for a large-scale analysis/annotation of protein functional residues is inevitable. Currently existing methods are not capable to do so due to the lack of automation, availability, and/or poor performance. In our previous work we were able to improve the accuracy of the prediction to ~86%, although the number of false-positives remained high. In this paper we present a fully-automated method for the prediction of catalytic residues in proteins that improves accuracy by reduction of false-positives, and is applicable for a large-scale analysis. Here, catalytic residues are predicted by machine learning approach followed by hierarchical analysis of the predicted residues. The capability of the method was tested on diverse family of hydrolytic enzymes with a/b hydrolase fold with widely differing phylogenetic origins and catalytic functions. The method was executed manually and then fully reproduces automatically. In the manual analysis, in 17 enzymes, the method correctly predicted all 3 residues of the catalytic triad with 3 false-positives out of 282 residues on average. Our method successfully eliminates the number of false-positives, while being applicable for a large-scale analysis of the protein function.","PeriodicalId":330904,"journal":{"name":"2010 IEEE International Conference on BioInformatics and BioEngineering","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on BioInformatics and BioEngineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2010.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

One of the aims of the Protein Structure Initiative (PSI) in the post genome-sequencing era is to elucidate biochemical and biophysical functions of each protein structure. Thus, the development of new methods for a large-scale analysis/annotation of protein functional residues is inevitable. Currently existing methods are not capable to do so due to the lack of automation, availability, and/or poor performance. In our previous work we were able to improve the accuracy of the prediction to ~86%, although the number of false-positives remained high. In this paper we present a fully-automated method for the prediction of catalytic residues in proteins that improves accuracy by reduction of false-positives, and is applicable for a large-scale analysis. Here, catalytic residues are predicted by machine learning approach followed by hierarchical analysis of the predicted residues. The capability of the method was tested on diverse family of hydrolytic enzymes with a/b hydrolase fold with widely differing phylogenetic origins and catalytic functions. The method was executed manually and then fully reproduces automatically. In the manual analysis, in 17 enzymes, the method correctly predicted all 3 residues of the catalytic triad with 3 false-positives out of 282 residues on average. Our method successfully eliminates the number of false-positives, while being applicable for a large-scale analysis of the protein function.

查看原文本刊更多论文

基于共识预测(CoP)方法的蛋白质催化残基预测

在后基因组测序时代，蛋白质结构计划(PSI)的目标之一是阐明每种蛋白质结构的生化和生物物理功能。因此，开发大规模分析/注释蛋白质功能残基的新方法是不可避免的。由于缺乏自动化、可用性和/或较差的性能，目前现有的方法无法做到这一点。在我们之前的工作中，我们能够将预测的准确性提高到~86%，尽管假阳性的数量仍然很高。在本文中，我们提出了一种全自动方法来预测蛋白质中的催化残基，通过减少假阳性来提高准确性，并且适用于大规模分析。在这里，通过机器学习方法预测催化残基，然后对预测残基进行分层分析。该方法在具有a/b水解酶折叠的不同家族的水解酶上进行了测试，这些水解酶具有广泛的系统发育起源和催化功能。该方法是手动执行的，然后自动完全再现。在人工分析中，在17种酶中，该方法正确预测了催化三联体的所有3个残基，平均282个残基中有3个假阳性。我们的方法成功地消除了假阳性的数量，同时适用于蛋白质功能的大规模分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 IEEE International Conference on BioInformatics and BioEngineering

自引率

0.00%

发文量