从进化概况预测抗冻蛋白质

IF 3.4 4区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS
Proteomics Pub Date : 2024-09-20 DOI:10.1002/pmic.202400157
Nishant Kumar, Shubham Choudhury, Nisha Bajiya, Sumeet Patiyal, Gajendra P S Raghava
{"title":"从进化概况预测抗冻蛋白质","authors":"Nishant Kumar, Shubham Choudhury, Nisha Bajiya, Sumeet Patiyal, Gajendra P S Raghava","doi":"10.1002/pmic.202400157","DOIUrl":null,"url":null,"abstract":"<p><p>Prediction of antifreeze proteins (AFPs) holds significant importance due to their diverse applications in healthcare. An inherent limitation of current AFP prediction methods is their reliance on unreviewed proteins for evaluation. This study evaluates, proposed and existing methods on an independent dataset containing 80 AFPs and 73 non-AFPs obtained from Uniport, which have been already reviewed by experts. Initially, we constructed machine learning models for AFP prediction using selected composition-based protein features and achieved a peak AUROC of 0.90 with an MCC of 0.69 on the independent dataset. Subsequently, we observed a notable enhancement in model performance, with the AUROC increasing from 0.90 to 0.93 upon incorporating evolutionary information instead of relying solely on the primary sequence of proteins. Furthermore, we explored hybrid models integrating our machine learning approaches with BLAST-based similarity and motif-based methods. However, the performance of these hybrid models either matched or was inferior to that of our best machine-learning model. Our best model based on evolutionary information outperforms all existing methods on independent/validation dataset. To facilitate users, a user-friendly web server with a standalone package named \"AFPropred\" was developed (https://webs.iiitd.edu.in/raghava/afpropred).</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of Anti-Freezing Proteins From Their Evolutionary Profile.\",\"authors\":\"Nishant Kumar, Shubham Choudhury, Nisha Bajiya, Sumeet Patiyal, Gajendra P S Raghava\",\"doi\":\"10.1002/pmic.202400157\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Prediction of antifreeze proteins (AFPs) holds significant importance due to their diverse applications in healthcare. An inherent limitation of current AFP prediction methods is their reliance on unreviewed proteins for evaluation. This study evaluates, proposed and existing methods on an independent dataset containing 80 AFPs and 73 non-AFPs obtained from Uniport, which have been already reviewed by experts. Initially, we constructed machine learning models for AFP prediction using selected composition-based protein features and achieved a peak AUROC of 0.90 with an MCC of 0.69 on the independent dataset. Subsequently, we observed a notable enhancement in model performance, with the AUROC increasing from 0.90 to 0.93 upon incorporating evolutionary information instead of relying solely on the primary sequence of proteins. Furthermore, we explored hybrid models integrating our machine learning approaches with BLAST-based similarity and motif-based methods. However, the performance of these hybrid models either matched or was inferior to that of our best machine-learning model. Our best model based on evolutionary information outperforms all existing methods on independent/validation dataset. To facilitate users, a user-friendly web server with a standalone package named \\\"AFPropred\\\" was developed (https://webs.iiitd.edu.in/raghava/afpropred).</p>\",\"PeriodicalId\":224,\"journal\":{\"name\":\"Proteomics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proteomics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1002/pmic.202400157\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pmic.202400157","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

摘要

由于抗冻蛋白(AFP)在医疗保健领域的广泛应用,对其进行预测具有重要意义。当前 AFP 预测方法的一个固有局限是依赖于未审查的蛋白质进行评估。本研究在一个独立的数据集上对所提出的方法和现有方法进行了评估,该数据集包含从 Uniport 获取的 80 个 AFP 和 73 个非 AFP,这些数据已经过专家审查。最初,我们利用选定的基于组成的蛋白质特征构建了用于 AFP 预测的机器学习模型,并在独立数据集上实现了 0.90 的峰值 AUROC 和 0.69 的 MCC。随后,我们观察到模型的性能有了显著提高,在纳入进化信息而不是仅仅依赖蛋白质的主序列后,AUROC 从 0.90 提高到了 0.93。此外,我们还探索了混合模型,将我们的机器学习方法与基于 BLAST 的相似性和基于主题的方法整合在一起。然而,这些混合模型的性能要么与我们的最佳机器学习模型相当,要么不如。我们基于进化信息的最佳模型在独立/验证数据集上的表现优于所有现有方法。为了方便用户,我们开发了一个用户友好型网络服务器,并将其独立打包,命名为 "AFPropred"(https://webs.iiitd.edu.in/raghava/afpropred)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Prediction of Anti-Freezing Proteins From Their Evolutionary Profile.

Prediction of antifreeze proteins (AFPs) holds significant importance due to their diverse applications in healthcare. An inherent limitation of current AFP prediction methods is their reliance on unreviewed proteins for evaluation. This study evaluates, proposed and existing methods on an independent dataset containing 80 AFPs and 73 non-AFPs obtained from Uniport, which have been already reviewed by experts. Initially, we constructed machine learning models for AFP prediction using selected composition-based protein features and achieved a peak AUROC of 0.90 with an MCC of 0.69 on the independent dataset. Subsequently, we observed a notable enhancement in model performance, with the AUROC increasing from 0.90 to 0.93 upon incorporating evolutionary information instead of relying solely on the primary sequence of proteins. Furthermore, we explored hybrid models integrating our machine learning approaches with BLAST-based similarity and motif-based methods. However, the performance of these hybrid models either matched or was inferior to that of our best machine-learning model. Our best model based on evolutionary information outperforms all existing methods on independent/validation dataset. To facilitate users, a user-friendly web server with a standalone package named "AFPropred" was developed (https://webs.iiitd.edu.in/raghava/afpropred).

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Proteomics
Proteomics 生物-生化研究方法
CiteScore
6.30
自引率
5.90%
发文量
193
审稿时长
3 months
期刊介绍: PROTEOMICS is the premier international source for information on all aspects of applications and technologies, including software, in proteomics and other "omics". The journal includes but is not limited to proteomics, genomics, transcriptomics, metabolomics and lipidomics, and systems biology approaches. Papers describing novel applications of proteomics and integration of multi-omics data and approaches are especially welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信