APLpred: A machine learning-based tool for accurate prediction and characterization of asparagine peptide lyases using sequence-derived optimal features

IF 4.2 3区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Adeel Malik , Majid Rasool Kamli , Jamal S.M. Sabir , Irfan A. Rather , Le Thi Phan , Chang-Bae Kim , Balachandran Manavalan
{"title":"APLpred: A machine learning-based tool for accurate prediction and characterization of asparagine peptide lyases using sequence-derived optimal features","authors":"Adeel Malik ,&nbsp;Majid Rasool Kamli ,&nbsp;Jamal S.M. Sabir ,&nbsp;Irfan A. Rather ,&nbsp;Le Thi Phan ,&nbsp;Chang-Bae Kim ,&nbsp;Balachandran Manavalan","doi":"10.1016/j.ymeth.2024.05.014","DOIUrl":null,"url":null,"abstract":"<div><p>Asparagine peptide lyase (APL) is among the seven groups of proteases, also known as proteolytic enzymes, which are classified according to their catalytic residue. APLs are synthesized as precursors or propeptides that undergo self-cleavage through autoproteolytic reaction. At present, APLs are grouped into 10 families belonging to six different clans of proteases. Recognizing their critical roles in many biological processes including virus maturation, and virulence, accurate identification and characterization of APLs is indispensable. Experimental identification and characterization of APLs is laborious and time-consuming. Here, we developed APLpred, a novel support vector machine (SVM) based predictor that can predict APLs from the primary sequences. APLpred was developed using Boruta-based optimal features derived from seven encodings and subsequently trained using five machine learning algorithms. After evaluating each model on an independent dataset, we selected APLpred (an SVM-based model) due to its consistent performance during cross-validation and independent evaluation. We anticipate APLpred will be an effective tool for identifying APLs. This could aid in designing inhibitors against these enzymes and exploring their functions. The APLpred web server is freely available at <span>https://procarb.org/APLpred/</span><svg><path></path></svg>.</p></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"229 ","pages":"Pages 133-146"},"PeriodicalIF":4.2000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1046202324001336","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Asparagine peptide lyase (APL) is among the seven groups of proteases, also known as proteolytic enzymes, which are classified according to their catalytic residue. APLs are synthesized as precursors or propeptides that undergo self-cleavage through autoproteolytic reaction. At present, APLs are grouped into 10 families belonging to six different clans of proteases. Recognizing their critical roles in many biological processes including virus maturation, and virulence, accurate identification and characterization of APLs is indispensable. Experimental identification and characterization of APLs is laborious and time-consuming. Here, we developed APLpred, a novel support vector machine (SVM) based predictor that can predict APLs from the primary sequences. APLpred was developed using Boruta-based optimal features derived from seven encodings and subsequently trained using five machine learning algorithms. After evaluating each model on an independent dataset, we selected APLpred (an SVM-based model) due to its consistent performance during cross-validation and independent evaluation. We anticipate APLpred will be an effective tool for identifying APLs. This could aid in designing inhibitors against these enzymes and exploring their functions. The APLpred web server is freely available at https://procarb.org/APLpred/.

APLpred:基于机器学习的天冬酰胺肽裂解酶准确预测和特征描述工具,使用序列衍生的最佳特征。
天冬酰胺肽裂解酶(APL)是七类蛋白酶之一,也被称为蛋白水解酶,根据其催化残基进行分类。APL 以前体物或前肽的形式合成,通过自体蛋白水解反应进行自我裂解。目前,APLs 可分为 10 个家族,属于 6 个不同的蛋白酶家族。由于 APLs 在包括病毒成熟和毒力在内的许多生物过程中发挥着关键作用,因此准确鉴定和描述 APLs 的特性是必不可少的。APLs 的实验鉴定和表征费时费力。在这里,我们开发了一种基于支持向量机(SVM)的新型预测器 APLpred,它可以从主序列预测 APL。APLpred 是利用从七种编码中提取的基于 Boruta 的最佳特征开发的,随后使用五种机器学习算法进行了训练。在独立数据集上对每个模型进行评估后,我们选择了 APLpred(基于 SVM 的模型),因为它在交叉验证和独立评估中表现一致。我们预计 APLpred 将成为识别 APL 的有效工具。这将有助于设计针对这些酶的抑制剂和探索它们的功能。APLpred 网络服务器可在 https://procarb.org/APLpred/ 免费获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Methods
Methods 生物-生化研究方法
CiteScore
9.80
自引率
2.10%
发文量
222
审稿时长
11.3 weeks
期刊介绍: Methods focuses on rapidly developing techniques in the experimental biological and medical sciences. Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信