A machine learning classification model for cholesterol-lowering peptides

Jose Isagani B. Janairo
{"title":"A machine learning classification model for cholesterol-lowering peptides","authors":"Jose Isagani B. Janairo","doi":"10.1016/j.aichem.2023.100026","DOIUrl":null,"url":null,"abstract":"<div><p>Cholesterol-lowering peptides (CLPs) are bioactive biomolecules often derived from food proteins. These short peptides bind with bile acids leading to decreased intestinal absorption of cholesterol. CLPs are promising bioceuticals that can possibly be used to support interventions for the management of high cholesterol. Integrating machine learning (ML) in the screening and discovery workflow for CLP can reduce trial-and-error thereby accelerating and increase the efficiency of the overall process. In this study, a support vector machine model that can distinguish CLPs from non-CLPs is presented. The model was built on a diverse dataset of 1840 peptides, with sequence length that ranges from 4 to 7. The ML model only needs 8 features (VHSE scores), and the most important features were found to be related to peptide polarity and hydrophobicity based on feature importance analysis utilizing Shapley and permutation-based method. The formulated ML classifier is reliable, as demonstrated by AUC &gt;0.7 for a diverse test dataset and AUC &gt;0.9 for a conservative validation dataset composed mainly of the top and bottom CLPs. Overall, the presented ML model presents incremental yet meaningful advances to the application of ML for understanding the nature of CLPs, and their discovery and development.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294974772300026X/pdfft?md5=0835f2ca55b7c8185903061e3f9f59c0&pid=1-s2.0-S294974772300026X-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S294974772300026X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Cholesterol-lowering peptides (CLPs) are bioactive biomolecules often derived from food proteins. These short peptides bind with bile acids leading to decreased intestinal absorption of cholesterol. CLPs are promising bioceuticals that can possibly be used to support interventions for the management of high cholesterol. Integrating machine learning (ML) in the screening and discovery workflow for CLP can reduce trial-and-error thereby accelerating and increase the efficiency of the overall process. In this study, a support vector machine model that can distinguish CLPs from non-CLPs is presented. The model was built on a diverse dataset of 1840 peptides, with sequence length that ranges from 4 to 7. The ML model only needs 8 features (VHSE scores), and the most important features were found to be related to peptide polarity and hydrophobicity based on feature importance analysis utilizing Shapley and permutation-based method. The formulated ML classifier is reliable, as demonstrated by AUC >0.7 for a diverse test dataset and AUC >0.9 for a conservative validation dataset composed mainly of the top and bottom CLPs. Overall, the presented ML model presents incremental yet meaningful advances to the application of ML for understanding the nature of CLPs, and their discovery and development.

降胆固醇肽的机器学习分类模型
降胆固醇肽(CLPs)是一种生物活性分子,通常来源于食物蛋白质。这些短肽与胆汁酸结合,导致肠道对胆固醇的吸收减少。clp是很有前途的生物药品,可能用于支持干预高胆固醇的管理。将机器学习(ML)集成到CLP的筛选和发现工作流程中可以减少试错,从而加快并提高整个流程的效率。在本研究中,提出了一种能够区分clp和非clp的支持向量机模型。该模型建立在1840个肽的多样化数据集上,序列长度从4到7不等。ML模型只需要8个特征(VHSE评分),利用Shapley和基于置换的方法进行特征重要性分析,发现最重要的特征与肽极性和疏水性有关。所建立的ML分类器是可靠的,对于不同的测试数据集AUC > 0.7,对于主要由顶部和底部clp组成的保守验证数据集AUC > 0.9。总的来说,所提出的ML模型为ML在理解clp的本质及其发现和开发方面的应用提供了增量但有意义的进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Artificial intelligence chemistry
Artificial intelligence chemistry Chemistry (General)
自引率
0.00%
发文量
0
审稿时长
21 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信