AMPCliff:抗菌肽活性崖的定量定义和标杆分析

IF 11.4 1区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES
Kewei Li, Yuqian Wu, Yinheng Li, Yutong Guo, Yanwen Kong, Yan Wang, Yiyang Liang, Yusi Fan, Lan Huang, Ruochi Zhang, Fengfeng Zhou
{"title":"AMPCliff:抗菌肽活性崖的定量定义和标杆分析","authors":"Kewei Li, Yuqian Wu, Yinheng Li, Yutong Guo, Yanwen Kong, Yan Wang, Yiyang Liang, Yusi Fan, Lan Huang, Ruochi Zhang, Fengfeng Zhou","doi":"10.1016/j.jare.2025.04.046","DOIUrl":null,"url":null,"abstract":"<h3>Introduction</h3>Activity cliff (AC) is a phenomenon that a pair of similar molecules differ by a small structural alternation but exhibit a large difference in their biochemical activities. This phenomenon affects various tasks ranging from virtual screening to lead optimization in drug development. The AC of small molecules has been extensively investigated but limited knowledge is accumulated about the AC phenomenon in pharmaceutical peptides with canonical amino acids.<h3>Objectives</h3>This study introduces a quantitative definition and benchmarking framework AMPCliff for the AC phenomenon in antimicrobial peptides (AMPs) composed by canonical amino acids.<h3>Methods</h3>This study establishes a benchmark dataset of paired AMPs in <em>Staphylococcus aureus</em> from the publicly available AMP dataset GRAMPA, and conducts a rigorous procedure to evaluate various AMP AC prediction models, including nine machine learning, four deep learning algorithms, four masked language models, and four generative language models.<h3>Results</h3>A comprehensive analysis of the existing AMP dataset reveals a significant prevalence of AC within AMPs. AMPCliff quantifies the activities of AMPs by the metric minimum inhibitory concentration (MIC), and defines 0.9 as the minimum threshold for the normalized BLOSUM62 similarity score between a pair of aligned peptides with at least two-fold MIC changes. Our analysis reveals that these models are capable of detecting AMP AC events and the pre-trained protein language model ESM2 demonstrates superior performance across the evaluations. The predictive performance of AMP activity cliffs remains to be further improved, considering that ESM2 with 33 layers only achieves the Spearman correlation coefficient 0.4669 for the regression task of the −log(MIC) values on the benchmark dataset.<h3>Conclusion</h3>Our findings highlight limitations in current deep learning–based representation models. To more accurately capture the properties of antimicrobial peptides (AMPs), it is essential to integrate atomic-level dynamic information that reflects their underlying mechanisms of action.","PeriodicalId":14952,"journal":{"name":"Journal of Advanced Research","volume":"43 1","pages":""},"PeriodicalIF":11.4000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AMPCliff: Quantitative definition and benchmarking of activity cliffs in antimicrobial peptides\",\"authors\":\"Kewei Li, Yuqian Wu, Yinheng Li, Yutong Guo, Yanwen Kong, Yan Wang, Yiyang Liang, Yusi Fan, Lan Huang, Ruochi Zhang, Fengfeng Zhou\",\"doi\":\"10.1016/j.jare.2025.04.046\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3>Introduction</h3>Activity cliff (AC) is a phenomenon that a pair of similar molecules differ by a small structural alternation but exhibit a large difference in their biochemical activities. This phenomenon affects various tasks ranging from virtual screening to lead optimization in drug development. The AC of small molecules has been extensively investigated but limited knowledge is accumulated about the AC phenomenon in pharmaceutical peptides with canonical amino acids.<h3>Objectives</h3>This study introduces a quantitative definition and benchmarking framework AMPCliff for the AC phenomenon in antimicrobial peptides (AMPs) composed by canonical amino acids.<h3>Methods</h3>This study establishes a benchmark dataset of paired AMPs in <em>Staphylococcus aureus</em> from the publicly available AMP dataset GRAMPA, and conducts a rigorous procedure to evaluate various AMP AC prediction models, including nine machine learning, four deep learning algorithms, four masked language models, and four generative language models.<h3>Results</h3>A comprehensive analysis of the existing AMP dataset reveals a significant prevalence of AC within AMPs. AMPCliff quantifies the activities of AMPs by the metric minimum inhibitory concentration (MIC), and defines 0.9 as the minimum threshold for the normalized BLOSUM62 similarity score between a pair of aligned peptides with at least two-fold MIC changes. Our analysis reveals that these models are capable of detecting AMP AC events and the pre-trained protein language model ESM2 demonstrates superior performance across the evaluations. The predictive performance of AMP activity cliffs remains to be further improved, considering that ESM2 with 33 layers only achieves the Spearman correlation coefficient 0.4669 for the regression task of the −log(MIC) values on the benchmark dataset.<h3>Conclusion</h3>Our findings highlight limitations in current deep learning–based representation models. To more accurately capture the properties of antimicrobial peptides (AMPs), it is essential to integrate atomic-level dynamic information that reflects their underlying mechanisms of action.\",\"PeriodicalId\":14952,\"journal\":{\"name\":\"Journal of Advanced Research\",\"volume\":\"43 1\",\"pages\":\"\"},\"PeriodicalIF\":11.4000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Advanced Research\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jare.2025.04.046\",\"RegionNum\":1,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advanced Research","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1016/j.jare.2025.04.046","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

活性悬崖(AC)是指一对相似的分子虽然结构上有微小的差异,但其生化活性却有很大差异的现象。这种现象影响了从虚拟筛选到药物开发中的先导优化等各种任务。小分子间的交流已被广泛研究,但对具有典型氨基酸的药用肽中交流现象的了解有限。目的介绍由典型氨基酸组成的抗菌肽(AMPs)中AC现象的定量定义和对标框架AMPCliff。方法本研究从公开的AMP数据集GRAMPA中建立了金黄色葡萄球菌配对AMP的基准数据集,并对各种AMP AC预测模型进行了严格的评估,包括9种机器学习算法、4种深度学习算法、4种屏蔽语言模型和4种生成语言模型。结果对现有AMP数据集的综合分析揭示了AMP中AC的显著患病率。AMPCliff通过最小抑制浓度(MIC)来量化AMPs的活性,并将0.9定义为具有至少两倍MIC变化的一对排列肽之间标准化BLOSUM62相似性评分的最小阈值。我们的分析表明,这些模型能够检测AMP AC事件,预训练的蛋白质语言模型ESM2在整个评估中表现出优异的性能。考虑到33层ESM2在基准数据集上的−log(MIC)值的回归任务仅达到了Spearman相关系数0.4669,AMP活性崖的预测性能有待进一步提高。我们的发现突出了当前基于深度学习的表征模型的局限性。为了更准确地捕捉抗菌肽(AMPs)的特性,整合反映其潜在作用机制的原子水平动态信息至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

AMPCliff: Quantitative definition and benchmarking of activity cliffs in antimicrobial peptides

AMPCliff: Quantitative definition and benchmarking of activity cliffs in antimicrobial peptides

Introduction

Activity cliff (AC) is a phenomenon that a pair of similar molecules differ by a small structural alternation but exhibit a large difference in their biochemical activities. This phenomenon affects various tasks ranging from virtual screening to lead optimization in drug development. The AC of small molecules has been extensively investigated but limited knowledge is accumulated about the AC phenomenon in pharmaceutical peptides with canonical amino acids.

Objectives

This study introduces a quantitative definition and benchmarking framework AMPCliff for the AC phenomenon in antimicrobial peptides (AMPs) composed by canonical amino acids.

Methods

This study establishes a benchmark dataset of paired AMPs in Staphylococcus aureus from the publicly available AMP dataset GRAMPA, and conducts a rigorous procedure to evaluate various AMP AC prediction models, including nine machine learning, four deep learning algorithms, four masked language models, and four generative language models.

Results

A comprehensive analysis of the existing AMP dataset reveals a significant prevalence of AC within AMPs. AMPCliff quantifies the activities of AMPs by the metric minimum inhibitory concentration (MIC), and defines 0.9 as the minimum threshold for the normalized BLOSUM62 similarity score between a pair of aligned peptides with at least two-fold MIC changes. Our analysis reveals that these models are capable of detecting AMP AC events and the pre-trained protein language model ESM2 demonstrates superior performance across the evaluations. The predictive performance of AMP activity cliffs remains to be further improved, considering that ESM2 with 33 layers only achieves the Spearman correlation coefficient 0.4669 for the regression task of the −log(MIC) values on the benchmark dataset.

Conclusion

Our findings highlight limitations in current deep learning–based representation models. To more accurately capture the properties of antimicrobial peptides (AMPs), it is essential to integrate atomic-level dynamic information that reflects their underlying mechanisms of action.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Advanced Research
Journal of Advanced Research Multidisciplinary-Multidisciplinary
CiteScore
21.60
自引率
0.90%
发文量
280
审稿时长
12 weeks
期刊介绍: Journal of Advanced Research (J. Adv. Res.) is an applied/natural sciences, peer-reviewed journal that focuses on interdisciplinary research. The journal aims to contribute to applied research and knowledge worldwide through the publication of original and high-quality research articles in the fields of Medicine, Pharmaceutical Sciences, Dentistry, Physical Therapy, Veterinary Medicine, and Basic and Biological Sciences. The following abstracting and indexing services cover the Journal of Advanced Research: PubMed/Medline, Essential Science Indicators, Web of Science, Scopus, PubMed Central, PubMed, Science Citation Index Expanded, Directory of Open Access Journals (DOAJ), and INSPEC.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信