{"title":"AMPCliff:抗菌肽活性崖的定量定义和标杆分析","authors":"Kewei Li, Yuqian Wu, Yinheng Li, Yutong Guo, Yanwen Kong, Yan Wang, Yiyang Liang, Yusi Fan, Lan Huang, Ruochi Zhang, Fengfeng Zhou","doi":"10.1016/j.jare.2025.04.046","DOIUrl":null,"url":null,"abstract":"<h3>Introduction</h3>Activity cliff (AC) is a phenomenon that a pair of similar molecules differ by a small structural alternation but exhibit a large difference in their biochemical activities. This phenomenon affects various tasks ranging from virtual screening to lead optimization in drug development. The AC of small molecules has been extensively investigated but limited knowledge is accumulated about the AC phenomenon in pharmaceutical peptides with canonical amino acids.<h3>Objectives</h3>This study introduces a quantitative definition and benchmarking framework AMPCliff for the AC phenomenon in antimicrobial peptides (AMPs) composed by canonical amino acids.<h3>Methods</h3>This study establishes a benchmark dataset of paired AMPs in <em>Staphylococcus aureus</em> from the publicly available AMP dataset GRAMPA, and conducts a rigorous procedure to evaluate various AMP AC prediction models, including nine machine learning, four deep learning algorithms, four masked language models, and four generative language models.<h3>Results</h3>A comprehensive analysis of the existing AMP dataset reveals a significant prevalence of AC within AMPs. AMPCliff quantifies the activities of AMPs by the metric minimum inhibitory concentration (MIC), and defines 0.9 as the minimum threshold for the normalized BLOSUM62 similarity score between a pair of aligned peptides with at least two-fold MIC changes. Our analysis reveals that these models are capable of detecting AMP AC events and the pre-trained protein language model ESM2 demonstrates superior performance across the evaluations. The predictive performance of AMP activity cliffs remains to be further improved, considering that ESM2 with 33 layers only achieves the Spearman correlation coefficient 0.4669 for the regression task of the −log(MIC) values on the benchmark dataset.<h3>Conclusion</h3>Our findings highlight limitations in current deep learning–based representation models. To more accurately capture the properties of antimicrobial peptides (AMPs), it is essential to integrate atomic-level dynamic information that reflects their underlying mechanisms of action.","PeriodicalId":14952,"journal":{"name":"Journal of Advanced Research","volume":"43 1","pages":""},"PeriodicalIF":11.4000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AMPCliff: Quantitative definition and benchmarking of activity cliffs in antimicrobial peptides\",\"authors\":\"Kewei Li, Yuqian Wu, Yinheng Li, Yutong Guo, Yanwen Kong, Yan Wang, Yiyang Liang, Yusi Fan, Lan Huang, Ruochi Zhang, Fengfeng Zhou\",\"doi\":\"10.1016/j.jare.2025.04.046\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3>Introduction</h3>Activity cliff (AC) is a phenomenon that a pair of similar molecules differ by a small structural alternation but exhibit a large difference in their biochemical activities. This phenomenon affects various tasks ranging from virtual screening to lead optimization in drug development. The AC of small molecules has been extensively investigated but limited knowledge is accumulated about the AC phenomenon in pharmaceutical peptides with canonical amino acids.<h3>Objectives</h3>This study introduces a quantitative definition and benchmarking framework AMPCliff for the AC phenomenon in antimicrobial peptides (AMPs) composed by canonical amino acids.<h3>Methods</h3>This study establishes a benchmark dataset of paired AMPs in <em>Staphylococcus aureus</em> from the publicly available AMP dataset GRAMPA, and conducts a rigorous procedure to evaluate various AMP AC prediction models, including nine machine learning, four deep learning algorithms, four masked language models, and four generative language models.<h3>Results</h3>A comprehensive analysis of the existing AMP dataset reveals a significant prevalence of AC within AMPs. AMPCliff quantifies the activities of AMPs by the metric minimum inhibitory concentration (MIC), and defines 0.9 as the minimum threshold for the normalized BLOSUM62 similarity score between a pair of aligned peptides with at least two-fold MIC changes. Our analysis reveals that these models are capable of detecting AMP AC events and the pre-trained protein language model ESM2 demonstrates superior performance across the evaluations. The predictive performance of AMP activity cliffs remains to be further improved, considering that ESM2 with 33 layers only achieves the Spearman correlation coefficient 0.4669 for the regression task of the −log(MIC) values on the benchmark dataset.<h3>Conclusion</h3>Our findings highlight limitations in current deep learning–based representation models. To more accurately capture the properties of antimicrobial peptides (AMPs), it is essential to integrate atomic-level dynamic information that reflects their underlying mechanisms of action.\",\"PeriodicalId\":14952,\"journal\":{\"name\":\"Journal of Advanced Research\",\"volume\":\"43 1\",\"pages\":\"\"},\"PeriodicalIF\":11.4000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Advanced Research\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jare.2025.04.046\",\"RegionNum\":1,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advanced Research","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1016/j.jare.2025.04.046","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
AMPCliff: Quantitative definition and benchmarking of activity cliffs in antimicrobial peptides
Introduction
Activity cliff (AC) is a phenomenon that a pair of similar molecules differ by a small structural alternation but exhibit a large difference in their biochemical activities. This phenomenon affects various tasks ranging from virtual screening to lead optimization in drug development. The AC of small molecules has been extensively investigated but limited knowledge is accumulated about the AC phenomenon in pharmaceutical peptides with canonical amino acids.
Objectives
This study introduces a quantitative definition and benchmarking framework AMPCliff for the AC phenomenon in antimicrobial peptides (AMPs) composed by canonical amino acids.
Methods
This study establishes a benchmark dataset of paired AMPs in Staphylococcus aureus from the publicly available AMP dataset GRAMPA, and conducts a rigorous procedure to evaluate various AMP AC prediction models, including nine machine learning, four deep learning algorithms, four masked language models, and four generative language models.
Results
A comprehensive analysis of the existing AMP dataset reveals a significant prevalence of AC within AMPs. AMPCliff quantifies the activities of AMPs by the metric minimum inhibitory concentration (MIC), and defines 0.9 as the minimum threshold for the normalized BLOSUM62 similarity score between a pair of aligned peptides with at least two-fold MIC changes. Our analysis reveals that these models are capable of detecting AMP AC events and the pre-trained protein language model ESM2 demonstrates superior performance across the evaluations. The predictive performance of AMP activity cliffs remains to be further improved, considering that ESM2 with 33 layers only achieves the Spearman correlation coefficient 0.4669 for the regression task of the −log(MIC) values on the benchmark dataset.
Conclusion
Our findings highlight limitations in current deep learning–based representation models. To more accurately capture the properties of antimicrobial peptides (AMPs), it is essential to integrate atomic-level dynamic information that reflects their underlying mechanisms of action.
期刊介绍:
Journal of Advanced Research (J. Adv. Res.) is an applied/natural sciences, peer-reviewed journal that focuses on interdisciplinary research. The journal aims to contribute to applied research and knowledge worldwide through the publication of original and high-quality research articles in the fields of Medicine, Pharmaceutical Sciences, Dentistry, Physical Therapy, Veterinary Medicine, and Basic and Biological Sciences.
The following abstracting and indexing services cover the Journal of Advanced Research: PubMed/Medline, Essential Science Indicators, Web of Science, Scopus, PubMed Central, PubMed, Science Citation Index Expanded, Directory of Open Access Journals (DOAJ), and INSPEC.