Anand Singh Rathore, Nishant Kumar, Shubham Choudhury, Naman Kumar Mehta, Gajendra P S Raghava
{"title":"Prediction of hemolytic peptides and their hemolytic concentration.","authors":"Anand Singh Rathore, Nishant Kumar, Shubham Choudhury, Naman Kumar Mehta, Gajendra P S Raghava","doi":"10.1038/s42003-025-07615-w","DOIUrl":null,"url":null,"abstract":"<p><p>Peptide-based drugs often fail in clinical trials due to their toxicity or hemolytic activity against red blood cells (RBCs). Existing methods predict hemolytic peptides but not the concentration (HC<sub>50</sub>) required to lyse 50% of RBCs. This study develops classification and regression models to identify and quantify hemolytic activity. These models train on 1926 peptides with experimentally determined HC<sub>50</sub> against mammalian RBCs. Analysis indicates that hydrophobic and positively charged residues were associated with higher hemolytic activity. Among classification models, including machine learning (ML), quantum ML, and protein language models, a hybrid model combining random forest (RF) and a motif-based approach achieves the highest area under the receiver operating characteristic curve (AUROC) of 0.921. Regression models achieve a Pearson correlation coefficient (R) of 0.739 and a coefficient of determination (R²) of 0.543. These models outperform existing methods and are implemented in HemoPI2, a web-based platform and standalone software for designing peptides with desired HC<sub>50</sub> values ( http://webs.iiitd.edu.in/raghava/hemopi2/ ).</p>","PeriodicalId":10552,"journal":{"name":"Communications Biology","volume":"8 1","pages":"176"},"PeriodicalIF":5.2000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11794569/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s42003-025-07615-w","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Peptide-based drugs often fail in clinical trials due to their toxicity or hemolytic activity against red blood cells (RBCs). Existing methods predict hemolytic peptides but not the concentration (HC50) required to lyse 50% of RBCs. This study develops classification and regression models to identify and quantify hemolytic activity. These models train on 1926 peptides with experimentally determined HC50 against mammalian RBCs. Analysis indicates that hydrophobic and positively charged residues were associated with higher hemolytic activity. Among classification models, including machine learning (ML), quantum ML, and protein language models, a hybrid model combining random forest (RF) and a motif-based approach achieves the highest area under the receiver operating characteristic curve (AUROC) of 0.921. Regression models achieve a Pearson correlation coefficient (R) of 0.739 and a coefficient of determination (R²) of 0.543. These models outperform existing methods and are implemented in HemoPI2, a web-based platform and standalone software for designing peptides with desired HC50 values ( http://webs.iiitd.edu.in/raghava/hemopi2/ ).
期刊介绍:
Communications Biology is an open access journal from Nature Research publishing high-quality research, reviews and commentary in all areas of the biological sciences. Research papers published by the journal represent significant advances bringing new biological insight to a specialized area of research.